AI is the sixth great revolution in filmmaking

It’s time to celebrate the incredible women leading the way in AI! Nominate your inspiring leaders for VentureBeat’s Women in AI Awards today before June 18. Learn More

The first motion picture in human history was filmed almost 148 years ago to the date by a famous photographer and convicted killer named Eadweard Muybridge on June 19th, 1878, in Palo Alto, California.

It featured a jockey riding a horse — as viewers of Jordan Peele’s modern horror film Nope will recall — part of an effort by his client Leland Stanford of Stanford University to settle the intense debate at the time over whether horses naturally galloped with all four hooves leaving the ground, or whether they always had at least one hoof down (the former is true).

Ever since then, there have, to date, been five great technological revolutions in the medium of filmmaking (by my count).

Silent Film Era (1878-1929)
Sound/Talkies Era (1927-early 1950s)
Color Film Era (1930s-1960s)
Camcorders/Home Video Era (late 1970s-1990s)
Internet and Mobile Device Era (late 1990s-present)

Each one of these revolutions ushered in entire new eras of film creation and consumption, unlocking new possibilities for the kinds of stories that could be told and increasing their realism and speed of creation, but arguably more importantly — they greatly expanded the accessibility of film creation and consumption to a much wider swath of the world’s people.

VB Transform 2024 Registration is Open

Join enterprise leaders in San Francisco from July 9 to 11 for our flagship AI event. Connect with peers, explore the opportunities and challenges of Generative AI, and learn how to integrate AI applications into your industry. Register Now

I am starting to think, based on the public release of the new, free Luma AI Dream Machine model this week — which turns a user’s raw text and still images into fluid videos in seconds, rivaling or exceeding the realism and quality of OpenAI’s unreleased Sora — that we are now at the cusp of the sixth great revolution in filmmaking: AI.

The origin of movies: turning static pictures into fluid movements

The birth of filmmaking in the late 1800s was all about transforming what had been the prior dominant immersive art format, live theater (which dates back 5,000 years ago to Ancient Greece), into recorded entertainment that could be shown to audiences without the original performers or directors present.

It was, in essence, a fusion of photography and theater, but using the same principles of older phenakistiscope and zoetrope machines from the 1830s, which themselves can be thought of as fancy flip books.

These were mechanical wheels with images painted or carved on them, spun at high enough rotational speeds to blur the imagery and create the optical illusion of motion. Arrange the frames vertically and put a light in the middle or behind them, and suddenly you could project the animation on a wall for an audience to enjoy.

Animated GIF of Prof. Stampfer’s Stroboscopische Scheibe No. X (Trentsensky & Vieweg 1833)

While these devices could be used to show simple characters moving, they were more like animated GIFs in that they looped and couldn’t be used to tell anything but a brief, simple story thanks to the constraints of the space and time.

But around 40 years after these things hit the scene, film cameras with fast enough shutter speeds (1/25th of a second instead of 15 seconds) and large enough light apertures were developed, allowing a photographer like Muybridghe to capture an object’s (or animal’s, or person’s) motion fluidly on film stock across multiple frames.

These frames, in turn, could then be arranged around a mechanical wheel like those of the zoetrope/phenakistiscope machines, a central light projected through them, and viola: the motion picture was born!

The 1st revolution was all about space and time

This technological achievement unlocked something more powerful than just a new medium for art and storytelling, however: it enabled a temporal revolution, as well.

Thanks to the advent of motion pictures, you could watch something that had been recorded yesterday or years ago, featuring real live performers, just like it was happening right now, in front of you.

Until this point, it was simply impossible to witness the same live action, human performance more than once.

Even if you attended the same live play two nights in a row and all the performers had tons of experience, there would be inevitable slight perturbations and differences between the two.

Movies removed this variance, allowing for the same exact singular performance to be re-syndicated indefinitely.

The advent of motion pictures freed these performances from the shackles of space as well, since obviously you could exhibit a film anywhere there was equipment to project it.

As mentioned earlier, this suddenly brought the art of performance to a much wider potential audience and created the first movie stars, since people all around the country and world could see actors at work without traveling to the site of the original performance.

The 2nd and 3rd revolutions were all about immersion and realism

Of course, there were some major technical limitations back then: despite Thomas Edison’s invention of a sound recording and playback machine called the phonograph back in 1877 (a year before the first motion picture footage was shot), it proved difficult for the early filmmakers to sync sound with motion reliably.

The first sound recording discs and cylinders could only store about four minutes worth of audio, resulting in a three-decade-long era of silent films accompanied by live music.

Yet by the mid 1920s, early film studios began an arms race to acquire systems for synchronizing longer audio tracks —including music, recorded dialog and sound effects— more reliably with movies, beginning with Warner Brothers’ use of a sound syncing system called the Vitaphone, developed by Western Electric and Bell Labs, showcasing again how the history of film and its advancement is inexorably linked to new technologies, even controversial ones (many studios initially resisted embracing and filming “talkies” because of the, at the time, high cost).

The third revolution, which occurred concurrently with the development and progression of sound in film, was one of new advances in chemistry and dyes for film stock, bringing all the colors of the rainbow to movie screens, making them much more immersive and reflective of our own real lives and leading to the “technicolor” era.

The 4th and 5th filmmaking revolutions democratized creation and consumption

The fourth great revolution, depicted aptly near the end of Paul Thomas Anderson’s Boogie Nights, was the development of commercially available camcorders and video cassette players and recorders (VCRs) in the 1970s-1980s, which brought both filmmaking and viewing into many more homes and non-theatrical venues, dramatically democratizing both the creation and consumption of the art of cinema.

These devices also made home movies much more popular.

Now, maybe it seems obvious but it’s worth noting that the creators of home movies were not professional filmmakers and by and large, didn’t aspire to make art.

Most of them were just ordinary people working in completely different fields, parents of young families, and weren’t really trying to tell fictional narrative stories or coherent documentaries.

Thanks to relatively affordable camcorders, it was possible for everyday people with middle-class incomes to capture humble yet significant human moments from their lives and those of their loved ones — graduations and birthdays and parties and other life milestones, even playing outside in the yard, mundane occurrences that the creators wanted to remember and intended to share with small, select private audiences going forward.

This is important because it shows that even as the earlier revolutions led to a larger total audience of film viewers and more extravagant productions like Gone with the Wind, the development of more compact, personalized and cheaper filmmaking and exhibition tech led to the personalization of film creation and production.

Thanks to camcorders and VCRs, a single person could suddenly make movies and display them, without the need of a studio, sets, or other fancy equipment. More importantly, they didn’t hesitate doing so because the tech was affordable enough for middle-class households. And, it led to the development of films that were more tailored to specific, niche audiences of even just a single family rather than large audiences of the prior filmmaking era. So this era was all about the personalization of film and the creation of smaller, targeted film audiences.

The next great revolution, the web and mobile, was more staggered: first came the World Wide Web in the late 1980s, aided by the PC revolution, and then in 2006, YouTube.

But it wasn’t until the launch of Apple‘s iPhone a year later that ordinary, non-businesspeople realized the tremendous potential of having an internet-connected device with you in your pocket everywhere you went, and later, with the release of the iPhone 3GS in 2009, the power to capture and upload videos to the web.

Those three ingredients: film + internet + smartphones, led to a veritable Cambrian explosion of video that has shown no signs of slowing down. TikTok, Instagram Reels, Facebook Video now give people a steady stream of short video clips on their mobile devices, captured by their peers, large brands, major movie studios running promotions, and yes, even indie filmmakers, at all hours of the day, whenever they like, for as long as they wish. Video is omnipresent now, thanks to the filmmaking revolutions 1-5.

Most of the video shot by humanity occurred in the last 10 years —the last 1 year, in fact— dwarfing all that came before. And AI will only further fuel this trend.

Total amount of video created each year in zettabytes. Credit: Cisco

Computers also gave people tools to create their own special effects and layer them atop their films, or create fully animated films from scratch, opening the creation side of the art to a much wider group than ever before.

The sixth revolution, AI, brings your imagination directly to audiences

Whereas all the prior cinematic revolutions required you to film real people in front of you in live action, or be artistic enough and skilled enough with tools to create animations, AI is a revolution because for the first time in history, ordinary people can transform their imagination into a film within minutes or seconds, without relying on any outside actors, crew, visual effects, or even other tools.

Simply type in a text prompt into Dream Machine, Sora, Runway’s Gen-2, Pika, Kling, Krea, or any of the other rapidly emerging AI video makers — or upload a single still image you’ve captured, drawn, or generated with an AI image generator — and viola, you have the first clip of your film.

Interestingly, all the prior filmmaking revolutions were externally focused — allowing filmmakers to capture their external environments and external actors more vividly and accurately, or use external tools to animate stories, and share them with external audiences more easily and affordably.

The AI revolution is different because of how internally focused it is.

AI, moreso than any filmmaking technology that preceded it, allows a creator to directly visualize their internal feelings, ideas, scenes, and worlds. AI is the most direct conduit for expressing what’s in your imagination we’ve yet developed. And as such, it may be the most important and impactful revolution since the motion picture itself.

Now, much like the birth of film nearly 150 years ago, AI movie generators are in their infancy and limited to creating clips of just a few seconds at a time (5 seconds in the case of Dream Machine, up to 18 for Runway).

Aside from Dream Machine, many AI video generator models produce largely slow motion clips, limiting their ability to generate fully lifelike scenes (though of course, you can speed it up manually with an external editing tool or program).

Also, due to the fact that AI video generation models remain fairly unpredictable in their outputs, it can be hard to control character consistency and setting consistency across clips or even frames.

Not to mention, most of the AI video generators I mentioned above don’t automatically include sound generation as you generate a clip, though Pika is among the few that offers AI sound generation as an option.

All of these issues are real, and will prevent AI from making a full Hollywood film from one person’s text prompt at least for the foreseeable future. But they are surmountable even right now, and people are already creating full feature-length AI generated films and serialized TV shows with recurring characters and scenarios, using the current tech and simply working around the limitations to get the results they want (such as using Midjourney’s new character consistency feature to create a character moving across multiple still images, then uploading this image set and turning it into motion with an AI video model).

Announcement trailer for “Gala”, my new feature-length film project.
Created with the just-announced @LumaLabsAI video AI model, #LumaDreamMachine.
This is nothing less than a sea change moment for AI video. Thank you to Luma Labs for allowing me to share in the release! pic.twitter.com/ynUNpFytHM
— Christopher Fryant (@cfryant) June 12, 2024

Of equal importance is the fact that AI models are already being used to generate portions of feature films such as the Academy Award Best Picture-winning Everything, Everywhere, All At Once and The People’s Joker. Like color and sound before them, the AI revolution is occurring piecemeal, but I expect that soon enough it will overtake some film productions entirely.

Trained on the shoulders of giants

I need to say at least a short word about the issue of AI video generators and training data. Most AI video generators (I believe all those listed above) have not publicly shared the sources of their training data. In fact, OpenAI’s CTO Mira Murati became a meme after she was asked in an interview what Sora was trained on and answered vaguely, noting it was public videos and licensed data like Shutterstock.

In fact, it seems highly likely that vast amounts of copyrighted data were used to make all of the current popular generative AI models across video, imagery, and text, of which the original data creators/rights holders/owners also likely didn’t see any direct payment or even requests to use their work in this way.

That has, understandably, pissed many creators off and even led to some of them filing lawsuits against AI model providers such as OpenAI and Runway.

Perhaps the courts will side with creators and mandate that AI model companies compensate them somehow. Though, as best I can tell, it is difficult for even the AI model makers to say exactly how much of each piece of training data influences each AI model, especially when the models have trained on millions or hundreds of millions of pieces of content.

Should the AI companies have scraped data en masse like this, including lots of copyrighted data? Ethically, the answer is a tough one. I myself as a writer whose work was undoubtedly scraped have, to a degree, mixed feelings about it.

But ultimately, I am a proponent of AI in general and in the arts specifically. I view it as an extremely exciting, cool, and compelling new tool — one that is controlled by and aids human creators, not one that necessarily replaces them or obsoletes them or their work made by other, older means.

The way the AI companies went about creating it is definitely “sus” as the kids say, but I also think the AI companies had a rational belief they were operating in good faith, since Google itself and many other web companies had long ago scraped large swaths of the internet to power their own, pre-gen AI commercial products such as Google Ads, and most everyone seemed to accept that.

I don’t view AI scraping as intrinsically, morally, ethically or even technologically to be different enough than these prior scraping techniques and outcomes, to warrant it being banned or even penalized, really.

More to the point: every new technology and art form is inspired by what came before. Some of our greatest filmmakers from Michael Mann to Sofia Coppola to the late, great William Friedkin were directly inspired by works of still art to create iconic movie shots, for which the original art creators did not receive direct credit or payment as a result.

Now, those critical of gen AI companies scraping copyrighted data without express permission will cry foul at this point, stating that a human creator being inspired by prior work is part of some long-established, unofficial social contract and that it is different because a human individual does not have the resources nor technical capability to scrape and learn from nearly as much data in their lifetime as the companies producing large language models (LLMs) do. To which I say — poppycock! The difference is only a matter of degrees, then.

If I, a human being, were a superhero who could read and watch everything in all of history and learn how to mimic or derive inspiration from all of it whenever I wanted, instantly, would I be prohibited from doing so? Just because an AI model is better at gathering, emulating and recombining data than we humans are doesn’t make the latter any more moral or justifiable or legal, in my humble opinion.

We’re all standing on the shoulders of giants, as the expression goes — all of us inspired by what came before to greater or lesser extent. Which is why I believe — as many established filmmakers do — that AI is simply another tool in the toolbox for expressing human vision and creativity, and yes, even originality. It is perhaps the most interesting filmmaking tool developed in my lifetime, certainly, but it is still ultimately a tool to be used by humans for human expression.

And as George Lucas recently said, “It’s inevitable…it’s like saying ‘I don’t believe these cars are going to work. Let’s just stick with the horses.’ And you say, ‘yeah, you can say that, but that isn’t the way the world works.’”