Нейронная сеть выступила сценаристом, режиссером и монтажером короткометражки. В первом фильме искусственного интеллекта “сыграл” звезда “Кремниевой долины” Томас Миддлдитч [видео]
48 часов - и фильм готов. Это ли не будущее?Фото: www.youtube.com
Изменить размер текста:
Вышел в свет первый фильм искусственного интеллекта. Нейронная сеть по имени “Бенджамин” срежиссировала по собственному сценарию короткометражку Zone Out в жанре научно-фантастического бессмысленного хоррора. В главной роли снялся актер Томас Миддлдитч, известный как исполнитель роли Ричарда Хендрикса в сериале “Кремниевая долина”.
Искусственный интеллект создал фильм в течение всего 48 часов. Он собрал его из тысяч часов старых фильмов и съемок профессиональных актёров, записанных на хромакее. Полученная в результате лента создана в качестве эксперимента и пока едва ли может рассчитывать на какие-то награды.
Zone Out - это не первый фильм, который режиссер из Лос-Анджелеса Оскар Шарп и его команда сделали вместе (Шарп называет себя “помощником режиссера”, поскольку вся основная работа была проделана “Бенджамином”). Дебют нейронной сети состоялся в 2016 году. Тогда Шарп и Росс Гудвин, креативный технолог в Google, участвовали в 48-часовом состязании на фестивале научно-фантастического кино в Лондоне. Там они показали короткометражку Sunspring, в которой также снялся Миддлдитч. В итоге картина даже попала в список топ-10 на фестивале.
Фильм Sunspring.
А в 2017 году Шарп и Гудвин снова решили поучаствовать в конкурсе с короткометражкой It's No Game с Дэвидом Хассельхоффом в главной роли. На этот раз “Бенджамин” создал диалог на основе сцен из произведений Шекспира и субтитров из других фильмов, а затем их связали историей об искусственном интеллекте, эксплуатирующим людей для создания кино. В результате картина завоевала третье место.
Короткометражка It's No Game.
Для съемок Zone Out Шарп и Гудвин вновь пригласили актерский состав из первого фильма, но решил построить производственный процесс в другом направлении: позволить “Бенджамину” делать все и создать картину “с нуля”. Команда не только поручила нейронной сети написать сценарий, но и выбрать сцены, разместить лица актеров на существующих персонажах, а также расставить реплики героев.
“То, что я действительно пытаюсь сделать, - это попытаться автоматизировать каждую часть человеческого творческого процесса”, - заявил Шарп.
В итоге производство Zone Out заняло всего двое суток. Гудвин и Шарп обучили нейронную сеть с помощью платформы Amazon Web Services. Для замены лиц использовалось 11 различных генеративных состязательных сетей (GAN). Они основаны на открытой библиотеке машинного обучения от Google под названием TensorFlow.
Кино.
Полученная короткометражка оказалась бессюжетной и, в целом, бессмысленной. Например, в одной из сцен герой Миддлдитча рассуждает о сексе с банкой сальсы. Кроме того, алгоритм не смог корректно заменить лица актеров. У команды также не хватило времени, чтобы создать диалоги с собственными голосами актеров, поэтому они использовали робо-голоса.
“Если в будущем технология провалится, у меня всегда будет работа. Если же она обретет успех и я перестану работать актрисой, то хотя бы застану тот момент, когда мы осознаем, что нас заменили компьютеры”, - рассказала актриса и участница съемочного процесса Элизабет Грей.
The recent boom in artificial intelligence has produced impressive results in a somewhat surprising realm: the world of image and video generation. The latest example comes from chip designer Nvidia, which today published research showing how AI-generated visuals can be combined with a traditional video game engine. The result is a hybrid graphics system that could one day be used in video games, movies, and virtual reality.
“It’s a new way to render video content using deep learning,” Nvidia’s vice president of applied deep learning, Bryan Catanzaro, told The Verge. “Obviously Nvidia cares a lot about generating graphics [and] we’re thinking about how AI is going to revolutionize the field.”
The results of Nvidia’s work aren’t photorealistic and show the trademark visual smearing found in much AI-generated imagery. Nor are they totally novel. In a research paper, the company’s engineers explain how they built upon a number of existing methods, including an influential open-source system called pix2pix. Their works deploys a type of neural network known as a generative adversarial network, or GAN. These are widely used in AI image generation, including for the creation of an AI portrait recently sold by Christie’s.
But Nvidia has introduced a number of innovations, and one product of this work, it says, is the first ever video game demo with AI-generated graphics. It’s a simple driving simulator where players navigate a few city blocks of AI-generated space, but can’t leave their car or otherwise interact with the world. The demo is powered using just a single GPU — a notable achievement for such cutting-edge work. (Though admittedly that GPU is the company’s top of the range $3,000 Titan V, “the most powerful PC GPU ever created” and one typically used for advanced simulation processing rather than gaming.)
Nvidia’s system generates graphics using a few steps. First, researchers have to collect training data, which in this case was taken from open-source datasets used for autonomous driving research. This footage is then segmented, meaning each frame is broken into different categories: sky, cars, trees, road, buildings, and so on. A generative adversarial network is then trained on this segmented data to generate new versions of these objects.
Next, engineers created the basic topology of the virtual environment using a traditional game engine. In this case the system was Unreal Engine 4, a popular engine used for titles such as Fortnite, PUBG, Gears of War 4, and many others. Using this environment as a framework, deep learning algorithms then generate the graphics for each different category of item in real time, pasting them on to the game engine’s models.
“The structure of the world is being created traditionally,” explains Catanzaro, “the only thing the AI generates is the graphics.” He adds that the demo itself is basic, and was put together by a single engineer. “It’s proof-of-concept rather than a game that’s fun to play.”
To create this system Nvidia’s engineers had to work around a number of challenges, the biggest of which was object permanence. The problem is, if the deep learning algorithms are generating the graphics for the world at a rate of 25 frames per second, how do they keep objects looking the same? Catanzaro says this problem meant the initial results of the system were “painful to look at” as colors and textures “changed every frame.”
The solution was to give the system a short-term memory, so that it would compare each new frame with what’s gone before. It tries to predict things like motion within these images, and creates new frames that are consistent with what’s on screen. All this computation is expensive though, and so the game only runs at 25 frames per second.
The technology is very much at the early stages, stresses Catanzaro, and it will likely be decades until AI-generated graphics show up in consumer titles. He compares the situation to the development of ray tracing, the current hot technique in graphics rendering where individual rays of light are generated in real time to create realistic reflections, shadows, and opacity in virtual environments. “The very first interactive ray tracing demo happened a long, long time ago, but we didn’t get it in games until just a few weeks ago,” he says.
The work does have potential applications in other areas of research, though, including robotics and self-driving cars, where it could be used to generate training environments. And it could show up in consumer products sooner albeit in a more limited capacity.
For example, this technology could be used in a hybrid graphics system, where the majority of a game is rendered using traditional methods, but AI is used to create the likenesses of people or objects. Consumers could capture footage themselves using smartphones, then upload this data to the cloud where algorithms would learn to copy it and insert it into games. It would make it easier to create avatars that look just like players, for example.
This sort of technology raises some obvious questions, though. In recent years experts have become increasingly worried about the use of AI-generated deepfakes for disinformation and propaganda. Researchers have shown it’s easy to generate fake footage of politicians and celebrities saying or doing things that they didn’t, a potent weapon in the wrong hands. By pushing forward the capabilities of this technology and publishing its research, Nvidia is arguably contributing to this potential problem..
The company, though, says this is hardly a new issue. “Can [this technology] be used for creating content that’s misleading? Yes. Any technology for rendering can be used to do that,” says Catanzaro. He says Nvidia is working with partners to research methods for detecting AI fakes, but that ultimately the problem of misinformation is a “trust issue.” And, like many trust issues before it, it will have to be solved with an array of methods, not just technological.
Catanzaro says tech companies like Nvidia can only take so much responsibility. “Do you hold the power company responsible because they created the electricity that powers the computer that makes the fake video?” he asks.
And ultimately, for Nvidia, pushing forward with AI-generated graphics has an obvious benefit: it will help sell more of the company’s hardware. Since the deep learning boom took off in the early 2010s, Nvidia’s stock price has surged as it became obvious that its computer chips were ideally suited for machine learning research and development.
So would an AI revolution in computer graphics be good for the company’s revenue? It certainly wouldn’t hurt, Catanzaro laughs. “Anything that increases our ability to generate graphics that are more realistic and compelling I think is good for Nvidia’s bottom line.”