From AI to Z in Entertainment in 2024 – Part Two
In Part One, we explored the evolving nexus between artificial intelligence and entertainment. In Part Two, we take a closer look at some specifics, as well as some of the challenges left to be overcome.
Contents
AI Before Today
The primary business uses of AI to date have been in:
- IT and business process automation,
- Security and threat detection,
- Marketing and sales, and
- Business analytics or intelligence.
Source: IBM Global AI Adoption Index
Growing Developments in AI in Entertainment
Some of the more recent types of AI in TV/film/Video have been:
- AI-based recommendation systems – Key players already using AI for this include Netflix, Amazon Prime, YouTube, Disney, Ubisoft (Fr. Video games) and Spotify (music) – basically using ML to predict customer preferences.
- AI for content creation, scriptwriting and storytelling – E.g., to analyze vast datasets and/or repurpose old content to generate new stories, dialogues, or even complete screenplays. Players include ScriptBook and HyperWrite.
- Audience engagement and advertising – Key players include Google Pmax (to increase/integrate communications across Google’s range of services, search, gmail, etc.), Canys (social media focus), Zefr, and Amazon Personalize (e.g., ML models or chatbots for CRM; or e.g., Natural Language Processing (NLP) algorithms to analyze social media trends, comments, and sentiment analysis to gauge public opinion and reactions to specific movies, TV shows, or events.
- Sentiment analysis – I.e., analyzing digital text to determine if the emotional tone of a message is positive, negative, or neutral, e.g., customer reviews, social media posts, etc.
- Video Generation – Creating or stylistically modifying video footage using a text prompt, single image or existing video footage.
- 3D Modeling – Creating new 3D assets (e.g., representations of scenes) given a natural language prompt or 2D reference images.
- Video editing and post-production tools – E.g., using AI to streamline workflows.
- Tools for animation and VFX – E.g., to automate enhancing or modifying images and videos, character animation, motion tracking, and rendering, optimizing production time and costs.
Case Study: Content Creation
AI-powered tools like Scriptbook and HyperWrite have an impressive potential in scriptwriting and storytelling. Scriptbook utilizes AI algorithms to analyze vast amounts of existing content and generate coherent narratives by identifying patterns and structures.
Case Study: Video Editing and Post-Production Tools
AI tools such as Adobe Sensei and Magisto are making a significant impact. Adobe Sensei, integrated into Adobe Premiere Pro, uses AI algorithms to analyze visual content, enabling automatic video editing features such as intelligent scene cut detection, color grading, and content-aware fill.
Case Study: Tools for Animation and Visual Effects
Autodesk’s Maya (3D computer graphics software used for creating interactive 3D animations, models, and simulations) with Bifrost (a visual programming environment in Maya used for procedurally building effects, like explosions, combustion, sand, snow, etc.) and NVIDIA’s AI-based deep learning technologies are transforming animation and visual effects.
Other uses on the current or near horizon include: Plagiarism detection/fake story detection; Production design; Virtual filmmaking/virtual environments; Use of NeRFs in VFX; Predictive storytelling/narrative development based on ML; or AR and VR immersive experiences to enhance audience enjoyment.
The Big Plus-Minus: Generative AI
Generative AI can be used to enhance or replace various aspects of the movie production process, such as scriptwriting, casting, editing, visual effects, sound design, and marketing. Some of the benefits of using generative AI include reducing costs, saving time, increasing creativity, and improving quality. However, it’s not without controversy.
OpenAI’s Sora
Launched in September 2021, Sora is a generative AI system that can create photorealistic and interactive movies, based on natural language inputs. Users can simply describe the scene, the characters, the actions, and the emotions they want to see, and Sora will generate a 1-minute “movie” that matches their specifications.
Runway, Pika and Others
Runway’s Gen-2 and Pika from Pika Labs use video diffusion models that are capable of synthesizing novel video, creating short, soundless animations from text prompts, images or video. Currently, Runway’s video capability caps out at 18 seconds.
Others include Wonder Dynamics and Metaphysic.
Issues with Gen AI
“Ultimately, the prevalence of GenAI in the production process will be gated by consumer acceptance, not technology.”
Is Gen AI in the “Messy” Teen Period of Growth?
Two key factors were necessary to train generative AI models to achieve their present capabilities, including the newer availability of (i) a massive scale and variety of metadata-rich internet data and (ii) computational power needed to run models at scale, most notably Nvidia’s graphics processing units (GPUs).
Venture investment amid gen AI hype poured in, with hundreds of startups building atop foundational models. But since Summer 2023, investment has begun to recede. Many VCs now believe we’re in generative AI’s “messy middle” or “awkward teenage years,” as investors realize some early bets haven’t proven out their market readiness, user retention and enterprise adoption.
Startup wobbliness has led some to proclaim generative AI is overblown, just another dot-com bubble or crypto misfire. Others have questioned the tech’s actual usability, given legitimate question marks around hallucination, bias and copyright legality.
Expected Cost Drivers
A recent survey by Variety VIP+ collected the following categories of expected costs in developing Gen AI platforms:
- Tools, systems and infrastructure integration costs (APIs, integrations, monitoring tools);
- Model development and training costs (human capital/talent);
- Application development for user interface;
- Data preparation (organizing, cleaning, formatting);
- Time required by subject matter experts to refine model for accuracy based on use case;
- Model development and training costs (compute/GPUs);
- Time required to train workforce on application using generative AI; and
- Running costs (compute/token costs, GPUs, specialized talent).
(Source: Generative AI in Film and TV: Variety VIP+ Special Report 5th Edition December 2023)
Introducing Control Processes
One of the ironic “defects” in gen AI is that the power of the tool is so overwhelming that human control can be problematic:
For now, raw video outputs from such tools are still far too limited to be usable onscreen footage for a high production value film or premium TV. Questions of copyright aside, these tools are considerably constrained to give professional artists necessary control over the output, meaning the ease with which they can derive or manipulate a result to achieve a specific look.
As workarounds, powerful new control parameters are also often being added (and needed) to Gen AI tools to allow users to more specifically change how the video renders, such as Runway’s Director Mode in Gen-2 that allows zoom, speed adjustments and “camera” rotations. Runway also recently released Multi Motion Brush, which allows video editors to control selected areas of a video with independent motion.
Avatars and the Uncanny Valley
Generative AI tools developed by Synthesia, Soul Machines and HeyGen can create entirely synthetic, photorealistic avatars that combine deepfake video and synthetic speech to precisely replicate a specific person’s appearance, voice, expressions and mannerisms. These unique personal AI avatars have been variously referred to as digital humans, twins, doubles or clones.
AI systems create a person’s custom model by training on varying amounts of audiovisual data, whether captured in studios or as video footage of a person speaking directly to camera. AI avatars fall on a wide spectrum of realism, with some being hyper-realistic (almost indiscernible from the real person), while others still tend to look like 3D graphics or “gamelike.”
For now, however realistic some avatars appear, many have only limited range of motion and facial expressiveness and overall remain in “the uncanny valley” — the theory that describes the uneasy emotional response we have toward not quite real humanoid figures.
Other Creative Issues
Current knocks on Gen AI like Sora include:
- Only capable of creating short films.
- The creator can’t control camera angles, lighting, composition, camera movements. I.e., Sora takes the role of creator, rather than assistant.
- Most of the current usage is sci-fi – not normal story videos – which breaks a link with self-reflection in the audience.
- The AI is, so far, not capable of remembering an entire story line.
People have this idea that you can instantly convert your thoughts into a feature film,” says [a filmmaker using AI]. “In reality, it’s a lot more complicated. You never get what you want. You get pushback from the platform. For example, if I try to get an image of the Chinese president, I can’t because they blocked it. So you have to find all these workarounds to get what you need.
If you would like our help with your AI entertainment concept, please reach out to us at https://www.caycon.com.