Text-to-image AI systems are booming in capability and popularity right now, and what better proof than their appearance in the most popular app in the world: TikTok.
The video platform recently added a new effect it calls “AI greenscreen” which allows users to type in a text prompt which the software will then generate as an image. This image can then be used as a background for a video, a potentially very useful tool for creators.
TikTok’s system output is fairly basic compared to that of state-of-the-art text-to-image models like Google’s Imagen, OpenAI’s DALL-E 2, or Midjourney’s eponymous software. He only creates rather abstract and swirling images; a strength reflected in the dreamy nature of TikTok’s suggested prompts like “astronaut in the ocean” and “flower galaxy.” Other models, by comparison, can produce both photorealistic images and complex, cohesive illustrations that look like they were drawn or painted by humans.
The limitations of TikTok’s model may be intentional, however. First, more advanced models require more computing power, which would be costly and resource-intensive for the business to implement. Second, TikTok has over a billion users, and giving all those individuals the power to create photorealistic images of anything they can imagine would almost certainly produce unsettling results.
For example, we tested the templates’ ability to create nudity and gore, two types of output that text-to-image generators often try to limit. Images based on violent prompts like “the assassination of Boris Johnson” and “the assassination of Joe Biden” produce mostly abstract swirls, with a roughly recognizable face for the British Prime Minister (although the familiar blonde mop of man makes caricature particularly easy).
Similarly, a request involving nudity – “naked model on the beach” – produces thematically appropriate colors, including skin tones, sandy oranges and ocean blues, but nothing that would make a vicar blush.
What’s remarkable about the appearance of TikTok’s “AI green screen” is that it shows how quickly this technology is going mainstream. The latest text-to-image AI development cycle arguably began in 2021 with the original version of DALL-E by OpenAI. Less than two years later, the technology is already in the hands of millions through an app like TikTok.
Given the potential of these systems for both bad and good, things are only going to get stranger from here on out.