Imagine a Whopper. Now imagine a Whopper topped with oatmeal, kiwis and sirloin steak.
Until March 17, Burger King is offering customers a blank canvas to create such wondrously creative and hideously repulsive concoctions using generative AI. Ingredients need to be edible and not trigger common allergies, so nuts and tires are a no-go.
Burger King will award the winning artist/mad scientist $1 million.
Judges will invite three finalists to the brand’s Miami HQ where they can sculpt their creations before they appear on menus nationwide for a limited time later this year. Customers can try all three and vote for the winner.
Burger King will also host a pop-up experience in Los Angeles on February 17 and 18 where visitors can sample sandwiches, create one themselves and buy exclusive merch.
Burger King built the Whopper generator with Media.Monks. It relies on three multi-modal generative AI models that work together to visualize people’s inventions, then adds backgrounds and a jingle based on ingredients and the inventor’s name.
Stable Diffusion and ControlNet kick things off by generating images of the Whoppers and their backgrounds. Media.Monks fed the diffusion model images of Whoppers to keep the resulting images consistent with how Whoppers are supposed to look, no matter what they’re topped with.
Large language models (LLMs) then take and adapt user-inputted text to produce accurate visuals. These LLMs also reject inputs that are too salacious, aren’t edible or contain common allergens.
The LLMs also transform brand-specific ingredients, such as Sour Patch Kids, to more generic products, working in tandem with the diffusion model to produce visuals that aren’t replicas of another brand.
These LLMs are composed of several fine-tuned models called “experts,” said Iran Reyes, VP and global head of engineering at Media.Monks. One expert knows what’s edible, one knows what’s brand-safe — meaning ingredients don’t belong to competitors and aren’t lewd — and another serves as a last line of defense for things that the other experts haven’t anticipated.
“We can create a list of 500 ingredients, but what if you come up with [that] 501[st one] that we didn’t think about?” Reyes said.
Finally, ElevenLabs uses text-to-speech to create a personalized jingle based on the burger’s ingredients. Media.Monks needed to program a different model to dynamically adjust the speed at which the ElevenLabs model reads so that every jingle is around the same length.
“If the ingredients are long, it needs to add time from another place [in the jingle],” Reyes said. “So it knows when to speed up and when to slow down. It was pretty challenging.”
He added that the tech used for audio generation can be used for other creative endeavors.
“You can take that component, isolate it from this project, and now you’re going to have a video ad creator that is dynamic,” he said.
To handle a huge amount of entries with countless ingredients, Media.Monks relied on AI experts to predict and group probable ingredients often found on burgers. Data scientists charted and clustered Whoppers by those ingredients to make judging each one easier. A quality assurance team then tested the models to ensure that they’d be usable by the average person.
Accounting for the Media.Monks team alone, less than 20 employees worked on the team that put the tech together, Reyes said.
Media.Monks is housing data from AI-generated Whoppers on Amazon Web Services. The environment is designed to keep Burger King’s data secure and out of the public’s hands.