In the vast expanse of technological innovation, the fusion of art and artificial intelligence has led to remarkable creations. Microsoft has unveiled a novel artificial intelligence (AI) model that can produce remarkably lifelike films with talking human faces.
AI Makes Mona Lisa Sing
Microsoft just dropped VASA-1.
This AI can make single image sing and talk from audio reference expressively. Similar to EMO from Alibaba
10 wild examples:
1. Mona Lisa rapping Paparazzi pic.twitter.com/LSGF3mMVnD
— Min Choi (@minchoi) April 18, 2024
The AI image-to-video model, known as VASA-1, can turn still images of people’s faces into vibrant animations. According to the company, the produced videos would feature natural-looking facial expressions and head movements along with synchronized lip movements to match the soundtrack. People were astounded when a video showcasing the app’s features became popular on social media recently. The famous Leonardo da Vinci painting Mona Lisa is seen lip-syncing to Anne Hathaway’s song, Paparazzi in an AI-generated film.
The video, which was first released on Microsoft’s official website, has gained a lot of popularity on social media due to its captivating portrayal of Mona Lisa rapping. Some people found the humorous film amusing, which is why the video went viral. “The Mona Lisa clip had me rolling on the floor laughing,” a user commented. One more person said, “Oh, man, If only Leonardo da Vinci could have seen this.
Concerns regarding its unethical use, particularly in the creation of deep fakes, were also voiced by several. Microsoft made it clear that they do not currently have any intentions to make the technology available to the public for online demos, APIs, or other implementations until they are confident it will be used appropriately and following legal requirements.
What Is VASA?
2. Realism and liveliness – example 1 pic.twitter.com/Kz0Bm2NRNy
— Min Choi (@minchoi) April 18, 2024
VASA is a foundation for creating virtual characters with enticing visual affective skills (VAS) who have lifelike talking faces. In addition to producing realistic lip-audio synchronization, it can capture a wide range of emotions, expressive facial expressions, and natural head motions, according to Microsoft experts. As the AI app evolves, it transcends the realm of mere imitation, venturing into the realm of artistic innovation. The top model, VASA-1, can mimic natural head gestures and a wide range of facial subtleties in addition to producing lip movements that correspond with sounds.
The marvels of AI-powered image animation offer a glimpse into a future where creativity knows no bounds and imagination knows no limits. So, what do you think of this new technology?
Cover image credits: X/Min Choi