Source: AFP
Microsoft researchers have unveiled a new artificial intelligence tool that can create deeply realistic human avatars — but offered no timetable for making it publicly available, citing concerns about facilitating deep fake content.
The AI model, known as VASA-1, for “visual affective skills,” can create an animated video of a person speaking, with synchronized lip movements, using just one image and a speech audio clip.
Disinformation researchers fear rampant abuse of AI-powered apps to create “deeply fake” image, video and audio clips in a crucial election year.
“We oppose any behavior to create misleading or harmful content about real people,” wrote the authors of the VASA-1 report, released this week by Microsoft Research Asia.
“We are committed to developing artificial intelligence responsibly, with the goal of advancing human well-being,” they said.
![](https://images.yen.com.gh/images/750824a8520805bc.jpg?impolicy=cropped-image&imwidth=256)
![](https://images.yen.com.gh/images/750824a8520805bc.jpg?impolicy=cropped-image&imwidth=256)
Read also
Kyrgyzstan’s TikTok block raises censorship fears
“We do not intend to release a web demo, API, product, additional application details, or any related offerings until we are confident that the technology will be used responsibly and in accordance with appropriate regulations.”
Microsoft researchers said the technology can capture a wide range of facial nuances and natural head movements.
“It paves the way for real-time engagements with realistic avatars that mimic human conversational behaviors,” the researchers said in the post.
VASA can work with artistic photos, songs and non-English speech, according to Microsoft.
The researchers cited potential benefits of the technology, such as providing virtual tutors for students or therapeutic support for those in need.
“It is not intended to create content that is used to mislead or deceive,” they said.
The VASA videos still have “artifacts” that reveal they were created by AI, according to the post.
![](https://images.yen.com.gh/images/2b163c19f1f3895e.jpg?impolicy=cropped-image&imwidth=256)
![](https://images.yen.com.gh/images/2b163c19f1f3895e.jpg?impolicy=cropped-image&imwidth=256)
Read also
A quarter of UK 5 to 7-year-olds have a smart phone: study
ProPublica’s chief technology officer, Ben Werdmuller, said he would be “excited to hear about someone using it to represent them in a Zoom meeting for the first time.”
“So how did it go? Did anyone notice?” he said on the Threads social network.
ChatGPT maker OpenAI in March unveiled a voice cloning tool called “Voice Engine” that can essentially copy someone’s speech based on a 15-second audio sample.
However, it said it was “taking a cautious and informed approach to a wider release due to the potential for synthetic voice misuse.”
Earlier this year, a consultant working for the long-shot Democratic presidential candidates admitted to being behind a robotic impersonation of Joe Biden sent to voters in New Hampshire, saying it was trying to highlight the dangers of artificial intelligence.
The call sounded like the voice of Biden urging people not to vote in the state’s January primary election, raising alarm among experts who fear a deluge of deep, artificial intelligence-driven disinformation in the 2024 race for the White House.
Source: AFP