Metaverse and A.I.

OmniHuman: ByteDance’s New AI Creates Realistic Videos From a Single Photo

ByteDance researchers have developed an AI system that transforms single photographs into realistic videos of people speaking, singing and moving naturally — a breakthrough that could reshape digital entertainment and communications. The new system, called OmniHuman, generates full-body videos that show people gesturing and moving in ways that match their speech, surpassing previous AI models that could only animate faces or upper bodies.

“End-to-end human animation has undergone notable advancements in recent years,” the ByteDance researchers wrote in a paper published on arXiv. “However, existing methods still struggle to scale up as large general video generation models, limiting their potential in real applications,”  The team trained OmniHuman on more than 18,700 hours of human video data using a novel approach that combines multiple types of inputs — text, audio and body movements. This “omni-conditions” training strategy allows the AI to learn from much larger and more diverse datasets than previous methods.

“Our key insight is that incorporating multiple conditioning signals, such as text, audio and pose, during training can significantly reduce data wastage,” the research team explained. The technology marks a significant advance in AI-generated media, demonstrating capabilities that range from creating videos of people delivering speeches to depicting subjects playing musical instruments. In testing, OmniHuman outperformed existing systems across multiple quality benchmarks.

The development emerges amid intensifying competition in AI video generation, with companies like GoogleMeta and Microsoft pursuing similar technologies. ByteDance’s breakthrough could give its TikTok parent company an advantage in this rapidly evolving field. Industry experts say such technology could transform entertainment production, educational content creation and digital communications. However, it also raises concerns about potential misuse in creating synthetic media for deceptive purposes. The researchers will present their findings at an upcoming computer vision conference, although they have not yet specified when or which one.

Terron Gold

Recent Posts

Candy Digital Announces Migration to Solana as NFT Platform Repositions for Long Term Growth

NFT platform Candy Digital has announced plans to migrate its digital collectibles ecosystem to the Solana blockchain, signaling…

7 hours ago

US Military Runs Bitcoin Node for National Security Testing, Admiral Tells Congress

The U.S. military has confirmed it is actively running a Bitcoin node as part of national security research, while…

7 hours ago

Over 90% of Web3 Games Failed After $15 Billion Boom as Players Never Showed Up

The Web3 gaming sector is facing a harsh reality check as new data reveals that more…

8 hours ago

Justin Sun Sues Trump Linked World Liberty Financial Over Frozen Crypto Assets

Justin Sun, founder of TRON, has filed a federal lawsuit against World Liberty Financial, a crypto venture…

10 hours ago

Tether Freezes $344 Million in USDT on Tron After Wallets Flagged by U.S. Authorities

Tether has frozen approximately $344 million in USDT on the Tron blockchain after the wallets were flagged by U.S. authorities, marking…

11 hours ago

Kalshi Fines and Suspends Three Congressional Candidates for Betting on Their Own Elections

Prediction market platform Kalshi has fined and suspended three U.S. congressional candidates after determining they engaged in “political…

12 hours ago