Shenzhen, China--(Newsfile Corp. - June 29, 2026) - Sumeru AI announced a major update to Mugen3D, its real-time interactive 3D content engine, enabling users to turn one photograph and voice sample into a talking, emoting 3D human ready for live conversation. The technology is already deployed across university classrooms in China, marking an early operational use of geometry-based spatial AI in education.
From a Single Photo to a Physics-Aware 3D Asset
A user uploads one photograph. Within minutes, proprietary geometric algorithms combined with 3D Gaussian Splatting (3DGS) produce a true-to-source model preserving facial structure, hair, fabric texture and surface lighting at 4K resolution. A single unified pipeline handles generation across humans, objects and scenes.

Image 1: Single Photo to Live 3D workflow - from 2D portrait input to 3DGS reconstruction to live 3D AI teacher output
To view an enhanced version of this graphic, please visit:
https://images.newsfilecorp.com/files/10816/302531_f411a9cef431dba1_001full.jpg
The update shifts Mugen3D from reconstructing the body to enabling real-time interaction. SumeruAI, the company's interaction engine, connects the 3D figure to voice input, multilingual dialogue, role-based knowledge and a speech-to-face animation pipeline with under 150 milliseconds of latency. The result is not a prerendered loop or rigid avatar, but a live persona that speaks, reacts, answers questions and converses.
Geometry-First Architecture
Sumeru AI built its generation model on eight RTX 5090 GPUs. A single model generates on one RTX 5090, and real-time interaction runs on standard consumer hardware, including smartphones. The pipeline completes full generation and optimization in minutes on a single RTX 5090 GPU.

Image 2: Compute & Cost Efficiency infographic - comparison of traditional video generation versus Mugen3D geometry pipeline
To view an enhanced version of this graphic, please visit:
https://images.newsfilecorp.com/files/10816/302531_f411a9cef431dba1_002full.jpg
Because Mugen3D produces spatial geometry rather than flat pixels, each asset is generated once and can be reused across any rendering environment without additional compute. Where video platforms require continuous high-cost inference to sustain output, a 3D asset generated once can be reused, re-rendered and redeployed indefinitely. This reusability makes the geometry-based approach practical for deployments that require consistent, repeatable interaction at scale.
Real-World Deployment
In its latest public demo, Sumeru AI showcases a Math Teacher persona generated from a real photograph. The avatar explains concepts, answers live questions and switches languages on demand.

Image 3: Classroom deployment - students interacting with 3D AI teacher on large display
To view an enhanced version of this graphic, please visit:
https://images.newsfilecorp.com/files/10816/302531_f411a9cef431dba1_003full.jpg
Institutions already using the platform include Beijing Institute of Technology, Shanghai Jiao Tong University, Shenzhen University, and Hong Kong University of Science and Technology. Nearly 1,000 educators and tens of thousands of students now interact with Sumeru AI's digital teaching personas across these institutions.
On June 17, 2026, Sumeru AI earned second place in the AI Agent track, the "Second Brain" Challenge, at the 36Kr National AI+ Scenario Application Competition during WAVES 2026 in Guangzhou.

Image 4: Character Plaza presentation at 36Kr WAVES 2026 competition by Dr. Cheng Feng, CEO of Sumeru AI
To view an enhanced version of this graphic, please visit:
https://images.newsfilecorp.com/files/10816/302531_f411a9cef431dba1_004full.jpg

Image 5: Robot Human-Likeness Test demo at 36Kr WAVES 2026
To view an enhanced version of this graphic, please visit:
https://images.newsfilecorp.com/files/10816/302531_f411a9cef431dba1_005full.jpg
"Most image-to-3D tools stop at the render. We treat that as the starting point," said Dr. Cheng Feng, CEO of Sumeru AI. "World models cannot be built on flat video. Reality is 3D. Whether training robots in simulated rooms or teaching students in virtual classrooms, you need geometry with physical bounds, not hallucinated pixels. A 3D teacher can explain a concept, answer a follow-up and remain with a student until the idea sticks. That is presence, not content."
Mugen3D replaces manual sculpting, rigging and animation with a photo-driven workflow that cuts production time while maintaining precise visual reconstruction. The pipeline supports Unity, Unreal Engine and WebGL environments, positioning the platform for education, robotics simulation, spatial computing, 3D printing and interactive entertainment.
Sumeru AI was founded by Dr. Cheng Feng, a physicist and University of Leicester PhD who previously built a computer vision company to a 1.5 billion RMB valuation, and Dr. Tomohiro Nagasaka, a Kyoto University mathematics PhD and International Mathematical Olympiad medalist who built China's first optical motion capture system. The company completed a ten-million-yuan angel round in 2025 and joined the NVIDIA Inception program in 2024.
Mugen3D is available at www.sumeruai.us/mugen3d.
About Sumeru AI
Sumeru AI, also known as Shenzhen Quxiang Spacetime Technology, builds spatial intelligence infrastructure for education, robotics and spatial computing. Its Mugen3D platform turns images and voice into precise 3D assets and conversational digital personas capable of real-time interaction.
Media Contact
Name: Yanchen Dong
Email: yanchen.dong@sumeruai.com
To view the source version of this press release, please visit https://www.newsfilecorp.com/release/302531
Source: The Alchemists Group
