For more than a quarter-century, Digital Domain has been at the forefront of film visual effects, developing innovative technology and techniques for boundary-pushing films such as Titanic and What Dreams May Come.
Increasingly, the studio has included digital doubles—intricate, near-identical recreations of real-life actors—in its repertoire, such as in The Curious Case of Benjamin Button, Tron Legacy, and the Avengers films. Over the last two years, Digital Domain has turned its attention to the possibilities of real-time digital humans, which are rendered in a game engine and can be “driven” or performed by a real person.
Digital Domain’s Digital Human Group recently showcased the early results of its work during a TED Talk, in which Doug Roble, the studio’s head of software R&D, gave a presentation about the technology. Only it wasn’t just Doug talking to the audience—it was also “DigiDoug,” a realistic digital duplicate of Roble himself, who mimicked Roble’s own body and facial movements on the screen above him.
The results were astounding, and as seen on the real Doug’s body on the TED stage, it was the Xsens MVN inertial motion capture suit that helped realize his team’s vision. The possibilities for Digital Domain’s tech—much like with the Xsens suit itself—are seemingly endless
The birth of DigiDoug
According to Roble, the team was first inspired to investigate real-time digital humans after witnessing a “Meet Mike” demo from facial animation specialists Cubic Motion at SIGGRAPH two years back. It was a different approach to the kind of work that Digital Domain already specialized in, and also paired well with other ongoing in-house R&D experiments. The team quickly saw the potential of exploring it further.
“The combination of the power of rendering that’s possible now with gaming engines, and the speed that we were getting out of our machine learning experiments… we thought, ‘Holy smokes, could we take almost all of technology that we’ve been developing for feature-quality work, and then put it into real-time?'” Roble recalls. “Over the last two years, that’s what we did.”
Bringing DigiDoug to life requires the fusion of extensive technology and expertise. The USC Institute for Creative Technology’s Vision and Graphics Lab captured Doug’s face with incredible detail, while Dimensional Imaging was tapped to capture his facial movements and Fox VFX Lab’s helmet camera is used onstage.
The Xsens MVN motion capture system is used to capture Doug’s exacting body movements, while Manus VR gloves capture his hand and finger movements in real-time. IKINEMA streams, retargets, and cleans up his body performance and synchronizes movements, with Epic Games’ Unreal Engine rendering DigiDoug and NVIDIA GPUs enabling the rendering and machine learning processes behind the demonstration.
Introducing DigiDoug to TED
Roble says that much of the team’s time on the project has been spent working out the kinks and polishing the system. When the opportunity came to demonstrate the technology at a TED Talk—and feature the first-ever digital double to deliver such a presentation—Digital Domain jumped at the chance. However, they also knew that it would be an incredible challenge to bring the technology out of their lab for the first time.
“We had imagined that the system would be used where the actors would be backstage, but they wanted him onstage for the TED Talk,” explains Digital Domain software engineer, Melissa Cell. “Luckily, we were using Xsens, and we were able to have him go out on stage and be mocapped, because it doesn’t require cameras and is inertial-based.”
The Xsens MVN system can be used essentially anywhere, as sensors within the suit capture all movements without the need for a capture stage or a camera setup. The suits can be easily stored and transported, and then set up within 15 minutes— you can use a laptop to capture data right there on the spot. The ability to move freely around the stage was essential to delivering the TED demonstration.
With the TED Talk successfully executed, Roble and team decided to double up for SIGGRAPH’s Real-Time Live!, in which two DigiDougs were piloted in real-time by both Roble and Cell. It was a new level of challenge for the Digital Human Group, but it’s one that Xsens was already equipped to handle.
“Connecting two Xsens suits to the Xsens MVN software was pretty easy—it’s already set up to do that,” Cell notes. According to Roble, their system could potentially scale up further with even more digital humans sharing a stage in real-time. “The challenge was going from one to two,” he says. “Adding more shouldn’t be… well, I should stop talking right there.”
Why Xsens was key
Roble sees incredible potential ahead for believable and utterly lifelike digital doubles. While it’ll be another hugely useful tool for Digital Domain’s film VFX team, he also sees a future in which digital doubles are used for virtual reality communication, and also to represent A.I. systems that people can interact with.
Digital Domain has an enormous camera-based mocap system and stage used for its feature work, but the Digital Human Group needed something different for this project—something portable, adaptable, and complementary to other capture tech regardless of location. Digital Domain needed Xsens for that.
“Because we are primarily an R&D group, it’s been really helpful for us to be able to set up anywhere,” Cell says. “Xsens lets us move around. We can run it right in the room where our engineers are working, or we can move to the stage. We don’t have to work on a mocap stage, and that’s been really valuable for us.”
“We could use our capture stage, but the flexibility of Xsens—being able to take it on the road and being able to do it in our cubicles—was staggeringly important,” Roble affirms. “Once we decided to go Xsens, we just kept on-going. It’s been a fine solution for that.”