Kevin Kelly -- Chapter 16: The Future of Control

	Cheaper than printing it out: buy the paperback book.
	Out of Control Chapter 16: THE FUTURE OF CONTROL

At Colossal Picture Studios in the industrial outskirts of San Francisco, Brad de Graf works on faking human behavior. Colossal is the little-known special effects studio behind some of the most famous animated commercials on TV such as the Pillsbury Doughboy. Colossal also did the avant garde animation series for MTV called Liquid TV, starring animated stick figures, low-life muppets on motorbikes, animated paper cutouts, and the bad boys Beavis and Butt-head.

De Graf works in a cramped studio in a redecorated warehouse. In several large rooms under dimmed lights about two dozen large computer monitors glow. This is an animation studio of the '90s. The computers -- heavy-duty graphic workstations from Silicon Graphics -- are lit with projects in various stages, including a completely computerized bust of rock star Peter Gabriel. Gabriel's head shape and face were scanned, digitized, and reassembled into a virtual Gabriel that can substitute for his live body in his music videos. Why waste time dancing in front of cameras when you could be in a recording studio or in the pool? I watched an animator fiddle with the virtual star. She was trying to close Gabriel's mouth by dragging a cursor to lift his jaw. "Ooops" she said, as she went too far and Gabriel's lower lip sailed up and penetrated his nose, making a disgusting grimace.

I was at de Graf's workshop to see Moxy, the first completely computer-animated character. On the screen Moxy looks like cartoon dog. He's got a big nose, a chewed ear, two white gloves for hands, and "rubber hose" arms. He's also got a great comic voice. His actions are not drawn. They are lifted from a human actor. There's a homemade virtual reality "waldo" in one corner of the room. A waldo (named from a character in an old science-fiction story) is a device that lets a person drive a puppet from a distance. The first waldo-driven computer animation was an experimental Kermit the Frog animated by a hand-size muppet waldo. Moxy is a full-bodied virtual character, a virtual puppet.

When an animator wants to have Moxy dance, the animator puts on a yellow hardhat with a stick taped to the peak. At the end of the stick is a location sensor. The animator straps on shoulder and hip sensors, and then picks up two foam-board pieces cut out in the shape of very large cartoon hand -- gloves. He waves these around-they also have location sensors on them -- as he dances. On the screen Moxy the cartoon dog in his funky toon room dances in unison.

Moxy's best trick is that he can lip sync automatically. A recorded human voice pours into an algorithm which figures out how Moxy's lips should move, and then moves them. The studio hackers like to have Moxy saying all kinds of outrageous things in other people's voices. In fact, Moxy can be moved in many ways. He can be moved by twirling dials, typing commands, moving a cursor, or even by autonomous behavior generated by algorithms.

That's the next step for de Graf and other animators: to imbue characters like Moxy with elementary moves -- standing up, bending over, lifting a heavy object -- which can be recombined into smooth believable action. And then to apply that to a complex human figure.

To calculate the move of a human figure is marginally possible for today's computers given enough time. But done on the fly, as your body does in a real life, in a world that shifts while you are figuring where to put your foot, this calculation becomes nearly impossible to simulate well. The human figure has about 200 moving joints. The total number of possible positions a human figure can assume from 200 moving parts is astronomical. To simply pick your nose in real time demands more computational power than we have in large computers.

But the complexity doesn't stop there because each pose of the body can be reached by a multitude of pathways. When I raise my foot to slip into a pair of shoes, I steer my leg through that exact pose by hundreds of combinations of thigh, leg, foot, and toe actions. In fact, the sequences that my limbs take while walking are so complex that there is enough room for a million differences in doing so. Others can identify me -- often from a hundred feet away and not seeing my face -- entirely by my unconscious choice of which feet muscles I engage when I walk. Faking someone else's combination is hard.

Researchers who try to simulate human movement in artificial figures quickly discover what animators of Bugs Bunny and Porky Pig have known all along: that some linkage sequences are more "natural" than others. When Bugs reaches for a carrot, some arm routes to the vegetable appear more human than other routes. (Bugs's behavior, of course, does not simulate a rabbit but a person.) And much depends on the sequential timing of parts. An animated figure following a legitimate sequence of human movements can still appear robotic if the relative speeds of, say, swinging upper arm to striding leg are off. The human brain detects such counterfeits easily. Timing, therefore, is yet another complexifying aspect of motion.

Early attempts to create artificial movement forced engineers far afield into the study of animal behavior. To construct legged vehicles that could roam Mars, researchers studied insects, not to learn how to build legs, but to figure out how insects coordinated six legs in real time.

At the corporate labs of Apple Computer, I watched a computer graphic specialist endlessly replay a video of a walking cat to deconstruct its movements. The video tape, together with a pile of scientific papers on the reflexes of cat limbs, were helping him extract the architecture of cat walking. Eventually he planned to transplant that architecture into a computerized virtual cat. Ultimately he hoped to extract a generic four-footed locomotion pattern that could be adjusted for a dog, cheetah, lion, or whatever. He was not concerned at all with the look of the animal; his model was a stick figure. He was concerned with organization of the complicated leg, ankle, and foot actions.

In David Zeltzer's lab at MIT's Media Lab, graduate students developed simple stick figures which could walk across an uneven landscape "on their own." The animals were nothing more than four legs on a stick backbone, each leg hinged in the middle. The students would aim the "animat" in a certain direction, then it would move its legs upon figuring out where the low or high spots were, adjusting its stride to compensate. The effect was a remarkably convincing portrait of a critter walking across rugged terrain. But unlike an ordinary Road Runner animation, no human decided where each leg had to go at every moment of the picture. The character itself, in a sense, decided. Zeltzer's group eventually populated their world with autonomous six-legged animats, and even got a two-legged thing to ramble down a valley and back.

Zeltzer's students put together Lemonhead, a cartoony figure that could walk on his own. His walking was more realistic and more complicated than the sticks because he relied on more body parts and joints. He could skirt around obstacles such as fallen tree trunks with realistic motion. Lemonhead inspired Steve Strassman, another student in Zeltzer's lab, to see how far he could get in devising a library of behavior. The idea was to make a generic character like Lemonhead and give him access to a "clip book" of behaviors and gestures. Need a sneeze? Here's a disk-full.

Strassman wanted to instruct a character in plain English. You simply tell it what to do, and the figure retrieves the appropriate behaviors from the "four food groups of behavior" and combines them in the right sequence for sensible action. If you tell it to stand up, it knows it has to move its feet from under the chair first. "Look," Strassman warns me before his demo begins, "this guy won't compose any sonatas, but he will sit in a chair."

Strassman fired up two characters, John and Mary. Everything happened in a simple room viewed from an oblique angle above the ceiling -- a sort of god's-eye view. "Desktop theater," Strassman called it. The setting, he said, was that the couple occasionally had arguments. Strassman worked on their goodbye scene. He typed: "In this scene, John gets angry. He offers the book to Mary rudely, but she refuses it. He slams it down on the table. Mary rises while John glares." Then he hits the PLAY key.

The computer thinks about it for a second, and then the characters on the screen act out the play. John frowns; his actions with the book are curt; he clenches his fists. Mary stands up suddenly. The end. There's no grace, nothing very human about their movements. And it's hard to catch the fleeting gestures because they don't call attention to their motions. One does not feel involved, but there, in that tiny artificial room, are characters interacting according to a god's script.

"I'm a couch-potato director," Strassman says. "If I don't like the way the scene went I'll have them redo it." So he types in an alternative: "In this scene, John gets sad. He's holding the book in his left hand. He offers it to Mary kindly, but she refuses it politely." Again, the characters play out the scene.

Subtlety is the difficult part. "We pick up a phone differently than a dead rat," Strassman said. "I can stock up on different hand motions, but the tricky thing is what manages them? Where does the bureaucracy that controls these choices get invented?"

Taking what they learned from the stick figures and Lemonhead, Zeltzer and colleague Michael McKenna fleshed out the skeleton of one six-legged animat into a villainous chrome cockroach and made the insect a star in one of the strangest computer animations ever made. Facetiously entitled "Grinning Evil Death," the token plot of the five-minute video was the story of how a giant metallic bug from outer space invaded Earth and destroyed a city. While the story was a yawner, the star, a six-legged menace, was the first animat -- an internally driven artificial animal.

When the humongous chrome cockroach crawled down the street, its behavior was "free." The programmers told it, "walk over those buildings," and the virtual cockroach in the computer figured out how its legs should go and what angle its torso should be and then it painted a plausible video portrait of itself wriggling up and over five-story brick buildings. The programmers aimed its movements rather than dictated them. Coming down off the buildings, an artificial gravity pulled the giant robotic cockroach to the ground. As it fell, the simulated gravity and simulated surface friction made its legs bounce and slip realistically. The cockroach acted out the scene without its directors being drowned in the minutiae of its foot movements.

The next step toward birthing an autonomous virtual character is now in trial: Take the bottom-up behavioral engine of the giant cockroach and surround it with the glamorous carcass of a Jurassic dino to get a digital film actor. Wind the actor up, feed it lots of computer cycles, and then direct it as you would a real actor. Give it general instructions -- "Go find food" -- and it will, on its own, figure out how to coordinate its limbs to do so.

Building the dream, of course, is not that easy. Locomotion is merely one facet of action. Simulated creatures must not only move, they must navigate, express emotion, react. In order to invent a creature that could do more than walk, animators (and roboticists) need some way to cultivate indigenous behaviors of all types.

continue...