Coming to a wireless phone near you: video. It won't be ready by this weekend's match-up between the Cowboys and the Steelers, but in the near future, you could be watching events like the Super Bowl over your phone -- even from remote outposts accessible only through wireless devices.
Scientists from around the world are meeting this week in Germany to develop the latest technology to create a new standard that will slim video images so completely that they could be sent over wireless gadgets such as cellular phones. It would also speed the transmission of video over conventional telephone lines, opening the door to teleconferencing, videophones, and interactive television.
Among the contributors to this new technology is a team at the University of Rochester. Two proposals prepared by Murat Tekalp, professor of electrical engineering, and graduate student Yucel Altunbasak are among the 85 or so technologies being considered in Munich as part of the new standard for compressing video images. The pair has filed for two patents on the work.
Dozens of members of the Motion Picture Experts Group (MPEG) are evaluating the proposals, which come from companies such as Microsoft, AT&T and Texas Instruments, as well as universities like Columbia, Berkeley and Rochester. The Rochester proposals are included in 25 core experiments that the MPEG group has decided to pursue as it pieces together a video standard to be known as MPEG-4, expected to be completed by 1998.
Industry has already set a few standards, including MPEG-1 and MPEG-2, that govern the technologies that compress and run video images off hard drives, CDs and other devices with massive amounts of storage space. Those video images typically are compressed and then transmitted at several megabits per second.
Now the industry is setting the standard for MPEG-4, which will compress video images dramatically. This could mean face-to- face conversations from anywhere a mobile phone can reach. You could watch your teacher present a class even though you're sick at home. Emergency personnel could transmit images of ill patients to hospital personnel while still in the field. Soldiers might choose to beam images of the front lines in such places as Bosnia back to commanders miles away. The standard will also help clear the logjam of traffic on the Internet by slimming routine video transmissions and making movies and images flow more sleekly and quickly.
MPEG-4 is also being designed to make wireless communications more reliable and to make interactivity for videogames, home shopping and other applications easier.
Transmitting real-time video images over lean data connections like telephone lines requires dramatic compression, since those devices typically can handle several thousand bits per second at most. Engineers are working on slimming down images without losing vital information that would degrade the images.
Tekalp and Altunbasak say their codes can transmit high- quality video images of simple scenes in the neighborhood of 20 to 30 kilobits per second, slim enough to send over videophones and wireless devices images such as someone speaking or teaching.
The Rochester technology incorporates several features widely coveted in the digital imaging industry. For instance, the code is able to zero in and focus a computer's resources on those areas that are moving the most. In the case of a person speaking, resources would be devoted to the eyes and mouth, with just a minimum devoted to the rest of the person's head and face, which usually move only slightly. This processing is hidden to the viewer, who sees one continuous, clear image.
The formula is designed to reduce the errors that most algorithms make from frame to frame. For instance, when the speaker's head moves suddenly, or when a person's hand leaves and then enters the field of view, most compression techniques send the new information as large, cumbersome chunks which take up the bulk of a transmission. The Rochester code eliminates such chunks, sleekly processing almost all new information on the fly.
"We spend our bits wisely," says Tekalp, who last year wrote the first book on Digital Video Processing, which quickly sold out and is being reprinted. "We send only the information that is important for each frame. When you need bits to correct for your errors, you're wasting bandwidth."
The team's content-based mesh design is a geometric pattern (invisible to the viewer) filled with triangles that fits snugly over a scene, like a glove over a hand. On the mesh are 200 flexible nodes that move with the scene, lending a certain ease of motion from frame to frame. The 10-second test sequence on which the Rochester team tested its algorithm is of a mother holding and speaking to a child, transmitted at 30 frames per second. As the mother and child move, the nodes dance around the screen, transmitting the information vital to reconstructing each frame in real time, including the positions of the node points, information about their motion, and the coordinates of new nodes.
The mesh is equipped with many special features. Most important, the mesh estimates motion very accurately, so there is little error from frame to frame. The mesh fits itself to any scene without special instructions; the nodes organize themselves so that triangles don't overlap different surfaces, for instance, the child's face and the mother's arm; and the triangles move so that they do not block other parts of the image.
This work was funded by the Center for Electronic Imaging Systems, which is funded by the National Science Foundation, New York State, and several industrial partners. tr