Tuesday, May 31, 2016

Compression on Newell Road

Silicon Valley's Newell Road starts at West Bayshore Road in East Palo Alto. After crossing the narrow bridge over the San Francisquito Creek into Palo Alto, it continues south-southwest then dead south to the public library and the art center. After crossing Embarcadero Road, it proceeds southeast and terminates at David Starr Jordan Middle School.

Jordan had become famous around 1977 because Xerox PARC gave it an Alto—which it called "interim Dynabook"—catapulting the pupils in a new world of windows, icons, mice, and pointing (WIMP), as described by Howard Rheingold.

Maybe, the most famous person to live on Newell Road was Dr. Norman J. Lewiston, a professor of pediatrics at the Stanford University School of Medicine who had three legal wives and whose life was made into a movie (The Man with Three Wives, his last name was changed to Grayson). Newell Road was where he lived with his original wife and three children.

Late Spring 1994, Yoko held one of her BBQs on her patio on Newell Road, which was attended by the group of researchers working on color fax. In this technology, the page images are compressed with JPEG. At the receiving end, the pages looked fuzzy, and the group was discussing how to improve the image quality.

The usual sharpening kernels could not be used because they would sharpen prominently the compression artifacts and the image quality was worse. Maybe, it was the Chardonnay that made the group come up with one of those crazy ideas: why not do the sharpening in the compressed domain, where the artifacts are explicit. Essentially, the original is boosted, and this is accomplished by using different DQT tables for compressing and decompressing, a trick that made the operation computationally free.

The method, described in US 5850484 A worked like a charm and other algorithms working in the compressed domain ensued, for example, to change the lightness or the contrast. Konstantin optimized the Huffman code and got an additional 14% compression for the color fax protocol.

This method improved the image quality, but it did not sufficiently reduce the file size, which was a showstopper. Indeed, it took 6 minutes to transmit an A4 page, but the upper time limit was considered to be 2 minutes. Optimizing the Huffman tables was not enough.

A few of Yoko's BBQs later, the group realized that visually lossless compression was too strict and made a run for perceptually lossy JPEG compression. The idea was to compress the image much more, but to move the artifacts to where the image would admittedly look bad, but this would not affect the readability, i.e., the reading performance for a human.

The heuristic process behind this method is interesting because it is the same that is behind today's deep learning algorithms for image recognition. In 1994, the hardware was not ready for that.

heuristic process for image recognition in deep learning

This led to patent US 5883979 A, finally making the new color fax technology viable. However, the employer was not convinced of the product and pulled the plug, only to resuscitate it a couple of years later. By then the momentum had been lost and the market window was closing. At the end, the only thing that came out of the Newell Road BBQs was the book Image and Video Compression Standards: Algorithms and Architectures.

All this work was on lossy compression. The work on lossless compression was a couple of years later, 3.1 km further northwest along Middlefield Road, in Menlo Park. There, the work did not take place on a patio but at Peter's kitchen table. The issue was that Unisys was enforcing its LZW patent. Peter used LZ, which was not patented, and followed it by Huffman encoding, obtaining a better compression rate than LZW.

In compression, it is customary to standardize the decoding of the signal, not the encoding. This leads to the somewhat funny name of FLATE for the method, although the RFC calls for DEFLATE. File formats like PNG and PDF use FLATE.

There was a related event on Newell Road in 2002 when in Yoko's garage the US National Body's proposal for color preference semantics was written (references USNB-CD-DIA-044 to 051). This became part of the MPEG–21 standard for color vision deficiency in the digital item adaptation (DIA) part. However, although this is part of MPEG: it has nothing to do with compression.

This being Palo Alto, people are running all kind of stuff in their garages. For example, at one point a neighbor was running Napster from his garage for a couple of months. However, that was at the time Palo Alto was running its Fiber to the Home (FTTH) pilot and about 100 houses in our neighborhood had 100 Mbps symmetric Internet connections. Legal action was taken by the private industry and the conduits many of us ran from the curb to the utility entrance in January 2000 remain empty and the fiber under Newell Road is dark. In our case, we get by with a 6 Mbps asymmetric VDSL line with 300 GB data cap that costs us $77.44 per month for a double play including VoIP. On Newell Road in Silicon Valley's Palo Alto, Internet access is pretty miserable and expensive.

Newell Road was named for Dr. William A. Newell, a prominent physician in San Francisco, who bought 47 acres from Henry Seale in 1864 for a country estate. Parts of his house, located at 1456 Edgewood Drive, date back to 1866. In 2011, Mark Zuckerberg bought 1456 for $7 million. The updated carriage house and "cow barn," once part of the estate, remain next door at 1450.

Henry Seale, like John Greer, came from Ireland and started a contracting business in San Francisco with his brother, Thomas. In 1853 the Seale brothers acquired a large part of Rancho Rinconada del Arroyo del San Francisquito from the Soto heirs. The Seales at one time owned most of the land on which early Palo Alto was located.