Friday, May 25, 2007

The past ten years in digital photography

Recently I received a comment on an old 5 March 2007 post, asking me for my thoughts about how digital photography's development in the last ten years has changed the way we take photographs. I am writing a new post instead of adding a comment to an old post because on this page you see only the current posts, and making this current allows more readers to comment.

First of all, here is the comment I received:

Re: HPL-97-164. Professional Portrait Studio

Your post, though I recognize it is designed to direct people away from this older research, does make some interesting points, as well as summarizing in brief how far digital photography has come. Given that you were involved in the early test projects for digital photography, I wonder if you have any thoughts about how its development in the last ten years has changed the way we take photographs. Obviously it has made photography easier in many ways, but I suppose I am interested in the broader impact it may have had. Does the fact that it is easier, for instance, mean that photography as an art form is more accessible to more people? Since as you say current research is always built on previous research, how do you regard the progression of that research over the past ten years both from the subjective viewpoint of making better and better cameras, and the more objective viewpoint of considering how those better and better cameras have shaped our world at large?

Posted by rzrsej on 5/22/2007 12:06 AM

Photography has always had three types of users: professionals, serious amateurs, and amateurs. In recent times the categories have been renamed by the trendier terms of pro, prosumer, and consumer, but it is the same difference. Professionals take photographs to make a living, be it as portraitists, photo reporters, commercial photographers, or artists. Serious amateurs mostly have artistic self-expression motivations, but also document events and family history. Amateurs are mostly chroniclers of the family and friend's life. Digital photography has had a different impact in these three categories.

Professionals act with a purely economic rationale. Time is money and the ability to shoot a model again while it is still in the studio can justify quite a high equipment investment, making the professionals early adopters of technology. The way it has changed our lives is that today we see pictures of important events in remote areas few minutes after the events happen. The boundaries of the "here and now" have become much wider, including even Mars with its rovers.

Serious amateurs adopt new technology only later, mostly when the price per quality has descended to approximately the same level as the technology it is replacing. For digital cameras this has happened early 2005, but for digital photography in general this is just starting to happen. Serious amateurs do not just take a picture and email it to a friend — they invest the time it takes to post-process the image, to interpret it, and tell a story. In digital photography this post processing happens on a computer, and only now are computers in the price range of these people becoming fast enough to be viable, especially in terms of bus and disk speeds.

For amateurs, who are mostly chroniclers, image quality is not just megapixels and sharpness. Immediacy and candidness are as important if not more important. To understand this we have to look at societies who have a vibrant middle-class, like Japan, Korea, and China. The first sign that digital photography was getting ready were the sticker photo booths that all the sudden popped up all over urban areas, especially around the entrances of department stores, where teens like to hang out. The photo sticker booths brought the concept of a fun pose to be shared immediately with friends.

The second event was the commercialization of the Internet, which in these three countries was followed immediately by the wide availability of broadband access, and email substituting the fax machine as a personal communication medium. Casio (pronounced Kashio), a company specializing in mass-production consumer goods, was the company who realized this and came out with the first popular amateur digital camera. It allowed chroniclers to document their email with fresh pictures of their dears.

The third event was the integration of PDA (personal digital assistant), handy, and camera. This meant that everybody is carrying a camera all the time and can communicate pictures instantaneously. As we have become more nomadic and dispersed, digital photography allows us to still share our lives as if we would be there. In February I wrote about We are all photographers now! and you may want to follow the link there to the exhibition in Lausanne.

When you compare the image quality of a phone camera to that of a standalone digital camera, you quickly note that technology still has quite a bit to catch up. Sophisticated imaging pipelines will migrate to digital cameras, exotic lens designs will bring 10x zooms, and last but not least, manufacturers will have to realize that like a watch of the same price, also a phone camera needs a scratch proof sapphire glass.

While in terms of resolution we have all that is needed, there will be improvement in the bit depth. Today a good camera records 12 bits per photosite (pixel), and an LCD panel also operates at 12 bits. Why then do we want to store and communicate only 8 bits in the files? This is the main technical innovation that Microsoft's new JPEG proposal is bringing to us.

In terms of social impact, there is something bigger than 12 bits coming upon us. Remember the slide shows of the old days? When at a party the host would announce the projection of the vacation slides was the time everybody suddenly remembered they had to run home and do their taxes. Well, it is coming back.

Non-professionals take pictures to document and chronicle their lives. The most effective way to do this is by telling stories. Computers are just becoming fast enough to finally produce stories that are compelling and do not put people to sleep like the old slide shows used to do. The new technology is called remix and the first successful product is iLife, which consists of a tightly integrated suite of tools to archive and edit photos (iPhoto) and music (iTunes), weave stories based on the images (iMovie), assemble sound tracks and create podcasts (GarageBand), and publish them instantly on the web (iWeb) or on DVD (iDVD).

The counterpart for serious amateurs is not yet quite ready for primetime, because the computers are still too expensive, but we are getting there, maybe by the end of this year. Of course, you can also do this on a machine running Linux or Windows, but it is not as well integrated and it is not preinstalled. More importantly, there will be new breed of applications running on your cameraphone that will create a rewarding experience of enjoying remixed stories during dead time (like while commuting on the train), enabled by MPEG-A.

Monday, May 21, 2007

Movies in our eyes

A month ago jlrevilla commented on my 30 April post on non-local realism with a link to the article the movies in our eyes in the April issue of Scientific American. I am finally following up on jlrevilla's suggestion. Essentially this article is a well-written account of some of the processing taking place in the retina's early mechanisms. Anybody who still imagines the eye as a sort of camcorder should read this article, which can be found in any library.

The article is an easy read and reports on the latest results in this research area. Building on Santiago Ramón y Cajal's original work on the retina's physiology, the authors quickly fast-forward to the latest physiological findings. Very well made illustrations allow readers to immediately grasp the complexity of the early mechanisms in the retina, like the non-intuitive fact that the rods and cones point inwards towards the brain, so that the image has to be detected through all the "wet mess" in the retina.

One aspect that is maybe oversimplified is that the reader could easily assume that color vision is hierarchical, i.e., that the visual signals follow the simple model of a distal event causing a proximal stimulus in the retina, which in turn causes a brain event. In fact, there are about 26 known areas of visual processing and feedback loops exist between all of them, as indicated by the bidirectional arrows in this simple cognitive model for color appearance.

operational model for color appearance

It has been proven that a necessary condition of some activity in even the primary visual cortex is input from “higher” areas. In essence, vision is not a bottom-up process, in the sense that the “inner eye's” function is not to understand what the sensory states indicate. A top-down model is that we have a model of the exterior visual world and then use the visual system to confirm or revise this model. This is how sleep researchers know when we are dreaming — our eyes are moving during the so-called REM (rapid eye movement) sleep. For more see Science 17 March 2006: Vol. 311. no. 5767, pp. 1606 - 1609.

However, even this is an oversimplification. For example, a rapid movement in the periphery can immediately activate the amygdala through the magnocellular pathway. In fact, if you read or write on your desk and have a large CRT in your peripheral view, this is what gives you a headache — it is the continuous stimulation of the amygdala and cancelling the alarm through a subconscious verification that the trigger is just a CRT.

Another oversimplification is that many sensory signals are non-correlational — a given signal does not always indicate the same property or event in the world.

Finally, it is important to notice that the signals from the retinal ganglion cells to the lateral geniculate nucleus are not amplitude encoded, they are frequency encoded and more than one signal can be multiplexed on a single optical nerve. This actually created a moment of panic at the 1993 AIC congress in Budapest, because the flicker frequency used in many old psychophysics experiments related to color perception was between the typical modulation frequencies of chromatic and luminance signals, meaning that old experiments might actually have shut off certain channels and delivered incorrect results.

PS: Links contributed in the comments:

Thursday, May 17, 2007

Low Commitment Spectrophotometer Care

On March 5, I posted an entry on the aging of technical publications and how their derivates need to be monitored. Today I need to write a similar update regarding the repeatability of spectral measurements.

Almost a decade ago I had the lucky chance to work with Kathleen Berrigan and David Wolf on detecting very small drifts on a manufacturing line. Than line was instantiated on various continents and operating at different ambient temperatures. Part of the result was published in the technical report HPL-1999-2, Low Commitment Spectrophotometer Care. This report had quite a few downloads and it is time to publish an important update.

The report discusses mostly a very structured procedure to follow, and the technical aspect requiring the most attention is thermochromism. The statistical part was a very simple repeatability estimate based on calculating ΔE values.

A lot of water has flown under the bridges in the past decade and today’s faster PCs allow for much more sophisticated statistical methods than ΔE. In fact, in the meantime the ASTM has created specification E2214-02, a Standard Practice for Specifying and Verifying the Performance of Color-Measuring Instruments.

The recommended multivariate methods allow for a much more accurate assessment of repeatability. Maybe more importantly, however, multivariate methods allow you a much faster diagnosis when a problem does occur.

A more comprehensive procedure entails a steeper learning curve, and this might have hold off some adopters. Fortunately, the current issue of Color Research and Application, Volume 32, Number 3, dated June 2007, has an easy-to-read paper by David Wyble and Danny Rich entitled Evaluation of methods for verifying the performance of color-measuring instruments. Part I: Repeatability. Their paper describes in an clear way the methods, then they present repeatability results from a long-term study of twelve commercial spectrophotometers.

If you are using what I described in HPL-1999-2, I recommend you study this paper and update your practice. There is also a Part II, but that is on inter-instrument reproducibility, which was not addressed in my technical report.

Tuesday, May 15, 2007

A sea change in intellectual asset management

An article senior editor Roger Parloff wrote in yesterday’s Fortune magazine had quite a bit of echo in the blogosphere during the last two days. Recently there have been some other events related to intellectual assets in color science that suggest we may be at the verge of a sea change in intellectual asset management.

Before I go any further, I need to make a big disclaimer. What I am writing in this post is my personal view based on what I see from my window and has absolutely nothing to do with my employer. I am just a foot soldier working in the trenches and do not have a clue of what is going on in HP regarding intellectual assets; hence do not even try to read anything in this post other than the textual words.

Bankers keep their assets in vaults and take them out only when necessary for business. Lawyers lock their intellectual assets in sturdy filing cabinets and safes every evening before they go home. We scientists take our employer’s intellectual assets home every evening, and the employers can only hope to see us back the next morning—and hopefully with our brains intact.

Society would not work well if its capitalists would live in constant fear of losing their assets. Therefore, governments came up with the concept of a patent. When technologists invent something useful, novel, and non-obvious, their employers are granted a patent protecting this invention, so they can recuperate their research investment and invest in new research. This allows society to progress.

Most of the time this system works quite well, but sometimes sand gets in the cogs. In the color community we have seen quite a bit of this, like the Schreiber patent, the blue noise halftoning issue, FlashPix, remote proofing, and so on. Often the issue can be resolved through cross-licensing or just plain licensing. Patents are true intellectual assets that have a business value and can be traded for the greater benefit of all in society.

Unfortunately our world is suffering under the increased diffusion of greed. Think for example how much more inexpensive health care would be if there were no greedy injury lawyers. With the increased conglomeration of companies and the ensuing deepening of their pockets, intellectual assets are increasingly attracting the attention of such greedy ilk, which use the proceeds for luxurious life styles instead of reinvesting them into research useful to all in society.

The phenomenon is not new—in our field it started maybe two decades ago—but recently one has the impression the matter is spiraling out of control. As a lowly technical worker, just by looking out of my window I cannot tell which patent disputes are driven by greed and which are genuine business transactions, but there is a number of issues that lately found a lot of echo in the press.

Besides the open software issue mentioned in the Fortune article, the trade journals have reported disputes directly related to color science and imaging. Examples are the impact of the Forgent patent on JPEG, Canon’s attempt in commercializing the surface-conduction electron-emitter-display (SED) TV technology, and the litigation around LED illumination with variable correlated color temperature. As stated, let us be positive and hope they are all genuine business transactions.

In such a degradation of our ecosystem, what is the capitalist supposed to do to avoid going paranoid? The knee-jerk reaction is to use one’s technology dollars for patent litigation instead of paying nerdy researchers. However, with an engineer costing $80 and hour and a litigation lawyer billing $800 an hour here in the Silicon Valley, one should hope investing in 10 scientists may be more economical than retaining an extra lawyer.

Of course, this requires that the 10 scientists should be well managed. For example, they should work on stuff nobody else is doing so they can realize their full potential in securing patents and other intellectual assets. When good researchers are leashed onto established technologies, the result is less beneficial for society. For example, my personal bet is that without JPEG patent litigation Microsoft would probably not have had to invent the new HD Photo technology it has recently proposed as a new JPEG standard.

In summary, if influential companies have to do CYA research to survive greedy assaults on their intellectual assets, we will see a sea change in the form of more small incremental technology improvements instead of big leaps, and higher product costs to cover the licensing and litigation fees. And when researchers go paranoid, what they fear are not lawyers but headhunters, their brains being the locus of their own assets…

Monday, May 14, 2007

Synaptic communication in the visual cortex

The traditional view of pyramidal neurons, which are excitatory, is that they can only excite their downstream target cells. A new study of the mouse visual cortex shows that cortical pyramidal neurons can elicit an inhibitory synaptic current in another neighboring pyramidal neuron.

This very complex study was performed by the Komatsu Group (mouse over the Neuroscience Department) at the Research Institute of Environmental Medicine at Nagoya University and reported in Science 4 May 2007: Vol. 316. no. 5825, pp. 758 - 761 in the article Specialized Inhibitory Synaptic Actions Between Nearby Neocortical Pyramidal Neurons. It appears to build on earlier research by Callaway at the Salk Institute for Biological Studies (see Nature Neuroscience  3, 701 - 707).

This inhibition may play a crucial role in the regulation of the cortical output signal. According to traditional views, pyramidal neurons receive inhibitory inputs via action potentials initiated in inhibitory interneurons after integration of synaptic inputs to their somatodendritic domain. Thus, synaptic transmission from inhibitory nerve terminals to layer-2/3 pyramidal cells is driven by two distinct signaling pathways:

  1. via an integration of feedforward and feedback signals in inhibitory interneurons, and
  2. more directly via output signals of nearby pyramidal cells

The presence of interpyramidal inhibition suggests that the functional influence of inhibitory neurons can be far greater than might be predicted by their relatively small numbers. Understanding what this important role is, will take quit a lot more research.

PS: Links contributed in comments:

Tuesday, May 8, 2007

How to buy a printer

I am often asked for printer recommendations. Many people think there is a single metric to rate a printer, for example the resolution in dots per inch, but then when they visit a store, they are overwhelmed by the choice. Often a manufacturer has even more than one model at the same price. How do you buy a printer?

The hardware technology is very advanced for all manufacturers, so a single metric will be of no help in choosing a printer. For example, the resolution is meaningless, because there simply are no longer any printers on the market where resolution is a determining quality criterion.

Because all products are at a high quality level, price is not a criterion either—you get what you pay for. Today’s markets operate at the highest efficiency level and the competition is so fierce that printers are priced at what they cost to manufacture. There is no room for price padding or charging what the market can bear. You can be sure that all manufacturers work with an anorexic staff and use just in time manufacturing. Nobody can afford people or material on standby in case something unforeseen happens. Still, if manufacturer honesty if of paramount importance to you, today it has become very easy to find out how much margin there is in a printer, just look up the SAG (sales and general expenses) in the annual report to determine the cost of the executives, then add the dividend paid to the shareholders for their investment in the R&D and manufacturing plants.

The proof of the printer market’s honest pricing is that there are no counterfeits and no generic printers, even not in emerging economy countries. Trust me, you really get what you pay for.

How then do you decide what printer to buy? When designing printers, and many other products, manufacturers build a number of customer profiles. Then they design a product that best fits that customer profile. Therefore, they best way to buy a printer is to profile yourself, and then look at the printer specifications for you profile; the printer with the best fit is the one you want to buy. If you cannot afford it upfront, usually you can lease it.

What are typical criteria to consider for a user profile? The most important one is probably how much you print. If the printer is personal and you print only a few pages a day, an ink jet printer will be most economical for a given image quality. If you print a lot, a laser printer will be more economical, because the supplies cost less.

The second important criterion is print speed. If you are a gamer who rarely prints and then does not mind waiting a couple of minutes, a $50 printer may be all you need. However, if you are a litigation lawyer who charges $800 an hour, if you are not buying a printer capable of printing at least 40 pages per minute you may be cheating on your clients. You will have to read independent printer reviews to find out what the real printer speed is for your software application, rather that how fast the print engine can mark pages, because it typically takes much longer for the software driver to image a page than for the printer hardware to print it out.

About a decade ago my wife told me we were wasting too many trees by printing only on one side of the paper, so we replaced our printers with duplex models, which can print on both sides of a page. Duplex capability may be a feature that might be important also for you. Good quality paper has little strike-through and is very suitable for duplex printing.

If the printer is to be shared, it may be handy to get a printer with a network card, so you do not have to turn on the computer on which it is attached when you need to print from another computer. In our home, we have a local color inkjet printer and a networked laser printer.

In ink jet printers there is a strong interaction between the paper and the ink. In modern papers, 60% or more are so called fillers, and the paper chemist has many degrees of freedom when designing a paper. Ink chemists have a very good understanding of paper chemistry too, and often they make a recommendation for a certain paper formula and then optimize the ink for that paper. Because of this, in an ink jet printer spending a little more on paper can make a big difference in print quality, especially when you buy the paper from the same brand as the printer. For documents that are given to customers or sent out of the house, spend a little more and use a high quality paper, because the cost per page is in the ink, not the paper. When you print photographs, use only a photo quality paper, because the image quality is so much better.

Even on a laser printer, where there is less interaction between paper and toner, you get much better quality for your important documents and especially photographs (e.g. when you print brochures or term papers) when you use a glossy paper formulated for your printer. However, in general the paper quality is less important than on inkjet printer, except when you need a high speed printer, because the mechanical engineer will have optimized the paper path for his own brand of paper and you are less likely to get a paper jam when the paper is a little out of spec, like when the printer is in a high humidity environment.

If you end up regularly using two papers, one for in-house documents and one for documents that go out, you may want to add to your profile the requirement of two paper cassettes, so you do not have to keep changing the paper in the printer.

When you use multiple paper types, e.g. one for letters and one for photos, you have to set the paper type in the print dialog. If you are forgetful and end up wasting media because you printed with the incorrect settings, it might be wise to add a paper type sensor to your customer profile.

At this point you should be able to write down your user profile. Other criteria like borderless printing, wireless connectivity, input sheet capacity, etc. should be easy to pin down. Now you can check the various vendors’ Web sites and find the best printer that matches your profile. It is very likely that there will be just one or two printers ideally suited for your requirements. Your choice has become easier.

PS: Links contributed in comments:

Tuesday, May 1, 2007

Natural language color editing

One of the aims of color science—at least in an industrial setting—is the communication of color, and researchers strive for effective communication. In this post we visit some new research by Dr. Geoff Woolfe, a Principal Research Scientist in the Xerox Innovation Group.

When color started appearing on workstations in the mid Seventies, color was specified on physical dials. In computer aided design (CAD) applications, the dials were for the amounts of red, green, and blue. In electronic publishing applications they were for cyan, magenta, yellow, and black quantities. The workstations were very expensive, and considerable resources were invested in training the users.

When color workstations came within reach of what at the time were called "casual users," it became clear that a more ergonomic user interface was need to specify color, and in 1978 Alvi Ray Smith came up with a very efficient transformation of RGB into a triplet he called HSV, for Hue, Saturation and Value. This model was much better, although it has serious shortcoming, because the h, s, and v quantities do not correlate well with with perceived hue, saturation, and value. For example, in HSV 100% yellow and blue have the same the same value, which is obviously incorrect.

Another problem with HSV is that it is not clear to specify a hue. For example, what is the h value corresponding to red? [Answer: 30] Half a dozen years later, when Maureen Stone implemented her interactive color selection tool, in addition to sliders for RGB, HSV, and CMYK, she also added a box to specify a color by name. This leads to the following operational model for color appearance:

a cognitive model for color perception

Maureen based her implementation the the ISCC-NBS Method of Designating Colors and a Dictionary of Color Names. Her tool worked very well, but for colleague Frank Crow, who incorporated the tool in his 3-D World editor, a color solid divided in 267 color-name blocks was too coarse. On his instigation, Maureen beefed up her parser to allow iterating among modifiers, so that a user could for example specify a moderate reddish orange, but could then add dark and grayish to make it a grayish dark moderate reddish orange.

A couple of years late I replaced the HSV sliders in Maureen's tool with sliders for L*, C*, hab, i.e., the correlates for lightness, chroma, and hue in CIELAB. Although the new sliders were much more usable, the color name specification mode remained more user-friendly. More recently, in the tool for pre-flighting variable data print jobs I presented at AIC 2005 in Granada, I used the Coloroid System color names to assess the readability of colored text on colored background.

Fast-forward to 30 April 2007. Geoff Woolfe has substantially improved on our now crude methods. First of all, he no longer constrains himself to a rectangular tiling of hue leaves, which we did for performance reasons. Dr. Woolfe partitions the color name space using a Voronoi graph which yields a more natural model of color naming. He achieves performance by storing his data in a kd-tree.

The second substantial improvement is in how the user performs a selection, which is by specifying a color by name and then iterating with the system, which interactively generates a mask from the color name by analyzing the samples in the image.

This is what I understand from his paper "Natural Language Color Editing," presented yesterday at the ISCC Annual Meeting in Kansas. Every year I used to attend ISCC's Annual Meeting, but in today's anorexic companies such luxury has no longer been possible for several years, so I am not able to report on more details. However, the author concludes his paper with

"The verbal presentation of this paper will discuss the details of how mappings between verbal descriptions and numerical color spaces are achieved and demonstrate the results that can be obtained using such a system."

If you can find attendees, you can ask them for the details, or maybe some of them are reading this post and can add comments.

Thank you to Neil Gunther for the pointer.