Wednesday, April 11, 2007

Stuffing the toolbox

In industrial and government research organizations there are often religious wars on what tools researchers should have in their toolboxes. In part this is due to the onerous purchasing processes, which tend to have researchers cling to whatever tools they have in a sort of a conservative reflex—need to pound a nail? Use your shoe! In academia the situation is healthier, because generous educational discounts allow researchers to use whatever tools allow them to accomplish their job most efficiently by the deadline.

Every time there is a transition in your research is a good time to look at your toolbox and reconsider what you have. If I would not keep revisiting my toolbox, I would still be doing my backups on paper tape; after all it would still work—at least sort of.

Artisans of the days past used to build their own tools at the beginning of their careers, and then keep improving them, because this was their competitive advantage over their colleagues. This is not a good paradigm for a modern research lab, because the tools are so sophisticated, they take a lifetime to build. I would not even advise anybody to build something as elementary as an integrating sphere; it would take you ages to get it completely smooth and you will probably never figure out where to mount the baffles, let alone determine their geometry.

Every color scientist should have a spectrophotometer. Measuring color is difficult and it requires a lot of intuition to assess the correctness of a measurement; the only way to build intuition is through daily practice. The instrument should be regularly tested, so you become confident in it but not overconfident. Do use your spectrophotometer in emission mode to regularly calibrate you monitor, because that is also a tool in which you need to be able to have confidence.

You should periodically self-administer the Farnsworth-Munsell 100 hue test, so you know what you can see and you know when you are getting out of shape.

The other tools depend on your specific area in color research. In this post, I will focus on the software, because you will have to use software tools and you will have to write software. Let us first look at the programming environment.

What you use depends on your deliverable. When your deliverable changes, your programming tools should be reconsidered. If you are doing wet color science, your deliverable will be an experimental procedure. Your software will be used to design the experiment, run it, and evaluate it. A nice recent example of wet color science is in the 30 March 2007 issue of Science, where Buschman and Miller developed a novel electrode system that allowed them to record across up to 50 channels simultaneously, allowing them to study top-down versus bottom-up control of attention in the prefrontal and posterior parietal cortices.

In such a case, the best strategy is to ask the vendor of the instrument you are interfacing for a recommendation, because they often chose a system for which they will write drivers and demo software first. The MathWorks and Wolfram Research are two software vendors with interpretative environments that allow you to cut down software development to a minimum and get quickly to your computational results. Your organization probably has a site license for the one or the other and you do not need to go through a purchasing process. There is a number of freely available toolboxes for this software that can give you a running start—just run a search on the Internet. If you end up using MATLAB, invest into Westland and Ripamonti's computational color science book.

If you are doing computational color science, sooner or later you will have to deliver a system, so you should have a full fledged system in your toolbox. You should talk first to your colleagues. Have they built a shared set of tools? Sometimes a colleague who used to work in your new project has a system, no longer wants to maintain it, and would be very happy having you taking it over. Since today it has become hard to get budget for tools, second look for open source software, and SourceForge is the best place to start starting looking. If you find a pertinent system, download it and give it a spin. If you like it, join the team as a contributor and get engaged.

If you have to develop your own system, there are a few points to note. Do not use an integrated development environment (IDE) targeted to software developers, because you do not need all the whistles and bells and over time would spend too much time keeping up to date with the stream of new feature updates and dealing with complex team version management tools. Ideally you want an IDE that allows you to easily switch platforms. Check out Eclipse.

For the programming language, you want a language that is platform independent and commonly used in open source projects. Inheritance is a good thing to have because you can then, for example, implement a generic color model operator, which you then subclass every time you need to add a new color space. A garbage collector is also handy because then you can forget about memory management. For high-level work Java and C++ are popular choices, for low-level work OpenGL is a popular choice. If you have to write stand-alone Java applications, use SWT with JFace and, if you like visual programming, SWT Designer.

A recent general book on computational color science is Kang's book Computational Color Technology by SPIE press.

Interpreted systems are not a good choice when you have to develop a system. They are difficult to maintain and even if all systems have good documentation facilities, they are not used and after a year you have forgotten all details about your implementation. Since there is no error handling, you cannot debug the system and would end up spending less time just rewriting it. Use the right tool for the job!

If you develop for color management, SourceForge has a color management system called Little cms that will give you a running start.

If you develop for the Web, stay away from scripting like the plague. They are not only impossible to debug, but they consume an input, try to guess what it means, and then execute powerful server code to process putative data. There is a legion of entrepreneurs who exploit this “feature” to run businesses for spam, phishing, identity theft, peer to peer sharing, etc., high jacking your server. This would piss off your organization and instead of working on your research you would be patching up holes in your system.

The best approach for color science on the Web is to rely on a robust system like Apache and Tomcat. Program in a strongly typed language and religiously perform consistency checks on the input. Never reuse a string you receive from the Internet, just use it to create a new string with fresh bytes, because you have no idea of what a creative entrepreneur can hide in a string. Initialize all your variable, check boundaries, and religiously catch all exceptions.

Your data is valuable and you will be glad to be able to access it in the future to mine it. So store it in a database you can search and access from any programming environment. Do not forget to include all the metadata you can, because the more metadata you have, the more valuable your data will be in the future. A good relational SQL database is MySQL. By the way, spreadsheets are for modeling and trying out (financial) models and scenarios, they are not for storing data.

Periodically, when you discover something interesting, you have to publish your results. If you are using Mathematica or MATLAB, use the built-in editor to take advantage of the seamlessly integrated systems. Else, there is still no better technical typographic design system than LATEX, which uses the best in class TEX typesetter. Today you no longer have to type your text in EMACS, but can use a modern GUI for LATEX, like for example TeXShop.

If you write a long document like a book or slides for a course, you need a document preparation system that allows you to break the document into a file per chapter or module, can manage hypertext links across files, autonumber across files, generate indices, tables of contents, etc, typeset formulas, and include floating figures by reference. Unfortunately only one generally available such system has ever been implemented and unfortunately its development stopped a decade ago, so the GUI is quite ancient. However, your organization if still has a FrameMaker site license you are in luck, even if it is for an old version, because not much has changed for technical and scientific document preparation after release 4. Most important, it always works and you never get surprises.

Finally, if you are not using an integrated system you need graphical software. XGRAPH will create your plots. Xfig can help you draw any kind of illustrations. If you nee to draw many diagrams, you may want to invest into a diagramming application like OmniGraffle. Last but not least, there also is an inexpensive application for fitting your data, called proFit.

So far this was my contribution. Now it is up to you to share your tool recommendations. Please write a comment with your recommendations.