The Mostly Color Channel: Binarization

Tuesday, October 16, 2012

Binarization

This year I have been working on document information retrieval, which is as far from color as you can imagine. Indeed, business documents are pretty dry binary black and white items, so that the first step—before even doing optical character recognition—is to binarize the document images so we can efficiently work with bitmaps. In the old days binarization was relatively easy, because almost any scanner illumination can easily be compensated when it is not uniform (see US patent 5,901,243).

Today binarization is much harder, because an increased number of documents is imaged with digital cameras, most often of the kind in smart phones. Much work went into extending existing binarization algorithms to text in pictorial images, alas with little success. It turns out that a completely different algorithmic approach is required, as was recently published in the paper

Yan Wang and Chuanjiang He, Binarization method based on evolution equation for document images produced by cameras, Journal of Electronic Imaging 21 (2012), no. 2, 023030

Here is the abstract:

We present an evolution equation-based binarization method for document images produced by cameras. Unlike the existing thresholding techniques, the idea behind our method is that a family of gradually binarized images is obtained by the solution of an evolution partial differential equation, starting with an original image. In our formulation, the evolution is controlled by a global force and a local force, both of which have opposite sign inside and outside the object of interests in the original image. A simple finite difference scheme with a significantly larger time step is used to solve the evolution equation numerically; the desired binarization is typically obtained after only one or two iterations. Experimental results on 122 camera document images show that our method yields good visual quality and OCR performance.

No comments:

Post a Comment

About this blog

The Internet is an amalgam of forms blurred under epistemological pressures. In Søren Kierkegaard’s words, under this flat shower of leveled information, where everybody is interested in everything and nothing is too trivial or too important, people just accumulate information and postpone decisions indefinitely, i.e., nobody takes action and nobody is responsible for truth — there is no mastery, just gossip. He called this the æsthetic sphere of existence, exhorting us to evolve to the ethical sphere, where we do not just accumulate information but take action and make commitments. Blogs are instruments to overcome flatness by creating opportunities for vertical activities. In this sense this blog is a view from my window — a collection of tidbits I judged relevant to computational color science and in general to the promotion of scientific excellence in areas of strategic importance for the future of research, economy and society.

The Mostly Color Channel

Tuesday, October 16, 2012

Binarization

No comments:

Post a Comment

Search This Blog

Featured Post

Meta-Palette

Understanding Color

Cognitive Aspects of Color

The Color Thesaurus...

Popular Posts

Blog Archive

Labels

Contributors

Blogroll

About this blog

Privacy Policy