From Stephen Wolfram's blog comes a post about natural language processing and Mathematica 8. First, I like that he used an example with separating color channels.
(image via wikimedia: Category:Images with Mathematica source code)
Second, in addition to mentioning color he has the following to say about corpora:
"One issue that we have faced is a lack of linguistic corpora in the area. (...) But as of yesterday we now have an important new source of data: actual examples of natural language programming being done in Mathematica 8. And taking a glance right now at our real-time monitoring system for the Wolfram|Alpha server infrastructure, I can see that very soon we’re going to have lots of data to study."
Looks like an interesting effort to follow.
Hagopian Ink Business Cards
21 hours ago