Monday, November 29, 2010

Crowdsourcing numbers

www.trackyourhappiness.orgWhen we want to perform experiments with a large number of subjects we resort to crowdsourcing. For his color naming experiment, Nathan Moroney uses the Web. An alternative approach is to write a smartphone application that periodically prods the owner for data.

Often researchers hide the size of their data. Recently Matthew Killingsworth and Daniel Gilbert have published a paper in which they reveal how much data they were able to collect with a smartphone app. Over an undisclosed time interval, they collected almost 250,000 responses from about 5000 subjects from 83 countries, ranging from 18 to 88 years of age and representing 86 occupations.

For comparison, the xkcd color survey harvested over five million color terms across 222,500 user sessions. Nathan's thesaurus so far served almost 300,000 color terms in English alone.

One of the worries in crowdsourcing is the amount of disruptive subjects. In the calibrated lunch color naming experiment, this number turned out to be surprisingly low: 4% of the participants. For Killingsworth and Gilbert with their more intrusive and possibly obstreperous method, the data came from 2250 adults (58.8% male, 73.9% residing in the United States, mean age of 34 years). The details are in their paper's supporting online material.

By the way, the topic of their research are not color terms but stimulus-independent thought, also known as mind wandering. The outliers in their data are the subjects who were making love when the application woke up to poll them about their happiness status. This is also an unexpected response, as we would think a very happy person would not answer the phone during such an activity…

You can find the paper here: Science 12 November 2010: Vol. 330 no. 6006 p. 932