The debate words those Democrats used
Well, The Times' Ben Welsh has done it again. For the third time this week, he's created a word cloud examining the words used over time in public, this time by Sens. Hillary Clinton and Barack Obama in Thursday night's Democratic presidential debate in Hollywood.
Ticket readers will recall this website published Welsh's creative word clouds twice before this week, once on the Republicans' debate and once examining the word patterns in all of President Bush's State of the Union Addresses.
Just put your cursor on the slider here and move it to the left, going back in time through 18 Democratic debates to last April 26 in South Carolina. Watch how the choice of words changes; the bigger the letters the more often they were used in that debate.
-- Andrew Malcolm



These are cool, and I appreciate the work that goes into them. As a suggestion, I think it would be more meaningful to compare the frequency of each word in a given debate to the frequency of that word over a big corpus of English-language debates. For example, "think" is a popular word when people opine ("I think...") but it was probably not much more common in these debates than in all English-language debates over the last half-century. It is large in almost every single debate you've analyzed, but not terribly informative in any of them.
I see that common words like "a", "the", "of", "for", et cetera are omitted and that is good, but a further comparison as I've suggested might do more in that regard. Also, if you can do the analysis I suggest, you could show words that are underused as well. That might be even more interesting.
Anyway, thanks for all the coverage and work. I hope you don't take my suggestion as a complaint -- it certainly isn't!
Posted by: Keith Henderson | February 01, 2008 at 01:17 PM
Was the word, "I" omitted? I know Senator Clinton said that word several times in last night's debate, but I don't see it in the tag cloud. It may be a common word, but as many times she repeated it she showed a streak of egoism.
Posted by: Jerry Tsai | February 01, 2008 at 05:27 PM
Keith, Jerry,
If you're interested in reviewing American political rhetoric over a larger sample, you should check out Chirag Mehta's Presidential Speeches tag cloud at the link below. Our clouds are created with my modification of an open-source application he developed.
http://chir.ag/phernalia/preztags/
If you want to get a look at the most common words, with omissions, you might enjoy wordcount.org, which says it's plotting out the most common words in the English language.
In response to your question Jerry, yes, the word 'I' was omitted from our debate analysis. Deciding what to keep and what to toss is a judgment call I made, for better or worse.
The angle you raise is interesting though. Lets tease it out. If you wanted to measure the number of self-referencing words the candidates used, what words would you need to pull? I, me, my? What else?
Ben Welsh.
Posted by: Ben Welsh | February 02, 2008 at 08:38 AM