Wednesday, June 25, 2008

Things to come

For the curious, here are the improvements I’m most likely to make in the coming weeks.

  • Completely rewrite Wordle’s handling of “stop words”, common words that, for most uses, should not appear in the visualization. Currently, there one huge list. There needs to be a separate list for each language (easy) and Wordle needs to make a heuristic guess as to which language is in use (harder).
  • Custom palettes.
  • “Next”, “Previous”, and “Random” buttons on the wordle-viewing page.

There are other design problems I’m aware of—no search!? no [insert your favorite language here]!?—and I’ll get to them as soon as I’m able.

Please keep your emails coming. They’ve been critical in showing me where the flaws lie.

20 comments:

mark said...

Jonathan, What a fun piece of code.. I cannot get a questionmark to show. Any chance you can add that to the next release? Thanks..
Mark

thingything said...

You are the coolest guy in the world. I'm obsessed with Wordle, have done everything from online conversations with friends to Obama and MLK's landmark speeches on race. Totally want to print some of these out and frame them for my room - thanks so much for this amazing software!!

Rina said...

Endlessly fascinating...it makes poetry of anything you type in. Question though - the font colors are not so attractive...is there a way in the works to use a regular palette of colors that the user picks? The black background is awesome, but doesn't print well. Would also love to see punctuation show up. AND, when I enter just one, two, or three words in plural- i.e. word word word WORD WORD WORD Word Word Word - they only show up once. I'd like a nice melange of them, but only get the one showing...

I am a Wordler!

Jonathan Feinberg said...

Mark,

Wordle is all about words; punctuation is a necessary casualty.

Rina,

I have creatd a new FAQ to address your question about multiple words. But I'd think that the blog post you're commenting on would cover your other question.

CG said...

What fun!

Would be kinda neat to see some of the stats when hovering over each of the words. Maybe something tool-tippy?

Jonathan Feinberg said...

cg,

I'm not sure there *are* any stats. There's only word frequency, and that's indicated by font size. What did you have in mind?

CG said...

Yeah, I suppose you're right, that's not too interesting. It's just word count? I was thinking that the font size gives a relative relationship between the words, but the number would show the actual count.

Rina said...

Yeah, I missed that little blurb about the duplicate words, thanks. Maybe I missed something else about the colors - I'll go back and read it more carefully.

How long has this been active? Is it new? The blurb I saw on Boing Boing was the first I heard about it

kate said...

Can i just ask you a very basic question Jonathan...how do i put a wordle directly into my blog?? i tried but have only managed to put the link in... yet i see other blogs with the complete package..i.e. the way it looks as a wordle..
thanks for your creative help with this.. it's brilliant by the way!
kate

Laura said...

I would like to add words accompanied by a number to indicate frequency. So that I can see how many time a particular term was used in a a search without needing to add the term to the application the number of times it was searched. Does this make sense? Would anyone else be interested?

g_google said...

Hi,

I'm really impressed with Wordle. In your FAQ regarding JSON interfaces, Flickr provides one:
http://www.flickr.com/services/api/response.json.html

It would be interesting if Wordle was able to do something similar to this:
http://www.flickr.com/photos/chimpaction/tags/

But obviously, in a more more attractive and stylish way ;)

Graham

Jonathan Feinberg said...

g_google,

Unfortunately, flickr's tag API is useless, as it doesn't provide tag counts. It also requires a cumbersome API key. Most foolishly, it requires the weird flickr ID format, instead of a sensible user name. If you know otherwise, please do correct me.

J Cubed said...

The language guessing shouldn't be too hard: just match on stop words!

OgleOodles said...

Jonathan, I am having so much fun with your program. My friends and I have now used it for a online scrapbook. Its so cool. :)

Anonymous said...

Could you include an option in the future of not including short words in wordle, enabling a user to set the minimum number of letters that a word should have to be included, and an additional option of including all capital letters words anyway (like XML, DOC, SAOP ...).
It might be easier than making a Stop words list.
Otherwise great thing.

Paul Taylor said...

Jonathan, love the wordle. Regarding the new code to exclude common words, could you make that an option rather then the default? I like to include the common words and with the exclusion as default, it's an extra step to exclude. I also like to go back to the keywords and modify and re-create the wordle, where I then have to allow common words.

Maybe preferences? Maybe memory during my session?

How about adding a URL? I see that punctuation is a casualty, but I'd like to include .com or .net in my wordles.

Great app and thanks!

Anonymous said...

Wordle is incredibly cool! I'd love to be able to include numbers as well; although I realize words are the emphasis, it'd be really nice to be able to include numbers for some things (like years). Possible?

Rina said...

Re the previous post re numbers - I had no trouble having dates show up, like June 30 2007. I just used the ~ between each, so try

June~30~2007. No commas though.

I do agree - to use this as creatively as possible, punctuation would be nice. I'd love to do this to illustrate poems, or speeches, or just nice quotes.

Gilbert said...

Can I suggest that wordle is not case sensitive? wordle, Wordle and WORDLE would then be counted as the same.

Max said...

It would be great with some more four colour palettes that are good for printing (both on paper and shirts). Right now you have five colors and RGB which makes it quite expensive to print stuff.