NPR had an interesting piece over the weekend about a project the Intelligent Information Laboratory at Northwestern University is working on called "Stats Monkey" which has computers generate sports stories based solely on the statistics from the game.
From the Northwestern Stats Monkey website:
Imagine that you could push a button, and magically create a story
about a baseball game.
Just like blogging!
That’s what the Stats Monkey system does.
Given information commonly available online about many games—the box
score and the play-by-play—the system automatically generates the text
of a story about that game that captures the overall dynamic of the
game and highlights the key plays and key players. The story includes
an appropriate headline and a photo of the most important player in the
The NPR piece notes that the system is still in early stages and creates the most basic stories without much color.
For now, StatsMonkey's stories are fairly basic play-by-plays — the
program isn't yet able to capture unexpected events or subtle details.
For instance, StatsMonkey wouldn't be able to capture details like Babe
Ruth pointing to the outfield before hitting a home run in Game 3 of
the 1932 World Series, or Kurt Gibson hobbling around the bases in the
1988 World Series.
Or Cliff Lee catching a ball behind his back.
The goal of the project is not to replace sports writers (so they say) but rather to generate stories where there otherwise would be none, like for every little league game ever for instance.
There is a sample computer-generated story at the bottom of the article here, and it's pretty much what you would expect: dry. But give those geeks up at Northwestern some time and they'll be able to build a database consisting of every anecdote Bill Conlin has ever told. That database should be completed in 2076.
>>Program Creates Computer Generated Sports Stories [NPR, audio available]