Vocabulary Analysis of the Gutenberg Collection
I found this page when looking for things on vocabulary density – something of relevance for reading books designed for language learners. The guy who wrote it is also interesting, in that he has a non-traditional career path into academia.
He shows his analysis is of ~2000 Gutenberg texts based on vocabulary – the kind of thing I like to muck around with. It is “unpublished” work, so lacks a few things, like references and axis labels, making it less useful than it otherwise might be.