Monthly Archives: June 2015

Easy Reader Genres

I’ve been taking part in the Tadoku competition again this month, in French, Japanese and German.

My Japanese is pretty basic, so I’m reading beginner readers that have only a couple of sentences on each page at most, with the majority of them being repetitious in order to give practice at certain phrases.  Only a few of these are particularly enjoyable in terms of text content: The Shinkansen series previously published by Heinemann are good.  The DEE Publications readers are good practice and have nice illustrations, but can’t be classed as particularly entertaining.  The interesting part of those books are actually the cultural notes at the end in English.  Some Japanese little books for very young children that I acquired in Japan are amusing, partly for their innovative layout (Inai inai baa!).

In French I’ve been reading books at the 700-1000 word vocabulary level, plus a few other books of a similar level of difficulty.  After reading quite a few books for adolescents about adventures and mysteries etc, I seem to have hit saturation point with the genre.  I’m still enjoying crime mysteries and some classic stories (though not all), but my new interest is stories from Africa.  There is a series of African stories published by Heinemann in 5 levels of difficulty.  I stumbled across these when visiting a Dutch shopping site, and ordered a couple at Niveau 3 to try.  They are a refreshing change from the fodder I’ve been reading recently.  They don’t pull any punches though.  I’ve read La Valise Ensorcelée, which has an element of magic to it, as well as a moral.  I’ve also read “L’usine de la Mort”. This book shocked me a little, but I’m glad I read it.  I don’t think it’s great literature by any stretch, but certainly interesting, moving, and sufficiently different for the jaded easy reader reader.  As a result I’ve bought more books from the series.

Advertisement

Thoughts on Up Goer Five and Constrained Vocabulary Writing

When I first saw the Up Goer Five comic by xkcd, I loved it.  It epitomised what I do with my comic book and my research, and is a convenient example to show people, when explaining the idea of constrained vocabulary writing.

Fans figured out that the 1,000 words used by xkcd for it were the contemporary fiction list, shown in Wiktionary.  This frequency list is based on over 9 million words of on-line contemporary fiction.  It combines plurals and simple verb forms into one listed word (lemmas), which is a good choice, since if the root word is known, then the plurals with s, and simple verb forms are usually also understood.

As someone who writes using lists generated based on frequency, I’ve noticed that several problems arise.  One is that, typically, male pronouns and nouns occur at higher frequencies than female ones.  The Wiktionary list is not overly biased in this way, possibly because it is based on contemporary fiction.  “he” is ranked at 8, “her” and “she” at 12 and 13 respectively, and “his” at 16.  However, we find “man” at 163 and “woman” at 452, but “girl” is at 133 and “boy” at 217.  This hints at what has been termed the systemic “infantilization” of women in society.  The figures are probably quite different due to the common pairing of “guy” (at 178) with “girl” in colloquial speech.  Google’s auto-suggest, which is also based on frequency, has occasionally come up with phrases that are considered racist, sexist or otherwise problematic – and it is purely a reflection of what we as a society tend to write.  When writing in a principled manner for language learners, it may be important to balance what word frequency lists tell us, with what is a more equitable representation.  I didn’t really think very much about this when I started writing Gnomeville years ago, but have become more aware of these issues thanks to some of my friends who are more knowledgeable in them.

Another issue that needs to be considered is what is culturally appropriate to write for the target audience.  For example, I have recently been made aware that it is inappropriate to use words referring to alcoholic beverages when the audience is Islamic.  Obviously for work intended for children (or for experimental subjects) it is customary to exclude expletives.  For this reason, several words on the list would need to be excluded.  There seems to be an expressive set of expletives in the list.

For the method of writing I employ in the Gnomeville story, I  introduce one new high frequency word per page of story, and somewhat less frequently I introduce a grammatical pattern.  Sometimes I’ve changed the order in which I add words due to the story.  This happened in episode one, in which I introduced “se” very early instead of after about a dozen other words.  Also, I recall that “le” was added before “de”, even though their ranks are reversed.  Having said that, my first 20 words were based on a corpus of newspaper articles.  Every corpus gives a different ranking of words.  There are some similarities across corpora however.  For example, if the corpus is large enough, the frequency of the word “the” is likely to be about 7% for English text.

Anyway, back to Up Goer Five.  The upcoming book “Thing Explainer”, as well as the text uploaded to the up goer five text editor provide some good practice at reading for people still consolidating their first 1000 words of the English language.  If going beyond that, the writing should have less than 5% of words outside the vocabulary set to be suitable for improving language skill while fluently reading for comprehension.  A text editor with more flexibility is the OGTE Editor, designed for writing English text for different language learner levels.

An extract of my French comic book is now available

I have finally produced an extract of the comic book for people to look at.  It contains 12 of the 28 pages, with images reduced to readable low resolution.

The extract contains all the text that explains the rationale for the approach, as well as showing a summary of the language covered in the first episode.  There are 3 pages of the Gnomeville story in the extract.  The first two show how the story begins with no prior French knowledge, and how the language is introduced.  The third page shows how the text increases in complexity and length later in the story, with a very short word definition on the page, so that the person reading is not slowed down too much in their reading in French.

Note that the extract doesn’t show the true page format, as it is an ordinary A4 pdf file, whereas normally the pages are processed into book form, re-numbered appropriately and trimmed to size.  The story pages are colour right to the edge of the paper in the physical copies.