Category Archives: My Publications

Book cover with musketeer holding a boot, saying "Diable !"

Repetition in Graded Readers

There is a class of graded readers that make much use of repetition in order to give learners the best chance of picking up vocabulary via their reading. This is the approach used by TPRS, Wayside Publishing, and the Old English book Osweald Bera. On re-reading Otto Bond’s Sept d’un Coup, published in 1936, I noticed that a similar technique was used, albeit more subtly, so it is not a new idea.

My observation of reading books with lots of repetition is that on the first read this is fine and helpful, but on rereading it becomes extra tedious. If things get too dull, the learner won’t engage. It occurs to me that my approach in the Gnomeville comics saves people from rereading unless they really want to, via their design.

The first couple of episodes of Gnomeville are moderately repetitive, in that they are not the most repetitive French graded readers around (Bootstrapping the Three Musketeers is the most repetitive I know of, followed by Le français par la methode nature, followed by Édi l’élephant), but they are more repetitive than other books, based on analyses of the first roughly 100 words. I try to ensure at least five occurrences of each new vocabulary word in the comic but I don’t force it into the Gnomeville story. Instead, I have a separate story (La Question du Moment), which uses any words that haven’t had enough repetition yet. Then the following episode has a revision page at the start, which recaps the story so far, using all the vocabulary and grammar used so far. Learners can read this and decide whether they are familiar with everything or need to revisit the previous episode. When revisiting, learners can skip La Question with all its repetition and just reread the Gnomeville story. Or they can use some other method of checking the words that are still unfamiliar.

When rereading the first few chapters of Osweald Bera, which I do periodically because I only dip into the book occasionally in my world of many demands, commitments and distractions, I have been wishing for a summary for each chapter, which summarises the story so far and includes all the vocabulary I should have picked up from it without the additional repetition. I consider it my “homework” to create this for my own use. For languages like French, Italian, German and Spanish, there are quite a few options for reading material, so it is probably better to read something else instead of trying to reread these often long, verbose, repetitive books. For Old English, however, Osweald is the only reader of its kind, so rereading is likely to be inevitable.

Gnomeville Comics are Easier than I Thought

On reviewing my readability measure results for various items in my collection, I suddenly thought, “hang on, how can the expected vocabulary size for Gnomeville Episode 1 be 25 when only 12 very frequent words are introduced?” Clearly something had gone wrong somewhere.

I blame the fact that part of my analysis is manual, and I probably didn’t follow the procedure very well. I run various scripts to produce a ranked list of words in the text in the frequency order of a large corpus of written French (mostly from Project Gutenberg). The manual bit is counting up cognates, or at least starting at the least frequent word end and counting up until I find 5% of the words that are not cognates or names. I think I went astray previously by having a less reliable process.

Results can differ depending on decisions that are made, such as whether to include titles (which I treat as sentences), the “Présentation” section that has brief notes about each character, and what is counted as a cognate. It is reasonably clear-cut for Gnomeville, but for other texts, it is less clear. Should “habiter” be considered a cognate due to its similarity to “inhabit”? And there are other words that are cognates in the linguistic sense but not particularly obvious from a learner perspective. The choice of general frequency list will also make a difference. Spoken text has different characteristics to written text, especially in French. Also, the very frequent words used for Episode 1 and 2 are the 20 most frequent in French newspapers, which is not the same set of words as any other corpus of text. The text I use for calculating expected vocabulary size has some of those words at lower ranks (“se” at 25, “au” at 31, and “on” at 40), which explains why there was the potential for the expected vocabulary size to be larger than the number of words introduced. But unless those words made up about 5% of the extract it was unlikely they would receive those scores.

Anyway, on revisiting my incorrect assessments of the Gnomeville episodes, I have the following updated vocabulary sizes.

EpisodeOld Expected Vocab SizeNew Expected Vocab SizeNew Readability Score
12532.20
216143.23
340173.83
4153.66

You may notice that Episode 4 has a lower expected vocabulary size at 95% and a lower readability score than Episode 3. There’s not a lot in it, but Episode 3 had longer sentences in the extract.

Well, there you are. Gnomeville’s expected vocabulary size is much smaller than originally calculated – at least for Episodes 1 and 3.

Gnomeville Episode 4 Soon to be Released!

Slowly (6 years!) but surely, my next comic for learners of French has been completed! I am holding a launch party for it on Sunday, where attendees will hear the Gnomeville songs performed, and have the opportunity to buy the comics at greatly reduced prices. Then, the physical comics will appear in the Square store, and not too much later, I intend to publish the ebook “wide”, as they call it, meaning it will be available from Kobo, Apple, and other ebook platforms. I intend to make Episodes 1 to 3 available in a bundle format for the platforms that haven’t had the comics before. So, more work to do. But first, we have the launch on Sunday!

Book cover with musketeer holding a boot, saying "Diable !"

Bootstrapping the Three Musketeers

Those who have visited my blog this year will know that I have put up some “filtered French”, such as a list of the most common one-word sentences in French classic literature, and sentences that fit the highly constrained vocabulary of my comic books. After musing on language acquisition, in particular how babies learn, not to mention our experience of picking up a few words and phrases in a foreign language by ear, I thought I’d try a different approach. This has resulted in producing a book (with more volumes to come) where I filter Les trois mousquetaires, and add vocabulary one word at a time based on which word will complete the most sentences. Using a combination of manual and automatic filtering, I have created extracts that have sufficient repetition in their vocabulary for people to become familiar with the words.

It has been fascinating to see what happens as I add each new word. The algorithm tends to find dialogue first, gradually increasing in average sentence length, then short non-dialogue sentences – after the 93rd word of vocabulary was added.

Anyway, if you’d like to have a look, it’s on Amazon, with a substantial preview.

(Affiliate links in this post.)