For #inktober a draft of a comic idea that I had 5 years ago when obsessing about French pronunciation. I might correct it and clean it up another day. Meanwhile enjoy this unedited version.
I recently re-watched Dilili in Paris, which is a fabulous animation movie for children, with French dialogue that is slow enough for French language learners to follow. I originally watched the movie during the Melbourne French Film Festival and considered buying the movie later so I could try watching it without English subtitles.
Frustration 1: Memory
There is a frequently repeated phrase when Dilili meets new people: “Je suis heureuse de vous rencontrer”. It was semi-humorous, and certainly designed to be remembered, to teach how to be polite when meeting someone new. However, what I actually remembered after a week or two was: “Je suis heureuse __ vous rencontrer.” Despite being exposed to many occurrences, the function word was lost. Function words don’t provide semantic content and therefore appear to be harder to retain. There is certainly research evidence that concrete nouns are easier to remember than various other types of words. This movie brought that home to me in a big way.
Frustration 2: Resources
(Not really about function words…)
I bought the DVD of the movie, and then when viewing it, discovered that the subtitles could not be switch off, and that the only subtitles were in English. I don’t know who makes these decisions when preparing DVDs for sale, but perhaps they don’t really consider their audience carefully enough. A French movie sold in Australia would have various audience segments: French ex-pats – possibly including some French people who are hard of hearing, Australian francophiles, Australians learning French. To me, movies and TV episodes are highly useful for practising comprehension of the spoken language. Ideally it can be done at three levels of difficulty (with the example given for L2 referring to the language being learnt and L1 referring to the native language):
- L2 audio with L1 subtitles,
- L2 audio with L2 subtitles,
- L2 audio without subtitles
I even do this with DVDs that were originally in English. I’ve watched two entire series of Perry Mason with French audio, which was quite illuminating. If you are short of practice material, check your DVD collection for audio in your target language. You may be pleasantly surprised to find a good selection amongst your favourite shows.
Frustration 3: Vocabulary Size
(Function words are frequent words…)
One of the excellent things about some graded readers was that they were designed for a specific vocabulary size. For me, vocabulary makes all the difference between a readable text and an unreadable one. CLE International used to publish books targeting a specific vocabulary size. For example, Niveau 1 had vocabularies of 400-700 words. Through extensive reading, I have successfully moved from 300-word vocabulary books to 700-1000 word ones, and I hope to continue to progress through further reading. However, as with other publishers, the publications have now been converted to CEFR levels: A1, A2 etc. and as far as I can tell, the subtleties of vocabulary size have been removed from the book information.
I have completed a CEFR B1 in French, yet I’m most comfortable reading A1 texts (and texts with less than 1000 word vocabularies) and with few exceptions they are not easy apart from the grammar, which is too easy for me, but the books are still sometimes challenging vocabulary-wise. What frustrates me is that A2 covers such a wide range of vocabularies, depending on the source material, from readable to incomprehensible. Published vocabulary sizes for A2, where they occur at all, vary from 400 to >1200 words. The level of frustration with some of these graded readers is the same as for texts written for native speakers. I oscillate between A1, A2, native texts and back again. The original memoirs of Céleste de Chabrillan are as easy and more exciting than many A2 texts.
CEFR is designed, as far as I can tell, to describe a person’s practical skill in a language, and for that it is useful. However, the jumps between levels are quite large, so that the defined levels are not very useful for the learner themselves. Some publishers solve this by dividing up levels. ELI uses A0, A1, A1.1. The Danish Teen Readers/Easy Readers also divide up the levels, and still appear to quote target vocabulary sizes. Indie publishers tend to ignore vocabulary size in their writing. However, writers and publishers should remember that:
- Extensive reading is at its best if learners are reading at a comfortable level while not being familiar with all vocabulary. Ideally learners should know 98% of the words in text they are reading.
- Readability of text largely consists of grammar and vocabulary components.
- The more readable AND interesting reading material is, the more learners will read, the better their vocabularies will become, and the better their skill in a language will be.
- Publishing vocabulary levels required for 95-98% coverage of the text will assist learners in finding materials of the right level for them at any point. Vocabulary levels should be (loosely) based on general word frequency.
This is why I write my comic books for language learners. This is why I research extensive reading, readability and language acquisition.
I’ve just been updating my database of French readers and observing the types of books or stories in the different ranges of my current preferred readability measure.
Scores under 4 are ridiculously easy for people with an English speaking background. Currently this consists only of episodes 1 and 2 of my Gnomeville comics. Sentences are short and vocabulary is highly constrained, exploiting French-English cognates.
Scores in 5-5.99 tend to be the short illustrated graded readers such as Bibliobus, as well as La Spiga’s Zazar for grands débutants (target vocabulary of 150). Gnomeville Episode 3 sits here due to having longer sentences compared to the first two episodes.
Scores in 6-6.99 tend to have longer sentences, including some classic graded readers such as Si nous lisions and Contes Dramatiques, as well as the 300 word vocabulary Teen Reader Catastrophe au Camping des Roses.
Scores 7-7.99 also have the more text-like graded readers, including Sept-d’un-Coup by Otto Bond, which tends to have long sentences but well-controlled vocabulary.
In the 8-8.99 range I find the first story for native speaking children, as well as more graded readers, including one with a target vocabulary of 1000 words.
The first books for adult native speakers occur with scores between 10 and 12.
Looking at the stories in the list, my own level seems to be from 7 to 10, suggesting I should continue reading more challenging graded readers in addition to stories written for French children. That is pretty much what I have been doing for a while, as well as incidental reading on the web and elsewhere.
A quick look at the relationship between stated vocabulary sizes and the 95 percentile that I have been using indicates that the required vocabulary is roughly 1.5x + 2600. However, I am using a token-based vocabulary whereas most would use a word family one. If I assume token vocabulary sizes are 5 times word family sizes, then the equivalence point for this model is when the vocabulary is about 770, meaning that the vocabulary load will be excessive for stated vocabulary sizes less than 770 but be ok for sizes greater than 770. That’s reasonably reassuring. Mind you this is an extremely rough estimate.
This work was based on about 100 words from the start of the text of 40 stories, but it does seem to sort things fairly usefully. The outlier based on my experience of reading the stories is Aventure en Normandie, with a score of 9.49. I don’t recall it being a difficult read.
Meanwhile I am making more progress on Episode 3 of my comic book. I decided to divide one page into three pages, as it had a lot of text and too many new language concepts for a single page. So Episode 3 will probably be 32 pages long, breaking the standard Gnomeville pattern of 28 page episodes. Hopefully it will be ready within a month.
I succeeded in acquiring more classic French readers recently. One of my new favourites is Dantès from Otto Bond’s Basic French Readings alternate series. It is a simplified extract from Dumas’s Comte de Monte Cristo. The story starts with an assumed knowledge of 97 frequent words, much like Sept-d’un-coup by the same publisher, but succeeds in having a higher proportion of cognates, leading to an impressive expected vocabulary for 95% coverage of 316 (based on my word list). This makes it the lowest I’ve seen so far, apart from my own series.
However, the important thing is also whether it was an enjoyable read. I definitely got hooked on the story, and then all of a sudden the extract ends, and I’m left wanting to read the rest of the story. That can only be a good thing.
I was less captured by the remaining stories in the five-story volume, but still enjoyed most of them.
Regarding the match between publicised vocabulary sizes of graded readers and the reality of reading them, I can say from my cursory investigations that there is not always a good match between the two. Perhaps it averages out across the books, as I only take the first chunk of text for my comparisons, but if the first few paragraphs are too challenging, then a language learner may lose interest.
I’ve developed a new estimate of readability now, which is more complex than ones I’ve previously used and seems to match the foreign language learner’s experience reasonably well. Based on this, and my more recent acquisitions, I now recommend the following as first reads for French beginners with English-speaking background.
Young children: Luc et Sophie series, or Bonjour Berthe, which I find more entertaining. Le Petit Napoléon series is also quite good, and suitable for all ages, for those who like cats.
Older children: Gnomeville, Le Chapeau Rouge, select stories from Mary Glasgow’s Bibliobus, or Sue Finnie’s Lire Davantage.
Teenagers: I quite like the Teen Readers series. Catastrophe au Camping des Roses is rated as a vocabulary of 300 words, and my estimate has the 95% coverage vocabulary at 2421, which isn’t too bad. But Dantès mentioned above is easier vocabulary-wise.
Adults: Dantès is my current favourite as a first read. Becky Tucker’s Histoires pour les grandes appears to be easy, but I haven’t read enough to know whether it is interesting. I have yet to rate other ebooks.
However, the only stories that you can read immediately in French without having studied it is the Gnomeville series. There are some minor issues with it though, as have been brought to my attention recently. There are places where I have used “de la Fantasia” that should be “de Fantasia”. I was uncertain of the rule for this, but now I have discovered it. Mostly “de” is used with a country, but “de la” is used in expressions that have a temporal sense to them, such as “le gouvernement de la France” (since governments are not permanent), or if there is an adjective applied: l’Histoire de France but l’Histoire économique de la France. Very subtle indeed and I hope I can be forgiven for getting that subtlety wrong in my comic. I intend to make a second edition of Episode 2 at some point to rectify this. Another error in Episode 2 is the use of the verb “voyage” combined with “à” (“voyage à la Place des Roses”). Voyage doesn’t get used this way and “à” should probably be “vers” to communicate this idea. The sentence will be removed from the second edition.
In other news I attended the Applied Linguistics Association of Australia 2018 conference a couple of weeks ago. It was very inspiring, and also emphasised that the important thing with language acquisition is communication, not perfection. Perfection is unlikely to be achieved, but improvement is always possible. So let’s keep improving our language skills. Read, listen, write and speak. With practice comes improvement. Until next time.
Since my last graded reader update I’ve looked at a few more books, some of which are “classics”, in the sense they were from the “direct reading” era of the first half of the twentieth century, following the influence of Michael West’s constrained vocabulary for language teaching, the various word and idiom frequency lists created at the time, and the idea of readability. Some of these books I had already acquired earlier; but through reading some papers published at that time, I was able to compile a shopping list of other books written according to the same philosophy.
As a result, I have a new winner in terms of expected vocabulary size at the 95% threshold of reading comfort. A New French Reader by Ford and Hicks received a 95% vocabulary size of 3532, and Otto Bond’s Sept-d’un-Coup was a close second, with 3650. Bond’s book starts with a much smaller initial assumed vocabulary (97 words) than the Ford and Hicks book (523), so Bond’s book may be a better first read despite the slightly higher vocabulary score here. As seen in my first post on expected vocabulary size for 95% coverage, these are much higher scores than my Gnomeville comics, as my comics take readability criteria to the extreme.
So based on the current stats available on vocabulary, I recommend the following first graded readers for English speakers learning French.
For 6-9 year olds: Bonjour Berthe.
For 10+: Gnomeville
For adults who don’t like fantasy comics: Sept-d’un-Coup by Otto Bond – though I think there are some errors in it, and it’s out of print (and it probably counts as fantasy…).
Stay tuned for further updates.
I have my moments of doubt with my French comic book project. It is virtually impossible to write something that is absolutely correct French in terms of the expressions used if one is not a native speaker. Grammar is relatively easy to get right, apart from minor slip-ups, but having something sound like natural French, especially while intentionally writing in a constrained vocabulary is almost impossible.
I’ve been attempting to get Episode 3 of my comic book ready this month, spurred on by a potential launch date at the language-themed concert I’m involved with, happening tomorrow, as well as #inktober. However, before finalising my comic it was important to get it checked by a native speaker of French. This happened today, and as usual there are errors that need to be fixed, and unlike text that is free to vary without consequences, this means making decisions about whether to leave out phrases or whole sentences, find an alternative French-English cognate, or an alternative way of saying the same thing. As I also try to ensure there are a certain number of repetitions of key words, phrases, or grammatical points, further changes also need to be made. Then there’s the crossword… As a result, I will need to delay the release of Episode 3 for a bit longer.
Unfortunately, despite multiple checks by francophone proofreaders, some things do get missed. I appear to have an error in Episode 2, which has already been published. I may need to set up an errata page here, and perhaps release a second edition at some point. It’s all a little discouraging, but I’m not giving up yet.
Reading graded readers can be quite educational on how to write good stories. Previously I wrote an article about a graded reader “Hall of Shame“, in which I highlighted various problems I had observed within the genre, and then provided a summary of recommendations on how to write better graded readers. Elsewhere I wrote another summary of advice on writing graded readers (I called them Easy Readers in my earlier posts.)
In this article I’d like to talk about suspense. I think I was in my twenties when I first really thought about suspense at all when reading. A fine simple example that crystallised it for me was Dirk Gently’s Detective Agency by Douglas Adams. Two memorable simple bits of suspense occurred in it. The first is the sofa stuck in the staircase, which is explained toward the end of the story. The second, which amazed me in its simplicity was explaining two of three things, and not answering the third one until late in the book. I was reading on, wondering what the third thing was. This illustrated that it didn’t really matter what the suspense is, as long as it’s suspense. It doesn’t need to be figuring out who committed a crime, or whether the romance concludes happily. It can be pretty much anything.
The next observation, which inspired the way I’ve organised my Gnomeville comics, was the use of cliff-hangers. I was reading a collection of X-Men comics, and noticed that they always ended with a cliffhanger and unanswered questions. By never fully ending the story, people get hooked and need to read the next issue. Soap operas seem to work the same way. It struck me that this is a very good strategy for graded readers, since we want to motivate people to keep reading to improve their language skill.
I buy and read graded readers by other authors, and I was struck by the contrast suspense made between two otherwise very similar stories by the same author, Sylvie Lainé.
Voyage en France tells of an English couple who go to France, because Louis is reminded of a creative project he commenced with his best friend decades earlier, who later moved to France, and with whom he had lost contact. The project was a movie about an old man trying to find an old friend, echoing the current situation. Louis wants to see how the story will end, by finding his old friend. We read the story of Louis and Melba as they follow a series of clues to find Louis’s old friend. This suspense worked for me, as I wanted to know how the story would end. I also wanted to know whether they tried to finish the movie, but that question wasn’t answered. I read the story quickly, despite many chapters of fairly mundane travel activities, all because of the suspense.
Contrast the above story with Voyage à Marseille. This contains the same two main characters, doing the same things, that is, travelling through France to get to a destination. However, it lacks the suspense of wondering whether they will find the person they are looking for. The first bit of excitement happens quite late, the disappearance of the car, and is resolved quite quickly. There is another unanswered question that had potential as a simple bit of suspense, the title of the biography of Louis’s friend that they were visiting. Unfortunately that wasn’t answered in the final chapter. It took me a lot longer to finish this story, because there wasn’t anything I wanted to know the answer to.
So, when writing graded readers, please provide suspense. It makes a lot of difference to the reading experience.
I’ve been tinkering with ways of comparing different easy readers for language learners. Previous posts I’ve used a type-token ratio or vocabulary density, which gives some idea of how likely it is you might learn new words through repetition from a text. But for something to be readable, the general consensus is that you need to know at least 95% of the words that you read. This is a level that allows people to guess the meaning of the words they don’t know.
So something I’ve been messing with recently is predicting the general vocabulary size needed for different beginner stories in French, assuming people know all cognates and all proper nouns. I’ve only been working with short samples of text so far, and there are many other assumptions and issues that make it not a perfect comparison – including bugs in my code…
Given a small set of extracts, and assuming you don’t learn the words via their introduction one at a time, as in my comic books, we have the following:
|Gnomeville Episode 1||25|
|Gnomeville Episode 2||25|
|Gnomeville Episode 3||40|
|Easy French Reader||5008|
|Martine a la Ferme||11854|
Note that this vocabulary size assumes that each conjugation of verbs is a separate vocabulary item, as are plurals etc. so will be much larger than word family figures normally used.
You can see that the one text written for native French speaking children (Martine) has a much richer vocabulary than the texts written for language learners. The figures for these look worse than they seem, because there are many words that are typically taught early to allow conversation, but which feature much lower on word frequency lists. For example, “maman” was at rank 6163 in my list. In contrast, my Gnomeville comics are designed to prioritise frequent words and cognates to optimally improve reading, at the expense of conversation. Hence the very small vocabulary sizes required.
Recently I’ve been reading a 1939 paper by Tharp that looked at measuring vocabulary difficulty. He appears to have had similar ideas about measuring vocabulary load based on the general frequency of the words, as well as a measure of density of difficulty words. I also recently acquired yet another very early graded reader, “Si nous lisions”, from 1930, which attempted to introduce new words every ~60 running words, in the style of Michael West, who seems to have been the first to use the approach. However, I have a graded reader published in 1909 in my collection, which was intended for “rapid reading”, and was part of a series that commenced with short easy texts. I’m not sure if they methodically introduced words at specific intervals as was done by West and others following his example.
In searches on-line, I found a French adapted reader from 1790, so we’ve been at it for quite a while. I’d like to say we know more about how to write graded readers these days, but I think West had it fairly right. The only thing we can do now is make them more interesting and relevant.
Here’s one from 1800 published for those with a German background. There seem to be quite a few published in the 1800s.
Anyway, I’ll finish off here with the usual things: we need 95% coverage to read comfortably (on average). To do that with native texts requires quite a large vocabulary. But vocabulary increases as you read more. So we should read as much as possible at the level that is right for us and of reading material that interests and motivates us. My Gnomeville comics are ideal first readers in French for those with an English language background and a good vocabulary in English. The Berthe and Luc et Sophie series are reasonable alternatives for children that are possibly too young for Gnomeville, as are the ELI A0 series. Until next time…
I recently came across a new word frequency list for French words, which I’m placing here partly for my own benefit. This one is like some others that combine all conjugations of a verb together, which is not helpful for all applications. Typically present tense is much easier than other less frequently used tenses, particularly for irregular verbs.
Anyway, the list is still useful. It was created by Étienne Brunet, a statistical linguist, based on a corpus of written French.
Here are the top 20 words. Interestingly, compared to the newspaper corpus list I used for designing Episodes 1 and 2 of my comic, this corpus has first person singular (je) occurring much more frequently, as well as “have” (avoir). “ce”, “son” and “elle” also occur in this list higher than “au”, and were not in the newspaper list. “avoir” may be higher because of all conjugations of it being grouped together.
- 95%+ vocabulary coverage,
- focus on very frequent words (eg. “le”/”the”) to give best coverage sooner,
- repetition of new vocabulary,
- images to enhance recall,
- high interest story (I hope).