Tag Archives: language learning

Gnomeville Comics are Easier than I Thought

On reviewing my readability measure results for various items in my collection, I suddenly thought, “hang on, how can the expected vocabulary size for Gnomeville Episode 1 be 25 when only 12 very frequent words are introduced?” Clearly something had gone wrong somewhere.

I blame the fact that part of my analysis is manual, and I probably didn’t follow the procedure very well. I run various scripts to produce a ranked list of words in the text in the frequency order of a large corpus of written French (mostly from Project Gutenberg). The manual bit is counting up cognates, or at least starting at the least frequent word end and counting up until I find 5% of the words that are not cognates or names. I think I went astray previously by having a less reliable process.

Results can differ depending on decisions that are made, such as whether to include titles (which I treat as sentences), the “Présentation” section that has brief notes about each character, and what is counted as a cognate. It is reasonably clear-cut for Gnomeville, but for other texts, it is less clear. Should “habiter” be considered a cognate due to its similarity to “inhabit”? And there are other words that are cognates in the linguistic sense but not particularly obvious from a learner perspective. The choice of general frequency list will also make a difference. Spoken text has different characteristics to written text, especially in French. Also, the very frequent words used for Episode 1 and 2 are the 20 most frequent in French newspapers, which is not the same set of words as any other corpus of text. The text I use for calculating expected vocabulary size has some of those words at lower ranks (“se” at 25, “au” at 31, and “on” at 40), which explains why there was the potential for the expected vocabulary size to be larger than the number of words introduced. But unless those words made up about 5% of the extract it was unlikely they would receive those scores.

Anyway, on revisiting my incorrect assessments of the Gnomeville episodes, I have the following updated vocabulary sizes.

EpisodeOld Expected Vocab SizeNew Expected Vocab SizeNew Readability Score
12532.20
216143.23
340173.83
4153.66

You may notice that Episode 4 has a lower expected vocabulary size at 95% and a lower readability score than Episode 3. There’s not a lot in it, but Episode 3 had longer sentences in the extract.

Well, there you are. Gnomeville’s expected vocabulary size is much smaller than originally calculated – at least for Episodes 1 and 3.

Review: Le Français par la méthode nature

I’ve seen this book by Arthur M. Jensen mentioned a few times and I thought it was worth a look, given its philosophy. This book, which was originally published in 1958 (and still in copyright, according to the death+70 rule), is a reading-based introduction to French, with the pronunciation of the text in IPA under each line. The text is quite mundane and repetititve, but the repetition is intentional to allow the language to be acquired by reading. Things get a little more interesting after a dozen or so chapters. A similar approach is used in several (relatively entertaining) stories in French, such as those by Wayside Publishing and TPRS, in addition to other languages. It is also used for an engaging story in Old English called Osweald Bera.

Jensen’s book makes no assumption about the learner’s first language. There are no glosses or definitions in another language. There are pictures to illustrate nouns that are introduced and names of characters talked about. This makes it a good choice for those with a language background that is not English, since many books assume English (or another common European language). It may be less useful for those who have no prior exposure to the Roman alphabet. That would need to be learnt first.

I ran my usual analysis on the first approximately 100 words and confirmed that based on my measures, it is easy French. Its type-token ratio (measure of repetition and learnability from text) is the lowest I have found so far, meaning there is a high chance of learning the vocabulary when reading the text. Its overall score, encompassing expected vocabulary at 95% coverage and sentence length (assuming that the text above and below the pictures on the first page, such as “une fille”, were sentences) was 4.84, making it the 7th easiest in my small table of the readability of French texts for learners.

One weakness of the text is that it is a bit old-fashioned. A lot of the conversation examples are not how people speak nowadays. Also, some of the words are more what would be read rather than spoken, such as “demeurer”, which I only come across in texts, whereas to my knowledge, “habiter” has been the most common verb to use for decades.

Ayan Academy has audio of many chapters on Youtube. This can be useful simple audio comprehensible input.

In summary, its strengths are that there is no assumption of first language, it is comprehensive, and there is a high chance of learning the language from reading due to its high level of repetitivness. Its weaknesses are it is dull and dated.

Gnomeville Episode 4 Soon to be Released!

Slowly (6 years!) but surely, my next comic for learners of French has been completed! I am holding a launch party for it on Sunday, where attendees will hear the Gnomeville songs performed, and have the opportunity to buy the comics at greatly reduced prices. Then, the physical comics will appear in the Square store, and not too much later, I intend to publish the ebook “wide”, as they call it, meaning it will be available from Kobo, Apple, and other ebook platforms. I intend to make Episodes 1 to 3 available in a bundle format for the platforms that haven’t had the comics before. So, more work to do. But first, we have the launch on Sunday!

Extensive Listening

Mostly I have focused on extensive reading, both in my research and practice. This is obvious from my DELF results, where listening, and especially speaking, were much worse than my reading and writing results in French. To build skill in listening requires suitable comprehensible input. There are various podcast and youtube channels that are recommended. Some have beginner content, others intermediate, and others cater to multiple skill levels. Partly for my own benefit and partly as a resource for others, I thought I’d list those I have found out about.

I should also mention that if something is too hard to comprehend because of the speed, you can always slow it down. If it is too hard because you are unfamiliar with most of the words, it is best to start with something easier and/or do more reading first. (Also, if it is too easy, you can speed it up).

Beginner channels.

There are also a few easy songs you can listen to (based on repetitiousness and vocabulary size). I’ve also become aware of a few easy children’s programmes.

Intermediate channels

For the intermediate channels, you may need to have developed your vocabulary a bit more via extensive reading with graded readers and the like.

At the intermediate level, the innerfrench.com is the most recommended. It starts slowly and has many episodes, which generally become gradually more difficult, apart from a few exceptions. There are some interesting topics amongst the series.

Another that has some interesting comprehensible content is Français avec Fluidité. There is an A2+ playlist (start here) as well as various interesting topic playlists. I feel as though it is slightly more difficult than the innerfrench podcasts but still quite comprehensible for an intermediate.

Little Talk in Slow French is beginner-intermediate level but has some words translated into English during the podcast.

Lingua.com has recordings that are labelled with CEFR competence levels, from A1 to B2, with questions to test your comprehension. They also have resources for other languages.

Elsewhere

Since creating this post, I have learnt that there is a comprehensible input wiki with resources for many languages. The French list of listening resources is more comprehensive than what I list here, so worth checking out.

Excerpt of sheet music for the song La Mission by Uitdenbogerd

The Easiest Songs in French

Someone recently asked whether there were any A0 songs in French for improving listening skills. There may be some learner songs. I wrote a few to go with my comic books but they are optimised for reading, not listening.

So what would be the criteria for easy songs? I think that the easiest would have a very small set of words and be repetitive. Where two songs equate in vocabulary and repetition, perhaps the one with the easiest grammar or the most standard expressions would be ranked easier.

I cannot comment on whether there are any easy popular songs. Most have extensive lyrics, with only the chorus being repetitive. However, in the collection of children’s songs (comptines) and folk songs, several can be found. Some were printed in my old Horan and Wheeler textbooks. Others I have found elsewhere. Given the above criteria, here is my list and a rough sequence of difficulty based on vocabulary size; type-token ratio, which captures repetitiveness; and a grammar scale. Beginners will probably want to read the lyrics for the first few listens.

SongVocabulary SizeLyrics LengthType Token RatioGrammar LevelMeasure
Bonsoir mes amis3210.14300.06
Savez-vous planter les choux ?191180.16110.23
Frère Jacques7140.50010.25
Sur le pont d’Avignon231300.17710.27
Alouette13960.13530.28
Quand trois poules14290.48310.29
Dansons la capucine
(or more seriously)
26910.28610.32
J’ai mangé un croissant10200.50010.32
Au petit pas militaire12390.3130.33
J’ai mangé un croissant remix19560.33930.39
Didi et Dada32720.44410.41
J’ai du bon tabac341110.30630.48
La mission33480.68810.49
Au clair de la lune701330.52640.86

Book cover with musketeer holding a boot, saying "Diable !"

A Tale of Three Three Musketeers and Another One or Two

No, that is not a typo in my title. Thanks to my recent obsession with this novel, leading to the Bootstrapping the Three Musketeers book, with a second on the way, I thought I’d look at three different abridged versions of Les Trois Mousquetaires. I have one published by CLE International, who, in the copy I have, state that it is for a vocabulary of 700 words, at the top level of the Niveau 1. New editions call that CEFR level A1. I also borrowed the CIDEB version, which is aimed at B1, with no mention of vocabulary size, as is common in current publications. When looking online at Mousquetaires books, I found another one adapted by Frédéric de Lavenne de Choulot (FLC).

One thing I’ve noticed with these, as with the recent movies, is that writers select different scenes to include in their version. CLE and FLC both include d’Artagnan’s anger at being ridiculed for riding an old nag. Both the recent movies and CIDEB chose to exclude that scene. The movies also took many more liberties with the story.

OK, now to details…

The CLE version

The CLE version, which aims for a vocabulary of 700 words, has a text length of about 10,000 words of story. It is written in present tense, and the volume includes a brief biography of Dumas, vocabulary support, and some short questions at the end of the book, with solutions. The books of this series provide two types of vocabulary support: general vocabulary is defined in French in a footnote, whereas words that are specific to the story, such as épée, are listed at the back. There are several ink illustrations throughout the 64-page book. (I notice the new edition has new greyscale images.)

On my simple readability score, based on a sample of the first 115 words, it gets 13.15, due to long sentences in the introduction. The expected vocabulary size for 95% coverage is estimated to be 7,427 (based on types, not word families).

The CIDEB version

CIDEB always include lots of additional material, such as many exercises, images, and additional articles to read that are related to the story. The book is 128 pages long, with less than half of that being for the story. I estimate it to have about 13,000 words of story. In addition there is an imaginary interview with Dumas, historical information about the period, and information about movies made of the story. The vocabulary support is provided as footnotes in French.

The story itself is written in passé simple. Based on a sample of 125 words at the start of the story, it gets a score of 8.71, also having a vocabulary size score of 7,427 and an average sentence length of 12.1.

The FLC version

The caveat for this review is that I have only looked at the sample. Therefore I do not know what additional material may be found in the book. There appears to be some vocabulary support, though I cannot say what form it takes, since I have only looked at the sample, which doesn’t allow the links to be followed. I am uncertain why certain words were selected for vocabulary support. Given the target audience appearing to be English-speakers, I would think it pointless to define “armes” and “crient”, but perhaps some nuance was discussed about these words. I noted one minor typo, a missing circumflex on the “i” of “boîte”. That’s not a big issue, given that I’ve seen many errors in books published by major publishing houses (and have been guilty of a couple in my published papers and comics).

The text is in present tense and received 8.18 on my readability measure, with a lower sentence length in the sample of 113 words analysed. The expected vocabulary size at 95% is 5,010.

Summary and Another Mousquetaire

Based on my simple measure, the FLC book comes out in front as the easiest, but there’s not much in it. I might add that the original only scored 12.3 in a short sample from the introduction, due to many low frequency cognates giving it an expected vocabulary size of only 5,543. All three books are simpler than the original in some way, and based on my reading, they are all engaging. All are a similar length in terms of story at 10-16 thousand words. The readability score of the original does highlight a potential problem that happens in graded readers that are not catering for a specific language pair. A different, higher general frequency word may replace a low frequency French-English cognate, making the resulting text harder, not easier for someone with an English-speaking background (although it can help an English-speaker get more experience of those higher frequency words). I have noticed in the past that some vocabulary support is more difficult than the word being defined, due to its similarity to an English word.

Both the CLE and FLC are very much reading-focused, whereas CIDEB tends to turn each story into a bunch of lessons, which, while often interesting and useful, get in the way of reading the story. Depending on my mood, I’ll either skip all the extra material and read the story, potentially going back to read the extra things later, or I’ll look through the extra material as I go. The one activity that I do find useful and make use of in CIDEB books is the pre-chapter vocabulary activities. My favourites are the mix and match vocabulary to images. This is great for people with different vocabulary backgrounds, as there will be some words you know, and others that you can puzzle through, based on the images and the other words to be matched. The activities help increase familiarity with the words, which appear in the chapter to follow.

I have another musketeer story in my collection, not written by Dumas. It is L’autre mousquetaire by Rupert Besley, illustrated by Bob Moulder. This is from the Mary Glasgow Bibliobus set of books, which are no longer in print. It is a short comic about Pathos, a musketeer wannabe. I’ve noticed several phrases in it that come up frequently in the original book, as determined by my bootstrapping algorithm: parbleu! bah! diable ! morbleu ! pardieu ! These words don’t come up so much in the adapted versions, since they are not “useful” language and are mostly dated, but they are an essential part of the character of the original novel. Something to look forward to if you decide to tackle it.

Book cover with musketeer holding a boot, saying "Diable !"

Bootstrapping the Three Musketeers

Those who have visited my blog this year will know that I have put up some “filtered French”, such as a list of the most common one-word sentences in French classic literature, and sentences that fit the highly constrained vocabulary of my comic books. After musing on language acquisition, in particular how babies learn, not to mention our experience of picking up a few words and phrases in a foreign language by ear, I thought I’d try a different approach. This has resulted in producing a book (with more volumes to come) where I filter Les trois mousquetaires, and add vocabulary one word at a time based on which word will complete the most sentences. Using a combination of manual and automatic filtering, I have created extracts that have sufficient repetition in their vocabulary for people to become familiar with the words.

It has been fascinating to see what happens as I add each new word. The algorithm tends to find dialogue first, gradually increasing in average sentence length, then short non-dialogue sentences – after the 93rd word of vocabulary was added.

Anyway, if you’d like to have a look, it’s on Amazon, with a substantial preview.

(Affiliate links in this post.)

French Novels Recommended to Learners of French

At some point learners of French should tackle novels in French that are written for native speakers. However, they vary considerably in difficulty. I thought I’d keep a list of those I’ve seen recommended. At some point I’ll add a readability score, as modelled for learners of French with an English-speaking background.

Several others are listed at https://www.private-frenchlessons-paris.com/blog/10-books-for-french-learners.

Novels that are translations from English to French are often easier than those originally written in French, but it is definitely not always the case.

(This page includes affiliate links.)

The Book Flood Study

In 1983, Elly and Mangubhai published their influential study that compared reading high interest stories to ordinary language instruction and found that there was considerable improvement in reading comprehension and other measures in the two reading-based groups compared to the language instruction group.

I’ve been reminded recently that the paper is behind a paywall, so I thought I would produce a few figures from it here and highlight some of the aspects of the study.

The study participants were primary school students in Fiji, who normally received instruction in their native Fijian for the first three years, switching to English in Class 4.

Here are the residual gains for each Class 4 group (300 students from 12 primary schools) and each type of assessment. The shared book group experienced the teacher reading aloud, sharing the story in an enlarged format, with students joining in to read easier sections, and doing story-related activities. The silent reading group read books of their own choice for 20-30 minutes a day. The control group did the normal curriculum (SPC/Tate audio-lingual program).

Another table showed that the gains a year later, continuing with the same reading activities, were even greater. The results were improved for exam marks in other subjects, including maths.

The Second Easiest Series of Books in French

At last I’ve found them. The books that can be read after Gnomeville.

As those who have been following my blog or buying my comics know, my comics start from a vocabulary of zero French but an English speaking background. Episode 1 introduces twelve very frequent words (with over 300 words of text); Episode 2 adds the remaining eight of the top twenty words occurring in French newspapers (while giving over 700 words of text to read); Episode 3 adds nine more frequent words (with 1200 words of text) and Episode 4 adds ten more (in 1800 words of text). This makes a total of 39 frequently occurring words. In addition, the comic uses many French-English cognates to make entertaining stories.

While I’m sure that the books I’ve found don’t restrict themselves to frequent words, they do start with a very small vocabulary and include repetition to allow the vocabulary to be acquired easily. The book with the smallest, at 55 words, in an illustrated text of 2100 words, is Edi l’éléphant. From there you can go to Les abeilles exploratrice at 88 words, then Émeraude, le bébé tortue, at 90 words. From there you can go to Brandon Brown dit la verité (95), Brandon Brown veut un chien (104), Brandon Brown à la conquête de Québec (165), and Obsession dangereuse (200). Some of the “Novice Mid” books have smaller vocabularies than these but use past tense.

I’ve now had a chance to look at a couple of sample chapters of two of the books. I can say there is definitely a narrative, but the low vocabulary in many words of text mean that there is quite a bit of repetition. This is great for acquiring vocabulary, but if you already have this vocabulary, you will probably want to choose something a little more challenging.