Tag Archives: vocabulary

A tale of three French picture books: passé simple is not that hard!

One of the weird things about studying French is that we seem to have three levels:

  • Beginners use present tense, imperatives, infinitives, and future proche;
  • Intermediate learners use passé composé, imparfait, future and conditional tenses
  • Advanced learners use passé simple and subjonctif

Yet, if we look at picture books written for French children, many use passé simple straight off.

I remember when I started reading (in English) in Grade 1 of primary school, one thing I had to get used to was constructs like “said Dora”. It doesn’t happen in spoken English, so felt a little weird. But it wasn’t overly difficult. Perhaps people from English-speaking backgrounds who had stories read to them would have been familiar with that already before reading it. The same thing must be true for French children reading or hearing passé simple. It’s a little different but not hard.

I recently read three French picture books. The first (Le Grand Antonio by Élise Gravel) was a fairly easy one with few words, written in present tense. The second (Quel est mon superpouvoir? by Aviaq Johnston) was a translation from English, written in passé simple (and imparfait). It was a comfortable read for me. The third (Dounia by Marya Zarif) was (mostly) written in present tense but was more difficult due to its vocabulary and more descriptive text. It is obvious to me that it is possible for texts in passé simple to be easier than those in the easiest tenses.

The thing is, you don’t need to know how to conjugate passé simple to read it. You just need to recognise the endings of third person singular (3ps) and plural (3pp) for regular verbs plus know a few of the irregular verbs. Here they are.

For -er verbs, 3ps ends in -a and 3pp ends in -èrent.
For -ir and -re verbs, 3ps ends in -it and 3pp in -irent.
You may come across a few -oir verbs, which have -ut and -urent.

The main irregular verbs to watch out for are:

être: fut, furent
faire: fit, firent
avoir: eut, eurent

The regular ones should not pose any problems. The avoir ones are recognisable thanks to already knowing the past participle of avoir (eu). The main difficulty is not mixing up the être and faire words. A simple rule is that faire has an ‘i’ in it, and so does its passé simple conjugation.

I hope that helps. It helps me.

Beginner French Resources

tldr: Easy French sentences from classics here.

Years ago I was tinkering with creating my beginner comic book in French, and then researching what made things easy to read in French for those with English speaking background. I learnt that the two main aspects that characterise text difficulty are grammar and vocabulary, with other aspects usually having a much smaller role to play. Through my own research, inspired by my own frustration and anecdotal experience, I learnt that for French the typical readability measures that use word length or even how common a word is for vocabulary difficulty just don’t work for people with English speaking backgrounds. This is because so many of the longer “difficult” words in French are identical to those in English, or close enough not to matter. My experiment demonstrated that you may as well just use sentence length to decide on difficulty, being the simplest measure of grammatical complexity. Despite this, vocabulary matters. It’s just that the words that are difficult are differently distributed than for languages that don’t have this peculiar French-English relationship.

In another of my experiments, I tried to filter a large collection of French text to find extracts that are easy for English speakers. While the extracts that are very easy are not long, they do exist. It’s a matter of playing around with the constraints to get something sizeable. It should also be noted that the text I used consists of French classics, which can be challenging to read. Anyway, it’s been a while since I looked at this. The other day I created a page on this site that contains all the sentences and extracts I found that restrict themselves to the vocabulary and grammar of Episode 1 of my comic book, (le, la, les, de, du, des, et, est, se, que, and present tense third person singular of -er verbs) plus cognates and names. I hope it is useful. More to come.

Comic Books versus Text-Only Books for Language Learning

Recently I have been reading a few comics in French, mainly by French-Canadian authors, or translated by them. The target audience for most of them is children and young adults. It had me thinking again about how best to grade comics in terms of difficulty.

My experience in attempting to read various Japanese books for children or learners showed me that it is possible to read a picture book that is really just an illustrated vocabulary without knowing any of the words beforehand. At the other extreme, it is theoretically possible to read everything in a parallel text, since the translation is right there to refer to, just very slow if every sentence needs to be analysed. That is known as “intensive reading”, which has been shown to be less useful than “extensive reading” for language acquisition. Complete glosses similarly make it possible to read a text without prior knowledge of the language, albeit with lots of interruptions to look things up.

Translations and glosses aside, a comic book will be easier than its text presented without illustration, since the illustrations provide clues to what is happening. It is also easier than text describing the same scenes provided by illustrations – a point that was made elsewhere in favour of learning language from comic books. In other words, “a picture paints a thousand words”.

In general, there is more dialogue and less descriptive text in comics, compared to novels, so the sentences are shorter on average. (This also applies to scripts of plays.) In addition, the pictures give clues as to what the text is about. A further benefit is that it often provides more examples of speech than would be found in a novel – or at least, as a proportion of the text read. This can be useful for absorbing speech patterns, particularly for people who are not exposed to much speech directly.

While the shorter average sentence length means that comic book text will generally be scored as easier than text from novels by readability measures, I think that a measure of difficulty of a comic may need to consider whether concrete nouns are illustrated when used. For example, a picture containing a wild boar with the text clearly indicating that it is “un sanglier” could be almost as easy as reading a French-English cognate, such as “village”. Or perhaps it is roughly equivalent to having a gloss entry, albeit introduced in the story instead of in a footnote.

Either way, comic books should be easier to read than books that have no illustrations. See my list of easy comic books in French for some that are a good starting point for beginners.

Review: Kill the French

Today I came across the book Kill the French by Vincent Serrano Guerra in a list of recommendations on Amazon and thought I would have a look. It appears to follow similar principles to others that do strict vocabulary control, pioneered by Michael West in the early years of the 20th century: restrict to cognates, introduce frequently occurring words first, include repetition, and slowly build up the assumed vocabulary. The author has also followed the principle of spaced repetition with the goal that readers will retain vocabulary at optimum levels. So how does it compare to other books and comics that do the same thing? Let’s have a look.

I have analysed approximately the first 100 words, which covers the Day 1 text and the title of the Day 2 text. According to Style, it has an average sentence length of 8.8 words and an average word length of 4.3. Word lengths don’t really tell us much for French, since longer words tend to often be easy for those with an English-speaking background. Sentence lengths do, however, have a stronger impact on readability.

Other stats on the sample: vocabulary is 45 words out of 95 words of text, making a vocabulary density (type-token ratio) of 0.47. Naturally the author has made heavy use of cognates. Some of these are exact cognates, such as “lion”, and in other cases they are more challenging without context, such as “musée”. If we assume that all cognates are known, then the assumed vocabulary size for 95% coverage is 41 (when words are ranked in general frequency order), which is an excellent achievement. The only books in my collection that achieve that level or better are:

RankTitle
Required Vocabulary Size for 95% Coverage
1Gnomeville 2: Les pythons et les potions16
2Gnomeville 1: Introductions25
3Longman’s Modern French Course Part 135
4Gnomeville 3: Les six protections de la potion40
5Kill the French41

So from the perspective of readability in French for people with an English-speaking background, I put it at the same level as Gnomeville 3 initially, as they both have similar sentence lengths as well as vocabulary coverage.

Unfortunately, like many graded readers out there, the text of Kill the French is quite dull. I checked the 18th day in the sample to see if it was more interesting, having gained extra vocabulary. Sadly, no. I can’t comment on the final stories in the book, which may be more interesting, since I have only examined the sample.

So, here is my conclusion. If you are an absolute beginner in French and are a huge fan of spaced repetition-based learning and willing to put up with texts that are mildly interesting at best, then this is an excellent graded reader for getting you to become familiar with the 500 most frequent French words efficiently. It certainly beats just memorising vocabulary in isolation. The Gnomeville comics may be more exciting and fun, but unfortunately they currently only take you to a frequent vocabulary of about 30, until the author gets cracking with the rest of the series. Perhaps the best approach at this stage is to use both together.

The first day of Kill the French uses frequent words that are introduced in Gnomeville Episodes 1 to 3. All except “avec” are introduced in the first two episodes. Day 2 introduces two words occurring in Episode 1, one from Episode 2, and one that doesn’t feature in the Gnomeville series yet, since it is far less frequent in text. Gnomeville‘s first two episodes introduce the twenty most frequent words occurring in French newspapers, which is a slightly different frequency profile to spoken language, and somewhat different to other text corpora. Kill the French introduces words in an order that doesn’t resemble any specific corpus frequency list but they are still frequent words. For example, the second day includes the word “aussi”, which in movie vocabulary ranks about 91, in books at 78, and in the Minnesota spoken corpus, at 79. But, it is still a frequent word, and I know from personal experience that being a bit flexible about the order of introduced words makes it easier to produce a coherent story.

Given that the order of word introduction varies enough that words will be introduced in one book and not the other, it doesn’t really matter too much which you read first. You could, for example, read Day 1, then reward yourself with Episode 1, then after Day 2, do the same with Episode 2. Day 3 is where the two texts diverge the most in terms of vocabulary, but there is still overlap. After that, you are stuck with Kill the French. But at some point you might be able to switch to Première Étape: Basic French Readings: Alternate Series by Otto Bond (published 1937), if you can locate a copy. According to my stats the expected vocabulary works out to 316, but it is another principled graded reader, using cognates, frequent words, and slowly adding new words as you read. It’s also an entertaining read. However, from memory, it does use more difficult tenses typically found in French literature right from the start, so can be challenging grammatically. The average sentence length is also quite long, making it potentially daunting.

In summary, I recommend using Kill the French in the following manner: for the first three days, read the day’s material and follow it with an episode of Gnomeville. After that, if you can keep going with the spaced repetition from Kill the French for about 100 days, you then might be able to start reading Première Étape: Basic French Readings: Alternate Series, which is interesting right from the start with an initial vocabulary of 97 frequent words and Si Nous Lisions, which starts being interesting from Chapter 6 with a vocabulary of about 100 words. Best of luck!

Function word frustrations

I recently re-watched Dilili in Paris, which is a fabulous animation movie for children, with French dialogue that is slow enough for French language learners to follow. I originally watched the movie during the Melbourne French Film Festival and considered buying the movie later so I could try watching it without English subtitles.

Frustration 1: Memory

There is a frequently repeated phrase when Dilili meets new people: “Je suis heureuse de vous rencontrer”. It was semi-humorous, and certainly designed to be remembered, to teach how to be polite when meeting someone new. However, what I actually remembered after a week or two was: “Je suis heureuse __ vous rencontrer.” Despite being exposed to many occurrences, the function word was lost. Function words don’t provide semantic content and therefore appear to be harder to retain. There is certainly research evidence that concrete nouns are easier to remember than various other types of words. This movie brought that home to me in a big way.

Frustration 2: Resources

(Not really about function words…)

I bought the DVD of the movie, and then when viewing it, discovered that the subtitles could not be switch off, and that the only subtitles were in English. I don’t know who makes these decisions when preparing DVDs for sale, but perhaps they don’t really consider their audience carefully enough. A French movie sold in Australia would have various audience segments: French ex-pats – possibly including some French people who are hard of hearing, Australian francophiles, Australians learning French. To me, movies and TV episodes are highly useful for practising comprehension of the spoken language. Ideally it can be done at three levels of difficulty (with the example given for L2 referring to the language being learnt and L1 referring to the native language):

  1. L2 audio with L1 subtitles,
  2. L2 audio with L2 subtitles,
  3. L2 audio without subtitles

I even do this with DVDs that were originally in English. I’ve watched two entire series of Perry Mason with French audio, which was quite illuminating. If you are short of practice material, check your DVD collection for audio in your target language. You may be pleasantly surprised to find a good selection amongst your favourite shows.

Frustration 3: Vocabulary Size

(Function words are frequent words…)

One of the excellent things about some graded readers was that they were designed for a specific vocabulary size. For me, vocabulary makes all the difference between a readable text and an unreadable one.  CLE International used to publish books targeting a specific vocabulary size. For example, Niveau 1 had vocabularies of 400-700 words. Through extensive reading, I have successfully moved from 300-word vocabulary books to 700-1000 word ones, and I hope to continue to progress through further reading. However, as with other publishers, the publications have now been converted to CEFR levels: A1, A2 etc. and as far as I can tell, the subtleties of vocabulary size have been removed from the book information.

I have completed a CEFR B1 in French, yet I’m most comfortable reading A1 texts (and texts with less than 1000 word vocabularies) and with few exceptions they are not easy apart from the grammar, which is too easy for me, but the books are still sometimes challenging vocabulary-wise. What frustrates me is that A2 covers such a wide range of vocabularies, depending on the source material, from readable to incomprehensible. Published vocabulary sizes for A2, where they occur at all, vary from 400 to >1200 words. The level of frustration with some of these graded readers is the same as for texts written for native speakers. I oscillate between A1, A2, native texts and back again. The original memoirs of Céleste de Chabrillan are as easy and more exciting than many A2 texts.

CEFR is designed, as far as I can tell, to describe a person’s practical skill in a language, and for that it is useful. However, the jumps between levels are quite large, so that the defined levels are not very useful for the learner themselves. Some publishers solve this by dividing up levels. ELI uses A0, A1, A1.1. The Danish Teen Readers/Easy Readers also divide up the levels, and still appear to quote target vocabulary sizes. Indie publishers tend to ignore vocabulary size in their writing. However, writers and publishers should remember that:

  1. Extensive reading is at its best if learners are reading at a comfortable level while not being familiar with all vocabulary. Ideally learners should know 98% of the words in text they are reading.
  2. Readability of text largely consists of grammar and vocabulary components.
  3. The more readable AND interesting reading material is, the more learners will read, the better their vocabularies will become, and the better their skill in a language will be.
  4. Publishing vocabulary levels required for 95-98% coverage of the text will assist learners in finding materials of the right level for them at any point. Vocabulary levels should be (loosely) based on general word frequency.

This is why I write my comic books for language learners. This is why I research extensive reading, readability and language acquisition.

Readability Zones

I’ve just been updating my database of French readers and observing the types of books or stories in the different ranges of my current preferred readability measure.

Scores under 4 are ridiculously easy for people with an English speaking background. Currently this consists only of episodes 1 and 2 of my Gnomeville comics. Sentences are short and vocabulary is highly constrained, exploiting French-English cognates.

Scores in the 4-4.99 range are very easy: Bonjour Luc, A First French Reader by Whitmarsh, and Histoires pour les grands. They tend to be conversation-based.

Scores in 5-5.99 tend to be the short illustrated graded readers such as Bibliobus, as well as La Spiga’s Zazar for grands débutants (target vocabulary of 150). Gnomeville Episode 3 sits here due to having longer sentences compared to the first two episodes.

Scores in 6-6.99 tend to have longer sentences, including some classic graded readers such as Si nous lisions and Contes Dramatiques, as well as the 300 word vocabulary Teen Reader Catastrophe au Camping des Roses.

Scores 7-7.99 also have the more text-like graded readers, including Sept-d’un-Coup by Otto Bond, which tends to have long sentences but well-controlled vocabulary.

In the 8-8.99 range I find the first story for native speaking children, as well as more graded readers, including one with a target vocabulary of 1000 words.

The first books for adult native speakers occur with scores between 10 and 12.

Looking at the stories in the list, my own level seems to be from 7 to 10, suggesting I should continue reading more challenging graded readers in addition to stories written for French children. That is pretty much what I have been doing for a while, as well as incidental reading on the web and elsewhere.

A quick look at the relationship between stated vocabulary sizes and the 95 percentile that I have been using indicates that the required vocabulary is  roughly 1.5x  + 2600. However, I am using a token-based vocabulary whereas most would use a word family one. If I assume token vocabulary sizes are 5 times word family sizes, then the equivalence point for this model is when the vocabulary is about 770, meaning that the vocabulary load will be excessive for stated vocabulary sizes less than 770 but be ok for sizes greater than 770. That’s reasonably reassuring. Mind you this is an extremely rough estimate.

This work was based on about 100 words from the start of the text of 40 stories, but it does seem to sort things fairly usefully. The outlier based on my experience of reading the stories is Aventure en Normandie, with a score of 9.49. I don’t recall it being a difficult read.

Meanwhile I am making more progress on Episode 3 of my comic book. I decided to divide one page into three pages, as it had a lot of text and too many new language concepts for a single page. So Episode 3 will probably be 32 pages long, breaking the standard Gnomeville pattern of 28 page episodes. Hopefully it will be ready within a month.

Recent French Reader Reads plus Errata

I succeeded in acquiring more classic French readers recently. One of my new favourites is Dantès from Otto Bond’s Basic French Readings alternate series. It is a simplified extract from Dumas’s Comte de Monte Cristo. The story starts with an assumed knowledge of 97 frequent words, much like Sept-d’un-coup by the same publisher, but succeeds in having a higher proportion of cognates, leading to an impressive expected vocabulary for 95% coverage of 316 (based on my word list).  This makes it the lowest I’ve seen so far, apart from my own series.

However, the important thing is also whether it was an enjoyable read. I definitely got hooked on the story, and then all of a sudden the extract ends, and I’m left wanting to read the rest of the story. That can only be a good thing.

I was less captured by the remaining stories in the five-story volume, but still enjoyed most of them.

Regarding the match between publicised vocabulary sizes of graded readers and the reality of reading them, I can say from my cursory investigations that there is not always a good match between the two. Perhaps it averages out across the books, as I only take the first chunk of text for my comparisons, but if the first few paragraphs are too challenging, then a language learner may lose interest.

I’ve developed a new estimate of readability now, which is more complex than ones I’ve previously used and seems to match the foreign language learner’s experience reasonably well. Based on this, and my more recent acquisitions, I now recommend the following as first reads for French beginners with English-speaking background.

Young children: Luc et Sophie series, or Bonjour Berthe, which I find more entertaining. Le Petit Napoléon series is also quite good, and suitable for all ages, for those who like cats.

Older children: Gnomeville, Le Chapeau Rouge, select stories from Mary Glasgow’s Bibliobus, or Sue Finnie’s Lire Davantage.

Teenagers: I quite like the Teen Readers series. Catastrophe au Camping des Roses is rated as a vocabulary of 300 words, and my estimate has the 95% coverage vocabulary at 2421, which isn’t too bad. But Dantès mentioned above is easier vocabulary-wise.

Adults: Dantès is my current favourite as a first read. Becky Tucker’s Histoires pour les grandes appears to be easy, but I haven’t read enough to know whether it is interesting. I have yet to rate other ebooks.

However, the only stories that you can read immediately in French without having studied it is the Gnomeville series. There are some minor issues with it though, as have been brought to my attention recently. There are places where I have used “de la Fantasia” that should be “de Fantasia”. I was uncertain of the rule for this, but now I have discovered it. Mostly “de” is used with a country, but “de la” is used in expressions that have a temporal sense to them, such as “le gouvernement de la France” (since governments are not permanent), or if there is an adjective applied: l’Histoire de France but l’Histoire économique de la France. Very subtle indeed and I hope I can be forgiven for getting that subtlety wrong in my comic. I intend to make a second edition of Episode 2 at some point to rectify this. Another error in Episode 2 is the use of the verb “voyage” combined with “à” (“voyage à la Place des Roses”). Voyage doesn’t get used this way and “à” should probably be “vers” to communicate this idea. The sentence will be removed from the second edition.

In other news I attended the Applied Linguistics Association of Australia 2018 conference a couple of weeks ago. It was very inspiring, and also emphasised that the important thing with language acquisition is communication, not perfection. Perfection is unlikely to be achieved, but improvement is always possible. So let’s keep improving our language skills. Read, listen, write and speak. With practice comes improvement. Until next time.

 

 

A few more French graded reader book stats

Since my last graded reader update I’ve looked at a few more books, some of which are “classics”, in the sense they were from the “direct reading” era of the first half of the twentieth century, following the influence of Michael West’s constrained vocabulary for language teaching, the various word and idiom frequency lists created at the time, and the idea of readability. Some of these books I had already acquired earlier; but through reading some papers published at that time, I was able to compile a shopping list of other books written according to the same philosophy.

As a result, I have a new winner in terms of expected vocabulary size at the 95% threshold of reading comfort. A New French Reader by Ford and Hicks received a 95% vocabulary size of 3532, and Otto Bond’s Sept-d’un-Coup was a close second, with 3650. Bond’s book starts with a much smaller initial assumed vocabulary (97 words) than the Ford and Hicks book (523), so Bond’s book may be a better first read despite the slightly higher vocabulary score here. As seen in my first post on expected vocabulary size for 95% coverage, these are much higher scores than my Gnomeville comics, as my comics take readability criteria to the extreme.

So based on the current stats available on vocabulary, I recommend the following first graded readers for English speakers learning French.

For 6-9 year olds: Bonjour Berthe.

For 10+: Gnomeville

For adults who don’t like fantasy comics: Sept-d’un-Coup by Otto Bond – though I think there are some errors in it, and it’s out of print (and it probably counts as fantasy…).

Stay tuned for further updates.

 

Mots fréquents français

I recently came across a new word frequency list for French words, which I’m placing here partly for my own benefit. This one is like some others that combine all conjugations of a verb together, which is not helpful for all applications. Typically present tense is much easier than other less frequently used tenses, particularly for irregular verbs.

Anyway, the list is still useful. It was created by Étienne Brunet, a statistical linguist, based on a corpus of written French.

Here are the top 20 words. Interestingly, compared to the newspaper corpus list I used for designing Episodes 1 and 2 of my comic, this corpus has first person singular (je) occurring much more frequently, as well as “have” (avoir). “ce”, “son” and “elle” also occur in this list higher than “au”, and were not in the newspaper list. “avoir” may be higher because of all conjugations of it being grouped together.

1050561 le (dét.)
862100 de (prép.)
419564 un (dét.)
351960 être (verbe)
362093 et (conj.)
293083 à (prép.)
270395 il (pron.)
248488 avoir (verbe)
186755 ne (adv.)
184186 je (pron.)
181161 son (dét.)
176161 que (conj.)
168684 se (pron.)
148392 qui (pron.)
141389 ce (dét.)
139185 dans (prép.)
143565 en (prép.)
127384 du (dét.)
126397 elle (pron.)
123502 au (dét.)

List of frequent words in French.

Extensive Reading Musings

I’ve been reading some more research on extensive reading and readability lately. One paper showed gains in reading rate, vocabulary and comprehension with students reading about 150K words over 15 weeks at an intermediate level. This was contrasted with another study where learners read ~65K words over 28 weeks and failed to show improvement. I think there is probably a threshold of some kind where you need to read a certain amount per week to improve language skill. The amount probably varies with the level of skill you already have. Someone still improving their knowledge of the most frequent 400 words of the language will not need to read as much to achieve vocabulary gain (assuming appropriate graded readers) as someone reading at the 2000 word level. The study that showed gains had students reading with vocabularies of 800+.

Given the 10K words per week guide, and the typical reading rate in foreign languages often being around 150 words per minute, that equates to about an hour of reading per week, or 10 minutes a day. That’s not a bad aim for maintaining and hopefully improving your language skills.