Tag Archives: language learning

Vocabulary Needed for 95% Coverage

I’ve been tinkering with ways of comparing different easy readers for language learners. Previous posts I’ve used a type-token ratio or vocabulary density, which gives some idea of how likely it is you might learn new words through repetition from a text. But for something to be readable, the general consensus is that you need to know at least 95% of the words that you read. This is a level that allows people to guess the meaning of the words they don’t know.

So something I’ve been messing with recently is predicting the general vocabulary size needed for different beginner stories in French, assuming people know all cognates and all proper nouns. I’ve only been working with short samples of text so far, and there are many other assumptions and issues that make it not a perfect comparison – including bugs in my code…

Given a small set of extracts, and assuming you don’t learn the words via their introduction one at a time, as in my comic books, we have the following:

Title Vocab Size
Gnomeville Episode 1 25
Gnomeville Episode 2 25
Gnomeville Episode 3 40
Bonjour Berthe 4179
Easy French Reader 5008
Martine a la Ferme 11854
Bonjour Luc 6163

Note that this vocabulary size assumes that each conjugation of verbs is a separate vocabulary item, as are plurals etc. so will be much larger than word family figures normally used.

You can see that the one text written for native French speaking children (Martine) has a much richer vocabulary than the texts written for language learners. The figures for these look worse than they seem, because there are many words that are typically taught early to allow conversation, but which feature much lower on word frequency lists. For example, “maman” was at rank 6163 in my list. In contrast, my Gnomeville comics are designed to prioritise frequent words and cognates to optimally improve reading, at the expense of conversation. Hence the very small vocabulary sizes required.

Recently I’ve been reading a 1939 paper by Tharp that looked at measuring vocabulary difficulty. He appears to have had similar ideas about measuring vocabulary load based on the general frequency of the words, as well as a measure of density of difficulty words. I also recently acquired yet another very early graded reader, “Si nous lisions”, from 1930, which attempted to introduce new words every ~60 running words, in the style of Michael West, who seems to have been the first to use the approach. However, I have a graded reader published in 1909 in my collection, which was intended for “rapid reading”, and was part of a series that  commenced with short easy texts. I’m not sure if they methodically introduced words at specific intervals as was done by West and others following his example.

In searches on-line, I found a French adapted reader from 1790, so we’ve been at it for quite a while. I’d like to say we know more about how to write graded readers these days, but I think West had it fairly right. The only thing we can do now is make them more interesting and relevant.

Here’s one from 1800 published for those with a German background. There seem to be quite a few published in the 1800s.

Anyway, I’ll finish off here with the usual things: we need 95% coverage to read comfortably (on average). To do that with native texts requires quite a large vocabulary. But vocabulary increases as you read more. So we should read as much as possible at the level that is right for us and of reading material that interests and motivates us. My Gnomeville comics are ideal first readers in French for those with an English language background and a good vocabulary in English. The Berthe and Luc et Sophie series are reasonable alternatives for children that are possibly too young for Gnomeville, as are the ELI A0 series. Until next time…

 

Advertisement

Luc et Sophie – a review

In my recent exploration of graded readers intended for children, I found the Luc et Sophie series. I have the première partie, and read through all 14 booklets.

Each booklet has 6 pages of story, a page of vocabulary, and a colouring in page with blank speech bubbles. The text is entirely conversation, shown in speech bubbles. The booklets are neatly presented in full colour, with a consistent style across the series.

The first booklet “Bonjour” has ~33 words (tokens), and ~20 different words (types).  The average sentence length is 2.2 words (according to “style”). The last (14th) booklet “Où est ma trousse?” has 71 tokens and 37 types. The average sentence length is 7.3 words. The low type-token ratio (61% and  52% respectively) provides for sufficient repetition for language acquisition, and with a large set of booklets, they can provide good extensive reading practice in the early stages.

The stories centre around a brother and sister who are 7/8 and 6 years old respectively. The brother is annoying. The punch-line of the stories is usually something to do with the annoying brother.

I find the series generally annoying – perhaps it is reminding me of my own childhood and sibling issues. The artwork bugs me, but I’m not sure why. While it’s a comprehensive series, it is too narrow in style and theme for it to be the only books for children to read. I prefer the Berthe witch series (admittedly based on a sample of one book), but that could just be my preference for a touch of the magical and the unusual in stories. It would be best to have the class library contain a variety of stories to cater to different tastes – Luc et Sophie for the realists and Berthe for the dreamers, and hopefully other stories for yet other children. Gnomeville might fit into such a library, but may be a bit complex for the very young, due to the difficult French-English cognates (eg. se matérialise, utilise, vulnérable) in it. It seems to suit 11-year-olds well enough.

Using Martine by Marlier for French Extensive Reading Practice

The Martine series was recently recommended to me for children learning French. I managed to purchase a couple of books from the series from FNAC. My review is of course biased by my own preferences in reading (and writing), and clearly I am not in the 5-12 age range for whom they were recommended, but hopefully it will be useful nevertheless.

I read Martine à la ferme, which is one of about 60 books in the series, which tell the adventures of Martine, a young girl. This particular book is about Martine visiting a farm with her friend Lucie.

From a story perspective, there is no driving narrative. It’s just a bunch of twee pastoral scenes with text. It is beautifully presented, and for children who love animals and dream of interacting with them, it may be an enjoyable experience. I found it dull, however.

From a language perspective, the series can be quite useful. It is authentic French in present tense, so great for learners to get reading practice without getting bogged down in passé simple. Plus, with 60 volumes to go through, that’s a good amount of practice at the level of the books – if you enjoy the genre.

There are 18 pages of illustrated text to read in the book, with about 60 words per page, making approximately 1000 words per book. The vocabulary and language appear to be sufficiently generic to be useful, and easier than other French children’s books I have seen in that regard. Sentences are fairly straightforward, and rarely longer than 15 words in length.

Vocabulary will be the main difficulty for foreign language learners. A sample of the first ~130 words had a vocabulary of 94 (including names and apostrophe’d words as separate words), making a vocabulary density of ~72% (unique words divided by total words). To put this into context, here are some vocabulary densities on the first ~100 words of other texts.

Consuelo 76%
Le Petit Prince 74%
Minnesota spoken corpus 68%
Gnomeville Episode 3 (not yet released) 58%
The French Bible 52%
Gnomeville Episode 2 46%
Gnomeville episode 1 43%

Basically, any normal native French text is likely to have a vocabulary density of about 75% in a sample of ~100. (The density typically drops a little as the length of the text sample increases.) Conversation (eg. Minnesota corpus) seems to be lower, and translations may also be lower. To get lower than that requires stories that are intentionally written with a small vocabulary, such as the Gnomeville comics listed above, and some Dr Seuss stories (in English) – especially Green Eggs and Ham.

So, in summary, if you are after authentic French text that has easy grammar, then the Martine series will be very useful for those who enjoy the genre. The books are also fairly short, allowing children to feel a sense of achievement in finishing them sooner than for a longer work like Le Petit Prince. Personally I would prefer to read more books that are specifically written for language learners until my vocabulary was large enough to read books that are more entertaining. The J’Aime Lire series of books for French children is much more entertaining and written for the 7-11 age group. The difficulty of the text does vary quite a lot though, depending on the author, so expect to occasionally struggle or skip stories. My current recommended sequence for primary-aged children is:

  1. Gnomeville series (for English-speaking background only)
  2. Mary Glasgow series (English-speaking background)
  3. EMC’s À l’aventure! Readers (English-speaking background)
  4. Aquila’s readers (English-speaking)
  5. CLE International’s Collection Découverte
  6. La Spiga Grand Débutant series (150 word vocabulary)
  7. ELI for children
  8. Martine or J’Aime Lire books

These are not a strict reading sequence, since the various series overlap in levels of difficulty (except Gnomeville). There are other series out there, such as CIDEB, Edition Maison des Langues. There are more books for adolescents, such as Teen Readers, and the adolescent FLE series by Hachette.

I will publish more detailed up to date lists as I become aware of more books and series. Stay tuned.

 

 

Two Great Language Acquisition Resources

German Extensive Reading Stories

I’ve recently been reading the ebooks by André Klein. He provides a collection of reading material in authentic German for beginners and intermediate learners, with comprehensive glosses of expressions used in the stories.

I have now finished reading the first four in his Dino lernt Deutsch series for beginners. I think they are a fantastic resource for German learners. Now, as a caveat, I must say that I’m a false beginner for German, since my Dutch background makes German pretty easy to understand, so how it reads for someone of purely English-speaking background or other language backgrounds I can’t comment.

Each book consists of ten short stories, but all follow the adventures of Dino, as he lives in and travels through Germany. I’m guessing the format is as it is, so that when commencing reading, the learner can feel that they have achieved something by reading one short story.

The stories are not greatly dumbed down, in that there are smatterings of dialect (also translated), which somewhat increase the load on the learner. As I’m generally interested in language, I find this quite interesting. However, it may make the stories somewhat more challenging for the beginner.

He also has a couple of picture story books. These have little text and the language is not necessarily easier in terms of vocabulary load and grammar. They are, however, pleasant reads. I think the language is somewhat more constrained in the Dino series, so they are probably more useful for extensive reading.

André Klein’s books are found on Amazon, Smashwords and elsewhere. He also has more language resources on his website.

French Listening Resources

I’ve been floating around some Facebook groups about languages lately. One provided a link to the series Extra French. I hadn’t come across this before, but for me it is entertaining, in sit com format, and simple enough to follow. It’s probably a good place to start for listening practice, other than the practice of listening to stories while following the written text.

Another resource I’ve heard about recently is related to specific CEFR levels. Find French oral comprehension activities there. Good practice for those wanting to sit DELF/DALF exams.

Episode 2 Launch Tomorrow!

I’m launching Episode 2 of my French language comic for beginners in the French language that have English as a first (or accomplished) language. The launch happens three years after Episode 1’s launch, and both are associated with concerts of my choir. I use appropriate choir concerts as a deadline for me to push myself to complete things. It works for me, although it does wreak havoc with my health in the short term. Last time it was a concert of music from France. This time it’s a fantasy-themed concert featuring a dragon.

The concert is The Quest, an entertaining night of music interspersed with a fantasy narrative involving a dragon. Music from my second (La Potion des Pythons) and third (La Mission) comic books will be featured in the concert. The song La Mission is also available on my third album On the Rocks.

Episode 2 (and Episode 1) will be available on the night in large format comic book, which is roughly a standard comic book size. Episode 1 is also available as an ebook from Amazon, and I’m running a special countdown deal starting on the day of the concert (Thursday 1st June), so Thursday is the best day to get your copy of Episode 1 for US$0.99.

Episode 1 provides incidental repeated exposure to 12 of the most frequently occurring words in French, but also provides gloss support and explanations of the new word of the page at the bottom of the page. Episode 2 uses the remaining 8 of the 20 most frequently occurring words in French newspapers. All the rest of the words used in the story are French-English cognates, like “dragon”, or names, like “Jacques”. In Episode 2 the amount of text in the main story reaches a level that it starts to be possible to guess the meaning of the new word of the page before checking the meaning provided in the gloss. This is considered optimal for vocabulary acquisition.

Have a look at the preview on Amazon and get ready to be entertained while reading the easiest French books you’ve seen. Then perhaps you’d like to read Episode 2.

didietdada

Gnomeville comic book cover containing head of dragon with smoke billowing out of its mouth and the title "DRAGON!" in large red letters

The Evolution of a Language Comic Book

I sometimes reflect on the journey that led to me now having published a comic book in French as an eBook. Like many who studied languages in school, I realised that despite 5 years of French, I couldn’t understand a native speaker, and I struggled to read a book written for native speakers. This didn’t seem right. I had some reading books written for learners of French, but there still seemed to be a lot of vocabulary that I didn’t know.

I started tinkering with the idea of a story that would only use words that English speakers already knew, plus one new word per page. The Taxi story in my comic was the first of these ideas. The Gnomeville story came later.

In the early 2000s, I showed my scribbled draft to a near-native French speaker, who gave me excellent feedback on my French. In addition to picking up a few grammatical nuances, I learnt that I couldn’t trust my textbook or my French-English dictionary. Around this time I also showed the draft to an artist colleague who was an author of children’s books. She suggested inking over the sketches and sending it to a publisher. I started working on the artwork and really enjoyed it. While I did send to a couple of publishers, they were not interested, so I decided to self-publish.

Feedback from other colleagues and friends led to the language summary section at the back of the comic, as well as page numbers. All the while I was learning more about language acquisition via reading, through my research in the field of computer-assisted language learning and computational linguistics. I learnt that 95% coverage is needed for comprehension (guessing of unknown words), and that glosses help with vocabulary acquisition, as do images. A subtlety I learnt more recently is that the best strategy for vocabulary retention is to first guess the word, and then look at the meaning.

I attended the Alliance Française to improve my French language skills. I continued to work on the comic, and brought the latest draft along to one of my lessons and was please to hear chuckles from my classmates at some of the humour. A few more language issues were sorted out thanks to feedback from the teachers.

Initially I was producing an A4 draft. This switched to A5 at some point. Then later I had the peculiar idea to make the comic fit into a DVD case, so the comic and CD could be sold together in a protective case. This became the default format for the comic that was released in 2014, at my choir’s La Musique de France concert. Feedback from someone at le forum led me to create a large format, which is roughly the size of a typical American comic book. So I ended up with many formats: small, small +CD in DVD case, large, large + CD. Finally, I now have it as an eBook as well. For my next issue I will be sticking with large format (and eBook), and make the audio a separate product. This reduces the number of ISBN numbers required, and the paperwork side of things.

Tonight I received feedback on my draft of the second issue, and the language side of things is in pretty good shape, so I’m hoping to have the full issue finalised by the end of May. I’m keen to have some progress after all these years. Also, the more episodes I do, the more useful it is for learners. I can see future learners reading one issue per day to get maximum benefit, or at least one issue per week.

Comprehensible Input

I came across this article recently while looking at on-line language learning groups and resources. Apparently there is a friction between those who understand the research on language acquisition and those who believe in language lessons. If one tries to learn or memorise language, it uses a different mental process to that used for communication, and doesn’t contribute to communication skill in the language, which explains a lot about people’s frustration with language education.

One point raised in the article is that early stage language acquirers tend to focus on content words, and not absorb the surrounding function words. This agrees with the observation that it is often easier to remember concrete nouns than the words that connect them in sentences.

How does this apply to my comic book? Well, my comic book attempts to make the input as comprehensible as possible for the complete novice. Anecdotal evidence suggests that this works. It also attempts to be as engaging as possible. Having heard chuckles from students of French when reading earlier drafts, I’d say that it does achieve that goal. Also, a recent customer said the following: “BTW, my 11yo read your book and I saw him giggling”. This makes me happy, as I was advised to develop this as a children’s reading resource.

Gnomeville comic book cover containing head of dragon with smoke billowing out of its mouth and the title

Gnomeville comic book cover containing head of dragon with smoke billowing out of its mouth and the title "DRAGON!" in large red letters

Episode 1 Progress

I’m very happy that people are starting to buy the ebook edition of my comic for learners of French with an English-speaking background. I’ve had purchases from UK, Canada and USA, as well as some kindle unlimited reads. It can be deflating when nobody buys the work you’ve put your heart and soul into, but then when they do, you are inspired to keep going with the vision.

On my Facebook page, I made a special offer for people who have read the comic book to write an Amazon review in order to receive a free copy of the narration by native French speaker Jeremy Marozeau.  This offer expires at the end of the month.

It’s also not too late to receive all audio tracks on mailing magnifica@gnomevillecomics.com the Amazon kindle receipt showing purchase of episode 1.

Meanwhile, the first additional resource for episode 1 is now available. I decided to focus on fashion and celebrity for this one.  It’s mainly pictures and links to articles in French about “le total look”. I hope it’s useful for beginners in French with an interest in fashion.

Where’s the Quality?

As a conscientious writer with an academic background I tend to try very hard to write correctly in all my publications and communications. Obviously sometimes one is rushed or typing on a small smartphone, so a few typos get by the self-editing phase. Occasionally I’m surprised at myself that I have typed the wrong spelling for a word, such as “their” for “they’re”, when I know very well the correct word to use, but in my haste the wrong word came out of my fingers. This seems to happen even for parts of words, where I always mistype some words the first time because they contain a sequence of characters that occurs frequently that leads me to follow with an incorrect one. An example for me is words that end in “in”, which will often automatically get a “g” after them, which I then need to backspace.

Some people don’t care about editing, and so be it. However there are some situations when I think it is our responsibility to be as correct in our writing as we can be. One of those situations is in resources for language learners. I have learnt through my attempts to write in another language that it is nearly impossible to write like a native speaker of the language. Languages are just too large to know all the phrases and collocations, let alone the vocabulary and grammar that most people manage to master. So, if you care about quality then the thing to do is to have a native-speaking proof-reader for your work. Some of the books in my collection have clearly made use of colleagues to do language checking, and that gives me a bit more faith in the authenticity of the language that I’m being exposed to. But in my recent scan of language books on Smashwords I was horrified at the poor quality writing, even just in the introductory blurb. There were some very poorly written English stories aimed at the Chinese ESL market. On the plus side, Chinese students of English would find them easier to read than stories with more English-like English, but it doesn’t give them the chance to absorb correct English grammar as they read. Likewise I found a Canadian book in French that, even with my B1-level French I could tell had incorrect grammar in the blurb.

So, advice to those writing stories for language learners (and anyone wanting to write as well as possible in a foreign language):

  1. Write stories in your own native (or best) language. It’s more likely to be correct.
  2. If you write in a foreign language (as I do), then it is imperative that you have a native speaker check it for you. You can’t trust (old) dictionaries, or sometimes even textbooks, to help you write correctly.
  3. Some techniques that can help you write correctly (before you get it checked) is to use a corpus-based dictionary, a concordancer, and a search engine. Check that words you want to use are actually used in the way you intend. Check the preposition that is normally used.
  4. Software is being developed that helps users improve or check their writing. MS Word has a grammar checker, so it can be useful for checking (but you can’t rely on it completely). Other prototype systems are being developed, some of which I saw at CoLing 2016 in Osaka recently, and another at the English Australia 2016 Conference in Hobart. Use the tools available to you.

The first time I had a near-native speaker check my comic book draft it was an eye-opener. I learnt that I couldn’t trust my old Cassells French-English dictionary, and that I couldn’t trust my high school textbook. The second (or was it third) time I had a native speaker read through the story she picked up an error that the first proof-reader didn’t. The final proof-reader was my narrator, who only remarked upon one phrase which remains in the comic “Le total?”, which occurs when a native speaker is more likely to use the expression “l’addition”. It is grammatically and semantically correct but unusual. I’ve allowed that expression to remain.

The sad thing for those who aim for quality writing is that there is possibly not much reward in it. There are many stories on Amazon and Smashwords that are full of grammatical errors, but they probably still earn dollars. Producing quality work takes more time and effort. Hopefully my comic book will find its audience that recognises the quality of the work and that it is worth the cover price.

 

 

 

Gnomeville eBook is Finally Here!

After many years in development, and release in physical form in 2014, my comic is finally available as an eBook.

Gnomeville comic book cover containing head of dragon with smoke billowing out of its mouth and the title "DRAGON!" in large red letters
Cover of Gnomeville Dragon! Episode 1.

This is the first episode in what is arguably the easiest book in French for native English speakers. Designed to introduce one or two new words or concepts per page, and to exploit the over 1,000 words that are the same in French and English, you learn the most frequently occurring words in French, while being entertained with a story about gnomes, mages and dragons. While the series is optimised for language learning, by using sight gags and visual humour it still manages to be entertaining from the first few pages. Follow the story of Jacques, Magnifica the mage, the gnomes Didi and Dada, and the griffon as they commence a quest to capture a rogue dragon.

The book includes further stories to reinforce the vocabulary learnt so far, as well as a crossword and songs. The mp3 file of the narration by a native French speaker of the Gnomeville Episode 1 story is available from the author on email of the receipt as proof of purchase (first 500 buyers). The first 10 customers will receive all audio tracks of Episode 1 (3 stories, 2 songs), while the first 100 customers will receive the narration and one song.

The comic book has been checked by three native/near-native speakers of French to ensure authenticity. It exploits several principles of language acquisition:

  • language can be acquired by reading extensively at a comfortable level of difficulty;
  • images increase retention of language;
  • glosses increase vocabulary retention;
  • repeated occurrences of new vocabulary increase vocabulary retention;
  • comprehension-based activities (eg. crossword) related to the reading improves retention of language;
  • once ~95% vocabulary coverage is achieved (episode 2), then it is possible to guess the meaning of new words, and confirm by checking the gloss after guessing, which further increases vocabulary retention.

In summary, this is a well-researched, well-edited, entertaining introduction to reading French via an extremely easy to read comic book. Read it before you read anything else in French. Read it now!