Harry Potter: The book that taught me Polish…

The title of this post is very misleading... I could already speak Polish at an intermediate-ish level before I started reading Harry Potter. However, this experience has really taught me alot!

What and why?

First, some background. Along with my own evolving ideas about language acquisition I've recently (maybe over the last 6 months?) become interested in the ideas of Stephen Krashen and Steve Kaufmann. They both emphasize "comprehensable input" as the primary means to language acquisition. This means reading and listening.

So, I decided that I wanted to read more in Polish. I had two primary goals:

  • To expand my vocabulary.
  • To read something interesting that I would enjoy.

I chose Harry Potter for the following reasons:

  • Its relatively simple: Like it or not, in Polish I read at about a 6th grade level, tops. Before I started reading it, people told me that its simple enough for children but still interesting to adults. I gambled that it'd be easy enough to get the gist without a dictionary.
  • I knew I'd enjoy it: I've really liked all the Harry Potter movies. But! I've never read the books. So, it'd still be interesting because it'd reveal more about a story I already like.

The only major downside is, its not a Polish novel! Well, I didn't know where to start with something Polish. How could I be sure that it'd be at a level that I could understand without a dictionary and that it'd have a story I would enjoy?

Now, of course, I want to read classic literature and works by Polish authors. But everyone has to start somewhere! After successfully reading a few Harry Potter books, I should be able to tackle some weighter stuff.


I am using the following technique: I have two bookmarks. First, I read, purely for enjoyment. I don't bother with words I don't know, I just try to pick-up the meaning from context. I place my first bookmark where I finish reading.

Later, I go back through the pages I read, looking up all the words I don't know, making cards for them in Memorati™. I call this second phase "translating." I place my second bookmark where I finish translating.

When I started, I was also underlining the words I didn't know in pencil during the first reading phase. By the time I read through not quite half the book (page 130-ish or so), I decided that this was too disruptive. I was worried about missing words that I should underline and it interfered with getting into the "reading zone." According to research by Stephen Krashen, the reading zone is very important. So, I've stopped underlining and I've been notably more immersed.

Creating Flash Cards

To date, I've only managed to translate through page 109 (I began, in earnest, around the beginning of January) and create over 1,600 flash cards in Memorati. This part of the process has undergone the most tweaking since it needs to be as efficient as possible to maintain a good ratio between (A) the time I spend doing busy work and (B) the time I spend taking in comprehensible input.

At first, I looked up each word in a paper dictionary and entered them directly in the web interface of Memorati. I was adding about 20 words/page and taking about 1.5 hours to do so. This was slooow!

So I started looking up the words in an online dictionary. After trying a few, my favorite is the "Portal Wiedzy" on Onet.pl. This helped. But after working on it enough, I began to wish that I already had Lingwo.dictionary complete and could wire a translating dictionary directly into Memorati. I knew that the process could be made more efficient.

I managed to locate a free Polish-English dictionary online. Its just a simple text file, listing words and their English glosses. I wrote a helper script, which I now use to read a YAML file containing the Polish words. This generates another YAML file, with the gloss from the dictionary if it can find one. I edit and fill in this file, then run another script which loads the cards directly into Memorati. Et voila!

I could expand this to call to internet translating sites when the words aren't in the dictionary file, but this is still the process I use for creating new cards.

Quizzing the Flash Cards

I also had to make some adjustments to the way I quiz myself with Memorati. Originally, I was aiming at "perfection." I would require that I answer each card forwards and backwards one time before moving it to the next level. This meant that it took a ton of time to do my flashcards each day!

So, next, I switched to only requiring "backward" (seeing the English and having to give the Polish gloss) because it was more difficult of the two. I figured if I could get the hard one, it was enough to say I know it and move on. This helped only very, very little because it would still take so long to know a card this well.

This is when I realized that I need the flashcards only to help me become familiar with the word. I didn't need to have it down really strong. In fact, I realize now that it isn't possible to get "perfection" from flashcards alone. But if I used the flash cards to constantly stimulate my familiarity with a word, it would increase comprehension when I encounter it again in real context. Its the "comprehensible input," afterall, that yields the best acquisition of the vocabulary.

So, now I only require that I answer the card "forward" (see Polish and give English gloss) before moving it to the next level. This is quicker, easier and focuses on familiarity with the word, which is the purpose of using the flashcards.

When to quiz again?

Memorati is a spaced repetition system. This means that after you answer a card correctly, you won't be quizzed on that card again for some period of time. This period of time is calculated based on the "level" of the card using some mathematical algorithm. To demonstrate, here is an example using a linear algorithm with an interval of 1 day:

  • When a card is first added, it is at level 0. This means when you hit "Quiz" it will definitely be quizzed.
  • If you answer the card correctly, it will be moved to the next level.
  • At level 1, the card won't be quizzed again for 1 day.
  • At level 2, the card won't be quizzed again for 2 days.
  • At level 3: 3 days, and so on.
  • If you answer the card incorrectly at any point, then it goes back to level 0, and will be asked again until you get it right.

There is an endless number of algorithms you could use. For example, an exponential algorithm with a base of 2 and an interval of 1 day, would go like:

  • At level 1, the card won't be quizzed again for 1 day.
  • At level 2: 2 days.
  • At level 3: 4 days.
  • At level 4: 8 days, and so on.

When I started creating flash cards for Harry Potter, I was using the above exponential algorithm. But I found that I was being quizzed far too often on cards that I already knew well.

I decided that I needed an algorithm, where there would be two phase: (I) Where I am first familiarizing myself with the card, and (II) where I am periodically reminded of the word. So, basically, if I got a card right every time (not that likely), I want to see it again 2-3 times and then I want it to be "shot into space" only to return once in a while.

This means a more aggressive increase at each level. I decided on exponential with a base of 3, meaning:

  • At level 1, the card won't be quizzed again for 1 day.
  • At level 2: 3 days.
  • At level 3: 9 days.
  • At level 4: 27 days.
  • At level 5: 81 days.

As you can see, we have phase I at levels 0-3, where we first learn the card. Assuming I get it right every time, this means it only takes 14 days, before it gets shot into space. And, oh, does it get shot into space! Level 4 is approximately a month, and level 5 is approximately 2.5 months.

The main thing I learned, is that its OK to use a very aggressive algorithm because I likely won't get it right every time anyway. A new card will hover between 0 and 3 for a long time. But if I get it right 3 consecutive times (which takes 14 days, at minimum) then its launched into orbit. If after the 27 days, I've forgotten it, well, then it comes back into active play again. I've only just started to see some of my level 4's come back.

How much translating?

The translation phase is useful. It helps make the reading flow progressively better, although, it can't be given full credit. The two phases really work in tandem. A word is first introduced in reading, familiarity is increased with flash cards, and then understanding is further increased by encountering the word again in reading.

Anyway, the translation phase is also very tedious. So, I put alot of effort into trying to keep trudging forward but without burning out! This should still be fun. And I do enjoy going back over the text. Looking up the words does reveal new depth in the text. But its still easy to do too much.

In the beginning, I decided to limit myself to 2 pages/day. This amounted to about 2 hours/day, adding about 40 cards, after all my process optimizations.

I also decided to limit myself to translating only through half of the book (up to page 156). Afterall, I needn't worry about missing any important words I might need later. If a word is important, I will encounter it again in the next book or other texts I read. At 2 pages/day, I calculated that I would reach page 156 on March 24th. This seemed like a reasonable goal

However, starting last week some time, I began to notice that I was adding only about 5 words/page during action or conversation and about 10 words/page during descriptive passages. I was averaging only about 25 minutes/page. So, I decided to double my pages per day but to keep the end date of March 24th putting me on page 250 when all is said and done!

Its amazing to imagine, but if I can keep this current rate, I will end up having made a flash card for every unknown word in the first 3/4 of the book. And maybe the number of unknown words/page will drop even lower and I can increase my rate to finish translating the whole book! Yeah, yeah, I know its not that important. But it'd be bad-ass.


My vocabulary is soaring. I'm enjoying my book. I've learned alot about spaced repetition and flash cards. The one thing I see missing from the whole picture is audio.

I think after I've finished the first Harry Potter book, I will move on to the second one but also integrate use of the audio book. This will likely increase the amount of time it will take me to get through the whole book but I after finishing this one I will already be familiar with quite a bit of its vocabulary which I hope will help to even out the total time.

All in all, I'd say this project is going swimmingly...


Awesome post! Right now I

Awesome post! Right now I have just begun to read my first bit of fiction in Japanese. However, I'm finding that my vocabulary is very very bad at the moment, and I have far too much unknown vocabulary per paragraph. I would estimate about 50 words a page - probably about 75% of the words I don't know. Once I know the words, the grammar is mostly fine.

So I am sort of doing the opposite approach to you. First, reading through a section and picking out the words I don't know. Adding them to my srs. Also focusing on vocab building in other sources. Then I will go through each section for enjoyment, perhaps a week or two after learning the vocab. I am hoping after getting through maybe one third of the book, maybe half, I will be able to switch around the other way, and read for enjoyment first. Or, if not this book, than the next.

I am learning Japanese, so I have the added difficulty of not recognising all the characters. But, I am confident that I can get through this book, still enjoy it in the end, and well, the next book will be infinatly easier.