Jump to content
Chinese-Forums
  • Sign Up

Who is interested in a Pimsleur ZDT Flashcard project?


flameproof

Recommended Posts

Who is interested in a Pimsleur ZDT Flashcard project?

I have the written transcript and put the lessons 1-10 in the ZDT format to be used with the ZDT flashcard software.

http://zdt.sourceforge.net

I will work on it and put more lessons into the ZDT format. Feel free to download the attachement and import it to ZDT.

Pimsleur1-10.txt

Link to comment
Share on other sites

I am still working on it. Problem is that I could not find a Pimsleur wordlist. http://www.ezmandarin.com has only the first sentences of each lessons, but not a wordlist for each lossons.

http://marcelnijman.demon.nl/mandarin/pimsleur/ has a wordlist, but is not complete for parts of II and III. New word apperances are also mixed up a little.

I now have a word list up to III-21 and will post it here as soon as I am through with it.

For the ZDT list I posted earlier, I will work through that too. I wonder how many lessons I can put into one ZDT file? 15 lessons match about 100 words, which is maybe a bit too much for a flashcard session.

Stay tuned!

Link to comment
Share on other sites

Hi flameproof,

Sounds like a really cool project. I just want to add that the ZDT is capable of exporting its lists to other formats as well. Currently it supports VTrain's rtf format right out of the box. But it should be just as easy to add support via plugins for other programs like SuperMemo or whatever your favorite flashcard program is. If people express an interest (and can give me the format), I can add it. Hopefully this can make your work even more valuable and accessible to everyone.

Keep up the good work!

Chris

Link to comment
Share on other sites

the ZDT format is a simple text form. So should be a piece of cake to put it into other formats, incl. Excel, or whatever. I'll be busy with work this week, but hope to update next weekend. The one I posted earlier should then be sort of obsolete.

Link to comment
Share on other sites

  • 8 months later...

Hi flameproof,

I'm a big Pimselur fan. I whipped up a little web-based flashcard app (I wanted to have access to flashcards from public computers where I couldn't install ZDT) and I'm curious if I can add your Pimsleur wordlists to it (with full credit to you, of course). (Apologies for the broadcast message; I couldn't figure out how to backchannel this.)

The site is http://hyquiz.com

Thanks in advance,

Joe

Link to comment
Share on other sites

  • 2 weeks later...

I just got Pimsleur from my friend and really like it. I'm learning unit 4 now. The bad thing about Pimsleur is that it doesn't provide "text book", therefore, I try part 1 as follow. correct me if typed wrong.

*************************************************************************************************

Man: 對不起 (duì bu qǐ), 請問 (qǐng wèn), 你會說英文嗎? (nǐ huì shuō yīng wén ma?)

Woman: 不會, 我不會說英文 (bù huì, wǒ bù huì shuō yīng wén)

Man: 我會說一點普通話 (wǒ huì shuō yī diǎn pǔ tōng huà)

Woman: 你是美國人嗎? (nǐ shì měi guó rén ma /mǎ?

Man: 是, 我是美國人 (shì , wǒ shì měi guó rén)

***************************************************************************************************

Should I continue with this project?

Link to comment
Share on other sites

  • 9 years later...

10 years are passed by already...

However, I've just written a software to automatically transcribe any Pimyleur Mandarin lesson. The transcriptions can be imported to ZDT or Anki or whatever you like in order to see first the english and then the chinese transcription. Also, you can listen the corresponding part of audio. It's about 90% accuracy.

 

Each lesson (30 minutes audio) produces around 500 single entries whereas only 140 matter because many of them are redundant (resulting in about 70 cards per lesson). So these 500 entries per lesson need to be reviewed, some of them corrected and selected for later export *manually*. This is taking 5-10 minutes per lesson depending on the chinese skill level of the reviewing person.

 

If there are still people interested in, I would put all 90 transcriptions on sourceforge, so we can collectively work on it to iron out the 10% of transcribe errors. Once corrected, we can export them to ZDT or any other learning software.

Link to comment
Share on other sites

I'm attaching a transcript example csv-file to demonstrate how it works. The columns are separated by tab.

 

Columns explained:

 

1) The first column is the audio file. The whole lesson has been split up into small audio files, so ideally only one speaker is speaking per audio file. During a lesson, many single words are repeated multiple times which is useful when listening but annoying when reviewing a transcript. This is why I have cut out parts during less than 1.5 seconds which are surrounded by pauses during longer than 1 second. Taking into account a one-second-pause makes sure that single words which may compose a sentence are not cut out if the speaker made a short breathing time between single words. 

 

2) The second column is the length in seconds of the audio file.

 

3) The third column is the start time of the audio file within the lesson.

 

4) The fourth column is the detected speaker. CHIN_FRAU means chinese woman. CHIN_MANN means chinese man. ENGL_MANN is the english speaker.

 

5) The 5th column is the confidence or probability in percent that the correct speaker has been detected.

 

6) The 6th column is the first transcript, either in english or in chinese, dependent of the detected speaker.

 

7) The 7th column is the confidence or probability in percent that the transcript is correct.

 

8) The 8th column is pinyin of the chinese transcript. If the transcript is english then it's empty.

 

9) The 9th column "Status" needs to be filled in manually in order to decide which entries belong together and which ones have to be exported to ZDT or other learning tools. There are a few codes for this column:

e = take the englisch text

c = take the chinese text

w = wrong speaker

t1 = take the first transcript alternative

t2 = take the second transcript alternative

t3 = take the third transcript alternative

 

e and c may be followed by an integer meaning that the following rows need to be merged together.

Since an english sentence or word is always followed by the corresponding chinese translation, you will first enter e followed by c some rows later.

Example:

e2 = take the english text of this row and append the english text of the following row

c3,t1,t2,t1 = the corresponding chinese translation spans over the next 3 rows whereas transcript1 has to be taken for this row, transcript2 for the next row and transcript1 for the third following row

 

10) The 10th column detects duplicates. It shows the audio file of the first found duplicate of the transcript in column 6. This is nice to see when deciding whether to take this row into account or not because duplicates are annoying while reviewing transcripts.

 

11 to end) These columns represent 2 further transcript alternatives. Each alternative is using its own columns "Transcript, Confidence, Pinyin and Duplicate" which are working the same as I have described above already.

 

Let's take a look at the attached csv file where I have added already the "Status" column:

 

As you can see, the first row I want to take into account is row 41. The status code I typed in is "e2" meaning that I want to take the first english transcription alternative of this row and the following row, resulting to "today the weather is very good isn't that so".

 

Btw. row 40 contains  a weird english transcript "archer Chungcheong saucer" which is due to the fact that the speaker has been incorrectly detected. So instead to have be transcriebed to chinese, it has been transcribed to english which does not make sense here. I could have typed "w" as status code but since I'm not interested in this sentence, I just skip it.

 

The next status code is "c" in row 44 which means I want to take the first transcript alternative of the current row which is "今天 天气 很好 是吧" which is the answer of the english text before.

 

As you can see in this example, from the 92 rows only 18 has been taken into account, resulting to 9 learning cards.

The rest is either redundant or does not have a translation as for example the chinese dialogs at the beginning of each lesson.

 

Your feedback is welcome.

Link to comment
Share on other sites

Legalities aside, you would be better off spending the time learning chinese rather than preparing to to learn.

 

I think there is a good reason there are no transcripts for Pimsleur, it is supposed to be an audio based course.

Link to comment
Share on other sites

You are quite wrong Shelley.

There is even an *offical* transcript for mandarin pimsleur 1. You can purchase it legally.

However, as for me, I didn't want to read a traditional book or paper. Instead I want to use flashcards on my mobile phone or computer. I tried both ways already and must say that it's much more efficient for me using flashcards on my mobile phone than reading a transcript book because the flashcard system is monitoring my progress and knows which cards I need still to learn. Moreover, flashcards can play sound, which a paper book cannot. Also, searching for a forgotten word is much more painful in a paper or audio version than in an electronic version.

 

By the way, I listened more than the first 60 lessons of Pimsleurs Mandarin *before* I ever took a look into a transcript. This took me multiple *years* (with some longer pauses in between though) because I wanted to continue only when I could remember more than 80% or even 90% of a lesson.

 

Now, I'm quite good at understanding simple chinese sentences. My native chinese language teacher was blown away when she heard me speaking chinese for the first time because I did not at all sound like a beginner she said even though I was in an advanced beginner class at the time.

So as for me, I think Pimsleurs is very good to train listening skills and pronounciation. The problem is that I can't read nor write chinese characters yet. I'm just beginning to learn them and it's a *great* help to have flaschcards with the transcriptions of what I have thoroughly listened to already. 

 

But ok, this is only *my* preferred way to learn. You may have another preference.

 

Whoever is interested in the transcripts, just contact me.

Link to comment
Share on other sites

Many apologies. I obviously didn't do enough research on the subject.

 

I listen to Pimsleur  when I want "hands free" learning, never thought there would be texts. I inherited the audio only from a late friend's book collection, I may have to go and check if he had the transcripts too.

 

I understand about using electronics "books, etc" , I too like using things like that.

Link to comment
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...