Jump to content
  • Sign Up

Little bit of help


Recommended Posts

Hey, so looking for a bit of help. Looking to use the AJATT model for Chinese, using subs2srs to produce the anki cards. 


Currently looking for the following in the software department. 


1. Ways to get full mkv/mp4 files for some mainland tv shows (红高粱,士兵突击,北京无战事 in particular)

2. Software for manually adding softsubs to the above shows


Ideally all software that runs on Mac (but I'm fine with buying a copy of Windows 8.1 and installing it as a VM in order to be able to use the necessary utilities) 


Any and all help would be greatly appreciated. 

  • Like 1
Link to comment
Share on other sites

You can usually use a PC client for sites such as Youku to download an MP4 or FLV file.  You'll probably need to make an account.  

I think the HD downloads tend to be MP4s but not sure.


Or if they are on YouTube you can use http://deturl.com/ to grab video files in a number of formats. 

The torrent site Asiatorrents is another possible source - large TV series are usually set as free downloads (so you don't need to upload to download them).


You can use WorkAudioBook to add softsubs and get timing done (you can export an SRT file from it).  

You'll need to make the mp4 file into an mp3 file to get this done.

You'd have to type them in yourself unless you have a script (most Chinese TV as scripts are not available).

It's time consuming but I find it's good practice - if you plan to study it intensively it's fine.  

Of course you only actually need to type up the lines of dialog you want to study.


I don't know if you are doing bilingual MCDs.  If so you'll need to translate the scripts yourself (unless there's a translation out there somewhere) and make SRTs for them too.

Or you can do it "on demand"  - if there's no translation available I just wait until I need the sentence and then translate it myself (or add a bit of context in English or Chinese).


Since I've already been down this path, here's some ideas:


There are SRT files available for A Bite of China which has very nice audio narration (and lots of it).  Very natural sounding Chinese (although obviously it's a documentary style).

And there are English translation SRT files available for several episodes of Season 1 at least (and it's available for download from Youtube).


And another possibility is to use dubbed Hollywood films, which sometimes have SRT files available (as well as English translation) - although it's tricky to get the right SRT matching the audio, and frequently the Chinese subs are just a translation and don't matching the dubbing (depends on the movie).  A few movies seem to be pretty nicely dubbed with fairly idiomatic Chinese, but of course it's not the same as actual native speech from native scripts (but I have found it very useful for learning about super powers and gamma radiation and playboy billionaires).  


If you want a more extensive sentence bank, there are also Anki decks of Mastering Chinese Characters and HSK examples that have sentences and audio.  The sentences are boring but OK if you want want some examples to learn simple stuff like nouns.  I use them to make MCD cards too, if my movies/TV don't have anything more exciting.

  • Like 4
Link to comment
Share on other sites

Appreciate all the info, few questions that spring too mind. 


I'm mostly sticking to sentence banks for now, and trying to transfer to monolingual (with uneven results) but am curious how you do MCDs with subs2srs, Got the feeling it was intended more for smaller, sentence sized bits of dialogue. If you had any insights into benefits and drawbacks of of SRS/MCDs I'd be curious to here. 


Second, maybe getting a little too deep in the weeds here, but for the MCDs are you using the built-in Anki cloze cards or an add-on? I used the anki clozes for a while and got pretty frustrated with them for a variety of reasons. 


Last, currently in China so probably gonna go with the youku route, that said noticed Asiantorrents is invite only, any chance someone here might be able to extend one, or is an admin level thing? 

Link to comment
Share on other sites

I think http://www.letv.com/ has a lot of shows and movies that are downloadable. The only problem is that some of them (or maybe all of them, not sure) are hard subed. I use the picture that Subs2SRS produces, as well as the English subs to help cue me, so I couldn't have the front of the card be showing the picture along with the hard subed Chinese... But with the ubuntu terminal and a few lines of code I was able to just chop off the bottom 1/5 or so of each picture, after I had converted the movie with Subs2SRS. Seemed to work really well - doesn't seem to take anything away from the experience.

Link to comment
Share on other sites

Ok, first, Subs2srs has an option to crop the bottom of the video by X pixels, so you can chop off the hardcoded subtitles.


For the benefit of others MCD = Massive [or Micro] Cloze Deletion - meaning instead of drilling SRS on vocabulary words, you drill on a paragraph or a sentence of chinese where you need to guess at what the missing word/character is.

E.g. at a micro level:

  • 白族人家用相似的手[----]   The Bai clan uses a similar method
  • The correct answer is 法

The context is the sentence itself (in Chinese) and optionally English translation.  If more advanced you might just have a hint in Chinese rather than a whole translation in English.


Regarding using subtitles for MCDs:


  • One of the key ideas of AJATT is that immersion is good, time spent is more important than perfect technique, and fun is good, so don't get too caught up in the "perfect" technique or source material or whatever -anything in Chinese will help, especially if you find it interesting.
  • So -- don't worry too much.  Having thousands of sentences with audio that are native (or close to native) is super useful even if there are some drawbacks. 
  • The MCD idea includes the idea of "micro" MCDs - where the MCD is very small.  So sentence level dialogs are in this category.
  • The context doesn't need to be a perfect translation - just enough to give you a hint as to what the Chinese should be
  • I think subtitles are ideal for bilingual MCDs as you can often find line by line translations.  Even an approximate translation is good enough for context.  Sometimes I add a better hint in English than the original translation.  Make it easy for yourself.
  • I use the sentence bank as a bank.  When I learn a word via conversation, textbook, reading - I check how frequent it is (first - if it's an HSK word I consider it frequent, second if it's in the SUBTLEX-CH list as a common word, third, if I am going to use it in my work/daily life, fourth, if it's a cool word like spider or diabetes or playboy).  Then if it's important to me, I search through my sentence bank for examples of the word.
  • I try to find a few sentences with that word in it.  
  • If the word is super easy to remember since I know both characters and they are logical, like 长跑 (long distance running) I'll just cloze the more important character like 长.  
  • If it's not so easy and is two character, but I know each character, I'll cloze delete one character in one sentence and one character in another e.g: 淘气.  
  • If one of the characters is unfamiliar, I'll cloze it twice, preferably as used in another word, like 淘气 and 淘汰.  This helps reinforce the character while you are learning it via multiple examples.  (Works best if you know all the other characters, if you don't know 汰 this will introduce a new character and maybe you need to find yet another word, etc etc.)
  • I have also find learning related characters like 攒 and 赞 at the same time helps differentiate them when you first learn them.  Only if they are all about your level and decent frequency.  
  • When testing yourself, you have a choice.  I recommend being strict on pronunciation (you must reproduce the sound including tones, else mark as failed).  But be soft on writing (got pronunciation correct but 80% of strokes right?  mark as hard, write the character a few times and continue).  You could even completely disregard writing correctness and just rewrite the character.  These days writing skill is expensive to learn but not too valuable to have.


Benefits/drawbacks of SRS-ing with MCDs:


  • Drawback:  You will learn vocabulary more slowly than people SRS-ing just words.  They will grind through vocabulary lists while you are still learning whether you want to say 记不下来, 记不起, 记不住 or whatever.  Their listening ability will be greater for a while as they listen for more nouns and verbs to get approximate meaning of simple sentences.
  • Benefit: You'll be better at using 上 去 下 来 到 了 过 呢 呗 什么 等 可 却 又 再 而已 都 那 的 得 地 because you can drill on these characters/words in context and get a feel for them, but vocab drillers will memorize a meaning and move on.  
  • Recommendation: Also drill full sentences (read them again and again until you can't get them wrong).  MCDs break up the sentence a bit much and you can't necessarily reproduce the full sentence fluently.  I frequently reply the audio in Anki and shadow it even after getting the missing piece right.  Even so I need to do it more.
  • Benefit - your reading skill and skimming skill will improve because you'll be constantly reading sentences (this is even better when it's monolingual -- actually I rarely read the English for sentences I'm familiar with because it's not necessary)
  • Benefit - you will memorize sentences and produce the right character even without reading well (just like you do with your favorite movies)
  • Benefit - words will pop into your head while you are speaking without even thinking about them - because you just know what will fit - you are drilling in context and so the same context will produce similar words
  • Drawback - lists of vocab words are cheap and easy to obtain.  Making MCDs is a constant process and you will need to study, add, cloze, drill.  (Using subs2srs and other tools is a bit of a shortcut)
  • Benefit - your MCDs are things you are interested in.  Wordlists are dry and boring.
  • Benefit - you can do paragraph level MCDs by grabbing stuff from wikipedia or any other source.

I use the built in Anki cloze functionality.  I usually cloze single characters - several in a sentence/paragraph.  


I use MDBG's Chinese Reader to generate word level definitions that I put into a field called "notes" that is on the back of each card.

So any unfamiliar words are listed there.  


I don't have any invites for Asiatorrents but if you PM me I will try to find someone who has.  No guarantee your exact shows are there but they do have many many Chinese movies and quite a few TV shows downloadable.

  • Like 3
Link to comment
Share on other sites


I don't have any invites for Asiatorrents but if you PM me I will try to find someone who has.  No guarantee your exact shows are there but they do have many many Chinese movies and quite a few TV shows downloadable.


Asiatorrents has open registration at the moment.

Link to comment
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Create New...