Using Chinese Text Analyser with Subs2SRS and Anki

Yadang

Edit: All of the things the spreadsheet did can now be accomplished natively in CTA.



This post describes how to combine Chinese Text Analyser with Subs2SRS. The user is able to add definitions and pinyin to all the subtitles containing words they don't understand, and suspend all the other cards. This saves the time of sifting through each line of the movie, deciding if one knows all the words on the card, and adding unknown definitions and pinyin, as well as suspending all the subtitles which they already fully understand.


It also tries to add more incentive to studying: because the user doesn't have to sift though each and every subtitle card, they only get glimpses of the movie from the cards they do study, incentivizing them to study all the unknown words in order to be able to watch the movie with good understanding afterwards.


This is done through Excel. Basically, excel takes the output from the exported deck, temporarily removes all the non-essential stuff, you paste this into CTA, mark any known words as known, then export. You paste it back into the excel document, which then combines all of the definitions and pinyin for each card (CTA exports each known word as a new card, so excel then combines all of that information for each subtitle). Excel also adds a tag (which you can choose) to the cards that it adds definitions to, so after importing into anki, you can suspend all the cards without unknown words on them.


This is particularly helpful for movies where only the Chinese subtitles can be found, and so the user can create cloze deletion cards with the Chinese subtitle, using CTA to add definitions and pinyin to any unknown words.


More detailed in instructions are on the first sheet of the excel document. I've also made a short video on how to use the spreadsheet.


Please let me know if you find any problems or have suggestions for improvements.





P.S. If you're looking for movies to use with Subs2SRS and CTA, I've made a thread with anki decks people have made using Subs2SRS that you can download. Note that this spreadsheet can only handle 10,000 notes, so at this time it's not possible to download the master deck (a deck consisting of all the movies converted thus far) and use this spreadsheet to process all of the words at once. It looks like Imron's CTA feature to do this is going to come out soon, which will be able to handle larger volumes of sentences.

Add Pinyin, Defs to Subs2SRS Anki Deck.xlsx

imron

Good stuff!


For those interested, this was the reason I decided to add Lua scripting support to CTA.


Once it's done, instead of needing to have spreadsheets like this you should just be able to open your original data in CTA, run a script, and get the output you need without needing to run any intermediate steps (i.e. removing the need to do all of paragraph 3 in the OP).  It'll also be far more flexible and configurable.

Flickserve

Good stuff.


Imron, expect another order for CTA. :P

imron

The more the merrier :mrgreen:

imron

The latest version of CTA (0.99.16) now has Lua scripting support.


Included with the release are a bunch of Lua scripts, including one I've called subs2anki.lua.


This script replicates all the steps from the above spreadsheet, so you can go straight from exporting the subs2srs subtitles from anki, then run the script, and then get as output the same information you would have got from the last worksheet in the spreadsheet (all the unknown words, and their pronunciations and definitions).


Let me know if you have any problems or questions about it.

  • Recent Posts