Jump to content
Chinese-forums.com
Learn Chinese in China

Graded Watching - TV series ranked by difficulty


wibr
 Share

Recommended Posts

Site Sponsors:
Pleco for iPhone / Android iPhone & Android Chinese dictionary: camera & hand- writing input, flashcards, audio.
Study Chinese in Kunming 1-1 classes, qualified teachers and unique teaching methods in the Spring City.
Learn Chinese Characters Learn 2289 Chinese Characters in 90 Days with a Unique Flash Card System.
Hacking Chinese Tips and strategies for how to learn Chinese more efficiently
Popup Chinese Translator Understand Chinese inside any Windows application, website or PDF.
Chinese Grammar Wiki All Chinese grammar, organised by level, all in one place.

Might be worth it to add genres.

I haven't watched any of these series, fore-mostly because most of these are from Taiwan, but also because all the Chinese series are (ancient) period focused. 

Creating a distinction between current day and period series is, I would say, important. As period series include a lot of dated words you'd never hear nowadays.

Link to comment
Share on other sites

12 hours ago, wibr said:

a ranking based on the number of words, to find TV series at your level

 

Thank you for this project!

 

Just as a feedback: I ran the vocabulary list of "Decoded" in CTA:

5950 words

2292 unique characters

but, apparently 68% (~4000) of the words are non-HSK words. 😲

 

Having a lot of non-HSK words is not all that surprising, but I wonder how your lists really helps identifying the difficulty level. Is it just number of total words? 

Link to comment
Share on other sites

@Weyland

Genre would be nice to have, but I don't think it's worth the effort. Usually you can guess it by the name or check the linked wikipedia page. You can sort the table by any column, while currently the majority of the shows are from Taiwan I counted more than ten shows from China which are playing in modern times with normal language.

 

@Jan Finster

Keep in mind that the list does not include the basic 1000 words which are used by more than 90% of all shows and are available in a separate list. The HSK coverage for those should be higher, although also not 100%. HSK doesn't contain words like 閉嘴. I think in general HSK vocabulary is more oriented towards written language.

 

I would say the number of words you need to know is a good indicator for difficulty? The differences can be pretty large, especially if you look at the number of words per hour in the first four hours.

  • Like 2
Link to comment
Share on other sites

  • 2 weeks later...
On 2/9/2020 at 9:38 PM, wibr said:

If you have soft subs for more shows I'd be happy to include them.

 

How did you do your statistics? Did you manually download all subs and then use CTA?

Is there a way to automate this for the whole show (not just for single episodes)?

Link to comment
Share on other sites

4 hours ago, Jan Finster said:

Is there a way to automate this for the whole show (not just for single episodes)?

Merging all subtitles into a single file is one way.  The other way would be to write a Lua script to process all files in a single directory (or similar).

Link to comment
Share on other sites

22 minutes ago, wibr said:

I manually downloaded the subtitles,

 

I thought so... That is a lot of work! 😞 It thought of doing it for 都挺好, but is has 45 episodes...😨 

 

22 minutes ago, wibr said:

I guess you could merge all the subtitles files into one and load it in CTA.

 

18 minutes ago, imron said:

Merging all subtitles into a single file is one way.

 

This is the obvious solution for us mere mortals, who have no idea what phython or Lua is...😉

 

Link to comment
Share on other sites

32 minutes ago, wibr said:

Where is 都挺好 available with Chinese subs?

https://www.youtube.com/watch?v=YtzqsA-a8MM&list=PLQqbdnAgoRmYhfPJgYB9YQxDsNQ-ErQBd

 

Thanks for your tip, but I do not really understand how to run those github scripts.... 

But, you are right, it should not take too long. I could use downsub.com. I will do it at some point and send you the merged files.

Link to comment
Share on other sites

10 minutes ago, imron said:

Are these official subs or youtube 'speech to text' subs?

I do not know. All I can say is that the subtitles do not always follow the speech word by word. Sometimes they use a different way to express what is being said. I guess this is not a feature of TTS (?)

  • Like 1
Link to comment
Share on other sites

OK here it is: 

All is Well 都挺好

46 Episodes

Available on Youtube: https://www.youtube.com/watch?v=YtzqsA-a8MM&list=PLQqbdnAgoRmYhfPJgYB9YQxDsNQ-ErQBd

Topic: family relationships, family conflicts

 

My CTA data:

Total words: 76839

Unique words: 5213

56.97% of unique words are non-HSK

23.76% of all words are non-HSK

2022 unique characters.

 

 

I think this show is fantastic for learning the following vocabulary: every day life, family relationship, food, some business language!

I am going to binge watch the last 4 episodes today and then I am done :) 

AllisWell-complete1-46.txt

  • Like 3
  • Helpful 1
Link to comment
Share on other sites

  • New Members
On 2/22/2020 at 9:54 AM, Jan Finster said:

Yes, when you watch there are only English soft subs, but using Donwsub.com or Lingq you get the Chinese subtitles too.

I will send it soon.

I didn't know that website, downsub.com, it's interesting to know new things; it seems like it downloads subtitles from different video platforms, and it allows to auto-translate, but is that translation good? Any experience with that?

 

Thanks!

Link to comment
Share on other sites

13 hours ago, Lu Jo dido said:

I didn't know that website, downsub.com, it's interesting to know new things; it seems like it downloads subtitles from different video platforms, and it allows to auto-translate, but is that translation good? Any experience with that?

 

 All I can say is that the subtitles do not always follow the speech word by word. Sometimes they use a different way to express what is being said. I guess this is not a feature of TTS (?) They are the same subtitles that can be extracted by Lingq.com and I am quite sure Lingq.com does not extract auto-translates. I have not come across any non-sensical text in the subs of that show either. As much as I can assess the correctness of the grammar at my stage, it sounded OK.

Link to comment
Share on other sites

  • 5 months later...

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...