Most Frequently used characters vs combination

@imron I agree in general, but I do find having frequency lists based on popular media to be very useful.... for consuming popular media.


The problem is what one considers popular can be hugely different from what some-one else considers popular. The sun and Times use different vocabulary and different grammar. The same is true for Popular science and Fashion magazine. There do exist a few decent lists, but these are basically just a 'random' selection. If you're interested in these you can google for 'corpus mandarin'.




It's just I don't have a very complete collection of texts of popular Chinese media... otherwise I could make a great big text file and CTA it.



If you want a more personalised approach collect what you like to read and analyse it with a tool like Chinese text analyser. IMHO a 'very complete collection' is a ghost, it does not really exist. You can collect a large amount of data from a very large number of sources however always some sources will be missed. Or the amounts of the different sources are not properly balanced. For all practical study purposes a decent amount of text from a decent amount of sources yields good enough results. Even if you have a very complete set, the moment you open the newspaper you're out of luck as after refugees, greece and a nuclear deal it's likely a new 'pet subject' will come up with again a new vocabulary distribution. Vocabulary distributions in popular media changes over time depending on the latest hypes. 


I understand where you come from, I chased the same ghost for some time, but for learning puposes I think a shorter term approach is better. Learn what's usefull 'today' and don't worry about tomorrow.

