anthony_barker Posted June 7, 2006 at 08:20 PM Report Share Posted June 7, 2006 at 08:20 PM I find it surprising that there is so much emphasis on learning characters. I have always found it easier to learn words and then recognize the characters within the words the way Chinese do naturally. Of course some characters are words in that case they can be learnt individually. That said there are a lot of lists of characters by frequency what about words by frequency? The only one I know is in Wenlin. What about word flashcard software? Has anyone done a frequency list of spoken vocabulary? Is suppose that this could be done by analyzing subtitles from movies and tv shows and breaking up the words by doing lookups on the adso db. Quote Link to comment Share on other sites More sharing options...
roddy Posted June 8, 2006 at 02:13 AM Report Share Posted June 8, 2006 at 02:13 AM Agreed. I think you ideally need a two-pronged approach - you need to learn words while learning vocab, but to learn characters while learning to write those words. I think there's a spreadsheet on here somewhere with a list of word frequency, I'll see if I can find a link. The only ones I've seen though are based on large chunks of text from the internet, usually news - which means you get very 'written' vocabulary and virtually nothing spoken. I'm not even sure if there are any significant corpuseseses of spoken Chinese to work from. The subtitle idea is a good one though . . . Word flashcard software - look at zdt or plecodict. Quote Link to comment Share on other sites More sharing options...
student Posted June 8, 2006 at 02:29 AM Report Share Posted June 8, 2006 at 02:29 AM I think you can find what you are looking for on this thread, The first 30,000 Chinese words by frequency http://www.chinese-forums.com/index.php?/topic/3997-the-first-30000-chinese-words-by-frequency Quote Link to comment Share on other sites More sharing options...
gato Posted June 8, 2006 at 03:04 AM Report Share Posted June 8, 2006 at 03:04 AM That list is of questionable value because its frequencies seems to be based on government press releases (circa 1975): Here are its most popular 20 words: http://www.chinese-forums.com/phrases.xls 我们 /we/us/ourselves/ 可以 /can/may/possible/able to/ 他们 /they/ 进行 /to advance/to conduct/underway/in progress/to do/to carry out/to carry on/to conduct/to execute/ 没有 /haven`t/hasn`t/doesn`t exist/to not have/to not be/ 工作 /job/work/construction/work/task/ 人民 /(the) people/ 生产 /to produce/manufacture/ 这个 /this/ 发展 /development/growth/to develop/to grow/to expand/ 就是 /(emphasizes that something is precisely or exactly what is stated)/precisely/exactly/even/if/just like/in the same way as/ 问题 /problem/issue/topic/ 国家 /country/nation/ 中国 /China/Chinese/ 我党 /our party/ 这样 /this (kind of, sort of)/this way/such/like this/such/ 革命 /(political) revolution/ 自己 /self/(reflexive pronoun)/own/ 不能 /cannot Quote Link to comment Share on other sites More sharing options...
gato Posted June 8, 2006 at 03:35 AM Report Share Posted June 8, 2006 at 03:35 AM This site provides a word list (limited to two-character words, or "bigrams"). http://lingua.mtsu.edu/chinese-computing/statistics/bigram/form.php?Selected=FICT Its top-20 words based on a selection of fiction writings (somewhat closer to the conversational language) are: 1 一个 57186 4.21834619678 57186 2 什么 45946 7.66008832863 103132 3 没有 40408 5.89442863226 143540 4 自己 35411 8.22590596472 178951 5 我们 33208 4.37625854551 212159 6 他们 31875 4.56612189359 244034 7 知道 22390 7.78745189666 266424 8 起来 21237 5.37620423292 287661 9 这个 21137 3.83614462594 308798 10 时候 19493 7.78539684589 328291 11 这样 19203 5.23431393801 347494 12 怎么 17118 7.3134229469 364612 13 已经 16604 8.44503875864 381216 14 现在 16279 5.7008234155 397495 15 出来 14315 4.45046966356 411810 16 不能 13410 4.15616711186 425220 17 还是 13211 3.71041616582 438431 18 不知 12635 4.47906889549 451066 19 可以 12462 6.5032612462 463528 20 女人 12426 5.08744035395 475954 The top-20 based on news articles are: 1 中国 53185 4.69339339612 53185 2 美国 31840 5.19176188171 85025 3 发展 26535 7.14269896701 111560 4 经济 22087 8.07034210667 133647 5 国家 21076 4.84091033157 154723 6 问题 19992 8.50211672358 174715 7 一个 19835 5.08647326702 194550 8 工作 18904 6.89133239246 213454 9 台湾 18020 8.39363573728 231474 10 社会 17841 6.6029564076 249315 11 政府 17667 8.04318829791 266982 12 记者 17154 8.08978189963 284136 13 我们 16676 7.17933037085 300812 14 人民 16651 5.06174414404 317463 15 进行 15084 6.18844527135 332547 16 北京 14659 8.76122765868 347206 17 企业 13554 8.02096355855 360760 18 表示 13347 8.00097495209 374107 19 国际 12712 5.70882614905 386819 20 他们 12326 6.58766409264 399145 Quote Link to comment Share on other sites More sharing options...
gougou Posted June 8, 2006 at 07:15 AM Report Share Posted June 8, 2006 at 07:15 AM top-20 words based on a selection of fiction writingsTheir definition for word seems to be any two characters - no matter whether the word actually ends after those (I'd be surprised if 12635 instances of 不知 were not followed by 道), or even in their middle (不能). Quote Link to comment Share on other sites More sharing options...
Guest realmayo Posted June 8, 2006 at 08:51 AM Report Share Posted June 8, 2006 at 08:51 AM I'd agree with the two-pronged approach, I tend to test myself from English to Chinese on the single characters and Chinese to English on the words (ie vocab). Quote Link to comment Share on other sites More sharing options...
wrbt Posted June 8, 2006 at 08:09 PM Report Share Posted June 8, 2006 at 08:09 PM I can go over a character over and over and will often keep forgetting it if I just learned it by itself, but once I've seen it in a few different words it's locked in. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.