Jump to content
Chinese-Forums
  • Sign Up

Frequently used chengyu project


chrix

Recommended Posts

When learning chengyu, at least in Chinese one has to confront the question: do we go with the classical ones taken from the Grand Masters or with nice stories behind them or do we also include Chengyu that are frequently used nowadays but might not hail from a classical source? I guess you just have to bite the bullet and know at least a couple of hundred chengyu, because you need to be able to both demonstrate your command of the classics, but also you have to know the run-of-the-mill chengyu nobody would call especially classic, but which get used all the time nonetheless.

Now I have seen the great lists and everything people have provided here on this forum, and I already have profited a lot from usign those. But those lists usually include at least a couple of hundred chengyu, for instance the Singaporean high school list 258, or 155 for the HSK list. And then of course there is a lot of rather subjective top ten lists around.

What I've tried is to find a list of 30-40 chengyu that can be considered to be essential, both from the classicist's point of view and the perspective of popular usage. For this, I used the appendix provided by the Chengyu dictionary from the Taiwanese Ministry of Education (MOE), which contains 48,000+ chengyu that are assigned numbers according to how many sources include them (by sources they mean chengyu dictionaries and textbooks and the like, they list them here) . I think that’s a good compromise. The numbers will still be skewed towards ‘classical’ chengyu because that’s what gets put in most compilations and so on, but frequently used chengyu not from classic sources still have a chance (yes even words like 馬馬虎虎). So I have taken the 37 chengyu from the MOE Chengyu list that appear in 25 or more sources (27 is the highest score in the sample of 48,000+ chengyu) and put them into a list. (the translations are mostly from CEDICT, I looked up the forms that weren’t in CEDICT). I hope though that these 37 chengyu are a good compromise for a learner of Modern Mandarin: frequently used but still considered classical by many.

Please let me know what you think of this list, especially if there are any chengyu on this list that might not be that frequently used in Modern Mandarin. Thanks

1. 一丘之貉 yīqiū zhī háo jackals of the same tribe (idiom); fig. They are all just as bad as each other.

2. 掩耳盜鈴 yăn'ĕr-dàolíng to plug one's ears while stealing a bell (idiom); to deceive oneself / to bury one's head in the sand

3. 朝三暮四 zhāosān-mùsì lit. say three in the morning but four in the evening (idiom); to change sth that is already settled upon / indecisive / to blow hot and cold

4. 杞人憂天 Qĭrén-yōutiān man of Qǐ fears the sky falling (idiom); groundless fears

5. 一毛不拔 yīmáo-bùbá (saying) stingy; parsimonious

6. 班門弄斧 bānmén-nòngfŭ display one's slight skill before an expert

7. 囫圇吞棗 húlún-tūnzăo to read hastily without thinking

8. 竭澤而漁 jiézé-éryú kill the goose that lays the golden eggs

9. 天衣無縫 tiānyī-wúfèng perfect (job); without a trace; flawloess (lies)

10. 名落孫山 míngluò-sūnshān (idiom) to fail in imperial exam

11. 指鹿為馬 zhĭlù-wéimă making a deer out to be a horse (idiom); deliberate misrepresentation

12. 水落石出 shuĭluò-shíchū As the water recedes, the rocks appear (idiom); the truth comes to light / When facts are known, doubts dissipate. / Murder will out. / Name of a CCTV soap opera set in police station

13. 胸有成竹 xiōngyŏu-chéngzhú to plan in advance (idiom); a card up one's sleeve / forewarned is forearmed

14. 舉一反三 jŭyī-fănsān to raise one and infer three (idiom); to deduce many things from one case

15. 青出於藍 qīngchū-yúlán lit. green is born of blue, but beats blue (idiom); fig. the student becomes superior to the master / same as couplet 青出於藍,而勝於藍|青出于蓝,而胜于蓝

16. 江郎才盡 Jiāngláng-cáijìn Mr Jiang has exhausted his talent (idiom); to have used up one's literary talent or energy

17. 一敗塗地 yībài-túdì failed and wiped over the floor (idiom); to fail utterly / a crushing defeat / beaten and in a hopeless position

18. 信口雌黃 xìnkŏu-cíhuáng to talk off the cuff (idiom); idle talk

19. 口若懸河 kŏuruò-xuánhé mouth like a torrent (idiom) / eloquent / glib / voluble / have the gift of the gab

20. 如火如荼 rúhuŏ-rútú lit. white cogon flower like fire (idiom); fig. a mighty army like wildfire / daunting and vigorous (momentum) / flourishing / magnificent

21. 滄海桑田 cānghăi-sāngtián lit. blue seas where once was mulberry fields (idiom); time brings great changes / life's vicissitudes

22. 為虎作倀 wéihŭ-zuòchāng to act as accomplice to the tiger (idiom); to take the part of the devil / to help a villain against honest people

23. 狐假虎威 hújiă-hŭwēi lit. the fox exploits the tiger's might (idiom); fig. to use powerful connections to intimidate people

24. 狼狽為奸 lángbèi-wéijiān villains collude together (idiom); to work hand in glove with sb (to nefarious ends)

25. 畫蛇添足 huàshé-tiānzú lit. draw legs on a snake (idiom); fig. to ruin the effect by adding sth superfluous / to overdo it

26. 老馬識途 lăomă-shítú an old horse knows the way (idiom); an experience worker know what to do / an old hand knows the ropes

27. 草木皆兵 căomù-jiēbīng lit. every tree or bush an enemy soldier (idiom); fig. to panic and treat everyone as an enemy / to feel beleaguered

28. 驚弓之鳥 jīnggōng zhī niăo lit. bird startled by a bowshot (idiom); fig. a frightened person

29. 南轅北轍 nányuán-bĕizhé (fig.) act in a way that defeats one's purpose

30. 望梅止渴 wàngméi-zhĭkĕ quench one's thirst by thinking of plums (idiom); to console onself with false hopes

31. 勢如破竹 shìrú-pòzhú sweeping

32. 司空見慣 sīkōng-jiànguàn a common occurrence

33. 含沙射影 hánshā-shèyĭng insinuate

34. 投筆從戎 tóubĭ-cóngróng (of student or intellectual) give up civilian pursuits to join the military, to join the army voluntarily

35. 東窗事發 dōngchuāng-shìfā to come to light; the game is up

36. 夜郎自大 Yèláng-zìdà ignorant and boastful

37. 作法自斃 zuòfă-zìbì a scheme that boomerangs; to get into trouble through one own's scheme

Link to comment
Share on other sites

assigned numbers according to how many sources include them (by sources they mean chengyu dictionaries and textbooks and the like, they list them here) . So I have taken the 37 chengyu from the MOE Chengyu list that appear in 25 or more sources (27 is the highest score in the sample of 48,000+ chengyu) and put them into a list.

Not sure how effective this method is. About half of the "sources" counted are books for teaching chengyu, which have their own instructional purposes and don't necessarily try to teach the most commonly used chengyus. But under this methodology, being referred to one time by such an instructional book counts as much as a being an entry in a chengyu dictionary.

There are a number of idioms on your list that I have never seen before. I've listed them below. I have already gone through the Singapore Ministry of Education list plus a couple of hundred more, so I question if the list is really of the most commonly used.

1. 一丘之貉 yīqiū zhī háo jackals of the same tribe (idiom); fig. They are all just as bad as each other.

7. 囫圇吞棗 húlún-tūnzăo to read hastily without thinking

8. 竭澤而漁 jiézé-éryú kill the goose that lays the golden eggs

9. 天衣無縫 tiānyī-wúfèng perfect (job); without a trace; flawloess (lies)

15. 青出於藍 qīngchū-yúlán lit. green is born of blue, but beats blue (idiom); fig. the student becomes superior to the master / same as couplet 青出於藍,而勝於藍|青出于蓝,而胜于蓝

18. 信口雌黃 xìnkŏu-cíhuáng to talk off the cuff (idiom); idle talk

19. 口若懸河 kŏuruò-xuánhé mouth like a torrent (idiom) / eloquent / glib / voluble / have the gift of the gab

22. 為虎作倀 wéihŭ-zuòchāng to act as accomplice to the tiger (idiom); to take the part of the devil / to help a villain against honest people

24. 狼狽為奸 lángbèi-wéijiān villains collude together (idiom); to work hand in glove with sb (to nefarious ends)

26. 老馬識途 lăomă-shítú an old horse knows the way (idiom); an experience worker know what to do / an old hand knows the ropes

31. 勢如破竹 shìrú-pòzhú sweeping

33. 含沙射影 hánshā-shèyĭng insinuate

34. 投筆從戎 tóubĭ-cóngróng (of student or intellectual) give up civilian pursuits to join the military, to join the army voluntarily

35. 東窗事發 dōngchuāng-shìfā to come to light; the game is up

37. 作法自斃 zuòfă-zìbì a scheme that boomerangs; to get into trouble through one own's scheme

Link to comment
Share on other sites

I agree with gato - I'm not sure all of these are amongst the most commonly used. (Although of the ones gato listed, I think 青出於藍 is quite common.)

Plus I can think of some pretty common ones which aren't on this list - just off the top of my head (I'm sure there are plenty of others):

人山人海

坐井觀天

井底之蛙

海闊天空

自相矛盾

一塌糊塗

一無所有

By the way, I just looked up 一丘之貉 in my 成語詞典, the 教育部重編國語辭典修訂本 and dict.cn - all of them list the pronunciation of 貉 as he2.

Link to comment
Share on other sites

about the method: I said it was skewed. However the learning materials are all geared towards teaching chengyu to native speakers, so the selection made should reflect the authors' judgement which chengyu are most important for a literate speaker to know. On the other hand, it wouldn't be too problematic to run a script to recalculate the scores excluding certain works used.

I don't see any other methods based on frequency so far, since the corpora we have are still far too small for a lexical frequency analysis of chengyu. I guess if somebody was able to get ahold of a huge newspaper corpus (and I'd say ideally at least 1 billion characters) then one could run a frequency analysis based on the MOE list perhaps. (Somebody with much better computing power at their disposal than me)

I think it boils down to a question of numbers though. I have read somewhere that while the Singaporeans only ask for 250, the Chinese standard is more about 1000 chengyu that high school graduates are expected to know. And the MOE dictionary has 1594 core chengyu (which amounts to 5123, if counting all the variants) all explained in great detail, in my opinion the greatest resource on chengyu that's available online!

I think each of us has approached learning chengyu differently, and has mastered different amounts and types of chengyu, so that's why I was especially hoping for a native speaker to chime in. From the 37 chengyu, the following are not in CEDICT: 囫圇吞棗, 竭澤而漁, 天衣無縫, 江郎才盡, 含沙射影, 投筆從戎, 東窗事發, 望梅止渴, 夜郎自大, 作法自斃. But CEDICT lacks quite a number of chengyu that can indisputably be considered classic, so I'm not sure how what this means really. Of course it might help to widen the sample and include the chengyu that had a score of 24 as well, but that would make it already 21 more.

If you map CEDICT and the MOE Chengyu frequency list, then you get around 4600-4700 from CEDICT. I'm right now trying to add the most important chengyu that CEDICT doesn't have to my list, and the MOE frequency numbers have been one tool, but also lists like the Singaporean one, and I've probably done about 150 or something so far. When I mapped the HSK Chengyu list, I was happy to see that most of the items not in it were actually not "real" chengyu...

Link to comment
Share on other sites

Get your list of Chengyu, run a batch of Google searches, rank chengyu by number of results.

I had a quick look for software that might do this, with no joy - it's all SEO stuff designed to tell you where your site is, not how many results there are overall. However, someone with some skillz and a Google API may be able to figure it out.

胡说八道:2.3m

人山人海:1.6m

马马虎虎:780k

一丘之貉:176k

Link to comment
Share on other sites

Waiming,

thanks about 一丘之貉. That was a mistake in CEDICT, I've fixed it in my own list now.

Well, I never said these were the 37 MOST frequently used chengyu, I was just trying to find a way to come up with a reasonably small amount of chengyu a learner could start with. I thinkn the fundamental problem is that there are so many of them, and each chengyu individual does not get used all that often. But it also might be the case this is indeed leading in the wrong direciton, perhaps there is no easy way and one really needs at least 200-300 chengyu or something like that. FWIW, a couple from gato's list are discussed in as part of a "50則易錯成與背後的故事" list in a nice chengyu book I've been reading (不錯用成語 by 石雨祺).

Just out of interest, the scores for the chengyu provided by you:

人山人海 15

坐井觀天 23

井底之蛙 19

海闊天空 18

自相矛盾 22

一塌糊塗 13

一無所有 14

13 is just the breaking point of the 1594. (Unfortunately no list of the 1594 core chengyu is available to map it against the MOE frequency data).

I have checked the sources of 人山人海, 一塌糊塗 and 一無所有 and the reason they score so low is that they are either from the Song era (人山人海), with an unknown source (thus strongly suspected to derive from popular usage, 一塌糊塗) and from the Tang era, but obscure source (一無所有). The more they deviate from the prototypical definition of what most authorities consider chengyu to be, the less will they occur in dictionaries and the like even if they're used frequently.

So maybe the thread should rather be "37 important chengyu" or something like that.

Link to comment
Share on other sites

--------------------------------------------------------------------------------

Get your list of Chengyu, run a batch of Google searches, rank chengyu by number of results.

That would be great. Just feed the 48,000+ chengyu into the Google and see what happens. Alas, I lack both the skillz and the Google API...

Of course google results have to be taken with a grain of salt (or rice rather), but it would be the next best thing to that one-billion-character newspaper corpus I was daydreaming about...

Link to comment
Share on other sites

  • 5 months later...

Just to revisit this thread, I've talked with Taiwanese friends about the chengyu on this list, and it seems that this list reflects a bias towards those chengyu the school system in Taiwan wants its students to learn. Which of course follows naturally from the source of the list. Chengyu considered worthwhile learning will appear in more sources and get higher points.

And not all commonly taught chengyu are particularly frequent, and sometimes relatively obscure chengyu can be regarded by some committee to be essential for a Taiwanese student to know as an adult or something.

As for the most frequent chengyu, since most chengyu have are of relatively low frequency anyways, one is still looking at learning at least 300-400 before one can claim to have mastered the basics....

Link to comment
Share on other sites

Of the 37 chengyu in #1, I think "8. 竭澤而漁 drain the pond for the the fish" is less common. Its equivalent "殺雞取卵 kill the hen for the the eggs" is much more common.

I think learning just 37 chengyu is not enough.

Link to comment
Share on other sites

Oh, I never said that. When I first started out with my little chengyu project, I was hoping to find some kind of core set of 100-200 most frequent chengyu, but alas, it isn't so. At least 500, that's my impression as of now, better 1000....

Link to comment
Share on other sites

That would be great. Just feed the 48,000+ chengyu into the Google and see what happens. Alas, I lack both the skillz and the Google API...

Indeed, that would be pretty awesome. I'd be surprised if there isn't a freeware/plugin/etc that could do something to that effect for us. Hmm.. *goes searching*

Link to comment
Share on other sites

I found a python function that can help with that. If someone sends me a list to a plaintext file with the chengyus each on its own line i'll slap something together that'll sort it by google's 'estimated results'.

BTW, if anybody else is interested, here's the example code:http://stackoverflow.com/questions/1657570/google-search-from-a-python-app

Link to comment
Share on other sites

After some preliminary Googling, the best list I've found so far that is just plain text (without pinyin, translations, etc) is http://en.wiktionary.org/wiki/Category:Mandarin_idioms however these include both four/five-character 成語 and any other idiomatic phrase. Also, there are only 1,675 entries available at present, so it's hardly comprehensive. However it might be good for a preliminary test.

EDIT: *quickly adds favourite chengyu 三天打魚兩天曬網*

Link to comment
Share on other sites

hmkay, I got something, but before I let it run over all 1600 chengyu, i let it do the first 30. Can someone with more Chinese knowledge than me tell me if they're more or less in the right ballpark, concerning relative fequency?

阿狗阿猫 4130000

阿狗阿貓 4100000

爱不释手 779000

愛不釋手 771000

八千里路云和月 395000

八千里路雲和月 389000

八九不离十 299000

按劳分配 270000

按勞分配 269000

按兵不动 136000

按兵不動 135000

愛面子 131000

爱面子 131000

阿猫阿狗 93000

阿貓阿狗 90600

八九不離十 87700

愛莫能助 77500

爱莫能助 77500

安分守己 71500

暗暗自责 53700

白刀子进,红刀子出 50300

白刀子進,紅刀子出 50000

八仙过海,各显神通 44400

the number is what google gave me as 'estimated results'. They should be taken with a grain of salt, but are probably good enough to sort by.

So, what do you think?

Link to comment
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...