Here is a list of 800 chengyu's that elementary school students supposedly should know. Seems a bit much for elementary school kids to me, but more is expected of kids nowadays, so maybe it is.



一败如水 胆小如鼠 引狼入室 风驰电掣 刀山火海 一贫如洗 料事如神
视死如归 对答如流 挥金如土 铁证如山 度日如年 心急如焚 巧舌如簧
如雷贯耳 如履薄冰 如日中天 势如破竹 稳如泰山 骨瘦如柴 爱财如命
暴跳如雷 红叶似火 心如乱麻 高手如林 健步如飞 守口如瓶 栩栩如生
骄阳似火 冷若冰霜 门庭若市 思重如山 从善如流 观者如云 浩如烟海

怒发冲冠 一目十行 一日千里 一字千金 一泻千里 一触即发 百发百中
一日三秋 不毛之地 不计其数 胆大包天 寸步难行 一步登天 千钧一发

弃暗投明 取长朴短 厚今薄古 苦尽甘来 七上八下 九死一生 三长两短
大同小异 大材小用 大智若愚 小题大做 上行下效 上窜下跳 无中生有
天翻地覆 化险为夷 凶多吉少 古为今用 古往今来 旧仇新恨 出生人死
生离死别 有名无实 有各无患 有眼无珠 寻死觅活 异口同声 异曲同工
阳奉阴违 此起彼伏 吐故纳新 同甘共苦 因小失大 优胜劣败 自生自灭
评头论足 远交近攻 求同存异 声东击西 克己奉公 扶弱抑强 改邪归正
里应外合 删繁就简 彻头彻尾 空前绝后 顶天立地 拨乱反正 居安思危
承上启下 承前启后 虎头蛇尾 明争暗斗 明枪暗箭 明察暗访 知彼知己
供不应求 舍本求末 舍生忘死 舍近求远 举足轻重 前因后果 将信将疑
南腔北调 南辕北辙 挑肥拣瘦 柳暗花明 厚此薄彼 除旧更新 畏首畏尾
绝无仅有 谄上欺下 起死回生 顾此失彼 能屈能伸 深入浅出 推陈出新
假公济私 量人为出 惩前毖后 街头巷尾 摇头摆尾 僧多粥少 颠倒是非
颠倒转折 避实就虚 避重就轻 藏头露尾 瞻前顾后

胡言乱语 仁人志士 调兵遣将 手舞足蹈 魂飞魄散 狐群狗党 人山人海
丢盔弃甲 呼风唤雨 龙腾虎跃 胡思乱想 日久天长 金枝玉叶 长吁短叹
争名夺利 虎背熊腰 日新月异 甜言蜜语 长年累月 高瞻远瞩 自由自在
争分夺秒 花言巧语 引经据典 流言蜚语 自轻自贱 崇山峻岭 含辛茹苦
风起云涌 假仁假义 自暴自弃 想方设法 穷凶极恶 雷厉风行


欣欣向荣 夸夸其谈 蒸蒸日上 人人皆知 井井有条 斤斤计较
历历可数 滔滔不绝 人人自危 心心相印 历历在目 比比皆是
循循善诱 蠢蠢欲动 步步为营 娓娓动听 振振有词 奄奄一息
洋洋得意 格格不入 默默无闻 津津有味 摇摇欲坠 耿耿于怀

不闻不问 无边无际 无忧无虑 自由自在 自言自语 自作自受
不折不扣 无影无踪 半信半疑 如痴如醉 人山人海 毛手毛脚
碍手碍脚 缩手缩脚 全心全意 不三不四 不知不觉 一模一样

山山水水 兢兢业业 高高兴兴 战战兢兢 三三两两 鬼鬼祟祟 熙熙攘攘
吞吞吐吐 口口声声 形形色色 大大咧咧 慌慌张张 原原本本 清清楚楚

神采奕奕 眉飞色舞 昂首挺胸 惊慌失措 漫不经心
垂头丧气 没精打采 炯炯有神 愁眉苦脸 大惊失色

奋不顾身 舍己为人 坚强不屈 赤胆忠心 不屈不挠 忠贞不渝
誓死不二 肝胆相照 克己奉公 一尘不染 两袖清风 见利忘义
豁达大度 兢兢业业 卖国求荣 恬不知耻 贪生怕死 威武不屈
舍生忘死 永垂不朽 顶天立地 厚颜无耻

学无止境 学而不厌 废寝忘食 争分夺秒 不甘示弱 全力以赴 真才实学
孜孜不倦 力争上游 好学不倦 笨鸟先飞 披荆斩棘 不学无术 闻鸡起舞
勤学好问 自强不息 发愤图强 只争朝夕

不骄不躁 大智若愚 谨言慎行 功成不居 戒骄戒躁 洗耳恭听 虚怀若谷 自知之明

班门弄斧 孤芳自赏 居功自傲 目空一切 目中无人 恃才傲物 妄自尊大
自鸣得意 自我陶醉 自命不凡 忘乎所以 唯我独尊 自高自大

忐忑不安 心惊肉跳 心神不安 心猿意马 心慌意乱 七上八下 心急如焚

口若悬河 对答如流 滔滔不绝 谈笑风生 高谈阔论 豪言壮语 夸夸其谈

闭月羞花 沉鱼落雁 出水芙蓉 明眸皓齿 心慈面善 老态龙钟 面黄肌瘦
美如冠玉 张牙舞爪 虎背熊腰 绰约多姿 倾国倾城 愁眉苦脸 如花似玉
其貌不扬 国色天香 冰清玉洁 容光焕发 蓬头垢面 鹤发童颜 眉清目秀
和蔼可亲 雍容华贵 文质彬彬 威风凛凛 落落大方 弱不禁风 大腹便便

直言不讳 无所顾忌 拐弯抹角 真心诚意 喋喋不休 娓娓动听 故弄玄虚
慢条斯理 绘声绘色 侃侃而谈 含糊其辞 对答如流 滔滔不绝 唠唠叨叨
自圆其说 虚情假意 推心置腹 旁敲侧击 振振有词 肆无忌惮 大言不惭

眉开眼笑 捧腹大笑 眉飞色舞 手舞足蹈 喜上眉梢 如获至宝 喜笑颜开
相视而笑 喜从天降 谈笑风生 笑容可掬 兴高采烈

亲密无间 推心置腹 肝胆相照 情同手足 促膝谈心 志同道合 情深似海
风雨同舟 拔刀相助 荣辱与共 海誓山盟 同甘共苦 关怀备至 盛情款待

万紫千红 春暖花开 鸟语花香 姹紫嫣红 春花秋月 花红柳绿 百花争艳
遍地开花 过时黄花 花团锦簇 化枝招展 锦上添花 火树银花 明日黄花

云雾迷蒙 九霄云外 腾云驾雾 壮志凌云 孤云野鹤 风云变幻 风起云涌
行云流水 烘云托月 过眼烟云 烟消云散 风卷残云 彤云密布 浮云蔽日

大雨倾盆 血雨腥风 风雨交加 满城风雨 风调雨顺 滂沱大雨 枪林弹雨
春风化雨 风雨同舟 风雨飘摇 风雨无阻 斜风细雨 和风细雨 狂风暴雨

崇山峻岭 山明水秀 高山深涧 山穷水尽 悬崖峭壁 大好河山 峰峦雄伟
刀山火海 漫山遍野 地动山摇 江山如画 锦绣河山 还我河山 湖光山色

水流湍急 一泻千里 波澜壮阔 水乳交融 波涛汹涌 血流成河 水平如镜
滴水不漏 杯水车薪 翻腾怒吼 高山流水 洪水猛兽 千山万水 水滴石穿

五彩缤纷 五颜六色 一碧千里 五光十色 万紫千红 灯红酒绿 花红柳绿
青红皂白 绿水青山 古色古香 光彩夺目 翠色欲流 姹紫嫣红 面如土色

不可多得 风毛麟角 寥寥无几 宁缺毋滥 前所未闻 九牛一毛 屈指可数
沧海一粟 绝无仅有 三三两两 千古绝唱 空前绝后 铁树开花 微不足道
微乎其微 独具匠心 寥若晨星 一鳞半爪 独树一帜 一丝一毫 百里挑一

接踵摩肩 车水马龙 川流不息 门庭若市 座无虚席 纷至沓来 万人空巷
万籁俱寂 花花世界 水泄不通 鸦雀无声 举袖为云 人声鼎沸 门可罗雀
挥汗如雨 人欢马叫 接踵而至 络绎不绝 人山人海 震耳欲聋

包罗万象 琳琅满目 美不胜收 眼花缭乱 不计其数 漫山遍野 目不暇接
洋洋大观 层出不穷 星罗棋布 无奇不有 无穷无尽 一应俱全 应有尽有
绰绰有余 多多益善 多才多艺 足智多谋 无所不包 应接不暇 多如牛毛
丰富多彩 五花八门 形形色色 比比皆是 不乏其人 俯拾皆是 举不胜举

拔苗助长 狐假虎威 亡羊补牢 精卫填海 坐井观天 掩耳盗铃 刻舟求剑
画饼充饥 叶公好龙 面地为牢 守株待兔 杀鸡儆猴 画蛇添足 愚公移山
对牛弹琴 盲人摸象

完璧归赵 程门立雪 三顾茅庐 四面楚歌 渑池之会 杞人忧天 负荆请罪
画龙点睛 纸上谈兵 南柯一梦 指鹿为马 卧薪尝胆 闻鸡起舞 十面埋伏



心口如一 一笔抹杀 杀一儆百 百家争鸣 鸣锣开道 道不拾遗 遗臭万年
年富力强 强词夺理 理直气壮 壮志凌云 云中白鹤 鹤发童颜 颜筋柳骨
骨肉相残 残兵败将 将信将疑 疑神疑鬼
斗志昂扬 扬眉吐气 气味相投 投机取巧 巧立名目 目送手挥 挥洒自如
如释重负 负荆请罪 罪魁祸首 首屈一指 指鹿为马 马到成功 功德圆满
大快人心 心旷神怡 怡然自得 得意忘形 形势逼人 人浮于事 事出有因
因小失大 大庭广众 众星捧月 月中折桂

一技之长 春风化雨 来日方长 沙里淘金 难能可贵 阳春白雪
声东击西 安居乐业 一步登天 津津有味

千言万语 文通字顺 百炼成钢 笔墨纸砚 不计其数 学有专长 博览群书
包罗万象 九州方圆 规行矩步

胆小如鼠 力大如牛 生龙活虎 守株待免 叶公好龙 打草惊蛇 悬崖勒马
顺手牵羊 杀鸡儆猴 呆若木鸡 白云苍狗 行同狗彘(猪)

焦头烂额 另眼相看 嗤之以鼻 摇唇鼓舌 口蜜腹剑 铁石心肠 指手画脚
异香扑鼻 画龙点睛 唇齿相依 肝胆相照 别具匠心 一目了然 劈头盖脸
集腋成裘 洗心革面 扬眉吐气 三头六臂 瞠目结舌 千钧一发 一手遮天
卑躬屈膝 掩耳盗铃 了如指掌 摩肩接踵

多此一举 说一不二 三头六臂 四通八达 五湖四海 六神无主 七嘴八舌
八仙过海 九霄云外 十全十美

手舞足蹈 心灵手巧 手到擒来 手忙脚乱 得心应手 爱不释手 手足无措
心狠手辣 情同手足

虎头蛇尾 生龙活虎 虎口拔牙 虎口脱险 龙争虎斗 调虎离山 龙盘虎踞
羊入虎口 骑虎难下 放虎归山 龙潭虎穴 龙腾虎跃 降龙伏虎 虎背熊腰
养虎遗患 狼吞虎咽 谈虎色变 虎视眈眈 如狼似虎 虎口余生 为虎作伥

怡然自得 依然如故 恍然大悟 茫然若失 浑然一体 豁然开朗 安然无恙
焕然一新 庞然大物 蔚然成风 迥然不同 泰然处之 防患未然 怦然心动
哑然失笑 昭然若揭 斐然成章

万象更新 千军万马 对牛弹琴 亡羊补牢 鸡鸣狗盗 爱屋及乌 泥牛入海
蜻蜒点水 声名狼藉 狗急跳墙 黔驴技穷 鱼目混珠 叶公好龙 狗头军师
杯弓蛇影 如鱼得水 鹏程万里 骑虎难下 一箭双雕 狐群狗党 鸡毛蒜皮
惊弓之鸟 鹤发童颜 守株待兔 噤若寒蝉 狗屁不通 指鹿为马 画龙点睛
哀鸿遍野 顺手牵羊 井底之蛙 鹤立鸡群 抱头鼠窜 免死孤悲 瓮中捉鳖
兵荒马乱 管中窥豹 螳臂挡车 蜂拥而至 蝇头微利 门可罗雀



One could use this website to get a complete list:


First one starts with a list of all characters (these can be found online). For each one convert it to the Big5 code.

For example for 一 the Big5 code is A440. And then you use this code to search the website:


Notice the %A4%40 in the url. That is the Big5 code.

So now parse the website for each character and extract the 成语. Of course you shouldn't do this by hand but write a small script for this :-)

Bert, these are a subset of the 48,000 chengyu, as I posted above:







So that website has 28,508 in total:

- 23,385 in the appendix, i.e. only short description

- 1594 full in-depth descriptions, with variants covering 5123 chengyu

Here's a list that has chengyu that appear both in the MOE dictionary and MOE chengyu dictionary, about 20,000 items: http://dict.idioms.moe.edu.tw/mandarin/fulu/dict/chengyu.htm

Well, as you say yourself, I find it a tad unrealistic that elementary students should know 800, if high schoolers need to know 1,000-1,500 as per the Chinese MOE. Though I haven't really able to confirm that number, and would REALLY like to get my hands on that list.... :conf

I haven't cross-referenced the data run by phyrex yet, but since muyongshi wanted a simplified version, I thought I'd put this here: the MOE list, cross-referenced with my own chengyu list, with all chengyu in CEDICT and HANDEDICT.

NOTE: or I was gonna post it, but turns out the file size is too large, even compressed it's 2.8 MB :tong

Oh right, I also wanted to ask you, phyrex, what the numbers signify? Can I reproduce those numbers by hand when I enter it in google? I tried some searches, but the numbers returned were always different. I suppose you used the Chinese-language version of google?

Another problem I have found is with characters outside a certain character set, i.e. google chrome hasn't been able to display those (while Excel has):


(for those of you that see squares: this should be the character 金 radical + right part of 稻). If google interpreted those as blanks, since there's a lot of combinations of the form X神Y鬼, naturally the google count was very high as well... But luckily, once you've cross-referenced the lists, this should go away, because characters that rare shouldn't be part of any chengyu common enough to be in the lists cross-referencing against...

chrix, I mentioned that somewhere earlier, but through all that back and forth it all got a bit convoluted :)

Basically, there are two ways you can query google. The first way is to use the API (application progammer interface) that google provides. This is the official and easy way.

Then, there is the second way, that is going to google.com 'manually', typing something in, and getting the 'estimated result count' from the upper righthand corner. This step *can* be automated, but is complicated, (therefore) very slow, and totally against the google terms of service.

Theoretically, both ways SHOULD yield the same results. But they don't. Nobody knows why and everybody is pissed off about it, but that's the situation. I don't know why the numbers differ, but they do. Considerably so. I don't know, however, if the relation between the numbers is more or less the same, so that the two lists, created by the different methods, should yield roughly the right order of the chengyu.

I posted the the first 100 chengyu of your field run through each of the methods a couple pages back (in different posts). You can have a look at them and compare.. I was too lazy to do that ;)

The numbers of the "google.com search", as I think I called it, are the same that you'd get by going to google.com (no matter the language setting) and searching manually.

As to the boxes where the characters are: That might be my fault. Somewhere in between a conversion between UTF-16 and UTF-8 happens. I can try to change it all back to UTF-16, or even try to work with unicode throughout, then we MIGHT see the characters instead of the boxes. If we're gonna work some more on that, I'll give it a try.

OK, I understand. The important thing is that it equals a search with quotation marks, otherwise there will be results with all the characters in them but not in the right sequence :mrgreen:

I don't think it's necessary to compare it by hand, because google searches are by definition an approximation only. There are some theoretical issues with using google as a linguistic corpus, but in absence of a million-word multi-genre corpus, it's the best we can get.

However, there's a list of 4000+ chengyu culled from a newspaper article corpus, discussed in some other thread. I have integrated the results into my database as well. And it has left me hundreds of new chengyu entries to complete manually :help

EDIT: Actually, 2160 :help :help :help

the API search (the one I used :) ) explicitely adds quotation marks, and the results are different from a search without them, which I take as a sign that they are working :)

If you can give me a list of only the new chengyu, i'll run that through the program, and then merge the result with the master result list. No need to take a couple weeks off work to ask google for the results of 4000 chengyu by hand ;)

edit: 2160 still takes a loooong time to do manually ;)

Yes, I can give you the list of chengyu that do not appear in that 48,000 monster list. Only problem I have at the moment is that the 4000+ newspaper corpus is in simplified and the 48,000+ list is in traditional. But since these are extremely rare chengyu anyway we wouldn't need to worry too much about getting google numbers for those.

But if there was a way to automatically extract information from chengyu dictionary website to auto-complete entries (for the 2,160 chengyu I mentioned right now I ONLY have the simplified characters and the frequency counts from the newspaper corpus, but no other info such as pinyin, meaning etc.), that would simplify matters a lot... Many dictionary sites rightly guard against automatic extraction, so of course I wouldn't wanna do anything that would upset the proprietors of sites so useful to the student of Chinese chengyu...

Well, of course you *could* use the dictionary sites automatically, which, I suppose, *would* upset their owners (if they noticed). But it's a lot of work, even ignoring that it's not "nice", so I think it might be easier (and less upsetting for everybody involved :) ) to send the list to the owners directly, and ask them for the information you need, and also how you can make this task easier for them (i.e. offering to write a short script they can run if they tell you how they're storing their data, or sth like that)

PS: The chengyus are so seldom that they can't be found in open sources, like, say, 现代汉语词典, CEDICT, wikipedia, and some such?

