Jump to content
Chinese-Forums
  • Sign Up

Simplified characters by frequency


fenlan

Recommended Posts

I found on http://fhpi.yingkou.net.cn/bbs/1951/61.htm and other pages on the same BBS a listing of all 6763 characters in the GB set, organised by frequency! They are in the attached spreadsheet.

This list came with the following text:

汉字频度表——对 ChenShuyuan先生转载清华大学统计资料进行了加工

今对 ChenShuyuan先生转载清华大学统计资料进行加工,公布如下,仅供参考。

使用字数 6763 字(国标字符集),范文合计总字数 86405823个。

根据上表数据绘制图表,可以说明一点问题,感兴趣者可以试一试。

统计时是否遇到过国标字符集以外的字,是否包含有各类专业范文,等等,不得而知。

构词能力较强的字,其频度就会较高;否则,频度较低。

过去曾经几次发布过常用字表,起到了积极作用。估计常用字、次常用字及少数非罕见字控制在 4000-5000 字左右为好。在此范围之外,生字明显增多。

建议各类文章作者、编者、编辑等工作者们在你们的作品中一旦使用了罕见字,请用汉语拼音方案给予注音,必要时予以解释。免得读者们费时去翻阅辞书。

是否还有词频统计结果,盼告。汉语词汇约有十二万个

Link to comment
Share on other sites

<<出现次数>>

<<累计字数>>

<<万分比>>

<<累计万分比>>

1. The number of times the individual character occurs in the 87 million character database.

2. The cumulative number of characters represented by that individual character and the ones previous to it in the list.

3. The proportion of the total number of characters that an individual character occurs, out of 10,000.

4. The cumulative proportion of the total number of characters that an individual character and ones previous to it in the list occur, out of 10,000.

Link to comment
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...