Jump to content
Chinese-Forums
  • Sign Up

How Do You Find Out How Many Characters You Can Actually Read?


Shi Tong

Recommended Posts

Hello JonBI.

I'm lucky in that I already speak reasonable Chinese. This means I can quite easily link up words with two characters togetether when they're zi I know from two other places:

EX: I learned to write 性别 the other day- I know the first word and the second word in two different phrases I already know, so it's easy to remember.别的 的 别. 性感 的 性.

Either way, I've kind of got over it.. I'm up to 420 "new" characters, I'm sure I can write and know the meaning of around 200 more simple ones. I'm just going to keep going learning words (with two characters) and at some point I will get to know what I know!

Link to comment
Share on other sites

Hello JonBI.

I'm lucky in that I already speak reasonable Chinese. This means I can quite easily link up words with two characters togetether when they're zi I know from two other places:

EX: I learned to write 性别 the other day- I know the first word and the second word in two different phrases I already know, so it's easy to remember.别的 的 别. 性感 的 性.

Either way, I've kind of got over it.. I'm up to 420 "new" characters, I'm sure I can write and know the meaning of around 200 more simple ones. I'm just going to keep going learning words (with two characters) and at some point I will get to know what I know!

Oh no doubt, but I was pointing to more obscure ones - what level did you test on. The results were totally out of whack, so I got 585 from my first test, over 1000 on the next one, then 1300 on the next one, then back to 700 on the next - personally, I would place myself around 1200-1300 at a comfortable level, but even then... I don't study characters at the moment, I study combinations, which is perhaps a bad idea, but that is essentially the method of choice for the HSK-based courseware up until an advanced level, where the ratio between words and characters hits 4 to 1, whereas before it is closer to 1 to 1, or 2 to 1. What that means is, when you are faced with the second character of a verb, coming up with the right answer is difficult. Same with the second character of a noun, often.

So of course, this is perhaps my fault, which is why I am working now to try to get a better sense of individual characters (though one does not have all the time in the world, especially when shackled by university curriculum) but from my understanding of Chinese now, words tend to always have a desire, especially nouns, for two characters or more. It is easier for someone brought up, or with more exposure to get a better sense of individual characters, but I doubt when many people read they pick each meaning from each character, at least when reading vernacular Chinese.

Link to comment
Share on other sites

Well, I study in two ways- I write combinations of characters when I am unused to both of the Han Zi- so I turn to two word phrases in these cases. I also study single characters when they're very commonly used. Generally I know where they're supposed to go. I do this by writing about 200 characters a day and studying between 5 and 8 new characters to add to the list.

I also have a useful index of just Hanzi in the backs of my books, so I perposefully go through this list once every couple of weeks to make sure that I'm combining the correct zi to make up new shenci (2 "word" "words" as it were). I find this a really useful way of learning it because you get every character trained and then you can combine them and make sure your combinations work.

Either way, thanks for your advice, and yes I agree that the test above is really poor- if I already know that I definately write and read 400 characters, coming up with a result around 200 is rediculous. I can probably retake and find myself like yourself with anything ranging from 200- 1000.

Link to comment
Share on other sites

  • 2 weeks later...

I did the print out X amount of 常用汉字 and check method, and came to 1091/2700 (being rather strict with myself).

This is after two years of university level studies including 6 months at Beida. I was humbled by going through this list, and the coming year I'll be focusing on learning new characters and reading.

That said, I probably know a lot of words that use those 1091 characters (or so I hope :D)

Link to comment
Share on other sites

A 3000 character list is useful, and I have gone through that kind of thing before, but it is very arduous and I dont have all that much time.

I think actually that there really isn't going to be an easy way of doing this, so I'll just have to spend the time doing it! :D

Pick words 1, 31,61, ..of the character list. Count how many of those selected 300 you know and multiply the total by 10. That way you get an estimate of the number of words you know, without spending too much time.

Of course, the precision of the estimate will increase if you increase the tested number of words, but you'll need more time

Link to comment
Share on other sites

You memorize it by practicing reading out loud and writing, etc... Might or might not be 死记 or 硬背. It could also be a combination of other things plus 死记 and/or 硬背 or not. Then you use a trusted official list of characters and compare what you know against this list. Or you could add more vocabulary first then check what you've remembered and know against the list.

Why do you really want to know? You're much better off just focusing on what you can read rather than trying to quantify things so abstractly. Knowing random characters out of context doesn't help with understanding the proper use of bigrams, trigrams, etc. which is where most of the work in vocabulary building comes in anyway.

You are wrong. I guess you don't know the meaning of a "trigram" in Chinese culture then? A "trigram" is a one of the 8 bagua signs made up of three bars [yin or yang] each. I don't know what you mean by "bigram"?

Ba gua

I did the print out X amount of 常用汉字 and check method, and came to 1091/2700 (being rather strict with myself).

I don't think that's enough. How about checking it against the list of 4500 or more which Japanese people learn Kanji, and the list which Koreans learn Hanja plus the list of Han Tu which Vietnamese people used to learn Chinese characters from [if there's such a list].

Link to comment
Share on other sites

You are wrong. I guess you don't know the meaning of a "trigram" in Chinese culture then? A "trigram" is a one of the 8 bagua signs made up of three bars [yin or yang] each. I don't know what you mean by "bigram"?

Ba gua

Sigh. Could you start checking better before you correct people?

From wikipedia: "An n-gram is a subsequence of n items from a given sequence. The items in question can be phonemes, syllables, letters, words or base pairs according to the application."

A three character word, which is pretty obvious what the poster meant, could easily be considered a trigram based on this definition. While it is an unusual term, especially given the other meaning for trigram, it certainly isn't wrong, and even if it's wrong, it doesn't paint you in a good light to mention it.

  • Like 3
Link to comment
Share on other sites

  • 3 weeks later...

Be careful though, the list on xiaoma cidian is NOT the list of all characters per HSK level, but all single-character WORDS of a certain HSK level. There are other words at the same level which have characters that are not listed there. Many common characters which only appear in compounds are not listed.

So I'd suggest the frequency lists, which are available at the same location.

Hmmm... Listing *all* characters per HSK level might be a good addition to Xiaoma Cidian. I'm going to work on it. Stay tuned.

Link to comment
Share on other sites

小马词典 Xiaoma Cidian now lists all characters used in each HSK level, as single-character words or as part of compounds.

The list of words have been updated with the possibility to filter single-character or compound words.

Some example links:

Do not hesitate to report errors or propose improvements!

  • Like 1
Link to comment
Share on other sites

@renzhe and @chinopinyin, how is "little horse" better / different than, say, MDBG, since they are both based on CC-CEDICT?

@chinopinyin, sorry for being dense, but I can't find the wordlists on the link you provided, only a pdf with the test and a mp3 with the 听力 section.

Link to comment
Share on other sites

Oh, that's what you meant. It will take a lot of work to create a useful (e.g. place into a flash card program) out of that. Not only is it in unformatted text, it is in a pdf of a scan so one can't even cut-and-paste.

Link to comment
Share on other sites

@renzhe and @chinopinyin, how is "little horse" better / different than, say, MDBG, since they are both based on CC-CEDICT?

The main difference is that it is focused on characters. What I like about it are:

1) Radical index -- great for looking up characters you don't know

2) It seems to have a more complete list of variants (this includes the traditional forms, as it's a simplified dictionary)

3) It shows each character in a nice, huge font, which is nicer than the small default at MDBG.

Having said that, I like MDBG, and it is superior when it comes to looking for words and wildcard searches.

Link to comment
Share on other sites

Oh, look! MDBG has a radical index! :D I didn't even know.

Well, xiaoma has descriptions for radicals, and it has a separate list of difficult characters.

Other than that, they're about the same, really.

Link to comment
Share on other sites

  • 5 months later...

Here's a list of Chinese characters by frequency. You can check yourself against the list, as Renzhe suggested:

http://lingua.mtsu.e...st.php?Which=MO

Thanks for this.

Does anyone know what "Cumulative frequency in percentile" means? I'd like to be able to judge how obscure some of those characters really are.

Link to comment
Share on other sites

It means I guess the frequency of that character and all those before it in the corpus. So:

152 相 269125 50.0550148783

means that if you know 相 and the 151 characters before it, you know half (50.055...%) of all the characters. Does that make sense?

Link to comment
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...