Jump to content
Chinese-Forums
  • Sign Up

How many characters?


WangYuHong

Recommended Posts

Hello all,

I have a couple of questions actually, but they all fall under the same general category.

Basically, how many characters do you need to recognize in order to read "stuff"?

Like, for the HSK. I saw the word list on this site, which has 1922 single characters in it. I didn't want to leech too much bandwidth, so I didn't play with the searching too much. Is that 1922 character sum the total amount required to read the advanced exam? (Not necessarily comprehend the exam, as that comes with knowing all the 2/3/4-character words that come from combining the characters, but just be able to read all the characters out-loud.) In a different topic, someone posted the character totals required by the HSK as 1600, 2200, 2900 for Beginner, Intermediate, and Advanced, respectively. I'm just curious which one is more accurate, since there's quite a big difference between 1922 and 2900.

For reading newspapers, what's a good target to shoot for? Recognizing 2000 characters? 3000?

What about technical topics such as business, legal papers, or computer engineering? I understand there would be a lot of specialized words, but I don't know if it comes more from adding new characters, or from adding new character-combinations.

Also, I've come across several charts depicting the frequency of chinese characters in large compilations of texts, which calculate the cumulative frequencies of characters: the 1,000 most frequent characters represent 89.14% of the characters in the texts, the 1,500 most frequent represent 94.55%, 2,000 - 97.12%, 3,000 - 99.18%. (and for the curious, it was about 9,933 to read 100%). For general reading, what do you think is enough to get the jist of things and be able to fill in the blanks from context? 90%? 95%? 99%?

While I know that purely focusing on memorizing characters won't increase my comprehension, it's at least a building block for expanding my knowledge (multi-character vocabulary and grammar are the other two big blocks, as far as I can tell). I just don't know how far I should set my sights. My goal is to be able to read (and understand!) Chinese that you would encounter normally (newspapers, general notices, etc.), with an ultimate goal of being able to recognize all the technical stuff (business, computers, legal) as well.

Link to comment
Share on other sites

For reading newspapers, what's a good target to shoot for? Recognizing 2000 characters? 3000?

This is a tricky question. I'm just past about 3000 and recognize 95% of the characters in newspapers. However, do I understand what I'm reading? A lot of the time NO. The problem isn't just how many characters, but their contextual usages. Different combinations of the same characters mean different things. That's my biggest hurdle. I'm better with some contexts than others.

For the characters that you don't recognize, you can make some really good educated guesses for their meanings based on their contexts and on what radicals are within those characters.

Link to comment
Share on other sites

Chinese characters are comparable to Latin roots. In classical Chinese (a written language, not spoken), most words consisted of only a single character (to save writing space and time, and there's less problem with homophones when the characters are written). In modern Chinese, which is based on spoken Mandarin, most words are made up of two or more characters. Some characters can still be used as words in of themselves, just like some Latin roots can be -- and sometimes multi-character words are abbreviated into a single characters (comparable to acronyms in European languages) -- but to increase your vocabulary, you really need to learn words, in addition to characters. I've browsed through the HSK vocab list. I think you need to know most of the words on the HSK advanced list to be able to read newspapers comfortably.

Link to comment
Share on other sites

Basically, how many characters do you need to recognize in order to read "stuff"?
(Not necessarily comprehend the exam, as that comes with knowing all the 2/3/4-character words that come from combining the characters, but just be able to read all the characters out-loud.)

I think there's also a distinction between being able to recognise characters and being able to read them out loud.

In my situation, I don't get much chance to practice speaking Chinese. Mostly I just read books, which for me at least, has the result that there are many characters which I recognise in terms of meaning, even though I don't remember how to pronounce them. (Does anyone else experience this too?) So I'm sure in principal you could get quite good at reading "stuff" even if you weren't able to read all the characters out loud.

Link to comment
Share on other sites

there are many characters which I recognise in terms of meaning, even though I don't remember how to pronounce them. (Does anyone else experience this too?)

I'm definately a novice with regard to my chinese learning, but I have this both ways; sometimes I find I can pronounce something I've learnt (probably when the phonetic component is obvious and easily remembered) but not its meaning, and other times I can remember the meaning but not the pronunciation.

Link to comment
Share on other sites

90% seems a lot, but it means you don't know every tenth character. That's not enough for reading comprehension. They say 2000 characters are enough to read the newspaper, but that's only if you understand the words that these characters form, and that is usually where the problem starts for foreigners.

Most specialized words in the areas you mention are made by combining regular characters, or by using different meanings of a character.

I'd recommend concentrating on learning words, some will consist of characters you already know, some of new characters.

Link to comment
Share on other sites

Let me just preface this by mentioning that I'm talking about learning to read, not necessarily to listen and speak. I understand those would take a different approach to learning.

Yeah, I know ultimately I'll need to focus on the words at some point...

I've tried it both ways in the beginning: learning words as a method of learning the underlying characters, and learning characters while learning the words that character combinations form.

For me, I've found that the books that focus on introducing words go by what words are most common, which seems good. Unfortunately, most of these words don't share common characters, so you're forced to learn a huge number of characters to keep up with the vocabulary.

Instead, I've found that John DeFrancis' horribly-outdated "Chinese Reader" series is much more suitable for me. He introduces new characters, and then expands your vocabulary by combining characters that you already know to form new words and uses. It still expands your vocabulary (although perhaps not in the most useful order), but for every 10 new words you only need to learn 1 new character, instead of 10 new characters.

So my original question was a bit of a curiousity, a goal to aim for. Unfortunately, the DeFrancis series only introduces 1200 characters, so I'll be on my own after that. If I'm at 600 characters right now, that's technically 80% "comprehension". but realistically, I'm missing every 1-in-5 characters, which actually turns out to be pretty important for understanding the meaning.

Hopefully eventually things come together at some point, where I can get the grasp of most things, and be able to learn the rest on my own. Do you think it's possible to get the gist of things at 1200 characters? Or is it more towards the 2000 character end? (Or maybe even 3000? Eek!)

Link to comment
Share on other sites

I think it's impossible to put a number on how many characters you need to know before you can understand what you're reading. Firstly, as people have already pointed out, number of characters isn't as important as number of words. But apart from that, it depends on what you are reading, and your ability to infer the missing bits of information.

I don't think number of characters is a particularly useful concept. It is OK if you are learning from a very structured course such as DeFrancis, when all the characters in a particular text will be ones that you have systematically learnt before. But in real practical situations, it isn't like that. Most real-life reading will also contain some rarely-used characters and words, some of which you will be able to guess from context, and others which you won't. But as you keep on practising, you will gradually expand your vocabulary, and little by little your reading comprehension will improve. However, your character repertoir will no longer be restricted by a text-book syllabus, and the number of characters you know will no longer be such a well-defined concept. There will be a continuum from the set of characters you know very well, to those you have a fuzzy recollection of in the back of your mind.

This is my opinion borne out from my own personal experience. Frankly I have no idea really how many characters I 'know'. In terms of number I have ever studied, probably about 5000. In terms of number I could write out by hand and know their reading, probably about 1000. And number I could recognise the meaning of in context, well, I don't know, but I don't have too many problems reading ordinary everyday materials.

Link to comment
Share on other sites

Instead' date=' I've found that John DeFrancis' horribly-outdated "Chinese Reader" series is much more suitable for me. [b']He introduces new characters, and then expands your vocabulary by combining characters that you already know to form new words and uses. It still expands your vocabulary (although perhaps not in the most useful order), but for every 10 new words you only need to learn 1 new character, instead of 10 new characters.[/b]

Huang and Stimson take that approach in their written series.

http://www.yale.edu/fep/catalog/standard2.html

Vol 1 has copyright 1980, and Vol 4 has a copyright of 1991, so the series was written and planned out before the Soviet Union fell apart, and before stuff really opened up in China. I am about 1/2 way through vol 4.

So my original question was a bit of a curiousity, a goal to aim for. Unfortunately, the DeFrancis series only introduces 1200 characters, so I'll be on my own after that. If I'm at 600 characters right now, that's technically 80% "comprehension". but realistically, I'm missing every 1-in-5 characters, which actually turns out to be pretty important for understanding the meaning.

It get's noticably better around 1000-1100.

What to do after that when you're beginner/intermediate series runs out? I think I know how it feels -- having relied on the stuctured essays which build vocabulary in a really smart and careful way. As I said, by 1200 or 1300 you should feel more confident. One thing you might want to do is look at the textbooks used at a selection of US university Chinese programs.

Fortunately there is a post on that

http://www.chinese-forums.com/index.php?/topic/8091-texts-used-in-us-university-programs

Also, A new text for Modern China takes a similar approach. It is used by some programs as 3rd year reader, and it is more contemporary than Huang and Stimson.

http://www.cheng-tsui.com/sb_catalog-csm.asp?grp=PC++&item=++++++++0887273122

Link to comment
Share on other sites

Bugger, I lost my post, browser crashed :evil:

---

New practical Chinese Reader goes beyond 1,200 most useful characters, so is Integrated Chinese. I'll stick to textbooks until I build enough characters

in my memory.

I found a good resource for intermediate to read "Everyday Chinese" series, which has hanzi, pinyin, vocab and English translations.

The title of series is not correct, IMHO. They also have a volume in Classical Chinese.

They are excellent for building vocabulary when you're past beginners' level. Unfortunately, they are not printed anymore.

I've got the books, a couple of them I got from our local Chinese books shop but they were the last ones!

mrhy1.jpg

mrhy2.jpg

This Russian site has some audio and Chinese texts of some books. The texts were typed in manually and I found some typos, though.

http://orientaler.com/china/wenxue/index.php

--

I just found one on German Amazon:

http://www.amazon.de/exec/obidos/ASIN/7800050254

Link to comment
Share on other sites

  • 2 years later...

I might not be adding anything new here but I would advice people to ease the learning process with online dictionaries and chinese software programs.

I used to use the good ol paper dictionary alot. All that radical lookup (a very useful skill) became too tiring after a while. Furthermore it didnt contain enough contemporary words to my liking. After switching to online resources, it has become a much more efficient process. Sites like zhongwen.com offers users an alternative way of looking characters using components rather than radicals (which can be hard to find for quite a number of characters). In short it's the click and copy/cut and paste functions that make it so much less agonising. See a new word that I don't understand on BBC chinese? Just copy and paste it into a dictionary and wala!

Try to use online chinese sites dictionaries though. English based ones are still lacking behind IMO. The only thing to watch out for are dodgy english translations; there's still plenty of them out there.

Link to comment
Share on other sites

  • 2 weeks later...

Haven't posted here in a while...

Regarding the "words vs. characters" discussion, I think the best way is to learn a bunch of characters in isolation (I know this is controversial on this site) and then start to learn the words in context, via the sentence method (also very controversial here). The number of characters isn't really important, but what you're trying to do is get a solid base, where you recognize most or nearly all of the characters you come across when looking for sentences. You can always learn a new character here or there once you've already learned a couple thousand, but it sucks to have to learn a new character for nearly every new word you learn if you didn't put in the work up front to learn a bunch.

I'd say just pick a book or list that makes sense to you and learn the whole thing. Could be the Matthews' book, could be the (hopefully soon-to-be-released) Remembering the Hanzi, could be any of the several "characters by frequency lists" floating around the internet, or the McNaughton book. Just pick one and work your way through.

Then you can move on to sentences. For example, let's say I already have a pretty good knowledge of characters. Now I'm adding sentences. Mind you, I'm not just going around and picking any random sentence and learning it. I'm picking sentences that have just one or two unknowns in them. And most of the time, I actually go looking for a sentence that contains the word I'm learning, is intelligible otherwise (I understand all or most of the other words and the grammar construction). So I'm learning "冰激淋," or ice cream. So I go to this site and type 冰激淋 into the search engine, and select "Easy Sentences" (no need to make it more difficult). And I get:

Could I taste your ice cream?

我可以尝一尝你的冰激淋吗?

Wo3 ke3yi3 chang2 yi4 chang2 ni3 de bing1ji1lin2 ma?

Perfect! If I know all these words, great! But let's say I pick a sentence where I don't know all the words (it's good to have more than one sentence anyway). So I also pick this one:

I like watching television, listening to music, eating ice cream.

我喜欢看电视,听音乐,吃冰激淋。

Wo3 xi3huan1 kan4 dian4shi4, ting1 yin1yue4, chi1 bing1ji1lin2.

OK, so now maybe I don't know 音乐 (unlikely, since I'm a musician, but oh well:lol:). Obviously the translation tells me what it is, but I need to use it in more sentences to learn it. So I put it into the search engine and come up with:

Would you like to listen to music?

你想听音乐吗?

Ni3 xiang3 ting2 yin1yue4 ma?

Your music's too loud!

你的音乐声太大了!

Ni3 de yin1yue4 sheng4 tai4 da4 le.

He has a lot of music tapes.

他有很多音乐磁带。

Ta1 you3 hen3duo1 yin1yue4 ci2dai4.

And so on. Now I can learn (if I don't know them already) how to say "too loud" (声太大了) and tapes. Other stuff I may learn on that page are 天赋 (talent), 盘 (measure word for tapes), 流行音乐 (pop music), 摇滚音乐 (rock music), etc. And I can find a sentence or two for each one, which will most likely have something I don't know, which will lead me to new sentences...etc. Not to mention learning new grammar patterns, synonyms (冰淇淋), etc. You can see how this method can kind of put itself on auto-pilot for a while. You can also set it on the "Look for this word everywhere and give me everything" option instead of the "Easy sentences" option, and find even more stuff, like:

The radio announcer said that there would be a concert in the square tonight.

广播员说今晚在广场上将有一场音乐会。

Guang3bo1yuan2 shuo1 jin1wan3 zai4 guang3chang3 shang4 jiang1you3 yi4 chang3 yin1yue4hui4.

The bar on the corner has great music and is always crowded.

街角那家酒吧的音乐很好,常常是门庭若市。

Jie1jiao3 na4 jia1 jiu3ba1 de yin1yue4 hen3 hao3, chang2chang2 shi4 men2ting2ruo4shi4.

Put in 20 sentences per day in this way for a few months and you'll be comfortably in what Khatzumoto calls the "Chrysalis Level," which is

"that magical intermediate phase, that strange stage where you understand maybe 75-80% of randomly picked authentic material, which is really good, but at the same time not yet enough to actually comprehend something new and raw in its entirety, since this still implies ignorance of every fourth or fifth word."

At this point, I'd say you could probably be really comfortable getting your sentences from authentic sources such as books (not learner's texts, but books intended for natives), movies, news sites, and TV shows, if you aren't already.

I think that something that is often misunderstood by people who don't like this method is that you aren't just going out and picking 10,000 random sentences. You pick each sentence with a purpose. If you already completely understand the sentence, what's the point? And on the other hand, if you only know 15% of the words in the sentence, it's too difficult for you to try now.

The fastest and most effective way to expand your knowledge is to pick something just slightly above your level. One or two new things in a sentence is all you need, and too much new stuff will just slow you down. It's that feeling of "almost got it" that makes you want to learn more and more, rather than seeing a long string of unknown words and grammar patterns that just makes you want to give up. If you need to, break long sentences into shorter, more manageable chunks (here's a secret: compound sentences are just 2 or more simple sentences combined).

Hopefully this helped. Khatzumoto explains all this better than I have, so I'd recommend reading through his site if you haven't already done so. Have fun!

Link to comment
Share on other sites

Regarding the "words vs. characters" discussion, I think the best way is to learn a bunch of characters in isolation (I know this is controversial on this site) and then start to learn the words in context,
Out of curiosity, how many characters would you say you currently know? I find it interesting to watch how people's opinion on this changes over time, and as the number of characters they know increases.
Link to comment
Share on other sites

Out of curiosity, how many characters would you say you currently know? I find it interesting to watch how people's opinion on this changes over time, and as the number of characters they know increases.

I'm a bit sceptical about approaches which wait with everything until they've "finished" learning characters, often without context or even pronunciation. I find that learning words in parallel actually helps you learn characters faster.

But when it comes to brute-forcing thousands of characters, I'm a die-hard proponent and swear by it, as long as it is accompanied with a healthy mix of other learning inputs. In my experience, the only serious way for serious learners to learn characters is to sit down and study characters intensively, the earlier the better. The rest is just details.

And yes, this is the frustration speaking, after trying everything else except hard work for many years and seeing that it doesn't work :mrgreen:

Link to comment
Share on other sites

I'm a bit sceptical about approaches which wait with everything until they've "finished" learning characters, often without context or even pronunciation. I find that learning words in parallel actually helps you learn characters faster.

I agree there. I guess my post sounds that way though. I'm adding sentences to my flashcard database now. Most of the examples from my post are actually in my database. I just use sentences that contain characters I know. If there's one or two unknown characters, that's fine too, I'll just put them in my flashcard program and learn them.

Link to comment
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...