Jump to content
Chinese-Forums
  • Sign Up

Reading Milestones


黄有光

Recommended Posts

On 9/9/2021 at 12:46 AM, phills said:

三体 3 - I liked the first 2, but heard the 3rd is more like short stories rather than a single plotline.  Curious if @Woodford would recommend moving this higher up my pipeline?

 

I can definitely see how someone could come to that conclusion! The book has a lot of plot lines in it. However, they don't really form independent short stories, but they're all related in some way. I remember the third book being extremely huge and ambitious. The scope of the story expands in a massive way. In terms of reading difficulty, I think you would find it to be easier than the second book (although it definitely is longer in terms of page count).

 

On 9/9/2021 at 12:46 AM, phills said:

黄金时代 - I started this a while ago, but I thought it was too hard back then, so revisit. 

 

This one was my 7th book (so I wasn't an absolute beginner when I read it), and I heard rumors that the vocabulary and language were straightforward and simple. So I decided to read it for myself (my edition had a total of three short stories in it, which included 黄金时代). I'd have to say that it's one of the most difficult books I've read so far. I barely understood certain parts of it, and I had to consult a dictionary several times on each page. You were probably wise to put it down and wait to come back later. If I had done that with some of the books I read, then I would have likely enjoyed them and profited from them more.

  • Like 1
Link to comment
Share on other sites

Man, I am so excited about 三体.  I was gifted an English translation many years ago when I was first starting with my studies. I read about 2/3 of the book before I got distracted by life -- it was quite good, I recall, although definitely esoteric.

 

Anyway, I'm excited because after more than half a year of obsessive vocabulary cramming, 三体 is almost within my grasp. 1,500 words is what I consider the upper range of what I'm willing to tackle, and it is currently sitting almost at an even 3000 生词 (it started the year somewhere above 6000).

 

I expect I'll be reading 三体 by this time next year.

 

Other books which are set to come into range soon include the entire Harry Potter series, Howl's Moving CastleEnder's Game, and a wide range of other books. And of course, some good native literature as well such as 猫城记 and 九州...and MAN. I am SO STOKED for 九州。

小      说      生      词      有      多      少.png

  • Like 2
Link to comment
Share on other sites

On 9/9/2021 at 6:37 PM, 黄有光 said:

Anyway, I'm excited because after more than half a year of obsessive vocabulary cramming, 三体 is almost within my grasp. 1,500 words is what I consider the upper range of what I'm willing to tackle, and it is currently sitting almost at an even 3000 生词 (it started the year somewhere above 6000).

Where do you get these numbers?

Link to comment
Share on other sites

On 9/9/2021 at 10:37 AM, 黄有光 said:

Other books which are set to come into range soon include the entire Harry Potter series, Howl's Moving CastleEnder's Game, and a wide range of other books.


That is one impressive looking chart! I'm guessing that the lines that sharply fall down towards zero are the books you've read (or are currently reading), and all the other books slowly go down in value because of what you've acquired from the other books. It makes me wonder if reading a book in a particular genre causes a steeper decline in other books in that genre (or written by the same author).

And oh my, 脑髓地狱 looks like an absolute monster of a book!

I have several charts I've been working on since January 2019 (almost 3 years now), and when my growth enters a definite plateau (in likely about another year of constant reading), I'll probably share those charts. It's always encouraging for beginners who ask, "What will progress look like? How long will it take for me to achieve a goal of X, Y, or Z?" It's also cool for advanced learners who compare each other's experience and "war stories."

Link to comment
Share on other sites

On 9/9/2021 at 6:53 PM, alantin said:

Where do you get these numbers?

I use Chinese Text Analyser.

 

On 9/9/2021 at 7:19 PM, Woodford said:

And oh my, 脑髓地狱 looks like an absolute monster of a book!

I have several charts I've been working on since January 2019 (almost 3 years now), and when my growth enters a definite plateau (in likely about another year of constant reading), I'll probably share those charts. It's always encouraging for beginners who ask, "What will progress look like? How long will it take for me to achieve a goal of X, Y, or Z?" It's also cool for advanced learners who compare each other's experience and "war stories."

Yesssss!  "War stories" are indeed very fun to share. I would love to see what you've got in that chart now if you're up for it.

 

脑髓地狱 is more than 500.000 words long. When the book falls to 1000生词 I will be a VERY happy camper.

  • Like 1
Link to comment
Share on other sites

This graphic represents my experience since September 2019 (June and July 2021 are missing, because I spent that time watching Chinese Youtube videos instead of reading). What's recorded is the average amount of words I had to look up in the dictionary per page, for each book. My current book, Sanmao's "Stories of the Sahara," will probably sink beneath the 1-word-per-page mark (barely), but I'm not finished with it yet. Basically, my current experience is that most books rarely go over 2 words a page or below 1 word a page. I'm squarely in that "1-2" zone.

 

Edit: You'll likely have to click on it to magnify it.

image.thumb.png.dc2352a0094cb2f228555dd83f793257.png

  • Like 2
Link to comment
Share on other sites

And, just for kicks, below are charts for my character recognition (every month, since January 2019), followed by the total number of SRS flashcards beyond HSK6, since September 2019, when I started reading books (so my actual flashcard library is 5000 words larger than what this chart shows):

 

image.thumb.png.f9b377da4ae31adca15f74d91c34f536.png

 

image.thumb.png.5853a3a96cf93a3e67874095d92908d4.png

 

 

Character recognition is plateauing, I think, but word acquisition (the second chart) really hasn't slowed down yet. I assume it will soon.

  • Like 2
Link to comment
Share on other sites

On 9/9/2021 at 8:28 PM, Woodford said:

below are charts for my character recognition

What exactly do you mean by character recognition and how are you quantifying it?

 

Vocabulary acquisition is so tricky to quantify because it is easy to know roughly how much vocabulary you are absorbing but nigh impossible to figure out how much is fading from your memory. I am curious why you believe vocabulary acquisition will level off?  My calculations suggest that a passive vocabulary in excess of 40.000 words is necessary for flawless comprehension of an average novel aimed at adults -- meaning each word is understood on its own terms, no context needed.

 

It is interesting that we seem to have similarly-sized vocabularies, but very different approaches to vocab acquisition. Hey, man -- the race is on ??

Link to comment
Share on other sites

On 9/9/2021 at 9:46 PM, 黄有光 said:

What exactly do you mean by character recognition and how are you quantifying it?


I began reading books in last December by pasting the text into a word document, then, while reading, I changed the font for every character I couldn't read to one that included the pinyin on top of the character. Then I used search and replace after each chapter to only leave the readable characters, copied them to an excel sheet and ran "remove duplicates" to arrive at a unique readable characters list. I also did readable character counts over the whole book as well as character frequencies and percentage of readable characters in the book or chapter.

I did that for a couple of months and it gave me a thrill then to see my readable characters percentage to go up from somewhere around 80-85% to about 90-95% quite quickly But got tired of doing that and haven't kept up doing it. Maybe I'll pick it up at some point and compare to my results then.

Link to comment
Share on other sites

On 9/9/2021 at 1:46 PM, 黄有光 said:

What exactly do you mean by character recognition and how are you quantifying it?

 

I've used a combination of two online tests, and I take an average of them. Then each data point is a 3-month average of that average (to make the line smoother). Most tests give me very similar results, so I'm assured of the accuracy. One time, I dumped my entire Pleco database into an Excel sheet and counted all the unique characters, and it also (roughly) matched what the tests were saying. I'm testing at around 4500 characters, but I think I'm actually at around 5000 by now.

 

On 9/9/2021 at 1:46 PM, 黄有光 said:

Vocabulary acquisition is so tricky to quantify because it is easy to know roughly how much vocabulary you are absorbing but nigh impossible to figure out how much is fading from your memory. I am curious why you believe vocabulary acquisition will level off?

 

I've participated in some discussions on this site as to what exactly constitutes "knowing a word" (and you've likely seen that discussion), so now I'm careful to say, "This isn't my total vocabulary size, but just the number of SRS flashcards I have." If those numbers are strictly measuring a particular figure, it's precisely that one--my Pleco SRS flashcard database size. I don't add every new word to my flashcards, especially when I can guess what it means via context, it's the name of a person or place, etc. So it probably falls far short of my actual vocabulary size (and I have no idea what that number would be).

 

I actually don't use CTA to keep track of my vocabulary, but I did plug in the books I've already read, and it estimated my vocab size as a little over 30,000 words, if I remember correctly.

 

As far as fading from memory is concerned, that will totally happen! But I religiously maintain my SRS review for all 18,000 flashcards I have. So any words I'm forgetting will get brought up for review until I acquire them again. I think things will plateau, because my number of unknown words per book is shrinking. I do admit that that plateau won't be a perfect plateau. It's impossible to learn every word in a language. I'm a native speaker of English, but I'll never know all the English words.

  • Like 1
Link to comment
Share on other sites

@Woodford Ah, that explains it then. Your vocabulary is more than double mine, it would seem. No wonder you are reading books at a much higher level than me!

 

I agree counting one's total vocabulary is a fruitless task, but in theory we can at least set general guidelines and arrive at an approximation based on those. The boundaries are fuzzy, but not nonexistent. For example it would be ludicrous to say that my vocabulary is <5000 words at this point. And it would be similarly WTF to claim that I knew 100.000 words.

 

In my case, I define "knowing" a word as "being able to understand it in context". Which is a pretty low bar, I'll admit -- but currently the only one that matters to me. I am crunching passive vocabulary to get to a point where I can really engage with native media. The reason being that in my experience with German, once you can do that your familiarity with the language will explode. Once I can do that, I'll probably reevaluate what I consider acceptable vis-a-vis "knowing" a word.

  • Like 1
Link to comment
Share on other sites

On 9/9/2021 at 2:39 PM, 黄有光 said:

In my case, I define "knowing" a word as "being able to understand it in context". Which is a pretty low bar, I'll admit -- but currently the only one that matters to me.

 

Yeah, I agree with you, and I think some of these discussions can get a bit too philosophical for my taste! If I see the word 章鱼, and think "octopus," then I know it (at least well enough for my present purpose of developing basic literacy). I know that some vocabulary is more abstract than that, but dictionaries generally do a good job of listing the various possible definitions, and at least one of those definitions makes a lot of sense in the context where I find the word. Admittedly, there's a very small handful of words whose dictionary entries leave me thinking, "Huh? What???" And virtually the only way for me to learn them is to keep encountering them in books and see how they're used.

  • Like 1
Link to comment
Share on other sites

Yeah, that is one of the principle ways I triage vocabulary. In my quest to learn every word in a given book, one of the rare cases where I will skip a word is if I look it up and can't make heads or tails of its supposed meaning. Or even if understanding its meaning becomes laborious, I'll triage it.

Link to comment
Share on other sites

The war stories are very interesting.  You 2 have such different approaches to getting up the reading curve than I do.

 

I haven't tried to learn every word in a book since my first one, 活着.  I decided many words are just not common enough to bother trying to memorize.  As bootstrapping, I drilled on the HSK 6 vocab list, and also learned vocab as part of learning characters.  Each time I look up a character, I try to associate 2 or 3 common words that include the character, so I know how it would be used. 

 

I'm a big fan of frequency lists, so I try to note the frequency of words & phrases that I learn so I know if it's something that's supposed to be common or rare.  I rely on a principle from linguistics that common words are irregular (so have to be memorized), but rare words are often regular / follow a formula (so can be guessed).

 

I do keep track of exactly how many characters I know though, and I drill them semi-regularly.  I'm up to 4141 right now.  I only add new characters if they're seen multiple times in a book + they're below a certain threshold on one of frequency lists. 

 

I also familiarized myself with PSC vocab list (13k list for chinese broadcasters, which I sorted based on frequency) & the top 1000 chengyu phrases (sorted by frequency), although I didn't drill them.  I don't necessarily remember them all, but just knowing certain characters are meant to go together helps me, plus the chengyu are often stories so if you've heard of it, you can guess the meaning a lot better. 


When I read a novel, I usually note down about words or chengyu phrases that seem interesting to me.  Usually something I've seen multiple times, something that I can't guess from meaning of the individual parts, or something that seems to be elegant.  I seem to get about 200 from each novel (except Dark Forest had more due to tech terms).  It might even be a word I already "know" but I just like the way they use it.

 

I keep separate files for each author, because I find authors tend to re-use the same words elegantly over and over again.  It's their literary "style".  As I learn it, I find I read the second half of books much faster than the first half.

Link to comment
Share on other sites

One thing I've gotten much better at is tolerating ambiguities in reading.  I remember seeing this passage representing 95% recognition, before I started:

 

https://www.hackingchinese.com/introduction-extensive-reading-chinese-learners/

 

Quote

In the morning, you start again. You shower, get dressed, and walk pocklent. You move slowly, half- awake. Then, suddenly, you stop. Something is different. The streets are fossit. Really fossit. There are no people. No cars. Nothing. “Where is dowargle?” you ask yourself. Suddenly, there is a loud quapen—a police car. It speeds by and almost hits you. It crashes into a store across the street! Then, another police car farfoofles. The police officer sees you. “Off the street!” he shouts. “Go home, lock your door!” “What? Why?” you shout back. But it’s too late. He is gone.

 

I thought it was gibberish then.  Now I read it and think;

 

Really Fossit!   No Dowargle at all! 

 

A few Farfooles won't hurt me.

  • Like 2
Link to comment
Share on other sites

On 9/9/2021 at 10:31 PM, 黄有光 said:

That is a lot of work that Chinese Text Analyzer would automate for you

 

Sadly it only gives me the total and unique character counts but does not break it down to known and unknown characters.

It's statistics does a better job with vocabulary though.

EDIT: Thanks for bringing the Text Analyzer up! I had tried it at some point but one big part of my learning process is "toggling" pinyin on for unknown characters and then toggling it off while reading when I feel I don't need it anymore for a given character. There is an obvious problem here with the 多音字 but it couldn't be helped until now. I just gave Text Analyzer another go and was able to create a lua script that adds the pinyin after each unknown word. I can't toggle it on or off as easily as I'd hope, but it's close enough. :)

  • Like 1
Link to comment
Share on other sites

On 9/9/2021 at 3:31 PM, phills said:

A few Farfooles won't hurt me.


That's a really fun example! I foresee that, someday, I'll have to be at peace with that kind of comprehension. Honestly, if I'm reading a complicated article or piece of literature in my native language of English, it's much like that (but maybe more like 99.9% comprehension, rather than 95%). Even English has its Farfoofles for me!

I can feel my vocabulary acquisition evolving. Firstly, I could comprehend the story without a dictionary, if I really wanted to. That wasn't true for my first several books. Also, I've learned to categorize and then ignore certain words. Do I see obscure words with the 花 character at the end? It's a kind of flower, usually. So I ignore it. I don't even have a definite knowledge of flowers in English (morning glory, rhododendron, chrysanthemum, etc.). Sure, there are much more familiar flowers (rose, daisy, dandelion, etc.), but odds are, those have already been covered at this point, and I have cards for them. Moreover, Asia has its own indigenous plant species, so a Chinese "rose" won't match my idea of "rose" in North America. I've just given up on studying all of these words via flashcards. A good guide is, "Do I not know what this word means in English? If not, then why worry about in in Chinese?" But the funny thing is that I've made some exceptions, and my English vocabulary has actually grown through my study of Chinese.

The same dynamic exists for different kinds of stones, animal subspecies, chemicals, etc. I'm feeling better and better about just passing them by without trying to memorize them.

The wonderful thing is that the more you grow in experience, the "Farfoofles" aren't so Farfoofly. The context and characters will, more often than not, give me a good idea of what the farfoofle might mean. :) 

Oh, yes--I dream of that time, perhaps soon, when brute-force vocab memorization will trickle down to just a few minutes a day.

  • Like 1
Link to comment
Share on other sites

Exactly!  After 活着, I knew more farming words in Chinese than in English.  Awl v a hoe v a rake?  I mean I kind of know the difference in English, but not really.  I doubt I'd be able to reliably separate the types if you presented me with a randomly chosen farm instrument. 

 

Plus one of my stretch goals is to try to read some of the 4 Classics.  A few days ago, I was happy to farfoofly my way through this chapter of the Water Margin, from the U Michigan text sampler

 

http://www-personal.umich.edu/~dporter/sampler/shuihu.html

 

Even 5 books ago (after 家), when I got the "hang" of reading, the stuff there stone-walled me.  And that's one the easiest samples in there...

 

This time I made it through!  So I thought Wu Song killed a great worm (lizard/dragon?) by punching it to death... but apparently it was a tiger, and 虫 must include the meaning of a pest.  But I understood the gist of the story without having to look anything up, and got in a flow reading it.

 

I farfoofled my way through Shakespeare too.  I remember reading Midsummer's Night Dream long ago, and I'm pretty sure I understood less than 95% of the words in there.  Just needed to awaken my "pert and nimble spirits of mirth" to get into that flow.

Link to comment
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...