Jump to content
Chinese-forums.com
Learn Chinese in China

gkung

A complete guide to Computer-aided Chinese Reading

Recommended Posts

gkung

Read Chinese Without Knowing Chinese is the first publication to advocate the idea of reading text in a foreign language without knowing or learning the language.

Chinese written language is extremely difficult to learn. Its text does not consist of alphabetic letters, but is made up of characters built from strokes written in virtual rectangular boxes. One needs to memorize 3,000 characters and over 10,000 commonly used words to read newspapers. This takes native Chinese children 3 to 6 years to learn. For foreigners who do not use Chinese on a daily basis, the time needed is substantially longer.

This book takes another approach. It shows that using the latest translation and OCR (optical character recognition) technologies, with a few other tools and know-how, it is completely feasible for someone with no knowledge of Chinese to read and understand text written in Chinese. Using the techniques described, one can read not only electronic text from websites and emails, but text on printed materials such as magazines, books and newspapers, or even from posters and street signs.

This book provides detailed, step-by-step instructions anyone can follow in each stage of the process. It contains real-life examples demonstrating reading from magazines, newspapers and street signs. There is an accompanying website (www.GeorgeKung.com) providing updates, sample chapters, additional resources and other information.

Share this post


Link to post
Share on other sites
Site Sponsors:
Pleco for iPhone / Android iPhone & Android Chinese dictionary: camera & hand- writing input, flashcards, audio.
Study Chinese in Kunming 1-1 classes, qualified teachers and unique teaching methods in the Spring City.
Learn Chinese Characters Learn 2289 Chinese Characters in 90 Days with a Unique Flash Card System.
Hacking Chinese Tips and strategies for how to learn Chinese more efficiently
Popup Chinese Translator Understand Chinese inside any Windows application, website or PDF.
Chinese Grammar Wiki All Chinese grammar, organised by level, all in one place.

liuzhou
It shows that using the latest translation and OCR (optical character recognition) technologies, with a few other tools and know-how, it is completely feasible for someone with no knowledge of Chinese to read and understand text written in Chinese.

Nonsense.

Computer translation is in its infancy and is highly unlikely to ever work. OCR might help you identify a character (but often not). However, that doesn't lead to understanding the meaning in context.

With translation technologies, at best, you might get the drift, but often it will be totally wrong.

Try translating any 成语 using a computer translation - you are almost guaranteed to fail to come up with anything meaningful..

Good luck with your advertisement.

Price: US $44.95

哈哈!

After finishing this book, you can even help your Chinese friends read Chinese.

I am sure they will be very grateful!

Share this post


Link to post
Share on other sites
renzhe

We have several Master-level students in our department writing theses on character recognition in general, and Chinese character recognition in particular. The focus is on written manuscripts, but some of the problems with distinguishing characters are apparent. State of the art software still makes very silly mistakes.

This basically looks like OCR + a translation engine like google's, and both of those are very faulty. It's got some use for people who don't know about OCR and translation engines, but the statement that you can read Chinese without knowing Chinese is very exaggerated. You can read anything if you stick it into an online translator, yet it's not a very pleasant experience. You end up just basically guessing what the main idea might be, which is far detached from actually understanding (reading) the language.

Some other very exaggerated, basically incorrect statements:

This takes native Chinese children 3 to 6 years to learn. For foreigners who do not use Chinese on a daily basis, the time needed is substantially longer.
There are two Chinese character sets—simplified and traditional. Most native Chinese can only read text from one of them.

Share this post


Link to post
Share on other sites
gkung

"Computer translation is in its infancy and is highly unlikely to ever work."

This statement may be true 5 years ago; however, the technology has been greatly improved in the last few years.

"OCR might help you identify a character (but often not). However, that doesn't lead to understanding the meaning in context."

OCR is for extracting characters. Translation software is used to understand the meaning of the text.

"Try translating any 成语 using a computer translation - you are almost guaranteed to fail to come up with anything meaningful."

Actually "成语" are very easy for translation software to handle. They basically come down to adding the explanations to the dictionary. Many translation programs are very good at these.

Share this post


Link to post
Share on other sites
mitcho

From the website:

It takes Chinese children 3 to 6 years to learn to read Chinese. You can match them in the time it takes to finish this book!

This banner is ambiguous: one interpretation would be that it takes three to six years to read the book! :P

Share this post


Link to post
Share on other sites
roddy

Even if it is feasible, I think charging almost fifty dollars to tell people to buy an OCR and some translation software is probably a bit ambitious.

Share this post


Link to post
Share on other sites
gkung

"Some other very exaggerated, basically incorrect statements:

Quote:

This takes native Chinese children 3 to 6 years to learn. For foreigners who do not use Chinese on a daily basis, the time needed is substantially longer. "

A third-grader Chinese child is not able to read newspaper (or other general document) without the help of ZhuIn or Pinyin, and he/she may have already learned some characters before going to the school. This is a simple fact, nothing exaggerated.

Share this post


Link to post
Share on other sites
gkung

"Some other very exaggerated, basically incorrect statements:

Quote:

There are two Chinese character sets—simplified and traditional. Most native Chinese can only read text from one of them."

From data I obtained by interviewing people: without any learning, the average comprehensive rate of text written in the other set is less than 75%. A few interviewers are able to understand more than 90% of the text written in the other set, which I considered "able to read text from both sets," but they achived that only after reading at least one novel (statistically more than 800 pages of books or documents) from the other set. I do not believe most native Chinese have completed reading that many pages of document from the other set - even though I don't have data to support that.

You said this statement is incorrect, so do you believe most native Chinese are able to read text written from bother sets?

Share this post


Link to post
Share on other sites
Senzhi

It's these kind of things that make my students of English not study: they use translation software to write their essays.

They basically research the information in Chinese on the internet, copy/paste it in their translation software ... and then nicely copy it on paper.

And they do not understand why I force them to do their homework again. They seriously claim I can't see the difference between computer generated translation and translation from the brain, with the help of a good dictionary.

Translation software and learning. Haha, what a joke! :roll:

Share this post


Link to post
Share on other sites
renzhe
A third-grader Chinese child is not able to read newspaper (or other general document) without the help of ZhuIn or Pinyin, and he/she may have already learned some characters before going to the school. This is a simple fact, nothing exaggerated.

The way you worded it, it suggests that "learning 3000 characters and 10,000 words" takes a native speaker child 3-6 years and "substantially longer" if you're a foreigner. That is preposterous.

An American third-grader can't read most American newspapers either. This is because third-graders still wet their bed, and newspapers aren't targetted towards them. Children do NOT learn this vocabulary. By that logic, it would take substantially more than 3-6 years to learn English too.

You said this statement is incorrect, so do you believe most native Chinese are able to read text written from bother sets?

Yes, I believe that the majority of literate native speakers will struggle somewhat when confronted with written language in the other set for the very first time, but after a short period of adaptation, will have only a mild discomfort while reading the other set. This is what I've observed with the native speakers I know. Here, on this forum, both sets are commonly used only people who struggle with this are beginning and intermediate-level learners, not native speakers.

I can't comment on your numbers because I don't know how you did your tests, on which sample, the difficulty of the text, how objective their self-assessment was, the educational background, etc. I just don't feel it's representative of "most native speakers". Again, I find the wording misleading. A reader would guess that there are two completely different sets of characters, when the fact is that only a few hundred (from the 3000 you mention) are sufficiently different to cause any problem, and those can be figured out from context most of the time anyway.

Claiming that a person with no Chinese knowledge who reads a book can "help your Chinese friends read Chinese" is absolutely ridiculous. You're basically describing how to use an OCR program and a computer translator.

Share this post


Link to post
Share on other sites
gkung

"This is because third-graders still wet their bed..."

Talking about exaggeration

".., and newspapers aren't targetted towards them."

I am just using newspapers as an example. Actually, third graders need the phoenic annotations for other readings, including stories that are targetted for them. The logic is simple: phoenic annotation is used simply because they do not recognize a character, not just because the contents are too deep.

Share this post


Link to post
Share on other sites
gkung

"It's these kind of things that make my students of English not study: they use translation software to write their essays.

They basically research the information in Chinese on the internet, copy/paste it in their translation software ... and then nicely copy it on paper."

It's about READING - not WRITING.

"Translation software and learning. Haha, what a joke!"

Actually, it's about "NOT learning". However, I don't feel associating translation software with learning is a joke. You can view it as a tool, just like a dictionary. It just so happen that your students have a different way of using it that's not approved by you.

Share this post


Link to post
Share on other sites
roddy

Plus, not everyone WANTS to learn to read Chinese - odd though it may seem from the ecosystem of this particular forum. Some people just need to know what something written in Chinese means, and for perfectly valid reasons haven't the time or inclination to learn how to read it themselves.

Share this post


Link to post
Share on other sites
HerrPetersen

Do I understand it right, that the book does not include some kind of software? So it basicly gives you a number of web-links and tells you how to use them? Would you mind to give a rough sketch, which web-sites you are explaining?

Share this post


Link to post
Share on other sites
gkung

"...So it basicly gives you a number of web-links and tells you how to use them? Would you mind to give a rough sketch, which web-sites you are explaining?"

It's more than that. It includes hands-on examples of reading magazines, newspapers, street signs ... so hopefully people will not think it's a joke. The most valuable thing offered is that it describes techniques to deal with issues which happen in real life. For example, text generated by a translation software may read very differently from regular writing. What do you do when the translated result is insensible or incomprehensible? How about when the output from the recognition program (OCR) contains incorrect characters? Only by putting all the little pieces together will the whole reading process work.

For more information you can visit: http://www.GeorgeKung.com

Share this post


Link to post
Share on other sites
Senzhi
It's about READING - not WRITING.

In my humble opinion ... it's about understanding what you write ... or read! :roll:

Share this post


Link to post
Share on other sites
renzhe
Some people just need to know what something written in Chinese means, and for perfectly valid reasons haven't the time or inclination to learn how to read it themselves.

I have no problems with this approach, and I have no problems with Mr. Kung's book. I'm sure that there are people who will find it useful.

I just found many of the statements far too bombastic. It doesn't take substantially more than 6 years to learn 3000 characters (took me a year and a half, as a hobby, in Europe), third graders have trouble in any language, not just Chinese and you can't use OCR software to help native speakers learn their own language. I'm assured by my Chinese friends that reading traditional/simplified characters is no issue at all for literate native speakers, and I've never met a Chinese person who claimed otherwise.

Share this post


Link to post
Share on other sites
gkung

This is really very derailed from my topic but I just need to say this for the record:

"I just found many of the statements far too bombastic. It doesn't take substantially more than 6 years to learn 3000 characters (took me a year and a half, as a hobby, in Europe)"

I couldn't say it's impossible, but this is very out of ordinary. I am talking about a regular corriculum for average people here. I do not invent this number (3 to 6 years) and it really takes that long. Maybe you can ask your Chinese friends how long does it take for them to learn Chinese.

"you can't use OCR software to help native speakers learn their own language."

I have never said that OCR is used to help native speaker learn the language. It is for people that DON'T know Chinese to use.

"I'm assured by my Chinese friends that reading traditional/simplified characters is no issue at all for literate native speakers"

As I said, it takes some time for native speaker to learn characters from the other set, and my data show that it will probably take people to read about a 800 page novel to become very comfortable reading characters from the other set. I don't have a very unbiased way of measuring this.

Share this post


Link to post
Share on other sites
imron
Maybe you can ask your Chinese friends how long does it take for them to learn Chinese.
This post, made just last week in an unrelated thread might provide some insight.

Share this post


Link to post
Share on other sites
renzhe
I couldn't say it's impossible, but this is very out of ordinary.

It is a bit fast (and closer to two years), but then again, it's less than an hour per day of studying. With immersion and more time dedicated to it, it could be faster. There are people out there who learn this in several months, given specialised books with mnemonics (Heisig system, etc.)

Anyway, reading Chinese is not easy, and learning a language takes years. You describe a way to use computer translation software on Chinese texts, and that's fine. I just found many of the claims bombastic, like the quote "After finishing this book, you can even help your Chinese friends read Chinese."

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...