(Not) finding phonetics that have Outlier entries in Pleco

October 21, 2019 at 08:13 AM

G: How about a refund?

M: Sure, if we can. Here's how.
G: I don't want your stupid money.

Seriously, I'm lost for words.

6 minutes ago, vellocet said:

Dude, you're being a complete dick

Ah, there they are.

October 21, 2019 at 08:20 AM

Quote

Dude, you're being a complete dick about a software function that doesn't work the way you think it should work.

Indeed - it doesn't work at all!

Quote

I bought it. I think we're lucky to have anything like it at all, and complaining it isn't as feature-laden as you'd like it is really rich. It shouldn't even exist

Dude, you're overegging it. Outlier may one day be feature-laden heavenly convenience and indeed quite reasonable value, but for now one often has to "fall back" on other resources, which usually provide pretty much the same information in most cases. I appreciated learning that the sheep was really a headdress though, that really was mindblowing. Not as mindblowing as a fusion approach I saw with the headdress fashioned from bits of natty fleece, mind.

Quote

You really have to make a distinction between professional polished products and works of narrow interest created by enthusiastic amateurs.

Hmm, and which one applies to Outlier by way of Pleco?

October 21, 2019 at 08:30 AM

Aw c'mon Roddy, implementing tech stuff is obviously very time-consuming work, so it'd be churlish to demand any stupid money back. I just need to have more patience and faith is all.

October 21, 2019 at 09:59 AM

1 hour ago, Gharial said:

Hmm, and which one applies to Outlier by way of Pleco?

The work of narrow interest created by enthusiastic amateurs. Thinking otherwise is at the root of the frustration you feel.

October 21, 2019 at 11:19 AM

Thanks for the lesson on managing expectations, but I think Pleco (advertising etc) needs it more than me LOL. Doubtless Outlier will respond and at length to your other sentence. Shall we leave it at that?

October 21, 2019 at 12:34 PM

@vellocet First of all, I'd like to say thanks for piping in. But, since this is a public forum and since our livelihood quite literally depends on public opinion of our competency, I'd like to address the term "enthusiastic amateurs." You seem to be referring to our business sense (in which case, it may be a fair assessment), but for anyone else who may be reading this thread, I'd like to clarify a bit.

I left a pretty cushy job as an electrical engineer to go to Taiwan to spend a year (and no small amount of money) at ICLP in order to get my Chinese up to a level that I could survive in a Chinese academic environment. After that, I spent 6 years in a PhD program for Teaching Chinese as a Foreign Language (TCSL) at National Taiwan Normal University and was the first westerner to pass the PhD qualifying exams (two 6-hour exams, all in handwritten Chinese with nothing but pen and paper -- I memorized 15 books over a 4.5 month period to do this. It was over 20,000 supermemo flashcards. I had to be able to answer any question about linguistics (first exam) and any question about Chinese language pedagogy (second exam). The linguistic areas I studied for the first exam include topology and universals, semantics, transformational grammar, case grammar, functional grammar, contrastive analysis, Chinese syntax, pragmatics, phonology and Chinese morphology.

During my time in the TCSL department, I became interested in Chinese paleography and Old Chinese phonology. I took 18 hours of classes in the Chinese department on paleography and historical linguistics, including classes by paleographers 季旭昇 and 杜忠誥. I had the good fortune to do two 2-week courses with Dr. William Baxter (specialist in Old Chinese phonology) and 陳劍 (one of China's top paleographers), one in 2007 and one in 2009. After talking with 季旭昇, Dr. Baxter and Dr. Dirk Meyer (of Oxford University), I left TCSL and move over to the Chinese department to pursue more academically rigorous studies in paleography. I was only allowed to transfer 9 credit hours, so had to do another year plus of courses including more historical phonology, paleography and excavated texts. I also participated in 季旭昇's weekly study group on excavated texts for over 2 years.

To summarize: not including the time I spent learning Chinese and classical Chinese to a level that I could even start preparing for this project, I've spent at least 7 years full-time preparing to do this project. My PhD proposal (i.e., for my dissertation) was accepted 22 Oct 2015. I put off writing it up until a month ago, because I was spending 6 days a week on this project. I'm now only putting 1 day a week toward my dissertation, which has to be written entirely in Chinese, also so I can put time into this project.

John lived in Taiwan for 3 years before moving to Japan. He learned Mandarin really quickly. Within two years of being in Taiwan, he applied to and was accepted into the masters program in the Chinese department, also at NTNU. He did a year in that program, focusing on paleography and Chinese calligraphy and was very successful. He has a real solid basis in paleography. Better than any of the popular books available in English (and Chinese for that matter).

The two of us earn our living from Outlier and have done so for years now. So, if we have a say in this, we prefer the term "underfunded professionals."

October 21, 2019 at 10:33 PM

9 hours ago, Ash@Outlier said:

It was over 20,000 supermemo flashcards.

Using supermemo long enough to build 20,000 flashcards is the most difficult part of what you described, doubly so if it was one of the versions that didn't have good support for Chinese characters/unicode (which if I recall was not until 2011). Amazing software, with one of the worst user interfaces I've ever seen.

October 21, 2019 at 10:41 PM

Well, not really. I had made a few GUIs for making Supermemo cards back when I was an engineer. I had ones specifically designed for doing Chinese characters, and some for just normal cloZe questions or true-false questions. So, I just typed questions and answers into my GUI and it made all the cards I needed. So, generating the cards themselves was fairly trivial. I didn't use the desktop version of Supermemo because of the issue you mention. I just used the mobile version on a PalmOS device. I had bought a few extra of those back in the day because I wore out both sets of buttons doing my flashcards for Chinese on my original palm device. I had Pleco on it already. I did try out desktop supermemo, but that was back in like 1999 or 2000. It handled Big5 ok! : ) As far as amazing, yeah, it was. Pretty much everyone that does SRS uses some version of the supermemo algorithm. The UI was even less intuitive than microsoft (and I didn't even think that was possible).

October 21, 2019 at 11:38 PM

52 minutes ago, Ash@Outlier said:

The UI was even less intuitive than microsoft

The fact that you perceived it easier to write your own GUIs rather than use the one they made speaks volumes. I actually installed one of the latest Supermemo releases the other week just to see if they'd made any improvements to the UI - and nope, it's still just as I remembered it.

October 22, 2019 at 07:39 AM

On 10/20/2019 at 3:40 AM, mikelove said:

I’m not sure how you would have liked us to integrate it better; it’s a character dictionary and we let you look up entries in it by character.

I think this reply makes complete sense for Pleco which is basically a dictionary (that I once bought a Palm for). From Pleco's point of view, Outlier provides some extra info about certain characters, which is nice, especially because people are willing to pay for it.

But Outlier marketed the dictionary as some kind of revolutionary learning tool that could teach you characters in a way not available before. People may have assumed that you could bounce around from character to character, drill down to components, learn about characters that way. I don't think it's entirely unreasonably to be a little miffed that you can't.

Maybe that's the problem though. The tool most learners need is a version of Wenlin, with good GUI, for phones, with perhaps updated etymology, that can integrate with Pleco but not be limited by Pleco's dictionariness.

Perhaps I'm biased cos I love Wenlin, which in addition to being a dictionary is indeed a self-learner's tool as well. But it's not cheap, it is not on-phone, and there's a touch of the SuperMemo stubbornness when it comes to GUI. Maybe people hoped Outlier would be to Wenlin what Anki is to SM.

October 22, 2019 at 08:06 AM

Our dictionary is indeed "a kind of revolutionary tool that can teach you characters in a way that has not been available before."

1. It has a unified way of explaining all characters based on 4 types of functional components (that has not been available before)

2. It's based on the latest paleographic research and quotes its references (also not available before)

3. It's also based on the latest research in Old Chinese phonology (also not available before)

Wenlin is great. I use it a lot myself and have since probably 2000. But, the character explanations are largely based on Wieger, Karlgren and the Shuowen. At best, those resources are 100 years out of date and take nothing from what has been uncovered by advances in paleography and research into Old Chinese phonology over the last 50 years.

If I'm wrong about this, please point out a resource that does the above (or even part of it).

The system-level data has already been put together. However, it's in the queue right now behind other things, like getting the Expert data back on track. Basically, we just need to edit that data and have Mike Love incorporate it into Pleco. And just as a side note, Mike Love is super effective with getting things done (the nonsensical, baseless claims made in this thread to the contrary notwithstanding)

Also, the issue raised here isn't that the OP disagreed with/wasn't happy with what we are doing. It's the fact that he immediately started in with disrespect / unreasonableness / sarcasm. There's nothing wrong at all with expressing your discontent or disagreement with something. But, just because you bought a product, it doesn't give you the right to denigrate people. And from a merely pragmatic viewpoint, doing so isn't likely to further your cause.

October 22, 2019 at 08:21 AM

Jeez louise, stop patting yourself on the back so much Ash, you'll end up a real hunchback. Outlier's sloooow workrate obviously tied Pleco's hands on this, simple as, you pretty much admitted it. To try to have it any other way is what is most nonsensical, but you just can't see it. You should take a much-needed break from this one-sided "debate" and get back to writing those entries you owe us in that never-forthcoming miracle update.

October 22, 2019 at 08:34 AM

Free forums-mandated posting-holiday for Gharial. It's very rare I look at the edit history of someone's posts and see they've been repeatedly coming back to make them more rude.

October 22, 2019 at 10:53 AM

2 hours ago, Ash@Outlier said:

Our dictionary is indeed "a kind of revolutionary tool that can teach you characters in a way that has not been available before."

No it's not. It's a dictionary which breaks down some Chinese characters into some components.

Nothing very revolutionary about that, except that it's no dinner party.

Other dictionaries exist. Other schemes of categorisation exist. Other character-histories exist. Yours may well be superior - and some entries are superb - but I don't see revolution here (and presumably the great majority of entries will be "x gives the sound and y hints at the meaning"). OK - I accept the slick and attractive digitisation and consequent accessibility of such advanced-level paleography may be revolutionary for advanced-level scholars (insofar as it's digitised and accessible on a phone), but I can't imagine there are enough of them to fund a kickstarter.

I'm also unclear about these four functional components (one of which appears to describe a non-functioning function). Outlier tells me that for 达, the 辶 is a form component, and that the 大 is both a form component and a sound component. Does this explain why the character looks like it does now? Or could one or more of these components be accidental, or inessential? If a character has a component whose sound coincidentally now fits with current pronunciation of that character, would Outlier classify said component as a "sound component", even though at the time the character became established it was not a "sound component"?

October 22, 2019 at 11:19 AM

2 hours ago, roddy said:

coming back to make them more rude.

I few threads recently make me feel like I’m on reddit ?

October 22, 2019 at 01:40 PM

Quote

@realmayo:

If you're really interested in the answer, here's an explanation, though it may take a few minutes of your time.

BACKGROUND

Chinese characters are one of the main reasons for the high rates of attrition among students learning Chinese. The dropout rate for Chinese at American universities is 4x of the rates for Spanish, French, German, etc. Why is that? Because most dictionaries for people learning Chinese side step the issue of characters altogether, or they do provide some information, but they don't explain the character's structure. That leaves the student to fend for themselves. Most people don't get past it. I lived in Taiwan for 13 years and was in graduate school for most of that time. I know a lot of people that speak fluent Chinese, but are constantly forgetting how to write characters. The reason for that is that they just tried to memorize them without understanding their structure. If' you've looked at any research into memory, you'll know that the brain hates rote memorization. It also hates trying to memorize things it doesn't understand.

Yet, Chinese characters as a system have very noticeable patterns of sound and meaning expression. However, due to changes between the Old Chinese period and now, due to sound changes, changes in syllable structure and character corruption, these patterns are not as clear as they once were. Even to native speakers. For instance, I've not met a native speaker yet who sees that 監 jiān is the sound component for 藍 lán and 籃 lán, even though 藍 and 籃 have the exact same sound and the semantic components 竹 & 艹 even fall within the list of traditional radicals (i.e., more likely to be identified as meaning bearing components by a native speaker). The reason is that they can't reconcile the pronunciation differences between 監 and 藍, 籃. Also, because they don't have the habit of trying to understand character structure.

Foreign learners of Chinese that have the tenacity to make it through all of that and learn a few thousand characters start to notice some patterns, but they still don't notice a good portion of the available sound patterns, because they aren't easy to see if you're not familiar with the sound changes. They also can't see a lot of the meaning patterns because they are lead astray by the traditional notion of what radicals are. So, how do these survivors do it? Either through vast amounts of repetition, or via memory techniques. Once the memory techniques fall away, they're just left with their degraded sense of sound and meaning expression. Degraded here meaning that they never actually got a clear view into these patterns to begin with.

People have tried to make dictionaries that elucidate these patterns, like Wieger and Karlgren, but they were hampered by the information available to them at the time. Others have tried making memory stories, like Heissig. The good part about that type of method is that it adds meaning where there was no meaning. Even though the meaning isn't what the character was originally trying to convey, it gives a type of meaning to the structure of the character, which in turn helps the brain to remember. The bad thing about this type of method, when based on an incorrect understanding of a character's structure (i.e., by breaking it into non-functional parts or assigning a meaning to a part that was designed to give a sound), is that it hides the sound and meaning patterns that are inherently there.

Traditional Chinese dictionaries try to tell you what "type" a character is and thereby to add meaning. The problem is that the categories are not clear and they are often in conflict with one another. So much so that a lot of paleographers ditch the traditional system all together OR they have to modify it in way that gets pretty complex pretty quickly. For instance, if there is a component in a character that has both sound and meaning, it could be a 聲符兼表意的形聲字 or a 意符兼表音的會意字. There actually is a difference, but no way is a student of Chinese going to make any sense of it. This has the further problem that in order to understand an abstract category, you have to first understand a lot of concrete examples of that category. In other words, the system doesn't make sense until you learn a lot of characters.

OUR SYSTEM AND HOW IT'S DIFFERENT

The system we use in the dictionary is based on the idea of functional components. Instead of worrying about what type of character a given character is, we show you how the character breaks down into components that have a function within that character. By looking at these correct breakdowns (and when I say "correct," I mean "coherent with the design and evolution of that character -- at least up to the level that modern scholarship can tell us and modern scholarship gets us really far), from day 1, the student is shown clearly how each character functions. They start noticing sound and meaning patters much, much faster than people left to fend for themselves or those who cover up the functional parts with memory stories. Memory stories when divorced from a correct understanding of the working parts of a character cannot lead to predictive ability. That is, the ability to predict the likely sound and meaning of characters you haven't learned yet. Learning via our dictionary will get you this ability faster than any of other method available because it is a natural consequence of learning characters by understanding their working parts. Here are unedited quotes from people who took our class on how to learn characters (but the same benefit can be derived from the dictionary itself -- they both use the same system):

"The course is amazing! I was a HSK3 level reader, but could barely write. Using your course has not only helped me to learn to write, but has boosted my reading comprehension. I guessed the meaning and sound of 症 today, having never seen the character before." Robin O'Connor

"After a month of not reviewing, I was able to write (not just recognize) 283 of the 300 characters in my queue. Before I took the Outlier course, it would have been 0."
Edsko De Vries

"Today, I came across the character 崤 and nailed the reading, tone, and meaning in context and felt like a god!" Jonathan Coveney

Edsko had been learning characters for 6 years and still had trouble remembering them until he took our class. He's actually the one who wrote code to pull the system level data out of our database. I mentioned earlier in this thread that the data is already available. We just need to edit it. Edsko did this for free because he wanted to help us out after the huge breakthrough he had. He isn't just a programmer, but a consultant that actually teaches programming to other programmers, so his time is very valuable.

TO SUMMARIZE:
Because our dictionary is based on solid scholarship, it presents character breakdowns which are more consistent internally and more consistent with the actual origin and evolution of the Chinese script, thereby allowing users to understand each character's structure. It allows them to see the overall sound and meaning patterns within Chinese characters on a system level more quickly and more clearly than any other method available to English speakers today. It also does this quickly enough that the user has this information when they need it: while they are learning and not simply after they've learned a few thousand characters. Not only does using this method quickly lead to predictive ability, as is evident in the quotes above, but understanding those same sound and meaning patterns also allows for long-term recall that isn't dependent upon a memory story, but rather on the sound and meaning of the word you are trying to write (or decipher in reading). In other words, it allows you to use the sound and meaning of the word you are trying to write (or decipher) as clues. There simply is no other resource available today that does this. As such, I stand by my claim.

October 22, 2019 at 02:51 PM

Ash, all these things have been done before, though you probably do most or all of them better. If someone updated the paleography of Wenlin and then put Wenlin on a phone, would there be much difference between that and Outlier?

I mean, there's nothing new about splitting characters into their components. You must see that anyone taking any kind of beginners course and not then guessing the pronunciation and meaning of 症 would be entitled to their money back?

Armed with Wenlin, anyone can teach themselves a functioning understanding of how Chinese characters work and how they break down. The fact that your system is based on the latest research is irrelevant to most learners if it doesn't speed up their learning time. You are providing more accuracy but at the cost of having to remember which of four functions is being activated for each component in a character.

Your four functions: do they accurately describe a character's genesis and development, or are they arbitrary ways of fitting into your four functions? I asked:

Outlier tells me that for 达, the 辶 is a form component, and that the 大 is both a form component and a sound component. Does this explain why the character looks like it does now? Or could one or more of these components be accidental, or inessential? If a character has a component whose sound coincidentally now fits with current pronunciation of that character, would Outlier classify said component as a "sound component", even though at the time the character became established it was not a "sound component"?

October 22, 2019 at 03:13 PM

1 hour ago, Ash@Outlier said:

I've not met a native speaker yet who sees that 監 jiān is the sound component for 藍 lán and 籃 lán

They seem to manage OK with Chinese characters though! But seriously, if that's true, why do they think that 藍 and 籃 are pronounced the same?

October 22, 2019 at 03:55 PM

No, they haven't been done before. Who did them? Give examples.

You're confusing data and software. We do data. Pleco does the software end. Wenlin is mostly a piece of software, though they do have data. Their data is very, very outdated (when it comes to character explanations). The software is very useful. I use it and have for many years. I'm a big fan. But the character explanations are not systematic and are based on outdated information. The reason that is important is because it comes at a cost in accuracy, which lowers you understanding of a given character. We can and will add system level data. It will be different from Wenlin in the sense that you can see how a components acts as a semantic component, sound component, etc.

No, there is nothing new to splitting characters into components. But the vast majority of data available doesn't do it in a way that is coherent with the functional parts within a character and most have no concept of character corruption. The ones that do try to break them down in a way that is coherent with their evolution generally do so based on outdated information and without a clear enough understanding of the issues at hand. If I'm wrong here, please give examples of someone doing it based on current research and correct understanding.

Quote

You must see that anyone taking any kind of beginners course and not then guessing the pronunciation and meaning of 症 would be entitled to their money back?

Not true at all. I've talked to many people learning Chinese. If you had gone to the Mandarin Training Center at NTNU in Taipei between the years of 2006 and 2014, you would have seen me sitting outside there. I've talked to tons of learners about these topics over the years and no, that is not a given. One guy that I had several conversations with wasn't even aware that characters had sound components at all. And he could speak decent Chinese and had read several entire books in Chinese.

Quote

Armed with Wenlin, anyone can teach themselves a functioning understanding of how Chinese characters work and how they break down. The fact that your system is based on the latest research is irrelevant to most learners if it doesn't speed up their learning time. You are providing more accuracy but at the cost of having to remember which of four functions is being activated for each component in a character.

No you can't. I know. I've done it. Wenlin is very useful, and breaks down sound-meaning characters correctly, that is true. I agree that it's irrelevant if it doesn't speed up learning time, but it does. The number one rule of effective memory is that you understand the object of learning. That means that advances in understanding directly correlate to better learning. Remembering which of four functions a component is doing is trivial if you are familiar with the sound and meaning of that component. It's also trivial because it's not arbitrary. It's based on the sound and meaning of the character itself, so it's a node in a web, rather than some extra, unrelated piece of information to be learned. It also allows you to recall character forms after having temporarily forgotten, because you can reason through it. Trying to remember by rote is far more costly, you can't reason your way back to the form based on rote memorization and it generally doesn't work, and hence the high attrition rates in Chinese programs.

I spent 6 years in Taiwan in a PhD program for teaching Chinese as a foreign language. One of the biggest topics when I was there was "how do we teach characters?" There are all kinds of things they're trying to do to solve the problem, like using IMEs instead of writing, ditching learning characters entirely, etc. If things were as simple as you make them out to be, this would not be the case nor would things like Heissig's book exit.

Quote

Your four functions: do they accurately describe a character's genesis and development, or are they arbitrary ways of fitting into your four functions? I asked:

Outlier tells me that for 达, the 辶 is a form component, and that the 大 is both a form component and a sound component. Does this explain why the character looks like it does now? Or could one or more of these components be accidental, or inessential? If a character has a component whose sound coincidentally now fits with current pronunciation of that character, would Outlier classify said component as a "sound component", even though at the time the character became established it was not a "sound component"?

They describe them as accurately as modern scholarship allows. Not all characters have the same amount of information available on their origin and evolution. Obviously, the ones with more information will allow for a higher degree of accuracy.

When doing an analysis, the entire history of the character is taken into account. Designation of component type comes from the earliest form of the character (or in some cases, the earliest form that lead to the modern form; characters often do not have linear evolution, but several forms exist in parallel).

In 达, no, the components are not accidental or inessential. If fact, we do tell you when a component may be inessential. For instance 無 vs. 舞. Originally, they were the same character. The only difference is that feet were added to 舞. Both forms existed in parallel (i.e., the addition of feet didn't change anything about the sound or meaning expression). Eventually, 無 was borrowed via sound loan to mean "not." 舞 retained the original meaning "to dance."

The sound component is determined by a character's ancient pronunciation, not its modern pronunciation. For instance, looking at the modern pronunciation, you may be tempted to think that 尤 yóu gives the sound in 就 jiù, but it doesn't. We've toyed with the idea of "effective sound components," which is basically what you are saying here. That is, you can use them in Mandarin to remember the sound of a character, but that wasn't their original function. So far, we haven't made use of this idea, though we may in the future. There is one more scenario related to this and that is the result of soundification or phoneticization, which is when a component gets replaced by another component due to graphical and sound similarities. Take 到 for example. The 刂dāo was originally 人, but got changed into 刂 because of both the similarity of those components during Warring States and because of their similar sounds. In the Essentials edition, we simply say that 刂 is phonetic, because trying to explain the actual origin would most likely detract from rather than enhance learning. In the Expert edition, it will be explained. I've actually already written that entry.

October 22, 2019 at 03:59 PM

Quote

They seem to manage OK with Chinese characters though! But seriously, if that's true, why do they think that 藍 and 籃 are pronounced the same?

I'm not saying that they don't. But as a non-native speaker, using native speaker methods of learning characters is way below optimal (it's actually suboptimal for native speakers themselves, but that's a whole other thread or two). We have a youtube video explaining why if you're interested.

I'm not sure what you mean. 藍 and 籃 are pronounced the same. They're pronounced lán. Or are you asking how they came to be pronounced lán when 監 is pronounced jiān?

Sign In

(Not) finding phonetics that have Outlier entries in Pleco

Recommended Posts

roddy

Link to comment

Share on other sites

Gharial

Link to comment

Share on other sites

Gharial

Link to comment

Share on other sites

vellocet

Link to comment

Share on other sites

Gharial

Link to comment

Share on other sites

Ash@Outlier

Link to comment

Share on other sites

imron

Link to comment

Share on other sites

Ash@Outlier

Link to comment

Share on other sites

imron

Link to comment

Share on other sites

realmayo

Link to comment

Share on other sites

Ash@Outlier

Link to comment

Share on other sites

Gharial

Link to comment

Share on other sites

roddy

Link to comment

Share on other sites

realmayo

Link to comment

Share on other sites

ChTTay

Link to comment

Share on other sites

Ash@Outlier

Link to comment

Share on other sites

realmayo

Link to comment

Share on other sites

realmayo

Link to comment

Share on other sites

Ash@Outlier

Link to comment

Share on other sites

Ash@Outlier

Link to comment

Share on other sites

Join the conversation