Jump to content
Chinese-forums.com
Learn Chinese in China

  • Why you should look around

    Since 2003, Chinese-forums.com has been helping people learn Chinese faster and get to China sooner. Our members can recommend beginner textbooks, help you out with obscure classical vocabulary, and tell you where to get the best street food in Xi'an. And we're friendly about it too. 

    Have a look at what's going on, or search for something specific. We hope you'll join us. 
edelweis

Are new characters still being created?

Recommended Posts

edelweis

I know there are thousands of old characters that are not in use anymore, so probably the answer is no. Why create a new character when there are thousands of old ones to choose from?

Still, I'd like to know if there is a record of the most recent character created?

Do we even have an estimation of the age of some characters?

(I mean, apart from the traditional to simplified process and variants of the same character.)

Share this post


Link to post
Share on other sites
Site Sponsors:
Pleco for iPhone / Android iPhone & Android Chinese dictionary: camera & hand- writing input, flashcards, audio.
Study Chinese in Kunming 1-1 classes, qualified teachers and unique teaching methods in the Spring City.
Learn Chinese Characters Learn 2289 Chinese Characters in 90 Days with a Unique Flash Card System.
Hacking Chinese Tips and strategies for how to learn Chinese more efficiently
Popup Chinese Translator Understand Chinese inside any Windows application, website or PDF.
Chinese Grammar Wiki All Chinese grammar, organised by level, all in one place.

jbradfor

New ones are being created for new elements, for example, and I think for some chemical compounds.

And of course old obsolete ones are being "repurposed", e.g. 囧.

  • Like 1

Share this post


Link to post
Share on other sites
imron

I think as computers become more and more dominant in terms of usage (compared to paper), it will get harder and harder to create totally new characters, simply because the hassle involved in updated various encoding standards, making sure that common fonts all have the character added and so on will present a large obstacle to all but a handful of characters for specialist vocabulary for niche fields.

  • Like 3

Share this post


Link to post
Share on other sites
edelweis

@jbradfor: thanks, I was not aware of the new elements warranting new characters.

@Imron: that's my feeling too... are all the existing characters already in thre standards anyway? Didn't some people have to change their surnames because they use un-typeable characters?

Share this post


Link to post
Share on other sites
Jose

I agree with imron that it is pretty hard to coin new characters these days because of encoding standards having a fixed inventory of characters. However, things could change in the future if a way of representing characters based on their internal structure were devised. In the current encoding schemes, a character like 淋 is represented by a numeric value completely unrelated to 林, which is also numerically unrelated to 木, but there have been some attempts to design component-based encoding systems where 林 would be represented as (木 木). With such a system, 淋 would be something like ((木 木)), and it would then be possible to coin a new "lin" character with the hand radical (扌林) by encoding it as ((木 木)). I think such a system would be a more natural representation of how Chinese characters actually work than the current sequences of numeric values, and maybe in a post-Unicode world in the far future this idea will catch on.

I can't remember where I first read about this kind of encoding schemes. After some googling, I've found this old paper from 1996 that explains such an approach: http://seba.ulyssis.org/thesis/papers/yeung.cpol97.pdf

  • Like 3

Share this post


Link to post
Share on other sites
Silent

Imron may be right that in some respects the computer makes it harder to create new characters. On the other hand, in a way the computer makes it easier too. Of course dependent on what you consider a new character.

In the past new characters were created by 'random' people either through writing errors or through conscious assembly to express something new or to simplify something that before was expressed by more difficult or multiple characters. After creation it was a matter of chance whether it would catch on. Now language is more standardised and if an official commity decides that a new character is needed it will be implemented in the systems, if only to meet the standards. A simple example is the euro-sign. After it was introduced many existing systems were patched to support it and standards were adapted.

Now, in the computer age only a very limited number of people, those in charge of the standards and the owners of the main OS's, have to be convinced to introduce a new character. A 'simple official decree' will do.

As long as a language is alive and used by people on a daily basis the language will develop. Changes are gradual, the speed of change may vary, but in the end it will effect every part of the language.

Share this post


Link to post
Share on other sites
imron
In the current encoding schemes, a character like 淋 is represented by a numeric value completely unrelated to 林, which is also numerically unrelated to 木, but there have been some attempts to design component-based encoding systems where 林 would be represented as (木 木). With such a system, 淋 would be something like ((木 木)), and it would then be possible to coin a new "lin" character with the hand radical (扌林) by encoding it as ((木 木)).

The thing is, it's a two pronged problem. It's not just the encoding, it's also the font. If the font doesn't have the character then it still wont display. Also, such an encoding system would be inefficient and impractical for storing characters. Currently unicode uses a maximum of 4 bytes for any character, however it typically comes down to 3 bytes for a common chinese character for utf8, or 2 bytes for utf16. An encoding system such as the one you mentioned could easily double/triple that for some characters.

Actually, for individual purposes, it's quite easy to create your own character with unicode. There is a range of several thousand code points set aside for private use in the BMP, and over a hundred thousand outside the BMP, so you can just decide you're going to use codepoint E001 for your new character and then create a font that has a picture of the character you want for codepoint E001. Anyone with your font installed who gets unicode text containing your new codepoint will see that codepoint as your new character. So, for personal or limited use, such a system makes it quite easy to create new characters. For widespread use however, it would take more work to get widespread agreement and widespread usage of the new font. New characters do keep getting added to the Unicode standard however (although these are not 'new' characters, rather they are rare characters that didn't make it into earlier versions of the standard and are being slowly added).

  • Like 3

Share this post


Link to post
Share on other sites
Mark Yong
imron wrote:

It's not just the encoding, it's also the font. If the font doesn't have the character then it still wont display.

That’s pretty much the problem I used to face regularly, when trying to input rare and non-standard characters (the latter normally being ones found in non-Mandarin dialects). To fix that problem, I have installed three of (what I have found to be) the most comprehensive set of fonts, i.e.

1. HanNom (Sets A and B)

2. SimSun Founder Extended

3. TW-Kai

The TW-Kai font set is not as comprehensive as (1) and (2), but I like it for its aesthetic appearance - it is the font normally used in Chinese wedding invitation cards. (1) alone probably contains all the characters in (2) and (3) plus more.

However, that still does not solve the problem of how to input the characters (having the font means you can display it, but inputting it is a totally different matter). So, I normally end up going to the Unihan website, and searching for the character (by radical and residual stroke count). That normally does the trick for me.

Actually, I did have an experience of having to manually-create a character. A friend of mine asked for assistance in drafting the text for his wedding invitation card. The snag was, his given name has a rare character: 木+强 (it’s actually a variant of 弶). And generating a new code for it would not have worked, as the draft softcopy would eventually have had to go to the printing company, who would undoubtedly use the TW-Kai font (it does not have the character - I checked). So, I had to manually-create it as a standalone JPEG by merging 木 with 强, plus some sideways ‘compressing’ to get the proportions right.

  • Like 1

Share this post


Link to post
Share on other sites
imron
However, that still does not solve the problem of how to input the characters

Most modern IMEs allow you to define an input sequence that maps to a given codepoint. For example, if you invent a new character that you give the codepoint E001 and want to pronounce 'xin', then you can go to the settings of your IME and configure it so that E001 is one of the choices for the input sequence of xin. Shape based IMEs also allow this functionality.

  • Like 1

Share this post


Link to post
Share on other sites
Takeshi

Sorry for bumping this thread but some new characters created here in Guangzhou are simplified-looking versions of HK's Cantonese characters. I have never seen any Guangzhou native write in Cantonese by hand, and when most write on the computer or cellphone extremely ad-hoc characters would be used, but some Cantonese textbooks I have use these characters. Most cannot be written on a computer.

Looking through my textbook (今日粤语), here are the first couple of characters that pop out:

口+个 (From 嗰)

扌+罗 (From 攞)

口+系 (From 喺)

  • Like 1

Share this post


Link to post
Share on other sites
Shelley

I understand that the characters for concave and convex were "created" recently.

凸 tu convex

凹 ao concave

These look like the concept being described, i like them for their simplicity and information given.

  • Like 1

Share this post


Link to post
Share on other sites
Shelley

I don't have any referances, only what my first chinese teacher told me. i did say in my original post that i wasn't sure. I said i understand.......

It does seem to be older, but i think the point is that it was created specifically for 2 concepts. Most characters seem to have developed and been adapted, these were created from scratch for these concepts. i am not sure of any others that are like this....but i am probably wrong:)

Thanks for the info.

Share this post


Link to post
Share on other sites
skylee

This word stands for "lift" (ie elevator) -> �� / Lift%28word%29.JPG(if you can't see it, it is (車立) combined as one character). In HK the pronunciation is lip1 (Cantonese). I think it is quite good, combining the meaning of standing in a car and also the similarity in pronunciation.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×