Jump to content
Learn Chinese in China

Chinese websites


Recommended Posts




The above passage comes from TheChairmansBao.com, and unfortunately they don't have a user forum where one can ask questions.


This is partly a tech question and partly a translation question.


What I understand is that through the end of 2018, there were 21.24 million .cn domain names, making up 56% of all the domain names in China.


The part I don't understand is what the 1.72 million refers to.  Obviously that's a much smaller number than the 21.24 million, so it's referring to something different from and more specialized than .cn domain names.  I can't figure this out.

Link to comment
Share on other sites

Site Sponsors:
Pleco for iPhone / Android iPhone & Android Chinese dictionary: camera & hand- writing input, flashcards, audio.
Study Chinese in Kunming 1-1 classes, qualified teachers and unique teaching methods in the Spring City.
Learn Chinese Characters Learn 2289 Chinese Characters in 90 Days with a Unique Flash Card System.
Hacking Chinese Tips and strategies for how to learn Chinese more efficiently
Popup Chinese Translator Understand Chinese inside any Windows application, website or PDF.
Chinese Grammar Wiki All Chinese grammar, organised by level, all in one place.



Question: I've seen Chinese characters expressed numerically as a percent sign plus numbers (in Chinese Wikipedia articles and filenames). Would using the numerical equivalent for 中国 work in a modern browser, or do you actually need to type the characters for a .中国 site to display?

Link to comment
Share on other sites

No.  Edit: Actually, yes.  See edit below for details.


In simplified terms, you can think of a URL as split in to a host, a path, and optionally a query (see here for the exact representation of the different parts of a URL).


Also in simplified terms, the host part tells your browser where to look on the internet to find the server you are interested in, and the path and query parts are then instructions for that host to return the resource (in most cases a website) that you are interested in.


Finding the host and giving the host instructions to return a resource are handled through different mechanisms, and have different rules for what is or isn't allowed.


The way the browser finds the host is through something called a DNS lookup.  This maps a host name to an IP address, e.g. you type in www.baidu.com, and the browsers does a DNS lookup, and gets told that the IP address of www.baidu.com is, and so it queries the server at for the data (try putting that IP address in your browser - it should load the baidu homepage).


DNS queries are handled by a DNS server, and for historical reasons, DNS entries typically only contain characters a through z, A through Z, digits 0 through 9, and the hyphen.


To solve the problems of using Chinese characters for the host name/domain name, when host names are limited to characters a-z, A-Z, 0-9 and the hyphen, smart people came up with the Punycode encoding system.


So when you type 中央电视台.中国 in to the address bar of your browser, the browser converts this to Punycode (xn--fiq53l90e917afrv.xn--fiqs8s) and then does a DNS lookup on that Punycode encoded name, and that lookup returns the IP address of the host.


This is why percent encoding won't work for typing the domain - because the domain/host needs to be in Punycode.  You could however avoid typing the Chinese and directly type the Punycode of any Chinese character domain and it would take you to the same place (just like you can use the IP address to take you to Baidu).


The path and query part of the URL on the other hand are handled by the webserver, and that is far more relaxed in the type of things it will accept but it still very much ASCII oriented.


Originally, url encoding (or percent encoding) was used to escape special character in the path and query sections of the url e.g. character like / or ? and &, which have special meanings.


There was no requirement to convert non-ascii text to a specific format, however in order to avoid potential encoding problems and ambiguities (e.g. the browser encoding non-ASCII text in GB18030 and the web server expecting utf8), it is recommended that all non-ASCII text be converted to utf8 and the resulting utf8 'url-encoded' with percent signs.  This is why you see % signs for chinese characters that appear in the path and query segment of the URL.


Edit:  Depending on the browser, you may be able to use the percent encoded version of a host name in the address bar.  The browser will convert it to Chinese, and then convert it to punycode, and then get the IP address from that.  Not all browsers will do this.  In my tests just now, it worked for Firefox, but not Safari unless I first put http:// in front of it.  Not sure what Chrome does.

  • Like 2
  • Helpful 1
Link to comment
Share on other sites


As the article says "中文域名正在逐渐被人们熟悉" 


You know, when someone doesn't understand something, it is neither helpful nor appropriate to just quote back to them what they didn't understand, as if the answer were obviously right there.  If it were obvious to me, I would not have asked.

Link to comment
Share on other sites

That was not one of the parts of the article you mentioned you had misunderstood, so I assumed you had understood it, especially as your second post seemed to imply that you had resolved the original source of your confusion.


Anyway, it means something along the lines of "People are gradually becoming more aware that domain names can use Chinese characters" and my comment wasn't meant as anything other than an ironic observation that you had never seen a " .中国 " URL, and now you are aware that they exist - which was the point of the last sentence.


Link to comment
Share on other sites

4 hours ago, Moshen said:

when someone doesn't understand something, it is neither helpful nor appropriate to just quote back to them what they didn't understand, as if the answer were obviously right there.  If it were obvious to me, I would not have asked.


Speaking for myself: many users on these forums read and speak Chinese at a high-level. My tendency is to assume the people I interact with here understand Chinese—especially if they quote Chinese texts in their posts—unless they specifically state otherwise or I know from previous experience that they don’t.


I would sooner overestimate the Chinese-language abilities of my fellow forum members than the reverse. 🙂

  • Like 1
Link to comment
Share on other sites

9 hours ago, Moshen said:

My dictionaries translate 域名 as "domain name," not as "Chinese characters." 

And that is correct.


The sentence I quoted and translated was then talking about "中文域名" which literally translated would be "Chinese domain name", except there is ambiguity in the English translation because the original is talking about Chinese language domain names (i.e. domain names that use Chinese characters) whereas in English "Chinese domain name" could also refer to domain names based in China e.g. https://sina.com.cn which is a Chinese domain name, but clearly not a 中文 domain name.


Therefore to remove ambiguity "中文域名" could be translated as "domain names that use Chinese characters" which is a clear an unambiguous translation of the intent of the source phrase, rather than a direct translation of the individual words.

  • Like 1
Link to comment
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Create New...