Jump to content
Chinese-Forums
  • Sign Up

Transliteration tool for Tang Chinese, Cantonese, Hanja, Kanji (On/Kun), Hán tự


jonsl01

Recommended Posts

Hello !

First posted here 3 years ago about ChineseTransliterator, a tool i made in my spare time for transliterating Chinese, which back then was based on the v5.1 Unicode Chinese character database (unihan). Finally got bothered to do an update just last week to the 6.1 database. Mandarin is also supported, but there's plenty of Mandarin tools available, and being the Non-Mandarin forum, i think some people might be interested in the following feature(s):

1) As the title suggests, it supports Tang Chinese (

http://www.unicode.org/reports/tr38/#kTang), Cantonese (jyutping), Hanja, Kanji (On/Kun), Hán tự Vietnamese, as well as Mandarin

For precision (and cause it was easier to make lol) It converts character by character, rather than recognise multisyllable words. Where a character has more than one pronunciation, you can double click on the result to bring up a little context menu and make your final selection. If you are unfamiliar with a character, you can also double click on that to get the English gloss

Some when they are bored might like to paste in a large slabs of text, then tick all the romanisation options, and see the the simultaneous transliteration to up to 5 languages/dialects/topolects

If you tried Version 1 and found it didn't place the tone marks on the correct vowel (for pinyin), that issue has been addressed.

If interested, please proceed to

http://download.cnet.com/ChineseTransliterator/3000-2279_4-10964539.html to download a copy. It runs on any Windows PC with .Net 2.0 capability, and requires no installation (simply unzip and run from anywhere, even your USB)

Thanks :-)

Link to comment
Share on other sites

Sorry, if anyone got scared by the term .Net 2.0, it is an application framework by Microsoft, which is very mature and most people should have this on their Windows PCs already. If you don't, it is downloadabled from Microsoft here: http://www.microsoft.com/download/en/details.aspx?id=19

.Net 2.0 doesn't mean you need to connect to the internet, in fact this tool is designed to be used completely offline

Viruses and spyware is a major problem these days. Download.com does checks before allowing a file to be published.

OK if still no takers after this, i can only assume you guys are happy with what you've got already, which is credit to the developers who've come before me. No worries at all.

Kind regards,

Link to comment
Share on other sites

Technically, your program doesn't do transliteration.

I don't know what converting characters to their pronunciation in Romanization is called but that's what your program essentially does.

Transliteration is a different thing. It's taking a word in one language and using the existing words in another language to pronounce the word from the first language.

It's kind of difficult to explain.

But examples would be like the English surname Roosevelt would be "luo si fu" or Clinton would be "ke lin dun". They've got a street in Taiwan named after Roosevelt called "luo si fu".

There are these style guide type books for Chinese journalists that have tons of foreign names to help in transliterating foreign names into Chinese.

English, Spanish, French, German, Italian, Norwegian, Russian, etc.

For Japanese, they'd just use the Kanji (Chinese characters) so no problem there.

Same with Korean, with the original Hanja (Chinese characters).

At one time, there was a bit of controversy with Obama's surname when he first ran for president of the US.

I guess it was so rare that it wasn't included in those guide books.

Initially they came up with two ways of writing it. I don't know which won out off hand.

I might download your program just to check it out.

jonsl01 wrote:

Viruses and spyware is a major problem these days. Download.com does checks before allowing a file to be published.

Could you point me to where it says that download.com has vetted the programs they have for download?

I once downloaded a program from sourceforge and after I ran a virus scan it gave me an alert. I don't know whether it was a false positive or what but I certainly didn't install that program.

Link to comment
Share on other sites

Hello Kobo-Daishi,

Thanks for the explanation. And I was wondering three years ago why no one has come up with something like this under what i thought to be a relatively catchy name lol. It was a mistake on my part, but as something of a hobby project, it doesn't concern me that much. Name aside, i am sure the functionality is not difficult for anyone to understand; you paste in some Chinese/Hanzi text, and it spits out one or more forms of romanisation.

On the point of download.com; they seem to have taken out their little tag line to the effect of "virus and spyware free", probably due to liability reasons (we all know that malware scanners aren't perfect). I would hazard a guess that they still scan any published software as this is not difficult to do and entirely necessary to protect their reputation as a relatively safe place to download. My program had to go through an approval process before it was published, the description was even edited slightly, so it was probably a manual review. For what it is worth (lol), i didn't insert any malicious payload into my program.

Sourgeforge is for opensource where security against any malware traditionally comes from the code review by volunteers. If it is not a very popular collaborative project, it is entirely possible for someone to slip a surprise or too in there.

Hope you gave the program a try. I am happy to hear any feedback.

Kind regards,

PS.

If anyone knows where i can get a database containing Hokkien, Hakka, Shanghainese, Yale Cantonese (or any other dialect/topolect) romanisations, please post it up in here or PM me. I am most keen to extend support to more romanisations. Thanks :)

Link to comment
Share on other sites

http://humanum.arts....Lexis/lexi-can/ for modern consensus dictionary-standardised Cantonese (romanisation is selectable). Although I personally use http://www.cantonese....uk/dictionary/, which includes words.

http://210.240.194.9...chil/Taihoa.asp for Taiwanese Minnan words (not sure if it's Ministry of Education standard, in POJ); http://210.240.194.9...jitian/tgjt.asp for a character-based dictionary.

http://tatoeba.org/e...ghainese_to_ipa converts (some) 汉字 into IPA; not sure which Shanghainese register it is (old-style, new style) or how strict the IPA is (whether it does the tone sandhi across sentences or across 3-syllable or 2-syllable words or not).

http://www.shanghaidialect.com: same caveats here.

http://www.chinalang...eID=Hakka/Query for Hakka. No idea if it's Meixian or not...

Link to comment
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...