Jump to content
Chinese-Forums
  • Sign Up

automating tone marks over Chinese characters


trevelyan

Recommended Posts

This cool feature owes its existence to mixing beer with Roddy's superior CSS skills. You can see it in action here, here, and

here. Everything is CSS-powered, so it works slightly better in Firefox than IE. But looks good enough in both. :)

http://www.adsotrans.com/tone.html

Check it out. And let me know of any other features missing. Spacing between words? Control over font sizes? Automatic tone adjustment (一个, 不要)? If there's demand I can implement these sorts of things. Squeaky wheel and grease, etc..

Two issues:

(1) No automatic tone adjustment yet. So two third tones are... two third tones. And don't expect 不 or 一 to change based on their context yet.

(2) The pinyin for many entries is computer generated, so there may be cases where characters are displayed incorrectly. This just means that the database is guessing. Corrections can be made at the usual place and will take effect immediately.

Link to comment
Share on other sites

  • 1 month later...

Hey, do you know of any free desktop software that does what your software does on the web, i.e. convert Chinese text to HTML with either mouseover or integrated pinyin? I'd like to post some English translations, and it'd nice if I can have it side-by-side with the Chinese original and pinyin.

Link to comment
Share on other sites

You could use Adsotrans.com to produce the annotated content, and then save the webpage (File - Save Page As )

However, if you then wanted to post it on here, or any other forum, you'd run into problems, as the HTML wouldn't be allowed. However, what I would like to do is integrate Adsotrans with this site, so that each post would have an 'Annotate with Adsotran' link or something.

I'm holding off on this until the new release of phpBB though.

Link to comment
Share on other sites

I did manage to post an adsotrans-enhanced version of a Wang Xiaobo article I translated, using the method Roddy just described. The process is slow, however. There seems to be character limit for each run of adsotrans on the web (maybe 100?). Great application nevertheless, though.

Bilingual with partial mouseover pinyin, shown in browser status bar (courtesy of Adsotrans.com):

http://csua.berkeley.edu/~mrl/WangXiaoBo/WhyDoIWrite.adso.HTM

Without adsotrans enhancement:

http://csua.berkeley.edu/~mrl/WangXiaoBo/WhyDoIWrite.HTM

Link to comment
Share on other sites

Hey, looks like adsotrans has competition. It works great with a China Times web page I looked at.

Here's the above

Wang XiaoBo article processed through Popjisyo.com.

http://www.popjisyo.com/WebHint/Portal_e.aspx

POPjisyo provides a web based pop-up dictionary for Japanese, Chinese, Korean and other languages.

This service enables surfing the web with pop-up hints to aid understanding of foreign language documents and web sites.

Enter a URL or text you want to translate below, then simply move your mouse over the words you want to lookup!

(All this, without leaving the comfort of your browser or installing anything... and for free.)

Link to comment
Share on other sites

Technically, anyone can download Adso and get it running on their own computer... the source code and database are available for download. The process is not challenging under Unix, but is probably too difficult for Windows users who have not compiled software before. If there is demand perhaps we'll be able to make a Windows binary available sometime the future.

http://www.adsotrans.com/readme.html

Unless anyone is in a hurry though, I would hold off until after the Chinese New Year. The version currently available for download is a month old and uses an older version of the database. I'll be uploading the newer version when I get back from Tangshan after the New Year holiday, probably around the 12th.

The Popjisyo site looks interesting. There is also a similar service at http://www.rikai.com for those interested. The big difference is that popjisyo and rikai use the public domain CEDICT dictionary (26,000 words), don't differentiate between parts of speech and offer no way for users to add content to the dictionary. Adso does all three. We currently have about 130,000 defined words, attempt to offer selective definitions and accept user submissions as well. And anyone can download the software and database, tinker with the code, etc.

Gato --> our server doesn't currently have a limit on the content it accepts, but is located in Beijing so there may be servers timing-out because of lag. Its also possible that part of your translation was being dropped because it used an uncommon word encoded in Guobiao but not yet in the database. That is a known error we should fix soon. If you can send me the copy of whatever text you were trying to process I can take a look.

Link to comment
Share on other sites

trevelyan,

It's a great project :clap and I'm looking forward to the day when it can be downloaded and installed in Windows for ordinary users (like me).

our server doesn't currently have a limit on the content it accepts, but is located in Beijing so there may be servers timing-out because of lag.

Just a brief try with the content of the first post in this thread:

http://www.chinese-forums.com/viewtopic.php?t=4249

I noticed that Adsotrans didn't react at all until I had deleted a significant part of the content. So, there must be a limit set somewhere there. This shouldn't be much of a problem as long as there is an indication somewhere for the user to know that the limit has / has not been reached.

Cheers,

Link to comment
Share on other sites

http://www.rikai.com works well. It's very similar to http://www.popjisyo.com, which according to the author came later. I see that the author supports making source code freely available, but it seems that Rikai itself is not open source.

A better and user-updateable dictionary and open source would make Adso more attractive than Rikai, everything else being equal.

Have you thought about using SqlLite instead of MySql? Or maybe just isolate the database function enough so another database can be used if need be? SqlLite apparently is much easier to install and maintain and therefore might be a better choice for a desktop user. http://www.sqlite.org/cvstrac/wiki

At first, I thought why not just host it on a remote server, then if you had it on your desktop, you could use for texts on your own disk and for times when you don't have a connection to the net. So that might still be worthwhile, though there're already programs like Wakan that does something similar: http://wakan.manga.cz/ But it's not open source.

Link to comment
Share on other sites

Hashiri Kata,

Thanks for pointing out the issues you've had... the first was chinese-forums giving our server 403 error messages whenever we requested a URL from it. This was a problem with the program we relied on to fetch remote HTML. It is fixed.

The limit on the size of the textarea seems to be a problem with Internet Explorer. I'll take a look later tonight to try and find a workaround. Its not much of an answer, but you can always try submitting content using another browser for now. Firefox doesn't seem to have a problem.

Gato,

My impression is that Popjisyo and Rikai are better for beginners, since they offer a list of all possible definitions for the words flagged. Both seem heavily-dependent on CEDICT and Unihan. Our project keeps a separate license, but anyone contributing data is welcome to contribute it under the CEDICT license (contributors are marked in the database). Some of the content we have contributed:

http://www.adsotrans.com/cedict.txt

The database licenses are very similar, so I don't think there is any conflict between the projects. The only thing CEDICT explicitly allows that we don't is commercial web services using the content.

SQLite is a great idea... our database calls are isolated to special class, so if it is easy to dump our data into it the database, we should be able to put out a version using it. Installing MySQL is a huge hurdle... so thanks for the link.

Link to comment
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...