Jump to content
Chinese-Forums
  • Sign Up

Firefox Plugin: Chinese text annotation


trevelyan

Recommended Posts

Happy New Years,

Just wanted to spread the word about our new Firefox plugin that does on-click Chinese text annotation. Using it is as simple as pressing 'a' (adsotate) and then clicking on Chinese text anywhere in the browser. A small window will popup (or refresh) with the sentence annotated newsinchinese-style: mouseover for pinyin and english.

The script handles traditional and simplified Chinese and those with an inclination towards hacking should also feel free to modify it to change its behavior if desired. Feedback is more than welcome.

INSTALLATION:

(1) Install or Upgrade to Firefox v. 1.5

http://www.mozilla.com

(2) Download Greasemonkey and restart Firefox:

http://greasemonkey.mozdev.org/changes/0.6.4.html

(3) Visit the following page and click on the "Install" button:

http://www.adsotrans.com/downloads/adso.user.js

USAGE:

(1) Visit any page with Chinese text.

(2) Press lowercase 'a'

(3) Click on any Chinese text.

A couple of more detailed notes:

(1) you don't need to press 'a' every time you annotate Chinese text, but you do need to press it again if you have pressed any other keys in the meantime. Hitting any key but 'a' deactivates the service so that the plug doesn't start popping up unexpectedly when people are typing and clicking.

(2) there are occasional places where the javascript seems to have issues picking up the text, or I find myself needing to click twice. This is especially the case with links and seems to be a Greasemonkey issue.

(3) annotation doesn't work for webpages encoded in BIG5 yet.

Link to comment
Share on other sites

Hi,

I've downloaded everything you mentioned and followed the instruction you taught. However, it didn't work when I pressed lowercast 'a'; nothing popped up. Is it because I am using traditional character system, Big5?

Thanks!:)

Link to comment
Share on other sites

(3) Visit the following page and click on the "Install" button:

http://www.adsotrans.com/downloads/adso.user.js

When I open this page, it opens as text in my Firefox and there's no install button. ???

I see the code of the page, should I enable something? I'm using Firefox 1.5.

// Adso GreaseMonkey Script

// version 0.1 BETA!

// 2005-04-22

// Copyright © 2005, David Lancashire

// Released under the GPL license

// http://www.gnu.org/copyleft/gpl.html

...

EDIT:

I got it to work, thanks for the links!

I went to one directory up - http://www.adsotrans.com/downloads, selected the link, right-clicked and selected Install User Script...

Link to comment
Share on other sites

Semantic,

Can you see the Greasemonkey icon in the lower-right hand side of your browser? If so things should be working if the script is installed.

Try visiting Sina.com or Xinhua and selecting one of their articles. Then try pressing 'a' and annotating some text on screen. Make sure your caps lock is off. If this doesn't work can you send me a private message through Chinese-Forums here with your OS, version of Firefox, and the URL of the page that won't allow the popup, ideally with the text that you're clicking on as well?

You should only run into BIG5 if you're reading older pages from Taiwan these days. I sometimes have issues getting the popup to appear when clicking on links and think this is probably an issue with Javascript in Firefox. Will try to iron out any bugs as they're found though.

Link to comment
Share on other sites

Semantic,

Glad it works for you now. There's no way I know offhand to segment Chinese text using javascript (at least the implementation we have to work with) and there would be issues with doing so across multiple encodings. This makes it difficult to grab variable-length sentences. I'd be happy to include any improvements others come up with. And if most people would find it useful to have more text processed by default.... we can do that. Anyone else have thoughts on this?

Regardless of what others want, you you can customize the script for yourself by saving the file http://www.adsotrans.com/downloads/adso.user.js locally and editing the lines below. Open the revised version in Firefox and reinstall:

// Customize the Popup

var width_of_popup = 500;

var height_of_popup = 150;

var surrounding_text_length = 30;

Note that you'll want a bit of spacing to appear below the text so that the popups pop downwards rather than upwards into the menu-bar.

Link to comment
Share on other sites

Yeah thanks thats a great tool!

A couple of questions:

1) When there eventually be Big5 capability as most of my Taiwan sites like yahoo.tw use big5

2) Anyone know any traditional character sites (non Big5) on news, sports, movies etc?

Thanks again!

Link to comment
Share on other sites

This is very cool. I found a small problem, though. Reading the first paragraph of http://www.voanews.com/chinese/w2006-01-02-voa8.cfm:

北韩誓言要在新的一年里坚持其“军事第一”的政策,只字不提为解除朝核项目而举行的进展缓慢的六方会谈。另外北韩还表示要发展农业,振兴经济。

and clicking on it, the 为 doesn't make it to the popup window, even if I click right on it.

Link to comment
Share on other sites

An excellent tool (once I realised that I needed to be running Firefox not IE to do the installation ... doh!?)

Will I still be able to use it if I have no internet connection? (only because I want to show it to other people but I don't think they use Firefox).

Any plans to (1) make it include more than one sentence? (2) not require a separate pop-up window?

Link to comment
Share on other sites

Hann -- a bit of progress on big5, although I'm still having problems passing big5 content to the server via the get method so am not holding my breath....

Beirne -- VOA is blocked from Beijing so I'm not in a good position to replicate the problem. The 为 seems to annotate fine in the text posted on this website though so I don't think its an issue with the server.... if you still have the problem can you forward me the actual webpage as an attachment via email?

BFC_Peter -- there's no way to use this offline right now. I'd happily make Adso available for desktop use if someone wanted to develop a GUI that could handle the text-under-mouse recognition and display features, and link into to C++ source code for the backend processing.

It should be possible to make javascript display the results in an on-screen popup rather than open a new window. I'm not a javascript/CSS expert, so it would speed up development if someone familiar with javascript and DHTML would take a shot at setting up the basic functionality and pass along some working code.

You can change the amount of text displayed by editing the greasemonkey script "adso.user.js". You want to edit the lines below. Current values should annotate the equivalent of about three lines:

// Customize the Popup

var width_of_popup = 500;

var height_of_popup = 157;

var surrounding_text_length = 46;

Link to comment
Share on other sites

Beirne -- VOA is blocked from Beijing so I'm not in a good position to replicate the problem. The 为 seems to annotate fine in the text posted on this website though so I don't think its an issue with the server.... if you still have the problem can you forward me the actual webpage as an attachment via email?

I wondered if there would be a problem with VOA. The problem shows up for me in the posted text too, though. I should add that I'm doing this on Firefox from a Mac running OS X, so this may be harder to test and resolve. I did think of something else to try, though. I copied the text from the adsotrans window and pasted it into Wenlin. The 为 showed up fine there, so the problem is a display issue rather than a parsing one.

I tried a page from within China, http://dailynews.sina.com.cn/comment/. The second line from the top of the page, which I have included below, has 为. I can reproduce the problem reading the text off of this posting so I guess the page isn't really necessary but there for reference. I have attached a screen print of the popup window. Both 为 and 联 are missing from the display, but if I wave the cursor over 合 it includes 联 as part of the word it defines. 成为 behaves similarly, only the 为 is missing from both the display and the tooltip. It is there if I look at the page source, though.

I then thought of one more test. I went to the adsotrans web page and tried pasting the line there. When pasting it into the text box on the first page I found that 为 and 联 are both replaced by whitespace in the box. When I adsotate it, though, they show up fine on that page, both in the regular display and the tooltip.

如果您反对日本成为联合国安理会常任理事国,请在此签名。谢谢!

I also checked the character encoding setting in Firefox. It is set to UTF-8. I tried Simplified Chinese but got gibberish, which I assume is expected.

I understand that the Mac isn't real common, but if you have anything you want me to check to help out I'd be happy to do so.

257_thumb.attach

Link to comment
Share on other sites

Thanks for the pic, Beirne. Don't have a Mac so debugging it myself or trying to figure out a workaround using other javascript is a bit hard.

And I'm honestly not sure if there is anything we can do here, since it looks like this is an issue with Firefox compatibility across different OSes. I'll pass word along to the Firefox project though in the hopes they can fix it, along with some sample code that works under Windows and provides this error under the Mac.

Link to comment
Share on other sites

This is very cool. I found a small problem, though. Reading the first paragraph of http://www.voanews.com/chinese/w2006-01-02-voa8.cfm:

北韩誓言要在新的一年里坚持其“军事第一”的政策,只字不提为解除朝核项目而举行的进展缓慢的六方会谈。另外北韩还表示要发展农业,振兴经济。

and clicking on it, the 为 doesn't make it to the popup window, even if I click right on it.

It works for me, Beirne, including the 为 character. I have Windows XP, Firefox 1.5.

Cool features, thanks guys! :)

Link to comment
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...