Jump to content
Chinese-Forums
  • Sign Up

Learning Chinese with the Word Sketch Engine


smithsgj

Recommended Posts

ironlady,

pricing is tough and will never make everyone happy. For some potential commercial users, there is a good match between what the SkE offers and what they want, so the price seems low. Where there is a less good match, it seems high. Also, Taiwan is a middle-income, middle-cost country, and it would be appealing to vary the price according to the country, thjough the system could be exploited. It is something I am currently looking into for third-world countries.

Adam

Link to comment
Share on other sites

Definitely true. Just hoping to give you some feedback from one user.

I'm not quite sure why you're bringing up Taiwan -- I'm in the US, actually. But I'm sure lots of learners and others in Taiwan might be interested in the tool. Maybe you should contact some of the Chinese language schools directly and see if they might be interested in offering it or in coming up with ways to use it in the classroom or as an adjunct to classroom instruction. "Get 'em while they're young" and all that... :mrgreen:

Link to comment
Share on other sites

I haven't looked at the word sketch engine, but there's a chinese corpus searchable online here for free. And it's large. Here are the number of hits for some common characters:

我: 5687

被: 1283

Interested to hear what the Word Sketch Engine has that this online concordancer doesn't...

The corpus itself is under a free licence. But this is for 'non-profit-making research' - I'm not sure if this would cover studying chinese. The links to various distributors, don't seem to to allow you to download it for free :roll: But there does seem to be a simple way to get it: just leave the search field in the online concordancer blank & it appears to return the whole thing...

With a suitable bit of software to strip out extraneous HTML, and a Chinese-compatible concordancing engine, we could all have a hefty concordance for offline use for free. (At least until they figure out what's going on and change the way the online concordancer works...)

Link to comment
Share on other sites

  • 2 weeks later...

Onebir I gave that link on the previous page, along with the Academia Sinica one.

What's the difference? Well, SkE isn't a corpus, it's a corpus query tool. The Gigaword corpus (which is what Ske currently accesses in its Chinese implementation) contains a billion Chinese characters, approximately 750 million words or so.

I looked at the LCMC, but couldn't immediately find a size noted. So I took a look at SkE, and found that 我 tokens total 333037. So I assume that LCMC is relatively small.

The Academia Sinica corpus, like LCMC, is fairly small. These corpora do, however, have the "balanced" advantage, whereas Gigaword contains only newswire texts. Tagging and segmentation is probably more reliable too: that makes sense for a smaller scale corpus.

What SkE tries to offer is short summaries of word usage based on the totality of the corpus data, plus the ability to click on links and call up related concordance data. I haven't seen a tool that can quite do this elsewhere. Even if some of the grammatical relations aren't quite right.

And of course it should be open source. But so should all software, and Adamk probably agrees with that in principle.

Research is funded. Funding pays for software licenses. Adamk and I aren't computer software magnates, but I get my funding from NSC and some of that goes towards our academic license. You guys want to use it? Use it! The password's up top!

Link to comment
Share on other sites

  • 1 month later...

For those who were kind enough to complete the online pre-test a few weeks back:

If you didn't get our email, would you mind going to

http://myweb.scu.edu.tw/~mralice/SimplifiedlPostTest.htm

or

http://myweb.scu.edu.tw/~mralice/TraditionalPostTest.htm

to do the second part of the questionnaire?

Thanks very much!

Link to comment
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...