Jump to content
Chinese-forums.com
Learn Chinese in China

  • Why you should look around

    Since 2003, Chinese-forums.com has been helping people learn Chinese faster and get to China sooner. Our members can recommend beginner textbooks, help you out with obscure classical vocabulary, and tell you where to get the best street food in Xi'an. And we're friendly about it too. 

    Have a look at what's going on, or search for something specific. We hope you'll join us. 
Sign in to follow this  
IronMandarin

Free graded reader resource

Recommended Posts

IronMandarin

Hello everyone,

Inspired by the Inkstone app that I found was so useful and free, I dedicated some of my time here in China to create a graded reader website, IronMandarin.

 

What you can do with this website:

  • read a text
  • choose a text by category or by HSK level
  • see how close a text really match a given HSK level
  • save words in your know list, to give you an idea of where YOU stand to a given text (this is your personal profile)
  • save words in a 'to learn' list, and I plan to add a SRS (Spaced Repetition Software) functionality, or maybe just allow export in a convenient format for anki import?
  • see the most frequent unknown words in a text (so once you add 的,一,个... to your personal list they don't appear in the frequency listing anymore)
  • switch the character set from simplified to traditional
  • analyze your own text:
    • without logging in, you can analyze a text, for example an email or an article, it will be better for you than google translation
    • logging in, you can publish a text so it is saved on the website. It can be public, or private (so no sharing say, personal emails)

 

It is not fully developed, as I also study and work, but more functionalities are planned, such as frequency list over one category or a set of articles, to be able to focus on specific vocabulary, which I find especially useful for HSK 6.

 

I publish text from different sources but I know a few Chinese teachers here in Chengdu that help me. You can also participate by publishing some texts.

 

The segmentation algorithm is automated, based on Jieba, but it makes a lot of mistakes and currently I spend quite some time reviewing the segmentation, maybe I'll have a look at the code to patch a few common mistakes (for example it doesn't split numbers, or number and measure word).

 

I had before some question about monetization. I dedicate quite some time to this project, and I need to pay the writers more if they spend some more time on the project.

I will make pretty soon a Patreon page, hopefully it will be enough to make this project sustainable.

The website also offers the possibility of tailored analysis and advice for a reader to progress, based on his current word list. This and some premium content could make for a premium membership in the future if a Patreon is not enough.

The website is not free to gain enough traction to put it full price more expensive than a real newspaper. The core functionalities intend to remain free, on a donation or freemium model.

 

Hopefully this will help all of us Chinese learners :)

Let me know if you have any suggestions!

  • Like 1

Share this post


Link to post
Share on other sites
Site Sponsors:
Pleco for iPhone / Android iPhone & Android Chinese dictionary: camera & hand- writing input, flashcards, audio.
Study Chinese in Kunming 1-1 classes, qualified teachers and unique teaching methods in the Spring City.
Learn Chinese Characters Learn 2289 Chinese Characters in 90 Days with a Unique Flash Card System.
Hacking Chinese Tips and strategies for how to learn Chinese more efficiently
Popup Chinese Translator Understand Chinese inside any Windows application, website or PDF.
Chinese Grammar Wiki All Chinese grammar, organised by level, all in one place.

imron
5 hours ago, IronMandarin said:

maybe I'll have a look at the code to patch a few common mistakes

Jieba is a statistics based segmenter.  It's not so much the code you need to patch but rather the probabilities used in the statistical model.  You could probably hard-code a bunch of different exceptions, but the whole point of using a statistical model is to avoid the need to hard-code exceptions in the first place.

 

For what it's worth, I'm currently working on a statistical segmenter for Chinese Text Analyser that uses the Jieba data files for probabilities (the current version of CTA uses a first longest match algorithm, which is fast but even more inaccurate than Jieba).

Share this post


Link to post
Share on other sites
IronMandarin

Ok, thanks for the information I'll have a look into that. It was not my priority but that could be useful to dig a bit into the technique.

Share this post


Link to post
Share on other sites
imron

Let me know if you have any questions about it, or can't figure out why it does something in a given way.  I've been going over it in detail recently so have a good idea of how most of it works.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this  

×
×
  • Create New...