Jump to content
Learn Chinese in China

  • Why you should look around

    Since 2003, Chinese-forums.com has been helping people learn Chinese faster and get to China sooner. Our members can recommend beginner textbooks, help you out with obscure classical vocabulary, and tell you where to get the best street food in Xi'an. And we're friendly about it too. 

    Have a look at what's going on, or search for something specific. We hope you'll join us. 

CC-CEDICT dictionary convert to excel

Recommended Posts


Hi everyone,


CC-CEDICT is a dictionary which you can search for words in. It is available in a text file here (https://www.mdbg.net/chindict/chindict.php?page=cedict)


Unfortunately, I do not know how to parse the text file into nice columns in Excel. Does anyone know how to do this?


I am aware that Excel has a text to column function but it doesn't seem advanced enough for the file structure used by CC-CEDICT




Share this post

Link to post
Share on other sites
Site Sponsors:
Pleco for iPhone / Android iPhone & Android Chinese dictionary: camera & hand- writing input, flashcards, audio.
Study Chinese in Kunming 1-1 classes, qualified teachers and unique teaching methods in the Spring City.
Learn Chinese Characters Learn 2289 Chinese Characters in 90 Days with a Unique Flash Card System.
Hacking Chinese Tips and strategies for how to learn Chinese more efficiently
Popup Chinese Translator Understand Chinese inside any Windows application, website or PDF.
Chinese Grammar Wiki All Chinese grammar, organised by level, all in one place.


Yeah, it should be tsv, but upon importing it to excel, it looks like it's delimited by space, which is problematic... You could probably write a function to delimit it based on when the pinyin starts (which is enclosed in brackets)... Let me know if you need help.

Share this post

Link to post
Share on other sites

It's not tsv.  The format is specified here.


Do you have access to an editor that handles regular expressions?  If not, download notepad++.


Then open the CC-CEDICT file.


Then Search->Replace (Ctrl+H)


Set the 'Search Mode' to 'Regular expression'.


In the 'Find what' field type: ^([^ ]+) ([^ ]+) (\[.*\]) (.*)$

(probably best to copy/paste this from this post).


This is a regular expression that matches 4 fields - Traditional, Simplified, Pinyin, Definition


In the 'Replace with' field type




This replaces each matching line with the individual fields separated by a tab character.


Then hit Replace All and wait 10-20 seconds and you should be good to go.  Just save the file and import it directly in to excel.

  • Like 3

Share this post

Link to post
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Create New...