Jump to content
Chinese-Forums
  • Sign Up

Ideas wanted: How would you improve this study process for new vocab&syntax acquisition?


Tamu

Recommended Posts

Flickserve,where do you source chinese .srt files from?

Google and then looking around various sites seeing if a Chinese version is available. Unfortunately, my Chinese is not yet good enough to search in Chinese websites.

*Edit* I just tried it again.

I think it was this site.

http://www.yifysubtitles.com/movie/brave-2012-1080p

Yup. I have Brave (also bought the DVD in China LOL). I did a download of the Chinese srt. And just quickly examined it. It matches almost practically perfectly to the audio. There might be an adjustment of half a second or so. But I would need to check by trial a trial run through subs2srs. There are some slight differences e.g the audio says 不是我的错 with the equivalent subtitle writing 不能怪我

I have another Chinese movie that I ripped from. 重返20岁. I just tried "重返20岁 subtitles" in google and found http://subhd.com/a/315908

My Chinese must have improved because I was able to download the .rar and unzip it. Opened up the .srt file and it's definitely >95% match and the timings are very accurate.

Link to comment
Share on other sites

If there is a problem opening the .srt file in notepad. try these instructions 1-7

https://forum.videolan.org/viewtopic.php?t=80401#p316916

 

PROBLEM:
Chinese subtitle dont display properly on vlc, or funny fonts come out and no matter what settings you do on vlc it wont work.

BACKGROUND:
TRUTH is that my computer/WINDOWS 7 is configured to display/input simplify or traditional Chinese.
So i can actually type Chinese in a blank NOTEPAD that I created and be able to view them correctly. YES. TRUE.
but then the .srt files downloaded from shooter.cn wont display correctly on my NOTEPAD, and always funny fonts to most cases.

MY SOLUTION using MICROSOFT WORDS 2010:
0. make sure u backup the .srt file.
1. right click on the Chinese .srt fie > Open with > Microsoft Words
2. choose an encoding that will display the Chinese in the preview window, and OK.
(In my case CHINESE TRADITIONAL/SIMPLIFIED auto-correct)
3. right click again on the Chinese .srt file > open with > NOTEPAD
(now u should have a WORDS windows displaying the subtitle in chinese that is viewable, and another NOTEPAD window displaying the subtitle in all un-readable funny fonts)
4. Select All (CTRL+a) & then Copy All (CTRL+c) the content in WORDS.
5. Go/click the NOTEPAD window, and do Select All CTRL+a, then do Paste All CTRL+v.
(you should now be able to view the subtitle in correct fonts/chinese that is view-able)
6. Go to opened WORD window and CLOSE program without making any save.
7. Go to the opened NOTEPAD window, and go to SAVE AS, and choose [uTF-8] in the encoding drop-down selection, (this should be next to the 'SAVE' button on the bottom). SAVE and CLOSE.

8. open VLC, go to TOOLS > PREFERENCES > SUBTITLE & OSD
-change the Default Encoding to [universal (UTF-8)]
-change the Font to one that is meant for chinese, eg for traditional Chinese i use [simHei"]
9. still in the TOOLS > PREFERENCES window, click on the left bottom corner, the ALL selection.
the menu on the left changes to more options.
-Double click "Video" > click 'Subtitle/OSD'.
-on the right side, change the "Text rendering module" to [FREETYPE2 FONT RENDERER]
10. click SAVE and CLOSE VLC for it to effect.

NOW re-open VLC and the chinese subtitle should work.

 

Link to comment
Share on other sites

Blu Ray version of "Beijing Meets Seattle II: Book of Love (Finding Mr. Right 2)" - chinese .srt match up. Found some English subtitles with almost exactly the same times. (over 2000 sentences in the film)

(First two minutes of Tang Wei getting thrown out is in Cantonese).

Link to comment
Share on other sites

  • 6 months later...
  • 4 years later...

I see that this is a very old thread, but I assume I’m not the only person who reads old threads and learns from them. Here’s something that eliminates the laborious portion of doing the overall approach in this thread. I hope it makes the overall approach in this thread much more accessible to many more people.

 

 

I use Excel to make the timing of an English SRT file match the timing of a Chinese SRT file exactly. (If you don’t have Excel, you can download and install the free open-source Open Office software at https://www.openoffice.org/. It’s equally as fantastic as Microsoft Office for everything I do and a lot more.) There’s still some work involved, but it’s a tiny fraction of working only in text files.

 

 

I’ve attached an example Excel file for you to start with.

 

 

Step 1

In a text editor (like Notepad in Windows or TextEdit in MacOS), open the Chinese SRT file. Select all and copy. Paste it into column A, row 1 of the Excel spreadsheet. It automatically gets pasted into as many rows in column A as it needs.

 

 

Step 2

In the text editor, open the English SRT file. Select all and copy. Paste it into column B, row 1 of the Excel spreadsheet. It automatically gets pasted into as many rows in column B as it needs. Then, the Chinese subtitles in row A are side-by-side with the English subtitles in row B.

 

 

On Edit 4/6/22: Then, I search for = (equals sign) and replace all with ‘ (single quote). For subtitles that start with - (single dash), Excel thinks it’s a formula, so it adds = before -. To get the subtitles to show up in Excel, ' needs be put in front of -. It turns the contents of the cell into text, which is what we want.

 

 

Now, let me take a moment to describe the subtitle format in an SRT file. Two sequential subtitles have the format below. Each subtitle is at least 4 rows, including the blank line at the end. The first row is the sequential subtitle number. The second row is the start and end timing. (FYI: The format for each of the start timing and end timing is hours:minutes:seconds,fraction-of-a-second.) The third row is subtitle text. Sometimes subtitle text is more than one row. Again, the last row is a blank line, before the next sequential subtitle starts.

 

 

1

00:00:01,000 --> 00:00:03,000

Subtitle 1 text

 

 

2

00:00:03,000 --> 00:00:05,000

Subtitle 2 text

 

 

Step 3a

Make sure the Chinese subtitle and it’s corresponding English subtitle are in the same set of rows. For example, there are times when a Chinese subtitle may have one row of text and it’s corresponding English subtitle may have two rows of text. Then, a blank line needs to be added after the Chinese subtitle. Do this by inserting a cell in Column A and choosing the option to shift the remaining cells in Column A down, so the next Chinese subtitle is in the same set of rows as it’s corresponding English subtitles. Repeat as necessary until all of the subtitles are perfectly lined up.

 

 

Step 3b

The number of Chinese subtitles may be different than the number of English subtitles for a variety of reasons, such as different people even within the same company creating the Chinese subtitles vs. English subtitles. The differences should be very minor. So, if there are extra English subtitles, then you can just delete them. If there are missing English subtitles, then you can just ignore them.

 

 

This brings up the question of what to do with subtitle numbering. As far as I can tell, they don’t actually need to be sequential. What matters is the timing. So, you shouldn’t need to do anything with subtitle numbering.

 

 

Step 4

In column C, row 1, I put in the formula below. After you have all of the Chinese and English subtitles perfectly lined up, copy and paste the cell C1 into all of the rows of column C, all the way through the last subtitle. Excel will automatically change the row number in the formula to the applicable row number.

 

 

=IF(B1="","",IF(MID(B1,3,1)=":",A1,B1))

 

 

Here’s what the formula is doing. First, it checks for a blank line in the English subtitle. If it’s a blank line, then it keeps it. Second, it checks for timing in the English subtitle. If it’s timing, then it puts in the timing for the Chinese subtitle, instead of the English subtitle. For the rest of the English subtitle, it keeps it.

 

 

Step 5

Select and copy all of the subtitles in column C. Paste it in a new text file. Save. After saving, you may need to adjust the file name, so it ends with .srt. (Notice that there’s a dot before the srt.) You now have an English SRT file that has the exact same timing as the Chinese SRT file.

 

 

 

 

Here are a few things that may or may not come up. I haven’t used subs2srs yet. I don’t know when I’ll get around to using it, because I’m learning Chinese as a side hobby.

 

 

As far as I can tell, it’s okay to have extra blank lines, e.g., back-to-back blank lines at the end of a subtitle. If subs2srs doesn’t like it, then you’ll have to manually delete the extra blank lines in either the text editor or Excel. There shouldn’t be too many instances of this.

 

 

What if a Chinese subtitle is in one row and it’s corresponding English subtitle is in two rows? Does subs2srs put both rows of the English subtitle together to equal the Chinese subtitle? Or, does it only think the first row belongs to the Chinese subtitle? I assume it’s smart enough to do the former. But, if it isn’t, then you’ll have to manually combine the two rows of English subtitles. There probably aren’t too many instances of this. Or, you can just ignore it and live with it.

 

 

Higher quality English subtitles are important. I personally like English subtitles from Viki to go along with Chinese subtitles from iQiyi, WeTV/Tencent, MGTV (MangoTV) or Netflix. Subtitles from Viki, iQiyi, WeTV and MGTV can easily be downloaded using the website https://downsub.com/. This type of mixing-and-matching involves a lot more manipulations in Excel. But, I think it’s worth it, because of the much greater accuracy of the English subtitles from Viki.

 

 

Happy learning Chinese! Best of luck with trying the overall approach in this thread!

sub2.xls

  • Like 2
Link to comment
Share on other sites

  • 3 weeks later...

 

For getting the srt into a nice list, I have been using subs2srs. When you process an srt (with audio), it give a tsr file with the data put into lines without needing a formula. I then import this into excel which gives the option of separating out the columns according to TAB. It’s then fairly straightforward to do the manual process of checking the corresponding Chinese and English sentence within Excel.

 

For me, I use this TSR file to import into anki. I found the easiest way is to cut and paste the data from Excel into the original tsr file for the import into anki. Now, that does mean each line has to match with the time in your downloaded video. I found for films, the timing of Chinese srt is more accurate compared to English srt. 

 

There’s another program called subtitle editor which lets you do the micro adjustments for srt timing if you already have the srt. I have only used it very briefly for rechecking the timings of an srt which was off by 30seconds. Fortunately, all I had to do was enter +30secs into the program and it changed all the timings accordingly. 

  • Helpful 1
Link to comment
Share on other sites

When I first read your post, I was completely lost, lol. Then, I looked around in my subs2srs folder. Now, I get it, lol. Thanks for the tips about TSV files and Excel. I’ll incorporate them into my process somehow. My process is more involved, because I’m creating my own English subtitle files both for watching Chinese TV dramas and for learning Chinese. I’m also manipulating Chinese subtitle files match. So, I need both good SRT files and good TSV files. For example, I can use Excel to turn a TSV file back into Chinese and English SRT files.

 

Since I can run stuff in Windows now, yes, I’ll try out Subtitle Editor in the near future. Viki uses it too. I’ve been using Aegisub, because it’s available for Macs. I really like it, so we’ll see which one I like better. It’s easy to time shift everything in Aegisub too.

Link to comment
Share on other sites

Flickserve – Is the subtitle software you mentioned the one at the link below? The internet has many with very similar names. The one below seems to be the most popular of those.

 

https://www.nikse.dk/subtitleedit

 

After playing around with the one above for a couple of hours… Wow! The many lists on the internet for best subtitle software don’t come close to describing how much better Subtitle Edit is than Aegisub. I was under the impression that Aegisub was one of the best and that there wasn’t much difference between it and any of the other best free ones. Here are some of the advantages of Subtitle Edit over Aegisub.

 

·       Highlights a duration less than the minimum specified (e.g., <0.833 sec per Netflix).

 

·       Highlights a duration greater than the maximum specified (e.g., >7 sec per Netflix).

 

·       Highlights overlapping subtitles without having to select one of them to see a highlight.

 

·       Highlights a number of rows greater than the maximum specified (e.g., >2 in one subtitle per many big companies).

 

·       Here’s where Subtitle Edit really starts to take off in my eyes. It extends end times of subtitles based on a choice of options. One of the options is characters per second (CPS)! In my subtitling approach, I extend times based on CPS, which allows me to keep accurate translations (instead of making them concise), while having subtitles be comfortably readable.

 

Viki is the only one of the big companies that does this. But, they extend end times by adding a second or so. So every episode still has quite a few subtitles with CPSs that are too high to read comfortably. This is what causes people to hit the rewind button. Using CPS directly to extend end times is a great way to meet more of this concept. (There is more that can be done too.) I’m already loving Subtitle Edit at this point!

 

·       It has a tool called “Fix common errors…” It’s a long, impressive list of things it can fix.

 

·       It can set/adjust a subtitle’s start time by clicking on the audio waveform vs. typing in a time.

 

·       It has the ability to use VLC Media Player (amongst others) to play videos in it. VLC can run problematic videos that other well-known software can’t. Beyond this, Aegisub is well known to sometimes have problems running videos. Having said that, I’ve never had a problem running the audio portion of a video in Aegisub. But, sometimes, there are subtitles in which the video is needed to select timing, such as scenes with signs or text messages.

 

I was previously doing some of the above in a spreadsheet I made. With Subtitle Edit, I don’t need that spreadsheet anymore, which saves me a noticeable amount of time. Subtitle Edit does everything in my spreadsheet and a bunch more. Aegisub is great, but Subtitle Edit is one of those things that make you think, “Wow. They think of everything” (kind of like Apple :D). I feel a donation coming on… :) I just have to decide how much after I work with the software a little more.

  • Like 1
Link to comment
Share on other sites

Yes. Sounds  good! I just updated it.

 

I haven’t really gone into the ins and outs of its functionality since I haven’t been creating flashcards for quite a while and also not focussed on improving for the past few years. 

 

Recently, I am into creating the big bank of flashcards and also want tidy up some of the previous cards made when I was less experienced. Hence, coming across it again in my computer.

Link to comment
Share on other sites

On 3/23/2022 at 2:55 AM, Flickserve said:

When you process an srt (with audio), it give a tsr file with the data put into lines without needing a formula. I then import this into excel which gives the option of separating out the columns according to TAB.

 

 I just tried this again with a film that I did before. My excel is an outdated one. I have to set the character set to UTF8. The other more major thing is some of the columns are imported incorrectly which is rather annoying.

 

An alternative is to import the .tsv file into google sheet. That actually took fewer steps as every TAB in one line was automatically separated into a column.

 

The reason why I was messing around with this was I looking at The Incredibles which I converted a while back - I could only find the .srt with traditional chinese and I wanted a simplified Chinese column in the .tsv file. (or vice versa if you want a column of traditional characters)

 

For this workflow for conversion of lists greater than 3000 characters:

 

1. Import tsv file into google sheet

2. Copy and paste the column of traditional chinese

3. paste the traditional chinese into another srt file

4. upload the srt file into for conversion into simplified chinese https://www.purpleculture.net/traditional-simplified-converter/

5. You get a download of an .srt file with the simplified chinese

6. Copy and paste that into another column into your google sheet

7. Copy and paste the whole google sheet back into .tsr

8. Import .tsr into anki

  • Helpful 1
Link to comment
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...