Jump to content
Chinese-forums.com
Learn Chinese in China

extract hard subtitle in Chinese clip


hoshi
 Share

Recommended Posts

having read those thread
https://www.chinese-forums.com/forums/topic/23286-how-to-create-your-own-transcript-from-a-chinese-video/#comment-190835
https://www.chinese-forums.com/forums/topic/57214-instantly-extract-chinese-subtitles-physically-embedded-from-videos-to-text-file/#comment-443761


certainly, extract hard subtitle in Chinese movie file is thorn in the side.

 

VideoSubFinder is a free program that allows you to autodetect a video frame by frame and extract hardcoded subtitles to a series of image grabs with text based on text mining algorithms for further OCR process. Closely follow the steps below.

download and install VideoSubFinder here: https://sourceforge.net/projects/videosubfinder/.
since this program requires "Microsoft Visual C++ Redistributable for Visual Studio 2015, 2017 and 2019" installed on the PC

run "VideoSubFinderWXW.exe", click on "File" and select "Open Video (OpenCV)" to import the video file embedded with hardsubs.


after importing the video file, drag the slider along the progress bar to locate the subtitle position (in this case Chinese). To precisely frame the area where the subtitle appears in the video.


to eliminate redundant video screen part. press "Run Search" button to autodetect the hardsubs.

when the process is finished, switch to OCR tab and click on "Create Cleared TXT Images". After done, it will produce a number of large cleared image sequences with text in the TXTImages folder of VideoSubFinder root directory,


DO not close the program and go to the next step.

at this point, we have to use some image to text recognition software for OCR process.


use the free command-line OCR engine tesseract. be sure to register the folder path in environment variable in PC.


use below command in prompt.

for %i in (C:\your_dir\Release_x64\TXTImages\*.jpeg) do tesseract -l chi_sim --oem 2 --tessdata-dir C:\your_dir\AppData\Local\Programs\Tesseract-OCR\tessdata --psm 6 %i "%~dpni"

use the %% in case of .bat file instead of %.


it takes a while conversion to be completed depending on the media length so let’s sip a cup of coffee.

now converted txt file appeared same holder and all file must be to move to C:\your_dir\Release_x64\TXTResults

now go back to VideoSubFinder, hit "Create Sub From TXT Results" button to generate a .ass (recommended) subtitle file. Rename the subtitle file and save it.

there are some garbled text in the file so edit accordingly. this method is far from the perfect but usable in relatively short video.

as aside next time use this app be sure no files in three holder namely ILAImages, ISAImages, RGBImages to pick up previous one.

use in another app like subs2srs to study.

that’s it.

 

 

videosubfiner_pic.jpg

result_pic.jpg

  • Like 1
Link to comment
Share on other sites

Site Sponsors:
Pleco for iPhone / Android iPhone & Android Chinese dictionary: camera & hand- writing input, flashcards, audio.
Study Chinese in Kunming 1-1 classes, qualified teachers and unique teaching methods in the Spring City.
Learn Chinese Characters Learn 2289 Chinese Characters in 90 Days with a Unique Flash Card System.
Hacking Chinese Tips and strategies for how to learn Chinese more efficiently
Popup Chinese Translator Understand Chinese inside any Windows application, website or PDF.
Chinese Grammar Wiki All Chinese grammar, organised by level, all in one place.

Interesting but Windoze only.  Anyone know of the Mac equivalent method?

I noticed you show a clip from 开端 Reset drama 😎.

Also wanted to point out that there's no need to go to such trouble for this specific drama, you can just download the full transcript with timings from YouTube.

 

Link to comment
Share on other sites

  • 2 months later...

For people who aren’t comfortable with the command-line interface, the free Subtitle Edit software has an OCR (optical character recognition) that can process images into text. It also uses the open-source tesseract OCR.

 

https://www.videohelp.com/software/VideoSubFinder/reviews

Link to comment
Share on other sites

  • 1 month later...

InpaintDelogo, VideoSubFinder, Subtitle Edit and OCR

 

All the software used in this post are free. This post discusses turning hard-embedded subtitles in a video file into soft-embedded subtitles in a subtitle file. I’m mainly interested in doing this for hard-embedded Chinese subtitles, although I’m a little interested in doing this for hard-embedded English subtitles too. I’ve stumbled across something that seems to be significantly better than the VideoSubFinder (VSF) software (https://sourceforge.net/projects/videosubfinder/). It’s the InpaintDelogo (IPDL) script (https://github.com/Purfview/IPDL).

 

IPDL was originally intended for removing a logo from an image/video, as if it were never there. It can also remove hard-embedded subtitles, as if they were never there. It’s amazing. It does this in part by reconstructing removed or damaged portions of images, which is called inpainting, hence the name InpaintDelogo. Later, IPDL added the capability to create an image file for each hard-embedded subtitle in a video. (Timing is important for subtitles too. It’s built into image file names, meaning it isn’t part of the files themselves.) This is the capability that is similar to VSF.

 

IPDL can apparently produce significantly more accurate images of subtitles than VSF. In turn, optical-character-recognition (OCR) software can extract significantly more accurate subtitles from these images. What I’ve read and seen has been convincing, so I wanted to try it myself.

 

After IPDL creates images of subtitles, separate OCR software is needed to extract subtitles from the images. It’s the same with VSF. The Subtitle Edit software (https://www.nikse.dk/subtitleedit) has many built-in OCRs to choose from. Subtitle Edit also creates the subtitle file, so it’s convenient to use. (Regarding how subtitle files are used to help learn Chinese, see other threads on this great website.)

 

Installing InpaintDelogo

 

I’ve been a little intimidated by trying to figure out how to install software like IPDL. But, it wasn’t as difficult as I feared. IPDL can be installed in MacOS, but I installed it in 64-bit Windows 11. So, that’s what I’m describing in detail.

 

AviSynth+

 

IPDL is a script that runs in the AviSynth+ software. AviSynth+ is a powerful tool for editing and processing videos. Go to https://github.com/AviSynth/AviSynthPlus. On the right side of the page, click on the “latest” button. On the next page, scroll to the bottom of the page. Click on the file for your computer and operating system to download it. I picked AviSynthPlus_3.7.2_20220317_vcredist.exe. After it’s downloaded, double-click on it to install it. One of the many folders created is the folder below for 64-bit Windows.

 

C:\Program Files (x86)\AviSynth+\plugins64+

 

Next, a long series of files from AvsInpaint to IPDL needs to be downloaded and placed in the plugins folder above. The steps for each are similar But, for someone new to this, like I am, it’s sometimes unclear what to do. So, I thought it would be better if I wrote out the steps for each file.

 

AvsInpaint

 

AvsInpaint is a plugin for AviSynth+. Go to https://github.com/pinterf/AvsInpaint. On the right side of the page, click on the “latest” button. On the next page, click on the file AvsInpaint-v1.3.7z to download it. Then, double click it to uncompress it. (If it doesn’t uncompress, search the internet and download and install free software that uncompresses *.7z files, e.g., https://www.7-zip.org/download.html.) In the subfolder x64, there’s a file named AvsInPaint.dll. Move it to the plugins folder.

 

MaskTools2

 

Go to https://github.com/pinterf/masktools. On the right side of the page, click on the “latest” button. On the next page, click on the file masktools2_v2.2.30.7z to download it. Uncompress it. In the subfolder x64, there’s a file named masktools2.dll. Move it to the plugins folder.

 

RgTools

 

Go to https://github.com/pinterf/RgTools. On the right side of the page, click on the “latest” button. On the next page, click on the file RgTools-v1.2.7z to download it. Uncompress it. In the subfolder x64, there’s a file named RgTools.dll. Move it to the plugins folder.

 

GRunT

 

Go to https://github.com/pinterf/GRunT. On the right side of the page, click on the “latest” button. On the next page, click on the file GRunT-v1.02.7z to download it. Uncompress it. In the subfolder x64, there’s a file named grunt.dll. Move it to the plugins folder.

 

RequestLinear

 

Go to https://github.com/pinterf/TIVTC. On the right side of the page, click on the “latest” button. On the next page, click on the files TDeint-v1.8.7z and TIVTC-v1.0.26.7z to download them. Uncompress them. In the subfolders x64, there are files named TDeint.dll and TIVTC.dll. Move them to the plugins folder.

 

ClipBlend

 

Go to http://avisynth.nl/index.php/ClipBlend. On the right side of the page, next to the word “Download,” click on the file “ClipBlend_25_26_x86_x64_dll_v1-01_20181127.zip” to download it. Uncompress it. In the subfolder Avisynth+_x64, there’s a file named ClipBlend_x64.dll. Move it to the plugins folder.

 

RT_Stats

 

Go to http://avisynth.nl/index.php/RT_Stats. On the right side of the page, next to the word “Download,” click on the file “RT_Stats_25_26_x86_x64_dll_v2.00Beta12_20181125.7z” to download it. Uncompress it. In the subfolder Avisynth26_x64, there’s a file named RT_Stats_x64.dll. Move it to the plugins folder.

 

FrameSel

 

Go to http://avisynth.nl/index.php/FrameSel. On the right side of the page, next to the word “Download,” click on the file “FrameSel_x86_x64_dll_v2-20_20180420.zip” to download it. Uncompress it. In the subfolder Avisynth+_x64, there’s a file named FrameSel_x64.dll. Move it to the plugins folder.

 

GrainFactory3

 

Go to http://avisynth.nl/index.php/GrainFactory3. The following instructions are different than all the previous ones. On the right side of the page, next to the word “Download,” right click on the file GrainFactory3.avsi. Select “Save Link As...” In the window that pops up, click the Save button. Move this file to the plugins folder.

 

vsTEdgeMask

 

Go to https://github.com/Asd-g/AviSynth-vsTEdgeMask. On the right side of the page, click on the “latest” button. On the next page, click on the file vsTEdgeMask-1.0.1.7z to download it. Uncompress it. In the subfolder x64/Releases, there’s a file named vsTEdgeMask.dll. Move it to the plugins folder.

 

InpaintDelogo

 

Go to https://github.com/Purfview/IPDL. The following instructions are different than all the previous ones. Click on the green “Code” button, and select “Download ZIP.” Uncompress it. Move the file InpaintDelogo.avsi to the plugins folder.

 

LSMASHSource

 

Go to http://avisynth.nl/index.php/LSMASHSource. On the right side of the page, next to the word “Download,” click on the link “L-SMASH-Works.” On the next page, click on the file L-SMASH-Works-20220505.7z to download it. Uncompress it. In the subfolder x64, there’s a file named LSMASHSource.dll. Move it to the plugins folder.

 

AvsPmod

 

Last, it’s easier to use AvsPmod to run IPDL. Go to https://github.com/gispos/AvsPmod/releases. Click the 64-bit file AvsPmod_v2.7.1.8_.Windows_x86-64.zip to downloaded it. Uncompress it. Open the folder AvsPmod_v2.7.1.8_.Windows_x86-64. Then, open the folder AvsPmod. Right click on the file AvsPmod.exe, and select Run as administrator. Finish the installation. Optional: Create a desktop shortcut of AvsPmod.exe by right clicking it, selecting “Show more options,” selecting “Create shortcut,” and moving the shortcut to the desktop.

 

Okay, so, that wasn’t too bad, was it?

 

Running InpaintDelogo and VideoSubFinder

 

For my first experiment, I chose a high-quality 4k video with both hard-embedded simplified-Chinese subtitles and soft-embedded simplified-Chinese subtitles in a subtitle file. The video is episode 1 of You Are My Glory. The subtitle file is from WeTV, which should match the hard-embedded subtitles exactly. I purposely chose a video that already had a subtitle file, because I wanted something electronic to more readily make comparisons with (vs. my slow-moving eyeballs, lol).

 

Before generating images of subtitles from a video, the location of the subtitles in the video needs to be narrowed in on. The script below is an example for doing this. In the example, Example.mp4 is the name of the video, and it is located in the folder C:\Users\MTH\Documents\Extract. The four numbers are the numbers of pixels from the left, top, right and bottom of the video, respectively. The four numbers 100,100,-100,-100 are randomly selected, just as a starting point.

 

LWLibavVideoSource("C:\Users\MTH\Documents\Extract\Example.mp4")

InpaintLoc(Loc="100,100,-100,-100")’

 

To run the script above, open AvsPmod and copy-and-paste the script into it. Change the first line of the script to match your setup, including the name of your video and its folder path. Then, hit F5 on your keyboard. An image with a yellow-ish highlighted area appears. The video that also appears needs to be advanced to show a subtitle. Chinese subtitles are usually on one line. (English subtitles are typically up to two lines, but may be more.)

 

I played around with the top and bottom numbers, while hitting F5 each time, until the highlighted area narrowed in on the subtitle. The top should be 10 to 16 pixels above the subtitle, and the bottom should be 10 to 16 pixels below the subtitle. I settled on the following four numbers: 100,1470,-100,-28.

 

Location.thumb.png.9f02157155209ae38a18db1f23898385.png

 

The highlighted area is also referred to as the mask. A dynamic mask is used to make images of a series of subtitles in a video. The DynMask3 option in IPDL seems to be applicable to many Chinese TV dramas, because of the way the subtitles are formatted. I used the simple script below. I was actually able to get better results with this simple script than more complicated ones that I tried. It may be because I started with such a high-quality video that it didn’t need a lot of extra work.

 

LWLibavVideoSource("C:\Users\MTH\Documents\Extract\Example.mp4")

InpaintDelogo (Loc="100,1470,-100,-28",

\ DynMask=3, DynTune=200, Dyn3Seq=8,

\ Extract=1, Show=4,

\ ImgDir="C:\Users\MTH\Documents\Extract\Images")

 

In the script above, the first line is the same as the one used in the script further above. The four location numbers are the ones that were determined using the script further above. In Windows, create a new folder for the images that will be generated. Then, in the script above, make sure the path matches the new folder. The rest of the script is essentially default settings. (Dyn3Seq is the minimum number of frames for a subtitle. I reduced it from the default value to the minimum allowable value to capture shorter-duration subtitles. This doesn’t affect the quality of the images.)

 

For more information about the options used above and other options, see the links below. The manual is actually built into the IPDL script file InpaintDelogo.avsi. It can be opened and read in Notepad. Or, it can be read at the first link below. The manual shows default values and provides suggestions for other values.

 

https://github.com/Purfview/IPDL/blob/main/IPDL.avsi

https://forum.doom9.org/showthread.php?p=1883832

 

Finally, to generate images of subtitles, open AvsPmod and copy-and-paste the script above into it. Hit F5 to run the script. It took 21 minutes to process the 41-minute video. The resulting images of subtitles were in the images folder.

 

For VSF, download it from https://sourceforge.net/projects/videosubfinder/. Install it. (It’s easy.) For instructions to run it, complete with pictures, see https://www.videoconverterfactory.com/tips/extract-hardcoded-subtitles.html. I ran VSF on the same video that I ran IPDL on. I didn’t pay attention to how long it took to run, but it was probably around an hour.

 

IPDL generated 767 subtitle images. VSF generated 700 subtitle images. The subtitle file from WeTV has 716 subtitles.

 

Subtitle Edit and OCR

 

It’s OCR time. Go to https://www.nikse.dk/subtitleedit to download the Subtitle Edit software. Install it. (It’s easy.) In the File menu, select Import, and then Images… A window pops up. Drag-and-drop the images of the subtitles from IPDL into it. Click the “OK” button. Another window pops up, which provides many options for OCRs. The following is based on what I’ve read related to IPDL and VSF. I haven’t tried any of it myself before. (For English, nOCR or Binary Image Compare may even be 100% accurate or close to it with images from IPDL.)

 

Tesseract 5 for Chinese subtitles

Binary Image Compare for English subtitles in a regular-size font

nOCR for English subtitles in a big font

Tesseract 5 for English subtitles in lower-quality images, like standard definition

 

Back to Subtitle Edit, in the window that popped up, select the OCR method, e.g., Tesseract 5. Next to “Language,” click “…” to download a language. There are a lot of choices, including Chinese simplified, chi_sim_vert, Chinese Traditional, chi_tra_vert and English. After downloading the language, select it. Then, select the Engine mode, e.g., Tesseract +LSTM. Change other settings, as desired.

 

Click the “Start OCR” button in the lower middle portion of the window (not the “OK” button in the lower right portion of the window). When the OCR is done, click the “OK” button in the lower right portion of the window. I didn’t keep track of the time, but I think it took around 20 minutes to process. It seems that the better the images, the faster the OCR works through them. If the OCR is crawling, it’s a really bad sign that the results will be bad.

 

At this point, Subtitle Edit has created the subtitle file from subtitle images from IPDL. I repeated the steps for subtitle images from VSF. (For VSF images, I ran into a problem with Subtitle Edit. Neither the Tesseract 5 OCR nor the Tesseract 3 OCR processed the VSF images properly. VSF images are black-on-white, whereas IPDL images are white-on-black. It for sure should have worked with the Tesseract 3 OCR. But, I had to invert the VSF images. I’ll describe how I did this later. Then, Subtitle Edit and Tesseract 5 worked.)

 

The combination of IPDL and Tesseract 5 matched 322 of 716 subtitles exactly, which is 45%. The combination of VSF and Tesseract 5 matched 85 of 716 subtitles exactly, which is 12%. Most of the rest of the subtitles were fairly close with IPDL/Tesseract 5 being significantly more accurate than VSF/Tesseract 5.

 

IPDL/Tesseract 5 completely missed 10 of 716 subtitles, which is about 1%. 8 were due to IPDL, and 2 were due to Tesseract. VSF/Tesseract 5 completed missed 37 of 716 subtitles, which is about 5%. 17 were due to VSF, and 20 were due to Tesseract. (These numbers exclude subtitles shorter than 0.83 sec, since subtitles should never be shorter than 0.83 sec (Netflix’s minimum). (Viki’s minimum is 1 sec.) I should have picked a better example, but it is interesting that both IPDL and VSF picked up a large number of subtitles that were shorter than 0.83 sec.)

 

For the first subtitle, here’s a snip of the video:

 

1757486805_Subtitle1.thumb.png.cbed26ff0db83e6e4ceb52843bcb8ab2.png

 

Here’s the subtitle from the WeTV subtitle file: 各位专家领导. Of course, it’s the same.

 

Below is the IPDL image of the subtitle. It looks pristine, doesn’t it? But, what do I know?

 

00_00_10_280__00_00_11_280.thumb.png.ce31aaec0e90871a670d77cdaf7ee72e.png

 

Here’s what the Tesseract 5 OCR extracted: 各位专完领导. The fourth character is off.

 

Below is the VSF image of the subtitle. Notice the bits of junk in it that can affect how well an OCR works. But, it still looks pretty darn good to my unsophisticated eye.

 

0_00_10_280__0_00_11_279_0145500000038400163403840.thumb.jpeg.3d1a249ba5f7669eda239bab8c4c0773.jpeg

 

Below is the inverted version of the VSF image.

 

0_00_10_280__0_00_11_279_0145500000038400163403840.thumb.jpeg.e60e2d2aa45482fa207ca28aa73af9dc.jpeg

 

Here’s what the Tesseract 5 OCR extracted: 各位妃冤颈导. The third, fourth and fifth characters are off.

 

Out of curiosity, I tried quite-a-few free online OCRs. The two below were the best for the one subtitle I tried. Online OCRs like these wouldn’t be practical to use for roughly a thousand subtitles per 45-minute episode. But, they show that an OCR can extract accurate subtitles from the IPDL and VSF images.

 

Online OCR

 

IPDL Image

 

VSF Image

 

Video Snip

 

https://www.ocr.best

各位专家领导

| | 各位专家领导

各位专家领导

https://convertio.co/ocr/chinese/

各位专家领导

各位专家领导

Nothing

 

Summary

 

I’ve only tried this one high-quality video so far. I essentially used default settings in IPDL and VSF. Someone who knows about this stuff may be able to get better images by playing around with settings, especially for VSF. The IPDL images already look great. But, if one were really interested, IPDL has countless settings to play around with.

 

For hundreds of subtitles, I couldn’t understand why the Tesseract 5 OCR didn’t extract them exactly. Does anyone know what the best OCR for Chinese is and one that can batch process on the order of a thousand images? Is it ABBYY FineReader? Is there more than one that is highly accurate, like one that costs less than ABBYY FineReader?

 

Sidebar: Inverting VSF Images

 

To invert VSF images from black-on-white to white-on-black, I had to write my first Python script ever! I haven’t touched any kind of programming in decades, and it wasn’t Python. So, it was a little intimidating. There are a lot of firsts for me with this post.

 

VSF uses OpenCV for image processing. So, I thought it was fair to use OpenCV to invert VSF images. I wanted to make an apples-to-apples-type comparison between IPDL and VSF, so I didn’t want to unknowingly handicap VSF. OpenCV is used in Python, which is a programming language. I had to download and install Python and OpenCV. I won’t go into detailed instructions, unless someone wants me to. Here’s my script (from a text file that I named InvertImages.py) for anyone who might want to use it.

 

# This script inverts a batch of images in a folder.

# It puts inverted images in a separate folder.

 

 

# OpenCV has the ability to invert images.

# Import OpenCV (after pip install opencv-python).

import cv2

 

 

# Import miscellaneous operating system interfaces,

# which include specifying and changing directories.

import os

 

 

# Define the folder with the images to be inverted.

directory = r'C:\Users\MTH\Documents\Images'

 

 

# Loop through the image files in the folder.

for filename in os.listdir(directory):

 

 

    # Specify what the image filenames end with.

    if filename.endswith(".jpeg"):

 

 

        # Change to the folder with images to be inverted.

        os.chdir(directory)

 

 

        # Define the image to be inverted.

        image = cv2.imread(filename)

 

 

        # Define the inverted image.

        image_not = cv2.bitwise_not (image)

 

 

        # Change to the folder for inverted images.

        os.chdir(r"C:\Users\MTH\Documents\Images\Inverted")

 

 

        # Write the inverted image.

        cv2.imwrite(filename,image_not)

 

 

        continue

    else:

        continue

 

This is my first time trying pictures in a post (by cutting-and-pasting from Word). So, I hope it works out.

 

 

Edited by MTH123
Okay, so my picture attempt failed. I've attached the Word file that includes pictures. Update: Pictures added to post.
Link to comment
Share on other sites

I decided to check out the OCR in ABBYY FineReader, because it has a 7-day free trial. I also tried the OCR in Google Docs, which is free. The table further below shows the results, including those of the Tesseract 5 OCR (used via the Subtitle Edit software). It shows the number of subtitles that match the WeTV subtitle file exactly. The WeTV subtitle file has 716 subtitles.

 

For the results of Google Docs and ABBYY FineReader, I also manually removed obvious junk in the subtitles, just to see how much it would help. I didn’t bother doing this with the results of Tesseract 5. By the way, I like to clean up Chinese subtitles in an Excel spreadsheet (mainly because that’s where I also translate them). I delete junk subtitles, combine multiple lines that should be one line, etc. Of course, this can be done in Subtitle Edit, too.

 

Number of Subtitles that Match Exactly

OCR

InpaintDelogo

VideoSubFinder

Google Docs

+ Junk Manually Removed

612 (85%)

565 (79%)

Google Docs

 

582 (81%)

546 (76%)

ABBYY FineReader

+ Junk Manually Removed

568 (79%)

572 (80%)

ABBYY FineReader

 

518 (72%)

555 (78%)

Tesseract 5

 

322 (45%)

85 (12%)

 

For the one video that I tried, both Google Docs and ABBYY FineReader are far better than Tesseract 5.  The combination of InpaintDelogo and Google Docs produced the best results. 85% of the subtitles matched the WeTV subtitle file exactly. For the 15% that didn’t match exactly, most of them are fairly close to right. But, it would take quite a bit of manual work to get them to 100% right.

 

On the other hand, my goal is to learn more of what people are actually saying in Chinese TV dramas. 80% to 85% is good enough for me for at least the foreseeable future.

 

 

Sidebar: Google Docs

 

See the link below for how to OCR files using Google Docs in Google Drive.

 

https://business.tutsplus.com/tutorials/how-to-ocr-documents-for-free-in-google-drive--cms-20460

 

It did not look like it could do batch processing. So, I dragged-and-dropped the images of the subtitles into a Word document. Then, I printed to pdf. I put the pdf through the OCR. The result was a text file of subtitles with no timing.

 

For timing, any of the subtitle files in the original post could be used. I put these two files in separate columns in an Excel spreadsheet. I manually line up the subtitles. Then, in a third column, I use some simple formulas to make what will become a new subtitle file. To me, the amount of work isn’t too bad, especially compared with translating the subtitles and many laborious aspects of learning Chinese.

  • Like 1
Link to comment
Share on other sites

I decided to see what it would take to fix the remaining 15% of the subtitles. It turns out that Google Docs could handle almost all of it, which saves a lot of manual transcribing.

 

·       For the remaining 15% of the subtitles, Google Docs does better with the individual subtitles than with all 716 subtitles in one PDF document. So, most of the images of these subtitles could simply be reprocessed through Google Docs one by one.

 

·       Google Docs had some trouble with quite a few images of subtitles. Here’s an example. The first picture below is an IPDL image. Google Docs produced the following: 为什么肯教我数学呢. The second picture below is a VSF image. Google Docs produced junk. The third picture below is a screen shot of the video. It turns out that Google Docs could extract the subtitle perfectly. It can actually handle color images well, unlike ABBYY FineReader and Tesseract. This is really helpful for manually fixing subtitles.

 

 

00_30_22_800__00_30_24_800.thumb.png.d3cf257fc9dff50a9dbb982280f29620.png

 

 

0_30_22_800__0_30_24_799_0145500000038400163403840.thumb.jpeg.8539562a91cbbe1010e428386ddd9d24.jpeg

 

 

333569740_ScreenShot.thumb.png.463abab2e6ee8bc0f6ad2ed60b63fad3.png

 

·       There were two subtitles that were on a very-light-colored background. All software struggled with separating out the white subtitles from the background, including IPDL, VSF and OCRs. This was understandable. They could be manually transcribed (if you know all the words, which I don't).

 

·       Google Docs does not process single characters well. So, I took screen shots of them in the video, dragged-and-dropped them into a Word document, and put them next to another randomly selected character. Then, I took a screen shot of that and processed it through Google Docs.

 

·       For the subtitles that IPDL/VSF missed, I took screen shots of the video and processed the color images through Google Docs.

 

·       There were around eight characters (in 716 subtitles) that Google Docs got wrong. I have no idea why. Maybe it doesn’t know them well enough (yet?).

 

·       All three OCRs (Tesseract 5, ABBY FineReader and Google Docs) struggled with晶晶 (a name). Sometimes they would get it right, and sometimes they would make a bunch of boxes.

 

The Bottom Line

It's pretty exciting! Google Docs can be 99% accurate (in terms of numbers of subtitles), when used in certain ways on a high-quality video! This is more accurate than IPDL and VSF.

 

Raw Color Images

 

Since Google Docs processes raw color images so well, it’s good to have them handy. For VSF, the raw color images were already created when the black-and-white images were created. They’re in the RGBImages folder, next to the TXTImages folder with the black-and-white images. For IPDL, in Windows, next to the Images folder, create a folder called Images_raw. Then, run the script below (with Extract=2) instead of the previous version of the script (with Extract=1) to create raw color images, as well as black-and-white images.

 

LWLibavVideoSource("C:\Users\MTH\Documents\Extract\Example.mp4")

InpaintDelogo (Loc="100,1470,-100,-28",

\ DynMask=3, DynTune=200, Dyn3Seq=8,

\ Extract=2, Show=4,

\ ImgDir="C:\Users\MTH\Documents\Extract\Images")

 

Files Created by Google Docs

 

I forgot to mention this in the previous post. Google Docs creates a text file or Word file of OCRd results. It frequently puts more than one subtitle on a line. In Word, copy the little dot thing between two subtitles into the find part of the find-and-replace-all function. Type ^p into the replace part. Then, hit replace all to put subtitles on their own lines. Google Docs also includes formatting and font colors I don’t understand. Save the document as a text file to get rid of all the unusual formatting.

 

If anyone knows more about this stuff than the bit I've learned recently, please share!

 

 

Edited by MTH123
Still figuring out how to put pictures in right.
  • Like 2
Link to comment
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...