Tumblr, played by the nation, is about to get a facelift, and you need a downloader now
Tumblrwhat is it?? Baidu's encyclopedia says this:“
Founded in 2007, Tumblr (/ˈtʌmblər/, Chinese: Tang Bole) is currently the world's largest light blogging site and the originator of light blogging sites. Tumblr ( Tang Bole) is a new form of media between traditional blogs and microblogs, focusing on both expression and socialization, and personalized settings, making it one of the most popular social networking sites for young people today. Yahoo's board of directors decided on May 19, 2013 to acquire Tumblr for $1.1 billion. "
Tumblr offers a huge amount of images, short videos, and short text content with so much quality content that it was once a picture lover's paradise. But slowly, domestic users can't access it anymore, and here's why.
It's because of the openness of Tumblr that a group of people in the country are spreading obscene content, which of course involves a large number of people with secrets, such as nude loans, selfies, stolen content leaks, various debauched organizations, ads for pornographic services 。。。。。 No need to say more, you know.
The country has a policy of restricting access to Tumblr, and all normal content is inaccessible, so I modified a python script for downloading Tumblr resources, based on other predecessors, to optimize and improve it. Just in time for Tumblr to get a facelift recently, a lot of your favorite things, or those secrets are going away and it's necessary to post this tool once more.
This post will provide you with three ways to use the downloader.
1) Understand the source code, write one yourself, I will explain the principle clearly, you read the principle, you can modify it arbitrarily, download it on demand, this is the most flexible way.
2) Install the Python environment and run the script directly.
3) For the lazy personexe documents, Configure the configuration file directly, Just run it.。 Here's how we say these backwards。
1.1 exe file
In the first version, I provided the exe file, which required a reward payment, and this time, because of the new Knowledge Planet circle, the exe file benefit is only available to circle insiders, along with hundreds of premium Tumblr accounts for downloadable resources.
The exe file is used as follows.
First edit the site.txt file and put the name of the space you want to download the resource from, with multiple spaces separated by English commas, no spaces, and no .tumblr.com suffix, for example you want to download https://nixcraft.tumblr.com/ Space for under pictures and videos.
Then just fill in "nixcraft" and site.txt will end up looking like this.
Configuring Proxies
If you do not have direct access to theTumblr Or not usingvpn, would require Configuring Proxies。
Refer to `./proxies_sample1.json` and `./proxies_sample2.json` for the format of the document.
Then write your proxy information in json format to `./proxies.json`.
If the document`./proxies.json` There is no content, Proxy will not be used during the download.
If you are in global mode usingShadowsocks act as an agent, At this point your`./proxies.json` The file can be written as follows,
```json
{
"http": "socks5://127.0.0.1:1080",
"https": "socks5://127.0.0.1:1080"
}
```
The last thing is to launch the exe for download.
exe Access to documents, Check out the brief description of the planet in the image below, Scan code to pay to join, There is a zip in the top post。
Downloading and saving of site pictures/videos
When the program is run, it will generate a folder with the same name as the tumblr blog under the current path by default,
Photos and videos will be placed under this folder.
run the program, No duplicate downloads of already downloaded images and videos, So don't worry about duplicate downloads. at the same time, Multiple runs can
Help you retrieve lost or deleted pictures and videos.
Also, each time you reboot, you can delete the small video files that may not have been downloaded successfully because of the network, and they will be downloaded again.
The second method source code is downloaded and run( complimentary, easy)
The following way is actually simpler as well.
## Environment installation
### First installpython environments, Just install the latest version
#### Download Code, Installing dependencies
```bash
$ git clone https://github.com/xuanhun/tumblr-crawler.git
$ cd tumblr-crawler
$ pip install -r requirements.txt
```
You're done, skip to the next section to configure and run it.
## Configure and run
There are two ways to specify the site you want to download, either by editing `sites.txt` or by specifying command line parameters.
### First method: edit sites.txt file (recommended)
Find a text editor, then open the file `sites.txt` and edit the Tumblr site you want to download into it, separated by commas, no spaces, no `.tumblr.com` suffix. For example, if you want to download _vogue.tumblr.com_ and _gucci.tumblr.com_ The file looks like this:
```
vogue,gucci
```
Then save the file, double click and run `tumblr-photo-video-ripper.py` or in the terminal
Run `python tumblr-photo-video-ripper.py`
### Second method: use command line arguments (only for users who can use the OS terminal)
If you are familiar with the command line on Windows or Unix systems, you can specify the site to be downloaded by specifying the command line parameters at runtime:
```bash
python tumblr-photo-video-ripper.py site1,site2
```
Site names are comma separated, no spaces, and do not need the '.tumblr.com' suffix.
### Downloading and saving of site images/videos
When the program is run, it will generate a folder with the same name as the tumblr blog under the current path by default,
Photos and videos will be placed under this folder.
When you run this script, it will not duplicate downloaded images and videos, so you don't have to worry about duplicate downloads. Also, multiple runs can
Help you retrieve lost or deleted pictures and videos.
### Use of proxies ( selectable)
You can't access it directly if you can'tTumblr Or not usingVPN, would require Configuring Proxies。
gentle Refer to `./proxies_sample1.json` and `./proxies_sample2.json` for the format of the document.
Then write your proxy information in json format to `./proxies.json`.
You can visit<http://jsonlint.com/> to make sure your formatting is correct.
If the document`./proxies.json` There is no content, Proxy will not be used during the download.
If you are in global mode usingShadowsocks act as an agent, At this point your`./proxies.json` The file can be written as follows,
```json
{
"http": "socks5://127.0.0.1:1080",
"https": "socks5://127.0.0.1:1080"
}
```
Then re-run the download command.
Third way, talk about the principles and do whatever you want (high success, high freedom)
Just open the link below, the basics I talked about before haven't changed, you download the latest code to modify