ProxyPool

A highly available, easy to deploy, long stable, easy to scale asynchronous authentication IP proxy pool.

The project address

Hogan-TR/ProxyPool

The characteristics of

  • High availability: The design of conventional proxy pool for data processing of a single database is changed, and two data pools — Chaos + stable are established, which are respectively used for original data cleaning and provide stable IP proxy invocation interfaces to improve robustness, thus effectively avoiding the degradation of IP proxy quality caused by new unknown data entering the database. Possibility of web page request failure
  • Easy deployment: Docker container deployment is adopted to quickly build services and reduce the learning cost of proxy pool
  • Long stability: The grading and scoring mechanism can be established freely, and the initial score, minimum allowed score, maximum controllable score and migration conditions of chaos pool and stability pool can be set in the Config file respectively, so as to effectively control the occupation of computing resources while ensuring the quality of agents
  • Easy to expand: In view of the timeliness of the Internet open proxy resource acquisition rules, crawler module can use the provided request function to independently write the crawling rules for specific websites and return the generator of proxy data

run

Docker deployment

  1. Docker installation

  2. Docker-compose installation (PIP installation is recommended)

  3. Download the current repository code locally

    git clone https://github.com/Hogan-TR/ProxyPool.git
    Copy the code
  4. Modify the configuration in./proxypool/config.py

    REDIS_HOST = "127.0.0.1"   # replace the contents of REDIS_HOST with the internal IP of the current machine
    
    Other configurations can be modified as required
    Copy the code
  5. Modify the configuration in docker-comemage. yml

    You can modify the port mapping of ports in Main to change the local call interface of the API. The default port is 5000

  6. Run the docker-compose up command to start the agent pool

Native deployment (UniX-like systems only)

  1. Preparation: Python3 + Redis environment

  2. Download the current repository code locally

    git clone https://github.com/Hogan-TR/ProxyPool.git
    Copy the code
  3. Create a virtual environment and install dependencies

    cd ProxyPool
    python3 -m venv venv
    source ./venv/bin/activate
    pip install -r requirements.txt
    Copy the code
  4. Modify Redis parameters REDIS_HOST, REDIS_PORT, and REDIS_PASSWORD in proxypool/config.py based on Redis configuration

  5. Run the sudo python run.py command to start the agent pool

Note: after the agent pool is started for the first time, it takes about 10 minutes to capture and clean the data before providing high-quality agents

Function realization diagram

Write in the end: welcome big guys clone test, raise issue, if you like it might as well give a star 😝