Write your own PHP crawling agent IP project step by step (2)

Posted by tomhilton on Mon, 30 Dec 2019 16:18:04 +0100

In this chapter, we officially launch our crawler project. First of all, we need to know which website can get free agent IP. At present, there are popular websites such as Xisi agent and fast agent. Here we take Xisi agent as an example.

Official website of West stab agent: http://www.xicidaili.com

Here are free IP addresses and port numbers. Our task is to crawl down these IP addresses and port numbers, check their availability and store them.

First of all, we need to write an entry file, which we call run.php. Its content is roughly as follows:

use ProxyPool\core\ProxyPool;

$proxy = new ProxyPool();
$proxy->run();

Instantiate ProxyPool and call the run method in it. If we want to use the namespace and use it, we can't avoid an autoloader (automatically load the corresponding file according to the name).

The code is as follows:

<?php
namespace AutoLoad;

class autoloader
{
    /**     
    * Auto load by name
    *     
    * @param string $name  use For example, here is ProxyPool\core\ProxyPool
    * @return boolean     
    */
    public static function load_namespace($name)
    {
        //Directory separators compatible with windows and linux
        $class_path = str_replace('\\', DIRECTORY_SEPARATOR, $name);
        
        //Get file path
        $class_file = __DIR__ . substr($class_path, strlen('ProxyPool')) . '.php';
        
         //If not, go to the previous directory to find
        if (empty($class_file) || !is_file($class_file))             
        {                
           $class_file = __DIR__ . DIRECTORY_SEPARATOR . '..' . DIRECTORY_SEPARATOR . "$class_path.php";            
        }
        
        if (is_file($class_file))         
        {            
            require_once($class_file);            
            if (class_exists($name, false))             
            {                
                return true;            
            }        
        }        
        return false;
    }
}
//spl registration auto load
spl_autoload_register('\AutoLoad\autoloader::load_namespace');

Then we come back to modify our run.php file:

<?php
require_once __DIR__ . '/autoloader.php';

use ProxyPool\core\ProxyPool;

$proxy = new ProxyPool();
$proxy->run();

In this way, we can directly use our own class files through the namespace.

Topics: PHP Windows Linux