Original address: copyfuture.com/blogs-detai…

This article explains how to quickly determine whether an IPV4 address is an IP address in mainland China.

The number of IPV4 addresses in China is now about 340 million.

Github repository address, source code and IP address are in it. Github.com/chenhaoxian…

The latest date is November 30, 2019 for all IP ranges within China.

There will be continuous updates and maintenance

Welcome everyone star

The simplest way, consume a lot of memory, rich method.

Store all 340 million IP addresses in memory in Set.

In terms of 32 bytes per IP address, it takes about 10 GIGABytes.

I won’t go into this method.

Split the IP into four segments. A, B, C, D

It can save 3/4 of the space compared with method 1, but it still needs a lot of memory.

The storage is stored in a tree structure. When a matches, B matches, and then C matches

The IP address is in the set only when the ABCD segment matches completely

IPV4 address range in CIDR format

CIDR and Address Block calculation IP address: IP address ::= {< network prefix >, < host number >}/ Number of network prefixes (slash notation)

In CIDR notation, any IP address is equivalent to a CIDR address block, which implements route aggregation

With this approach, the blocks of IPV4 addresses represented in CIDR format are in the tens of thousands, and the storage is sufficient.

You can determine whether the IP is in the IP set by storing the IPV4 start and end of the CIDR and comparing the actual IP to whether it is in the CIDR block

The details will not be explained. Personally, I think it is more troublesome to use this way, because there is a more convenient method

Is the best way I know so far to determine if an IP is in some IP range, if you have a better way, feel free to comment

We can know that an IP address can be stored by 32 bytes, and 32 bytes can be converted into long digits. According to the idea of method 3, if the IP address is searched in a range way, the storage space is very small and the efficiency is very high.

So you can store blocks of CIDR IP addresses by storing the starting IP number followed by the range number (or the starting and ending number)

Method four core classes:

package com.uifuture.chinaipfilter;
import com.uifuture.chinaipfilter.util.FileUtils;
import org.apache.commons.net.util.SubnetUtils;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Set;


public class ChinaIPRecognizer {
    
    private static final long RANGE_SIZE = 10000000L;
    
    private static Map<Integer, List<ChinaRecord>> recordMap = new HashMap<>();
    private static final String CHINA_IP_PATH = "china_ip/ip_long.txt";

    final static class ChinaRecord {
        
        public long start;
        
        public int count;

        private ChinaRecord(long start, int count) {
            this.start = start;
            this.count = count;
        }

        
        public boolean contains(long ipValue) {
            return ipValue >= start && ipValue <= start + count;
        }
    }

    static {
        List<ChinaRecord> list = new ArrayList<>();
        
        Set<String> ipSet = FileUtils.readSetFromResourceFile(CHINA_IP_PATH);
        for (String ip : ipSet) {
            String[] ipS = ip.split(",");
            long ipLong = Long.parseLong(ipS[0]);
            int count = Integer.parseInt(ipS[1]);
            ChinaRecord chinaRecord = new ChinaRecord(ipLong,count);
            list.add(chinaRecord);
        }

        list.forEach(r -> {
            int key1 = (int) (r.start / RANGE_SIZE);
            int key2 = (int) ((r.start + r.count) / RANGE_SIZE);
            List<ChinaRecord> key1List = recordMap.getOrDefault(key1, new ArrayList<>());
            key1List.add(r);
            recordMap.put(key1, key1List);
            if (key2 > key1) {
                List<ChinaRecord> key2List = recordMap.getOrDefault(key2, new ArrayList<>());
                key2List.add(r);
                recordMap.put(key2, key2List);
            }
        });
    }

    
    public static boolean isCNIP(String ip) {
        if (ip == null || ip.trim().isEmpty()) {
            return false;
        }

        if (isValidIpV4Address(ip)) {
            long value = ipToLong(ip);
            int key = (int) (value / RANGE_SIZE);
            if (recordMap.containsKey(key)) {
                List<ChinaRecord> list = recordMap.get(key);
                return list.stream().anyMatch((ChinaRecord r) -> r.contains(value));
            }
        }

        return false;
    }

    
    public static boolean isValidIpV4Address(String value) {

        int periods = 0;
        int i;
        int length = value.length();

        if (length > 15) {
            return false;
        }
        char c;
        StringBuilder word = new StringBuilder();
        for (i = 0; i < length; i++) {
            c = value.charAt(i);
            if (c == '.') {
                periods++;
                if (periods > 3) {
                    return false;
                }
                if (word.length() == 0) {
                    return false;
                }
                if (Integer.parseInt(word.toString()) > 255) {
                    return false;
                }
                word.delete(0, word.length());
            } else if (!Character.isDigit(c)) {
                return false;
            } else {
                if (word.length() > 2) {
                    return false;
                }
                word.append(c);
            }
        }

        if (word.length() == 0 || Integer.parseInt(word.toString()) > 255) {
            return false;
        }

        return periods == 3;
    }

    
    public static long ipToLong(String ipAddress) {
        String[] addrArray = ipAddress.split("\\.");

        long num = 0;
        for (int i = 0; i < addrArray.length; i++) {
            int power = 3 - i;
            
            
            
            
            num += ((Integer.parseInt(addrArray[i]) % 256 * Math.pow(256, power)));
        }
        return num;
    }

    
    public static String longToIp(long i) {
        return ((i >> 24) & 0xFF) + "." + ((i >> 16) & 0xFF) + "." + ((i >> 8) & 0xFF) + "." + (i & 0xFF);
    }

    
    public static String[] analysisCidrIp(String ip){
        SubnetUtils utils = new SubnetUtils(ip);
        return utils.getInfo().getAllAddresses();
    }

    
    public static void main(String[] args) {
        
        Set<String> ipSet = FileUtils.readSetFromResourceFile("china_ip/cidr.txt");
        Set<String> saveIp = new HashSet<>();
        for (String ip : ipSet) {
            
            String[] ips = analysisCidrIp(ip);
            long min = Long.MAX_VALUE;
            long max = Long.MIN_VALUE;
            for (String s : ips) {
                long ipL = ipToLong(s);
                if(ipL>max){
                    max=ipL;
                }
                if(ipL<min){
                    min=ipL;
                }
            }
            if(max==Long.MIN_VALUE && min==Long.MAX_VALUE){
                continue;
            }
            if(max==Long.MIN_VALUE){
                max = min;
            }
            String save =min+","+(max-min);
            saveIp.add(save);
            System.out.println(save);
        }
        FileUtils.saveSetToResourceFile(saveIp,"china_ip/ip_long.txt");
    }
}
Copy the code

Original address: copyfuture.com/blogs-detai…