Because the communication between networks is based on TCP protocol, and the communication between the server and the browser is based on HTTP protocol, the following implementation of a TCP server based on Python, the browser can send requests and parse based on HTTP protocol. The browser displays a standard HTML page returned, and the server interprets multiple requests from the client and returns the result. That is, the client sends HTTP requests to the server according to various links in THE HTML to get the corresponding images, videos, Flash, JavaScript scripts, CSS and other resources, and finally displays a complete page.

1. Code to achieve the Web server, the implementation of user request parsing and return the corresponding results

#coding=utf-8
import socket
import re

def handle_client(client_socket) :
    "Serving a client"
    recv_data = client_socket.recv(1024).decode('gbk', errors="ignore") # Error ignored
    Note that although the client can send a second request based on the link in the returned HTML, to return the request content correctly, the server needs to be able to parse the request content, find it, read it, and send it to the client. PNG HTTP/1.1\r\nHost: 127.0.0.1:7890\r\nConnection: b'GET /images/qt-logo.png HTTP/1.1\r\nHost: 127.0.0.1:7890\r\nConnection: Keep-alive \r\ nuser-agent: Mozilla/5.0 (Windows NT 10.0; Win64; X64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.80 Safari/537.36\r\nAccept: image/webp,image/apng,image/*,*/*; Q =0.8\r\ naccept-Encoding: http://127.0.0.1:7890/\r\ naccept-encoding: gzip, deflate, br\r\ nAccept-language: zh-cn,zh; Q = 0.9 \ r \ n \ r \ n ' ' ' '
    request_header_lines = recv_data.splitlines() #1. Shard the data received by the server in HTTP format. The pattern separator is \r\n
    for line in request_header_lines:   # Print is for test use
        print(line)

    http_request_line = request_header_lines[0]  GET the first line of the requested data, for example: GET /images/qt-logo.png HTTP/1.1
    get_file_name = re.match("[^ /] + ((^) / *)", http_request_line).group(1)  #3. Obtain the requested file name images/qt-logo.png
    print("file name is ===>%s" % get_file_name)  # for test

    # if you do not specify which page to visit. Such as the index. HTML
    # GET / HTTP/1.1
    if get_file_name == "/":
        get_file_name = DOCUMENTS_ROOT + "/index.html"
    else:
        get_file_name = DOCUMENTS_ROOT + get_file_name

    print("file name is ===2>%s" % get_file_name) #for test

    try:
        f = open(get_file_name, "rb")
    except IOError: # 404 not found if the file is not present
        # 404 indicates that this page is not available
        response_headers = "HTTP / 1.1 404 not found \ r \ n"
        response_headers += "\r\n"
        response_body = "====sorry ,file not found===="
    else:  # Notice the use of else here, return 200 OK if the file is found
        response_headers = "HTTP / 1.1 200 OK \ r \ n"
        response_headers += "\r\n"
        response_body = f.read() If the file is too large, it can be read in a loop
        f.close()

    finally:This is where the read file is sent to the client.
        Since the header information is organized as a string, it cannot be merged with data read in binary open files, so it is sent separately
        Send the response header first
        client_socket.send(response_headers.encode('gbk')) Note that if tested on Linux, change to UTF-8
        Send the body again
        client_socket.send(response_body)
        client_socket.close()

def main() :
    "As the main control entry for the program."
    server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    Port 7788 is bound to port 7788 the next time you run the program
    server_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
    server_socket.bind(("".7788))
    server_socket.listen(128)
    while True:
        client_socket, clien_cAddr = server_socket.accept()
        handle_client(client_socket)


Configure the server
DOCUMENTS_ROOT = "./html"  # Since the files are stored in the HTML folder under the current path, I define a fixed path for the pre-written web page files.

if __name__ == "__main__":
    main()
Copy the code

2. Test display and analysis

Start the above program, open a Web browser, access the server. Note that previously our server returned a fixed page to the browser and did not parse the requested data, so 192.168.1.1: 404 NOT FOUND 404 NOT FOUND /aaaa/ BBBB/CCC 404 NOT FOUND

2.1 Incorrect Access Address

2.2. Send requests to the server normally

So at this time we need to access according to the normal deployment of the server is the file. For example, if we configure the default path on the server to be index.html, we will access it directly from the browserhttp://192.168.1.1:7788/index.htmlorhttp://192.168.1.1:7788The result is the same, as shown below. Note that the images on the following page have already loaded properly. The realization of the client according to the HTML inside a variety of links, and then send HTTP requests to the server, get the corresponding pictures, videos, Flash, JavaScript scripts, CSS and other resources, the final display of a complete page

2.3 The result of the internal browser request and resolution displayed on the server is as follows:

You can see that the request was made once, but the browser, because of the url embedded in the page, then sent multiple requests and got the correct response.

GET / HTTP/1.1
Host: 192.1681.1.:7788
Connection: keep-alive
Cache-Control: max-age=0
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.03770.80. Safari/537.36Accept: text/html,application/xhtml+xml,application/xml; q=0.9,image/webp,image/apng,*/*; q=0.8,application/signed-exchange; v=b3 Accept-Encoding: gzip, deflate Accept-Language: zh-CN,zh; q=0.9

file name is ===>/
file name is= = =2>./html/index.html
GET /classic.css HTTP/1.1
Host: 192.1681.1.:7788
Connection: keep-alive
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.03770.80. Safari/537.36Accept: text/css,*/*; q=0.1
Referer: http://192.1681.1.:7788/ Accept-Encoding: gzip, deflate Accept-Language: zh-CN,zh; q=0.9

file name is ===>/classic.css
file name is= = =2>./html/classic.css
GET /images/qt-logo.png HTTP/1.1
Host: 192.1681.1.:7788
Connection: keep-alive
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.03770.80. Safari/537.36Accept: image/webp,image/apng,image/*,*/*; q=0.8
Referer: http://192.1681.1.:7788/ Accept-Encoding: gzip, deflate Accept-Language: zh-CN,zh; q=0.9

file name is ===>/images/qt-logo.png
file name is= = =2>./html/images/qt-logo.png
GET /images/trolltech-logo.png HTTP/1.1
Host: 192.1681.1.:7788
Connection: keep-alive
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.03770.80. Safari/537.36Accept: image/webp,image/apng,image/*,*/*; q=0.8
Referer: http://192.1681.1.:7788/ Accept-Encoding: gzip, deflate Accept-Language: zh-CN,zh; q=0.9

file name is ===>/images/trolltech-logo.png
file name is= = =2>./html/images/trolltech-logo.png
Copy the code