Make writing a habit together! This is my first day to participate in the “Gold Digging Day New Plan · April More text challenge”, click to see the details of the activity.

The Internet is all around us today, like air. It’s changing our lives in countless ways, but the core technology of the Internet has changed very little.

With the vigorous development of open source culture, there are many excellent open source Web frameworks, which make our development easier. But at the same time, we dare not stop learning the steps of the new framework, in fact, all changes are the same. As long as we understand the core technology part of the Web framework, when a new framework comes out, the basic part is similar, we only need to focus on understanding: what are its characteristics, which technologies are used to solve the pain points? This will make it easier to accept and understand new technologies and less exhausting.

There are those who can only use Web framework students, whether countless times open the source of the framework, want to learn to improve but do not know how to start?

Today we are on the silk peeling cocoon, to simplify, with a file, to achieve a mini Web framework, so as to explain its core technology part clearly, the source code has been open source.

GitHub address: github.com/521xueweiha…

Online: hellogithub.com/onefile/

If you think I do this thing is helpful to you, please give me a ✨Star, more forwarding so that more people benefit.

Without further ado, let’s begin our journey of improvement today.

First, introduce the principle

If we start with the OSI seven-layer network model, I’m sure we’ll get no more than 30 percent of it!

So today we’ll go straight to the top layer, which is the HTTP application layer that Web frameworks touch most, and we’ll cover TCP/IP briefly when we talk about sockets. During the period I will deliberately code unnecessary to explain the technical details, cut off from the topic of this period of technical topics, a file only speak a technical point! Please feel free to read without delay.

First of all, let’s recall the normal process of browsing the website.

If we compare surfing on the Internet to listening to a lesson in a classroom, the teacher is the server and the student is the client. When students have problems will raise your hand (request to establish a TCP) first, the teacher found the students’ questions at the request of students agreed to answer questions, the students stood up to question (request), and if the questions of the teacher promised to give students the classroom performance score, then ask questions when they need a efficient way of asking questions (request format), namely:

  • The first student id
  • And then ask the question

After receiving the students’ questions, the teacher can immediately answer the questions without asking the student number (return the response). The answer format (response format) is as follows:

  • Answer questions after
  • Extra points according to student number!

With the agreed good question format (agreement), you can save the teacher every time to ask the student’s student number, that is, efficient and rigorous. Finally, the teacher answers the question and asks the students to sit down (close the connection).

In fact, our communication flow on the Internet is similar:

It’s just that machines are more rigorously enforced, and people develop software that follows a protocol, so that they can communicate over a protocol, and this network protocol is called HTTP.

The Web framework we want to do is handle the above process: establish the connection, receive the request, parse the request, process the request, and return the request.

So much for the principles section, just remember that there are two big steps to communicating on the web: establishing the connection (for communication) and handling the request.

A framework is something that handles most of the time, so we’ll write a Web framework that handles two things:

  • Processing connections (Sockets)
  • Processing requests

Keep in mind that a connection and a request are two different things.

And want to establish a connection to initiate communication, you need to achieve through socket (establish a connection), socket can be understood as two virtual book (file handle), communication of both sides of a hand, it can both read and write, as long as the transmission of the content written to the book (processing requests), the other side can see.

I’ll break the Web framework into two parts, and all the code will be implemented in Python3, which is easy to understand.

Write a Web framework

The code + comments total 457 lines, please rest assured absolutely simple to understand.

2.1 Handling Connections (HTTPServer)

A quick word about sockets, which at the programming language level are libraries that handle connections and network communication. However, it is essentially a process that provides communication at the system level, and a computer can establish multiple communication lines, so each port number is followed by a socket process, which is independent of each other and does not interfere with each other, which is why we specify a port number when starting the service.

Finally, the server is a better computer that’s always on, and the client is a browser, phone, or computer that has a socket (a process at the operating system level).

To write a Web framework that handles connections, you need to understand the steps and processes of socket processing.

The following shows the server.py and client.py code based on sockets, respectively.

# coding: utf-8
# server-side code (server.py)
import socket

print('I'm the server! ')
HOST = ' '
PORT = 50007
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)  Create a TCP socket object
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)  Release ports on reboot
s.bind((HOST, PORT))  # bind address
s.listen(1)  1 indicates the maximum number of connections the operating system can suspend (pending requests). The value must be at least 1
print('Listening port:', PORT)
while 1:
    conn, _ = s.accept()  Start passively accepting connections from TCP clients.
    data = conn.recv(1024)  1024 indicates the size of the buffer
    print('Received :'.repr(data))
    conn.sendall(b'Hi, '+data)  Send data to the client
    conn.close()
Copy the code

Because HTTP is built on the relatively reliable TCP protocol, TCP Socket objects are created here.

# coding: utf-8
# Client code (client.py)
import socket

print('I'm the client! ')
HOST = 'localhost'    IP address of the server
PORT = 50007              The port of the server to connect to
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((HOST, PORT))
print("Send 'HelloGitHub'")
s.sendall(b'HelloGitHub')  Send 'HelloGitHub' to the server
data = s.recv(1024)
s.close()
print('Received'.repr(data))  Print the data received from the server
Copy the code

The running effect is as follows:

With the above code, it is easier to understand the flow of socket establishment communication:

  1. Socket: Creates a socket
  2. Bind: indicates the bind port number
  3. Listen: Starts listening
  4. Accept: Receives the request
  5. Recv: receives data
  6. Close: Closes the connection

As a result, the HTTPServer class in the Web framework that handles connections comes into play. Create socket in __init__ method, bind port (server_bind) and listen port (server_activate)

# Process connections for data communication
class HTTPServer(object) :
    def __init__(self, server_address, RequestHandlerClass) :
        self.server_address = server_address # server address
        self.RequestHandlerClass = RequestHandlerClass Class to handle requests

        Create a TCP Socket
        self.socket = socket.socket(socket.AF_INET,
                                    socket.SOCK_STREAM)
        Bind the socket to the port
        self.server_bind()
        # start listening on port
        self.server_activate()
Copy the code

As you can see from the RequestHandlerClass parameter passed in, processing the request is handled separately from establishing the connection.

Serve_forever (HTTPServer) : serve_forever (HTTPServer) : serve_forever (HTTPServer) : serve_forever (HTTPServer) : serve_forever (HTTPServer) : serve_forever (HTTPServer)

def serve_forever(self) :
    while True:
        ready = selector.select(poll_interval)
        When the data requested by the client is available, proceed to the next step
        if ready:
            If you have a ready handle to a readable file, the link to the client is established
            request, client_address = self.socket.accept()
            RequestHandlerClass handles requests and connections independently
            self.RequestHandlerClass(request, client_address, self)
            # close the connection
            self.socket.close()
Copy the code

And so on, that’s all the code for HTTPServer to process the connection and establish the HTTP connection, right? Right! Isn’t that easy?

The RequestHandlerClass parameter in this code is the class that handles the request, and the corresponding HTTPRequestHandler handles the HTTP request in more detail.

2.2 Processing Requests (HTTPRequestHandler)

Remember the socket introduced above how to achieve communication between the two ends? Through two readable and writable “virtual books”.

In addition, to ensure the efficient and rigorous communication, it is necessary to have a corresponding “communication format”.

So, there are only three steps to processing the request:

  1. Setup: Initializes two books

    • File handle to read request (rfile)
    • File handle to write response (wfile)
  2. Handle: Reads and parses the request, processes the request, constructs the response, and writes

  3. Finish: Returns the response, destroys both books to release resources, and then dust itself off, waiting for the next request

Corresponding code:

# handle requests
class HTTPRequestHandler(object) :
    def __init__(self, request, client_address, server) :
        self.request = request # Incoming requests (socket)
        # 1 initialize two books
        self.setup()
        try:
            # 2. Read, parse, process the request, construct the response
            self.handle()
        finally:
            # 3. Return the response and release the resource
            self.finish()
    
    def setup(self) :
        self.rfile = self.request.makefile('rb', -1# Read the requested book
        self.wfile = self.request.makefile('wb'.0Write the response book
    def handle(self) :
        Parse requests according to the HTTP protocol
        # Concrete processing logic, namely business logic
        Construct the response and write it to the book
    def finish(self) :
        # return response
        self.wfile.flush()
        Close request and response handles, freeing resources
        self.wfile.close()
        self.rfile.close()
Copy the code

This is the overall flow of processing the request. Here, I’ll explain how Handle parses the HTTP request and constructs the HTTP response, as well as how it implements the separation of the framework from the specific business code (processing logic).

Before parsing HTTP, I need to look at an actual HTTP request. When I open the homepage of hellogithub.com, the browser sends the following HTTP request:

The HTTP request format can be summarized as follows:

{HTTP method} {PATH} {HTTP version}\r\n
{header field name}:{field value}\r\n
...
\r\n
{request body}
Copy the code

Given the request format, Handle has a way to resolve the request.

def handle(self) :
    Start parsing -- #
    self.raw_requestline = self.rfile.readline(65537Read the first line of the request, the request header
    requestline = str(self.raw_requestline, 'iso-8859-1'# transcoding
    requestline = requestline.rstrip('\r\n'# remove newlines and blank lines
    # GET/HTTP/1.1
    self.command, self.path, self.request_version = requestline.split() 
    # split string by space ("GET", "/", "HTTP/1.1")
    # command corresponds to HTTP method and path corresponds to request path
    # request_version corresponds to the HTTP version. Different versions have different parsing rules
    self.headers = self.parse_headers() Parsing request headers also handles strings, but the library has utility functions that are more complex and omitted here
    # -- Business logic -- #
    # do_HTTP_method corresponds to the specific handler function
    mname = ('do_' + self.command).lower()
    method = getattr(self, mname)
    # call the corresponding handler
    method()
    # return the response #
    self.wfile.flush()

def do_GET(self) :
    # change the path
    if self.path == '/':
        self.send_response(200)  # status code
        Add the response header
        self.send_header("Content-Type"."text/html; charset=utf-8")
        self.send_header("Content-Length".str(len(content)))
        self.end_headers() '\r\n'
        self.wfile.write(content.encode('utf-8')) Write the response body, i.e. the page content

def send_response(self, code, message=None) :
    # Responsive physique
    """ {HTTP version} {status code} {status phrase}\r\n {header field name}:{field value}\r\n ... \r\n {response body} """
    Write the response header line
    self.wfile.write("%s %d %s\r\n" % ("HTTP / 1.1", code, message))
    Add the response header
    self.send_header('Server'."HG/Python ")
    self.send_header('Date', self.date_time_string())
Copy the code

This is the core code snippet for handling a request and returning a response with Handle. Now that HTTPRequestHandler is complete, we’ll show you how it works.

2.3 run

class RequestHandler(HTTPRequestHandler) :
    # handle GET requests
    def do_get(self) :
        The path corresponds to the specific processing method
        if self.path == '/':
            self.handle_index()
        elif self.path.startswith('/favicon'):
            self.handle_favicon()
        else:
            self.send_error(404)

if __name__ == '__main__':
    server = HTTPServer((' '.8080), RequestHandler)
    # start service
    server.serve_forever()
Copy the code

Here, the do_GET method is overwritten by RequestHandler, which inherits the HTTPRequestHandler implementation of Web framework, to achieve the separation of business code and framework. This ensures the flexibility and decoupling of the framework.

The service then runs without surprise, with the following effect:

The Web framework code in this article has been simplified for ease of reading. To get the full working code, go to GitHub:

Github.com/521xueweiha…

The framework does not contain the rich functions of the Web framework, aimed at the simplest code, to achieve a mini Web framework, let the students who do not understand the basic Web framework structure, to explore.

If the content of this article has piqued your interest in Web frameworks, you’ll want to take a closer look at more comprehensive, concise Web frameworks that are equally suited to production, code and structure. My recommended learning path:

  1. Python3 HTTPServer, BaseHTTPRequestHandler

  2. Bottle: Open source Web framework that can be used in production environment with single file, no third-party dependencies and continuous updates:

    • Address: github.com/bottlepy/bo…
  3. werkzeug -> flask

  4. starlette -> uvicorn -> fastapi

Sometimes reading framework source code is not to write a new framework, but to learn and close to the previous.

The last

New technology is always endless, master the core technology principle, not only can accept new knowledge faster, but also can hit the nail on the head when troubleshooting problems.

I wonder if this kind of article, which explains one technology point at a time, tries to describe the principle through simple text and concise code, during which the technical details are erased, focuses on one technology, and finally gives the complete running open source code, is not your appetite? This article is my attempt at a new series, open to any Pointers and criticisms.

If you like this kind of article, please like to give me a little encouragement, and leave a comment with suggestions or “order food”.

OneFile is looking forward to your participation, click to contribute a force.

Don’t think about what you do for open source, just be clear about what you do for yourself.