Httpbin, the project of Requests author Kenneth Reitz, is an HTTP protocol demo project created using Flask. Learning this project, we can probably get two small gains:

  1. Learn how to make a website using Flask
  2. Learn some details about the HTTP protocol

Before we start, welcome to flask for those of you who are not familiar with it:

  • Flask source code read – on
  • Flask source code read – in
  • Flask source read – next

Httpbin project structure

We chose httpbin v0.7.0, the project structure is as follows:

The module function
templates Template file
core Function implementation
fileters Some decorator implementations
helpers Some help classes
structures Data structure implementation
utils Some utility classes
Dockerfile Docker image file
test_httpbin.py Unit test cases

The use of httpbin

Httpbin projects are available directly in thehttpbin.org/Site experience, the site interactively shows some uses of HTTP, such asgetrequest

  • Data is requested using the GET method of the HTTP protocol
  • Set in the header of the requestaccept:application/jsonReceive JSON output
  • Display the status code,header and body of response

We can also use curl observations in the terminal:

curl -v -X GET "https://httpbin.org/get" -H "accept: application/json" ... < HTTP/2 200 < date: Sun, 09 Jan 2022 12:34:55 GMT < content-type: application/json < content-length: 269 < server: Gunicorn /19.9.0 < access-Control-allow-Origin: * < access-Control-allow-credentials: true < {"args": {}, "headers": {" Accept ", "application/json", "the Host" : "httpbin.org", "the user-agent" : "curl / 7.64.1", "X - Amzn - Trace - Id" : "Root= 1-61DAD66F-2405a8151152a4664c258b05 "}, "origin": "111.201.135.46", "url": "https://httpbin.org/get"}Copy the code

The -v parameter traces the request process

Comparison shows that this is consistent with the data presented on the website. There are also many HTTP method demonstrations on the Httpbin web site that you can try for yourself.

The realization of the httpbin

Httpbin deployment

The Dockerfile file describes how httpbin runs with gunicorn deployment:

Set the environment variable ENV WEB_CONCURRENCY=4 # ADD. /httpbin # Install dependency RUN apk add-u ca-certificates libffi libstdc++ && \ apk add --virtual build-deps build-base libffi-dev && \ # Pip pip install --no-cache-dir gunicorn /httpbin && # Cleaning up apk del build-deps && rm -rf /var/cache/apk/* # EXPOSE 8080 # Start the service with CMD ["gunicorn", "-b", "0.0.0.0:8080", "httpbin:app"]Copy the code

Gunicorn launches the httpbin:app, which is provided by the Core module under the httpbin package:

. # Find the correct template folder when running from a different location tmpl_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'templates') app = Flask(__name__, template_folder=tmpl_dir) ...Copy the code
  • When you start your app, set the flask project’s template file path to the Templates directory, which is the same directory as the Core file.

Implementation of the GET API

/getAPI returns the requested URL, args, header, and Origin, and jsonizes the result:

@app.route('/get', methods=('GET',))
def view_get():
    """Returns GET Data."""

    return jsonify(get_dict('url', 'args', 'headers', 'origin'))
Copy the code

The jsonify output uses the Jsonify feature that Flask provides, just adding a newline output to the default result:

from flask import jsonify as flask_jsonify

def jsonify(*args, **kwargs):
    response = flask_jsonify(*args, **kwargs)
    if not response.data.endswith(b'\n'):
        response.data += b'\n'
    return response
Copy the code

Flask’s request is bound to a thread, so there is no need to pass request parameters to get_dict:

def get_dict(*keys, **extras): """Returns request dict of given keys.""" _keys = ('url', 'args', 'form', 'data', 'origin', 'headers', 'files', 'json', 'method') assert all(map(_keys.__contains__, keys)) ... Args args=semiflatten(request.args), form=form, data=json_safe(data), d = dict(url=get_url(request), origin=request.headers.get('X-Forwarded-For', request.remote_addr), headers=get_headers(), files=get_files(), Json =_json, method=request.method,) out_d = dict() # out_d[key] = d.get(key) out_d.update(extras) return out_dCopy the code

You can use the following command line to demonstrate the args argument. The query name=shawn&age=18 is automatically converted to the ARGS dictionary:

curl -X GET "https://httpbin.org/get?name=game404&age=18" { "args": { "age": "18", "name": "game404" }, "headers": {" Accept ":" * / * ", "the Host" : "httpbin.org", "the user-agent" : "curl / 7.64.1", "X - Amzn - Trace - Id" : "Root= 1-61DADb92-7bd4d2a3130e8df54f2ebeb4 ", "origin": "111.201.135.46", "url": "https://httpbin.org/get?name=shawn&age=18" }Copy the code

HTTP is a hypertext protocol, so the age parameter defaults to a string, not a number

Http-bin also provides two flask-based Middlewares implementations, one of which is after_request, which handles cross-domain issues after the request completes, adding two cross-domain flags to the response header:

@app.after_request def set_cors_headers(response): Response.headers [' access-control-allow-origin '] = request.headers ('Origin', '*') response.headers['Access-Control-Allow-Credentials'] = 'true' ... return responseCopy the code

You can verify this in the Console for Chrome browsing:

var xmlHttp = new XMLHttpRequest(); Xmlhttp. open("GET", "https://httpbin.org/get", false); xmlHttp.send( null ); Xmlhttp. status 200 xmlhttp. responseText; '{\n "args": {}, \n "headers": {\n "Accept": "*/*", \n "Accept-Encoding": "gzip, deflate, br", \n "Accept-Language": "en,zh; Q = 0.9, useful - TW; Q = 0.8, useful - CN; Q = 0.7, "\ n" Host ":" httpbin.org "\ n" Origin ":" https://stackoverflow.com, "\ n" Referer ": "https://stackoverflow.com/", \n "Sec-Ch-Ua": "\\" Not A; Brand\\"; v=\\"99\\", \\"Chromium\\"; v=\\"96\\", \\"Google Chrome\\"; v=\\"96\\"", \n "Sec-Ch-Ua-Mobile": "? 0", \n "Sec-Ch-Ua-Platform": "\\"macOS\\"", \n "Sec-Fetch-Dest": "empty", \n "Sec-Fetch-Mode": "Cors ", \n" sec-fetch -Site": "cross-site", \n "user-agent ": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36", \n "x-Amzn-trace-ID ": "Root= 1-61DADC7C-70a2CDE54a07AB3a6DF28d5c "\ N}, \n" Origin ": "111.201.135.46", \n "url": "https://httpbin.org/get"\n}\n'Copy the code

seo

Seo support is important in the web2.0 era, and you can use search engines for free to drive a lot of traffic. Robots.txt is a convention for web sites and search engine crawlers, httpbin provides a simple implementation:

@app.route('/robots.txt')
def view_robots_page():
    """Simple Html Page"""

    response = make_response()
    response.data = ROBOT_TXT
    response.content_type = "text/plain"
    return response
Copy the code
  • Robots. TXT is output in plain text

The content of robots.txt is to deny /deny access to directories:

ROBOT_TXT = """User-agent: *
Disallow: /deny
"""
Copy the code

If you do not comply with the robots.txt access /deny directory, http-bin will be angry, you can go to test feel.

Kenneth Reitz’s design here is quite interesting, including FWord 402 in the code, showing the author’s lively side

The compression

HTTP compression, support GZIP, Deflate and Brotli three algorithms. Here is the gZIP support implementation:

@app.route('/gzip')
@filters.gzip
def view_gzip_encoded_content():
    """Returns GZip-Encoded Data."""

    return jsonify(get_dict(
        'origin', 'headers', method=request.method, gzipped=True))
Copy the code

Gzip is implemented using decorators:

from decorator import decorator
import gzip as gzip2

@decorator
def gzip(f, *args, **kwargs):
    """GZip Flask Response Decorator."""

    data = f(*args, **kwargs)

    if isinstance(data, Response):
        content = data.data
    else:
        content = data

    gzip_buffer = BytesIO()
    gzip_file = gzip2.GzipFile(
        mode='wb',
        compresslevel=4,
        fileobj=gzip_buffer
    )
    gzip_file.write(content)
    gzip_file.close()

    gzip_data = gzip_buffer.getvalue()

    if isinstance(data, Response):
        data.data = gzip_data
        data.headers['Content-Encoding'] = 'gzip'
        data.headers['Content-Length'] = str(len(data.data))

        return data

    return gzip_data
Copy the code
  • Compress data using gzip
  • The compressed data modifies the HTTP headers of the two responsesContent-EncodingandContent-Length

In particular, the gZIP decorator here uses the decorator library implementation. Unlike normal decorators, decorators claim to be human decorators. The core feature is that there are no multiple layers of nested function structure. The first argument to a function is the function, and then args and kwargs are dynamic arguments to the native function.

Basic – Auth certification

Http-bin also provides an implementation of simple authentication. In simple authentication, the browser provides an input box for the user name and password by default. After the user is authenticated, the user can continue to access:

Here’s the code:

@app.route('/basic-auth/<user>/<passwd>') def basic_auth(user='user', passwd='passwd'): """Prompts the user for authorization using HTTP Basic Auth.""" if not check_basic_auth(user, passwd): return status_code(401) return jsonify(authenticated=True, user=user) ... def check_basic_auth(user, passwd): ""Checks user authentication using HTTP Basic auth. """ Auth = request. Authorization # auth.username == user and auth.password == passwdCopy the code

Using curl makes it easier to track the process:

curl -v -X GET "https://httpbin.org/basic-auth/game_404/123456" -H "accept: application/json" ... < HTTP/2 401 < Date: Sun, 09 Jan 2022 13:33:00 GMT < Content-Length :0 < server: Gunicorn /19.9.0 < www-authenticate: Basic realm="Fake Realm" < access-control-allow-origin: * < access-control-allow-credentials: trueCopy the code

You can see that 401 is received for the first time, www-authenticate:

code_map = {
    ...
    401: dict(headers={'WWW-Authenticate': 'Basic realm="Fake Realm"'}),
    ...
}    
Copy the code

The browser automatically displays the user name and password. After the user enters the user name and password, the user is authenticated. This window does not require application development.

flow

Stream can be used to download HTTP files, as in the following implementation:

@app.route('/stream/<int:n>') def stream_n_messages(n): """Stream n JSON messages""" response = get_dict('url', 'args', 'headers', 'origin') n = min(n, 100) def generate_stream(): for i in range(n): Response ['id'] = I # yield json.dumps(response) + '\n' return response (generate_stream(), headers={ "Content-Type": "application/json", })Copy the code

In the test, we can see that a request is received in multiple segments, so that for large files, breakpoint continuation can be done.

can't parse JSON. Raw result: {"url": "https://httpbin.org/stream/3", "args": {}, "headers": {"Host": "httpbin.org", "X-Amzn-Trace-Id": "Root=1-61dae496-15998ef6666f82c444ca483c", "Sec-Ch-Ua": "\" Not A; Brand\"; v=\"99\", \"Chromium\"; v=\"96\", \"Google Chrome\"; v=\"96\"", "Accept": "application/json", "Sec-Ch-Ua-Mobile": "? 0 "and" the user-agent ":" Mozilla / 5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36", "Sec-Ch-Ua-Platform": "\"macOS\"", "Sec-Fetch-Site": "same-origin", "Sec-Fetch-Mode": "cors", "Sec-Fetch-Dest": "empty", "Referer": "https://httpbin.org/", "Accept-Encoding": "gzip, deflate, br", "Accept-Language": "en,zh; Q = 0.9, useful - TW; Q = 0.8, useful - CN; Q = 0.7 "}, "origin", "111.201.135.46", "id" : {0} "url" : "https://httpbin.org/stream/3", "args" : {}, "headers" : {" Host ": "httpbin.org", "X-Amzn-Trace-Id": "Root=1-61dae496-15998ef6666f82c444ca483c", "Sec-Ch-Ua": "\" Not A; Brand\"; v=\"99\", \"Chromium\"; v=\"96\", \"Google Chrome\"; v=\"96\"", "Accept": "application/json", "Sec-Ch-Ua-Mobile": "? 0 "and" the user-agent ":" Mozilla / 5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36", "Sec-Ch-Ua-Platform": "\"macOS\"", "Sec-Fetch-Site": "same-origin", "Sec-Fetch-Mode": "cors", "Sec-Fetch-Dest": "empty", "Referer": "https://httpbin.org/", "Accept-Encoding": "gzip, deflate, br", "Accept-Language": "en,zh; Q = 0.9, useful - TW; Q = 0.8, useful - CN; Q = 0.7 "}, "origin", "111.201.135.46", "id" : 1} {" url ":" https://httpbin.org/stream/3 ", "args" : {}, "headers" : {" Host ": "httpbin.org", "X-Amzn-Trace-Id": "Root=1-61dae496-15998ef6666f82c444ca483c", "Sec-Ch-Ua": "\" Not A; Brand\"; v=\"99\", \"Chromium\"; v=\"96\", \"Google Chrome\"; v=\"96\"", "Accept": "application/json", "Sec-Ch-Ua-Mobile": "? 0 "and" the user-agent ":" Mozilla / 5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36", "Sec-Ch-Ua-Platform": "\"macOS\"", "Sec-Fetch-Site": "same-origin", "Sec-Fetch-Mode": "cors", "Sec-Fetch-Dest": "empty", "Referer": "https://httpbin.org/", "Accept-Encoding": "gzip, deflate, br", "Accept-Language": "en,zh; Q = 0.9, useful - TW; Q = 0.8, useful - CN; Q =0.7"}, "origin": "111.201.135.46", "id": 2}Copy the code

There are other HTTP examples in Httpbin that you can explore for yourself without going through them in this article.

Unit testing

The unittest of the /get API shows how to test an HTTP interface using unittest:

class HttpbinTestCase(unittest.TestCase):
    """Httpbin tests"""

    def setUp(self):
        httpbin.app.debug = True
        self.app = httpbin.app.test_client()
        
    def test_get(self):
        response = self.app.get('/get', headers={'User-Agent': 'test'})
        self.assertEqual(response.status_code, 200)
        data = json.loads(response.data.decode('utf-8'))
        self.assertEqual(data['args'], {})
        self.assertEqual(data['headers']['Host'], 'localhost')
        self.assertEqual(data['headers']['Content-Length'], '0')
        self.assertEqual(data['headers']['User-Agent'], 'test')
        # self.assertEqual(data['origin'], None)
        self.assertEqual(data['url'], 'http://localhost/get')
        self.assertTrue(response.data.endswith(b'\n'))
Copy the code
  • In the setUp method httpbin.app.test_Client () returns a test app that emulates the service
  • Self.app. get(‘/get’, headers={‘ user-agent ‘: ‘test’}) emulated requests
  • The Response method is the same as the actual HTTP response

This unit testing approach is more efficient without the HTTP service. A similar approach works in the Django framework.

summary

In this paper, we learn the source code of httpbin website based on flask framework, understand some HTTP protocol implementation details, I believe that HTTP protocol has certain help for everyone to master.

tip

Utils provides a clever weighted random algorithm:

def weighted_choice(choices): """Returns a value from choices chosen by weighted random selection choices should be a list of (value, Weight) tuples. Eg. Weighted_choice ([(' val1, 5), (' val2, 0.3), (' val3 ', 1)]) weighted random, Values, weights = zip(*choices) total = 0 cum_weights = [] for w in weights: Total += w cum_weights. Append (total) # random a float x = random. Uniform (0, total) # binary find I = bisect.bisect(cum_weights, total) x) return values[i]Copy the code

Refer to the link

  • httpbin.org/
  • En.wikipedia.org/wiki/Robots…