Updated content: Added video tutorial – WEB system testing, PHP HTTP protocol, OKHTTP framework parsing and application, Access Requests library, Basic Introduction to Interface testing, JMeter HTTP protocol interface performance testing; Updated at: 2017-04-03

In order to facilitate interested friends to maintain HTTP resources together, I put the resources on Github. Hope you can recommend more, thank you!

B/S structure definition

Browser/Server structure, referred to as B/S structure, and C/S structure is different, its client does not need to install special software, only need a Browser, Browser through the Web Server and database interaction, can be convenient in different platforms work; The server can use high-performance computers and install large databases such as Oracle, Sybase and Informix. B/S structure simplifies the work of the client. It is produced with the rise of Internet technology and improves the C/S technology. However, the work of the server side under this structure is heavier and requires higher performance of the server. — Wikipedia

URI (Uniform Resource Identifier)

In computer terms, a Uniform Resource Identifier (OR URI) is a string used to identify the name of an Internet Resource. This identifier allows users to interact with resources on the network (commonly referred to as the World Wide Web) over specific protocols. The most common form of URI is the Uniform resource Locator (URL), often specified as an informal web address. A rarer use is the Uniform resource Name (URN), which is intended to provide a way to do this. Used to identify resources in a particular namespace to complement urls. — Wikipedia

URI grammar

The URI grammar consists of the URI protocol name (such as “HTTP”, “FTP”, “mailto”, or “file”), a colon, and the corresponding content of the protocol. A specific protocol defines the syntax and semantics of the protocol content, and all protocols must follow certain general rules of URI grammar, that is, some special characters are reserved for some special purposes. — Wikipedia

Examples of URIs and their components are shown below:

Access path ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ ┌ ─ ─ ─ ┴ ─ ─ ─ ─ ┐ abc://username:[email protected]: 123 / path/data? Key = value&key # 2 = value2 fragid1 └ ┬ ┘ └ ─ ─ ─ ─ ─ ─ ─ ┬ ─ ─ ─ ─ ─ ─ ─ ┘ └ ─ ─ ─ ─ ┬ ─ ─ ─ ─ ┘ └ ┬ ┘ └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ └ ─ ─ ┬ ─ ─ ┘ agreement user information hostname port segments of query parametersCopy the code

MIME

Multipurpose Internet Mail Extensions (MIME) Multipurpose Internet Mail Extensions. A type of file with a specified extension that the browser automatically opens when the file is accessed using the specified application. It is used to specify the name of a file customized by the client and the opening mode of some media files. — Baidu Encyclopedia

The file format

Each MIME type consists of two parts. The first part is a large category of data, such as audio, image, and so on, followed by a specific category.

Common MIME types are:

The name of the resource The suffix type
Hypertext Markup Language text .html text/html
The XML document .xml text/xml
Plain text .txt text/plain
PNG image .png image/png
PDF document .pdf application/pdf

Learn more about MIME types – Internet media types

The HTTP protocol

HyperText Transfer Protocol (HTTP) is one of the most widely used network protocols on the Internet. HTTP was originally designed to provide a way to publish and receive HTML pages. Resources requested over HTTP or HTTPS are identified by Uniform Resource Identifiers (URIs). — Wikipedia

HTTP protocol is based on request and response, as shown in the following figure:

Main features of HTTP

  • Simple and fast: When the client sends a request to the server, simply fill in the request path and request method, and then you can send the request through the browser or other means
  • Flexibility: THE HTTP protocol allows clients and servers to transfer data objects of any type and format
  • Connectionless: The meaning of connectionless is to limit processing to one request per connection. The server disconnects from the customer after processing the request and receiving the reply from the customer. In this way, the transmission time is saved. (Today, most servers support keep-alive function, using the server to support long connections to solve the problem of no connection)
  • Stateless: Stateless means that the protocol has no memory for transaction processing and the server does not know the state of the client. That is, after the client sends the HTTP request, the server will send us the data according to the request, and after sending the data, no information will be recorded. (Use cookie mechanism to maintain session and solve stateless problem)

HTTP request packet

HTTP request packet consists of four parts: request line, request header, blank line and request body (request data), as shown in the following figure:

Sample Request message

GET / HTTP/1.1
Host: www.baidu.com
Connection: keep-alive
Cache-Control: max-age=0
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.02987.110. Safari/537.36Accept: text/html,application/xhtml+xml,application/xml; q=0.9,image/webp,*/ *; Q =0.8 Accept-encoding: Gzip, deflate, SDCH, BR Accept-language: zh-cn,zh; Q = 0.8, en. Q = 0.6, id; Q = 0.4 cookies: PSTM = 1490844191; BIDUPSID=2145FF54639208435F60E1E165379255; BAIDUID=CFA344942EE2E0EE081D8B13B5C847F9:FG=1;Copy the code

The request line

The request line consists of the request method, URL, and HTTP protocol version, separated by Spaces.

GET / HTTP/1.1Copy the code

Request header

The request header consists of key-value pairs, one on each line, separated by colons (:). A request header notifies the server of information about a client request. A typical request header is:

  • User-agent: User Agent information – Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit
  • Accept: the client can identify the content type of list – text/HTML, application/XHTML + XML, application/XML
  • Accept-language: natural Language accepted by the client -zh-cn,zh; Q = 0.8, en. Q = 0.6, id; Q = 0.4
  • Accept-encoding: Client acceptable Encoding compression format – gzip, Deflate, SDCH, BR
  • Host: specifies the requested Host name that allows multiple domain names to reside at the same IP address, that is, virtual Host –www.baidu.com
  • Connection: indicates the connection mode
    • Close: Tells the WEB server or proxy server to disconnect after completing the response to this request
    • Keep-alive: tells the WEB server or proxy server. After completing the response to this request, hold the connection for subsequent requests
  • Cookie: the Cookie is stored in the extended field on the client and sent to the server of the same domain name. Cookie-pstm =1490844191. BIDUPSID=2145FF54639208435F60E1E165379255;

A blank line

The last request header is followed by an empty line, and carriage return and newline characters are sent to inform the server that there are no more headers below.

Request body

Request data is not used in the GET method, but in the POST method. The most commonly used request headers associated with request data are content-Type and Content-Length.

HTTP response packet

The HTTP response packet consists of the status line, response header, blank line, and response body, as shown in the following figure:

Response Packet Example

HTTP/1.1 200 OK
Server: bfe/1.08.18.
Date: Thu, 30 Mar 2017 12:28:00 GMT
Content-Type: text/html; charset=utf- 8 -
Connection: keep-alive
Cache-Control: private
Expires: Thu, 30 Mar 2017 12:27:43 GMT
Set-Cookie: BDSVRTM=0; path=/Copy the code

The status line

The Status line format is http-version status-code reason-phrase CRLF

  • Http-version-http Version
  • Status-code – Status Code
  • Reason-phrase – Indicates the description of the status code
  • CRLF – Carriage return/line feed

Response headers

The response header consists of key-value pairs, one on each line, separated by colons (:). Response header fields allow the server to pass additional information that cannot be placed in the status line. These fields describe server information and request-URI further information. Typical response headers are:

  • Server: Contains software information about the original Server processing the request
  • Date: indicates the Date of the server
  • Content-type: Returned resource Type (MIME)
  • Connection: indicates the Connection mode
    • Close: The connection is closed
    • Keep-alive: Indicates that the connection is held and is waiting for further requests for the connection
  • Cache-control: Cache Control
  • Expires: Sets the expiration time
  • Set-cookie: sets Cookie information

A blank line

The last response header is followed by an empty line, and the carriage return and newline characters are sent to inform the browser that there are no more response headers below.

Response body

The server returns the response information to the browser. The following is the response body fragment of baidu’s home page:


      
<! --STATUS OK-->
<html>
<head>
    <meta http-equiv="content-type" content="text/html; charset=utf-8">
    <meta http-equiv="X-UA-Compatible" content="IE=Edge">
    <link rel="icon" sizes="any" mask href="//www.baidu.com/img/baidu.svg">
    <title>Google it and you'll see</title>
</head>
<body>.</body>
</html>Copy the code

HTTP Methods

HTTP request methods include GET, POST, HEAD, PUT, DELETE, OPTIONS, TRACE, CONNECT, PATCH, and HEAD

HTTP request methods:

  • GET – obtains resources and passes parameters in URL mode. The size is 2KB
    • http://www.example.com/users– Obtain all users
  • POST – Transport resource, HTTP Body, default size 8M
    • http://www.example.com/users/a-unique-id– New user
  • PUT – Updates resources
    • http://www.example.com/users/a-unique-id– Update users
  • DELETE – Deletes the resource
    • http://www.example.com/users/a-unique-id– Deleting a user

HTTP Status Code

The status code consists of three digits. The first number defines the category of the response and has five possible values:

  • 1XX: indicates that the request has been received and processing continues
  • 2xx: succeeded – The request is received successfully
  • 3xx: Redirect – Further action must be taken to complete the request
  • 4XX: client error – The request has a syntax error or the request cannot be implemented
  • 5xx: Server error – The server failed to fulfill a valid request

Common status codes and status descriptions are described as follows:

  • 200 OK: The client request is successful
  • 204 No Content: There is No new document, the browser should continue to display the original document
  • 206 Partial Content: The client sends a GET request with a Range header, and the server completes it
  • 301 Moved Permanently: The requested page has been Moved to a new URL
  • 302 Found: The requested page has been temporarily moved to a new URL
  • 304 Not Modified: The client has a cached document and makes a conditional request. The server tells the client that the cached document can be used again.
  • 400 Bad Request: The client Request has syntax errors and cannot be understood by the server
  • 401 Unauthorized: The request is not authorized. This status code must be used with the WWW-Authenticate header field
  • 403 Forbidden: Access to the requested page is Forbidden
  • 404 Not Found: The requested resource does Not exist
  • 500 Internal Server Error: An unexpected Error occurs on the Server
  • 503 Server Unavailable: The request is not completed, or the Server is temporarily overloaded or down. However, the Server may recover after a period of time
  • Brief book – A comprehensive interpretation of HTTP cookies
  • Simple book-cookie mechanism
  • Ruan Yifeng – JavaScript standard Reference tutorial – Cookie
  • Segmentfault – Talk about cookies
  • Zhihu – What is the difference between COOKIE and SESSION?

HTTP Cache

  • Alloyteam – Web Caching mechanism series
  • Imweb Front-end Community – HTTP Cache control summary
  • A trilogy of applications to HTTP caching in front-end performance optimization
  • Thoroughly understand the Http caching mechanism – cache strategy based three-factor decomposition method
  • Master HTTP Caching – Everything from request to response (Part 1)
  • Master HTTP Caching — Everything from request to response (Part 2)

HTTP CORS (Cross-domain Resource Sharing)

  • MDN – The same origin policy of the browser
  • MDN – HTTP CORS
  • Ruan Yifeng – Cross-domain resource sharing CORS details

HTTPS

  • Imweb front End community – HTTPS popular science literacy paste
  • Maybe it’s easier to understand HTTPS this way
  • This article gives you an in-depth analysis of the HTTPS protocol
  • Ruan Yifeng – HTTPS Upgrade Guide
  • Full site HTTPS is here
  • Take an inside look at HTTPS sites
  • Simple Book – Nine questions from getting started to getting familiar with HTTPS
  • Before you understand why HTTPS is safe, take a look at these things. Okay
  • Why IS HTTPS Secure
  • HTTPS free certificate application tutorial
  • Nuggets? – The whole Https upgrade thing
  • Jane – Android uses HTTPS
  • Zhihu column – HTTPS in iOS development
  • Simple – iOS development HTTPS implementation of trusted SSL certificates and self-signed certificates

HTTP/2

  • Segmentfault-http /2 New feature Analysis
  • Simple books – The future is here – HTTP/2
  • Nuggets – HTTP 2.0 stuff
  • HTTP/2 development history
  • Nuggets – HTTP/2 Performance optimization guide for Web developers
  • Front-end development and HTTP/2: Amway
  • Simple book – Upgrade your website to HTTP/2
  • Simple book – HTTP/2 traffic debugging
  • Gitbook-http2 explains

HTTP security

  • Getting started with Web security – Books and advice
  • Github – Utility developer safety tips
  • Segmentfault – Aliju Security 2016 Annual Report
  • Segmentfault – Description of web permission verification methods
  • HTTP identification, Authentication, and Security – the Definitive GUIDE to HTTP series
  • The basic technology of HTTP authentication brief analysis and reveal
  • SegmentFault Technology Weekly Vol.12 – Web Security Guide (Part 1)
  • SegmentFault Technology Weekly Vol.13 – A Guide to Web Security (Part 2)
  • Jane book – Web front – end attack and defense, accidentally caught
  • How can I defend against common Web attacks
  • Simple book – Web security SQL injection attack techniques and defense

HTTP interview

  • What do you understand about HTTP?

  • Segmentfault – Front-end Classic Interview question: What happens from entering the URL to loading the page?

  • Alloy – HTTP,HTTP2.0,SPDY,HTTPS Something you should know

  • Github. IO – HTTP protocol (written interview knowledge) – information is very comprehensive

  • Nuggets – Interview – Web HTTP

  • CSDN – HTTP must know must meet – Often meet test summary

  • TCP/IP: An introduction to HTTP and HTTPS

  • How to talk about HTTP / 1.0/1.1/2.0 gracefully in an interview

  • Cat brother network programming series: Explain BAT interview questions

  • The difference between segmentFault-GET and POST requests

  • Open Source China – 99% of people understand the difference between GET and POST in HTTP

HTTP crawler article

Resources

  • Zhihu – Reptilian learning resources arrangement
  • Jane’s Book – 2016, my summary of reptiles
  • Brief book – an article about the current state of crawler technology
  • Girlfriend’s micro-blog mood monitoring

Node.js

  • The gold-nuggets – nodeJS implementation periodically sends messages to specified mail based on the Promise crawler

Java

  • Java-spring-mybatis integration crawler “Toutiao” funny dynamic picture crawling
  • Gold Digger – Refactoring: Grab a download link for all the 2016 movies on YouTube
  • Open source China – SpringBoot+SpringMVC+MybatisPlus framework integration exercise of beauty picture crawler – graphic details of the process

PHP

  • Segmentfault-php crawler: Zhihu user data crawling and analysis

Python

  • Bole Online – List of Python crawler tools
  • Simple book – Python crawler Library – Use of Beautiful Soup
  • Rambling on the practice of Pyspider web crawler
  • Simple book – 500 lines of Python code to build a lightweight crawler framework
  • The Python crawler crawls property information step by step
  • Simulated log in for Python crawlers
  • How to give a baby a good name with a Python crawler
  • Short book – Python crawler – Use the Python framework to crawl comics
  • Jane Book – Use Python to grab beauty pictures
  • Simple book – Python benefits the crawler, crawling today’s headlines street photos of beautiful women
  • Nuggets – Python crawler: Convert Liao xuefeng’s tutorial to PDF ebook

The HTTP resource

The article

  • Ruan Yifeng – HTTP protocol introduction
  • HTTP protocol details
  • Take a look at the structure of HTTP
  • Learn Restful HTTP API design from Github
  • Imweb Front-end Community – HTTP1.1 with front-end performance
  • When it comes to Web security, 99% of websites ignore this

    video

  • Baidu pass class – WEB system test – HTTP protocol details
  • Mooc – HTTP protocol in PHP
  • Jerry Qu-HTTP series of archived articles(Thank god for recommending)

    HTTP packet capture tool and tutorial

tool

  • Browser Develop Tools – Browser developer Tools
  • Fiddler for Windows – Fiddler is an HTTP debugging agent
  • Charles for Mac HTTP debugging agent for Mac
  • Fiddler-addons – Address of the Fiddler plug-in
  • Wireshark – Wireshark is a network packet analysis software that captures network packets and displays the most detailed network packet information
  • Mitmproxy – an interactive command line packet capture tool

The tutorial

  • Blogpark – An incomplete guide to Chrome developer Tools
  • Fiddler CertMaker for iOS and Android HTTPS certificate generation plugin
  • Simple book – HTTPS and Fiddler capture HTTPS protocol – APP HTTPS packet capture
  • Summary of knowledge related to Fiddler
  • Simple book – Fiddler Tutorial
  • Wireshark Describes how to use the TCP three-way handshake
  • TMQ – Learn HTTPS from Wireshark
  • Mitmproxy (HTTPS Proxy)

Chrome HTTP plug-in

  • Proxy SwitchyOmega – Easily and quickly manage and switch between multiple Proxy Settings
  • CORS Toggle – Allows cross-domain requests
  • Postman – Super powerful HTTP Client

HTTP platform repository

Browser

  • jQuery
  • jquery-pjax
  • zepto
  • fetch
  • axios
  • mockjs

Node.js

  • request – Request is designed to be the simplest way possible to make http calls
  • axios- Promise based HTTP client for the browser and node.js
  • http-proxy – It is an HTTP programmable proxying library that supports websockets
  • superagent – SuperAgent is a small progressive client-side HTTP request library
  • morgan – HTTP request logger middleware for node.js

Java

  • HttpClient

Android

  • okhttp
    • Analysis and Application of MOOC – OKHTTP framework

Python

  • urllib
  • urllib2
  • httplib
  • Requests
    • Mooc – Access the Requests library

If only python3.X is used, forget about the following and remember that there is a library for urllib

Python2. X has these library names available: urllib, urllib2, urllib3, httplib, httplib2, requests

Python3. X has these library names available: urllib, urllib3, httplib2, requests

Urllib3 provides thread-safe connection pooling and file POST support, and is not related to URllib and URllib2 requests, which call themselves HTTP for Humans.

For details, please refer to urllib, urllib2, Httplib, httplib2

HTTP stress testing

tool

  • LoadRunner
  • SoapUI
  • Jmeter
  • http_load
  • webbench
  • ab
  • siege

The tutorial

  • 51Testing – Automated performance testing tool
  • Blogpark – Jmeter tutorial for simple stress tests
  • Mooc – Introduction to interface testing basics
  • Mooc – HTTP protocol interface performance test for JMeter
  • This section describes several Web server performance pressure testing tools

HTTP proxy server

product

  • Nginx
  • Squid
  • Privoxy
  • Varnish
  • Polipo
  • Tinyproxy
  • HAProxy
  • ATS

The article

  • Zhihu – Why is a Reverse proxy called a Reverse proxy?
  • 51Cto – Diagram Forward proxy, reverse proxy, transparent proxy
  • Anker – Summary of forward and reverse proxies
  • Top five open source Web proxy servers
  • Reverse Proxy Server Comparison (Nginx, ATS, Squid, etc.)
  • Brief Book – Apache vs Nginx: A Comparison based on Practical experience
  • Host, Server, Proxy server, Reverse Proxy Server Understanding (self use)
  • Brief Book – Large Site Architecture series: Load balancing details
  • Squid was used to set up the proxy server
  • Jane Book – Combat Nginx notes
  • Brief – Nginx Proxy Cache principles and best practices

HTTP books

  • Douban Reading – the definitive guide to HTTP
  • Douban reading – Illustrated HTTP

The resources

  • HTTP request packets and HTTP response packets and their working principles
  • Baidu Encyclopedia – HTTP