This article is from NetEase Cloud community

Author: Wang Tao


The optional parameters are introduced one by one:

parameter paraphrase The sample
params Generate url? Key=value Example 1:

>>>payload = {'key1': 'value1'.'key2': 'value2'}
>>>r = requests.get("http://httpbin.org/get", params=payload)print(r.url)http://httpbin.org/get?key2=value2&key1=value1Copy the code

Example 2:

>>> param = 'httpparams'
>>> r = requests.get("http://httpbin.org/get",params=param)
>>> print r.urlhttp://httpbin.org/get?httpparamsCopy the code
data Supports dictionaries, lists, and strings.

The POST method is used to simulate an HTML form
Example 1:

>>> payload = {'key1': 'value1'.'key2': 'value2'}
>>> r = requests.post("http://httpbin.org/post", data=payload)
>>> print(r.text)
{
...
    "form": {
    "key2": "value2"."key1": "value1"},... }Copy the code

Example 2:

>>> payload = (('key1'.'value1'), ('key1'.'value2'))
>>> r = requests.post('http://httpbin.org/post', data=payload)
>>> print(r.text)
{
...
    "form": {
        "key1": [
            "value1"."value2"]},... }Copy the code
json Post is used to pass JSON data to the server,

Many Ajax requests are passed JSON
Example 1:

r = requests.post(url, json={"key":"value"}})Copy the code

The captured header is content-Type: application/json

headers A custom HTTP header that is merged with the request’s own default header as the request header.

Note: All header values must be String, byteString, or Unicode.
Example 1:

r = requests.post(url,headers={"user-agent":"test"})Copy the code




cookies You can access the cookie in the reply through cookies,

You can also send custom cookies to the server.

With custom cookie objects, you can also specify properties such as valid fields
Example 1: Get the cookie in the reply

>>> url ='http://example.com/some/cookie/setting/url'
>>> r = requests.get(url)
>>>r.cookies['example_cookie_name']Copy the code

Example 2: Sending cookies to the server

>>> jar = requests.cookies.RequestsCookieJar()
>>> jar.set('tasty_cookie'.'yum', domain='httpbin.org', path='/cookies')
>>> url = ' '
>>> r = requests.get(url, cookies=jar)
>>> r.text
'{"cookies": {"tasty_cookie": "yum"}}'Copy the code
files Upload a multipart-encoded file Example 1: Upload a file

>>> url = 'http://httpbin.org/post'
>>> files = {'file': open('report.xls'.'rb')}
>>> r = requests.post(url, files=files)Copy the code

Example 2: Explicitly set the file name, file type, and request header

>>> url = 'http://httpbin.org/post'
>>> files = {'file': ('report.xls', open('report.xls'.'rb'),
           'application/vnd.ms-excel', {'Expires': '0'})}
>>> r = requests.post(url, files=files)Copy the code
auth Support multiple HTTPBasicAuth HTTPDigestAuth/HTTPProxyAuth kind of certification Example 1

>>>url = 'http://httpbin.org/digest-auth/auth/user/pass'
>>>requests.get(url, auth=HTTPDigestAuth('user'.'pass'))Copy the code

After packet capture, the HTTP header is as follows:

GET http://httpbin.org/digest-auth/auth/user/pass HTTP / 1.1 Host:httpbin.org Connection: keep alive - Accept - Encoding: Gzip, deflate the Accept: / the user-agent: python - requests / 2.12.4 Authorization: Basic dXNlcjpwYXNzCopy the code
timeout 1 float: indicates the connection waiting time. This parameter is valid only for socket connection. 2 tuple:(connect timeout, read timeout) Example 1:

>>> requests.get('http://github.com', timeout=0.001)
Traceback (most recent call last):
    File "", line 1, inrequests.exceptions.Timeout: 
    HTTPConnectionPool(host='github.com', port=80): 
        Request timed out. (timeout=0.001)Copy the code
allow_redirects If you use GET, OPTIONS, POST, PUT, PATCH, or DELETE,

You can disable redirect processing with the allow_redirects parameter.

Note: In the crawler process, we need to disable the jump in some scenarios and set it to False.

The default True
Example 1: When 3xx is received, the system automatically jumps to:

>>>r = requests.get('http://github.com', allow_redirects=False)
>>> r.status_code 
    301 
>>> r.history
    []Copy the code
proxies Configuring proxy information, nothing to say. Configure HTTP and HTTPS proxy as required Example 1:

>>>proxies = {
    "http": "http://127.0.0.1:8888"."https": "https://127.0.0.1:8888",
}
>>>r=requests.get(url, headers=headers,proxies=proxies)Copy the code
stream Set stream to true if the reply is streaming. Note: Active shutdown is required, otherwise the connection will not be released Example 1: Download baidu pictures:

 import requests
from contextlib import closing
def download_image_improve():
    url = (' '
    'image&quality=80&size=b9999_10000'
    '&sec=1504068152047&di=8b53bf6b8e5deb64c8ac726e260091aa&imgtype=0'
    '&src=http%3A%2F%2Fpic.baike.soso.com%2Fp%2F'
    '20140415%2Fbki-20140415104220-671149140.jpg')
    with closing(requests.get(url, stream=True,verify=False)) as response:
    # open an empty PNG file, equivalent to creating an empty TXT file,
    # wb indicates write file
        with open('selenium1.png'.'wb') as file:
        This file is finally written to selen.png
            file.write(data)Copy the code
verify The default value is True to verify the server certificate. If it is a string, it is the path of CA_BUNDLE, which is the certificate path. If you can’t find the certificate, use PEM from Fiddler as in Example 2, or install the Certifi package (which comes with a set of trusted root certificates for Requests). Example 1: Disabling certificate validation (recommended)

r=requests.get(url, verify=False)Copy the code

Example 2: Borrow Fiddler’s converted PEM certificate to access Central Asia. Fiddlerroot.zip download address:

nos.netease.com/knowledge/

2b99aacb-e9bf-42f7-8edf-0f8ca0326533? download=FiddlerRoot.zip

Note: The certificate can be in cer or PEM format. You are advised to install a cer certificate first.

You can also install Fiddler yourself, trust the Fiddler certificate,

Export the cer format and use it again

Cer -out specifies the format of the fiddlerroot. pem conversion certificate

headers = {
    "Host": "www.amazon.com"."Connection": "keep-alive"."Cache-Control": "max-age=0"."Upgrade-Insecure-Requests": "1"."User-Agent": ("Mozilla / 5.0 (Windows NT 10.0; Win64; x64) "
                   AppleWebKit/537.36 (KHTML, like Gecko)
                   "Chrome / 68.0.3440.106 Safari / 537.36"),
    "Accept": ("text/html,application/xhtml+xml,"
               "application/xml; Q = 0.9, image/webp image/apng, /; Q = 0.8"),
    "Accept-Encoding": "gzip, deflate, br"."Accept-Language": "zh-CN,zh; Q = 0.9, en. Q = 0.8"
}
print requests.get('https://www.amazon.com/', 
verify=r"FiddlerRoot.pem", headers=headers).contentCopy the code
cert Type: String: Represents the cert file of the SSL client (.pem) file path tuple :(‘cert’,’key’),verify given the server certificate, cert given is the client certificate (for HTTPS bidirectional authentication) This field has not been tested and has not yet been used. If you’re interested, you can look into it

Key functions and parameters in Tornado

Tornado Non-blocking HttpClient

Tornado has two non-blocking implementations of HttpClient, SimpleAsyncHTTPClient and CurlAsyncHTTPClient. You can call them the base class for AsyncHTTPClient, through AsyncHTTPClient. The configure method which one to choose to use the above implementation, or directly instantiate any of the above a subclass. The default is SimpleAsyncHTTPClient, which already meets the needs of most users, but we chose CurlAsyncHTTPClient with more advantages.

  • CurlAsyncHTTPClient supports more features, such as proxy Settings, specifying network outgoing interfaces, and so on

  • CurlAsyncHTTPClient is also accessible for sites that are not very compatible with HTTP,

  • CurlAsyncHTTPClient faster

  • Before Tornado 2.0, CurlAsyncHTTPClient was the default.

2 introduction of Tornado key functions and parameters

Sample code (similar to the previous) :

@gen.coroutinedef fetch_url(url):
    """Grab url"""
    try:
        c = CurlAsyncHTTPClient()  Define an HttpClient
        req = HTTPRequest(url=url)  Define a request
        response = yield c.fetch(req)  Make a request
        print response.body
        IOLoop.current().stop()  Stop the ioloop thread
    except:        print traceback.format_exc()Copy the code

As you can see, this httpClient is also very easy to use. Create an HTTPRequest and call the Fetch method of HTTPClient to initiate the request. Let’s take a look at HTTPRequest’s definition and see what key parameters we need to know.

class HTTPRequest(object):
    """HTTP client request object."""

    # Default values for HTTPRequest parameters.
    # Merged with the values on the request object by AsyncHTTPClient
    # implementations._DEFAULTS = dict(connect_timeout=20.0, request_timeout=20.0, follow_redirects=True, max_redirects=5, decompress_response=True, proxy_password=' ',
        allow_nonstandard_methods=False,
        validate_cert=True)

    def __init__(self, url, method="GET", headers=None, body=None,
                 auth_username=None, auth_password=None, auth_mode=None,
                 connect_timeout=None, request_timeout=None,
                 if_modified_since=None, follow_redirects=None,
                 max_redirects=None, user_agent=None, use_gzip=None,
                 network_interface=None, streaming_callback=None,
                 header_callback=None, prepare_curl_callback=None,
                 proxy_host=None, proxy_port=None, proxy_username=None,
                 proxy_password=None, proxy_auth_mode=None,
                 allow_nonstandard_methods=None, validate_cert=None,
                 ca_certs=None, allow_ipv6=None, client_key=None,
                 client_cert=None, body_producer=None,
                 expect_100_continue=False, decompress_response=None,
                 ssl_options=None):
        r"""All parameters except ``url`` are optional.Copy the code



NetEase Cloud Free experience pavilion, 0 cost experience 20+ cloud products!

For more information about NetEase’s r&d, product and operation experience, please visit NetEase Cloud Community.


Relevant article: “recommended” to know things by learning | see, following instead the Android applications and protective techniques