官方文檔:https://2.python-requests.org//en/master/
工作中涉及到一個功能,需要上傳附件到一個接口,接口參數如下:
使用http post提交附件 multipart/form-data 格式,url : http://test.com/flow/upload,
1
2
3
4
5
6
|
字段列表: md5: //md5 加密(隨機值_當時時間戳) filesize: // 文件大小 file : // 文件內容(須含文件名) 返回值: { "success" : true , "uploadName" : "tmp.xml" , "uploadPath" : "uploads\/201311\/758e875fb7c7a508feef6b5036119b9f" } |
由于工作中主要用python,并且項目中已有使用requests庫的地方,所以計劃使用requests來實現,本來以為是很簡單的一個小功能,結果花費了大量的時間,requests官方的例子只提到了上傳文件,并不需要傳額外的參數:
https://2.python-requests.org//en/master/user/quickstart/#post-a-multipart-encoded-file
1
2
3
4
5
6
7
8
9
10
11
12
|
>>> url = 'https://httpbin.org/post' >>> files = { 'file' : ( 'report.xls' , open ( 'report.xls' , 'rb' ), 'application/vnd.ms-excel' , { 'Expires' : '0' })} >>> r = requests.post(url, files = files) >>> r.text { ... "files" : { "file" : "<censored...binary...data>" }, ... } |
但是如果涉及到了參數的傳遞時,其實就要用到requests的兩個參數:data、files,將要上傳的文件傳入files,將其他參數傳入data,request庫會將兩者合并到一起做一個multi part,然后發送給服務器。
最終實現的代碼是這樣的:
1
2
3
4
5
6
7
8
9
|
with open (file_name) as f: content = f.read() request_data = { 'md5' :md5.md5( '%d_%d' % ( 0 , int (time.time()))).hexdigest(), 'filesize' : len (content), } files = { 'file' :(file_name, open (file_name, 'rb' ))} MyLogger().getlogger().info( 'url:%s' % (request_url)) resp = requests.post(request_url, data = request_data, files = files) |
雖然最終代碼可能看起來很簡單,但是其實我費了好大功夫才確認這樣是OK的,中間還翻了requests的源碼,下面記錄一下翻閱源碼的過程:
首先,找到post方法的實現,在requests.api.py中:
1
2
3
4
5
6
7
8
9
10
11
12
13
|
def post(url, data = None , json = None , * * kwargs): r """Sends a POST request. :param url: URL for the new :class:`Request` object. :param data: (optional) Dictionary, list of tuples, bytes, or file-like object to send in the body of the :class:`Request`. :param json: (optional) json data to send in the body of the :class:`Request`. :param \*\*kwargs: Optional arguments that ``request`` takes. :return: :class:`Response <Response>` object :rtype: requests.Response """ return request( 'post' , url, data = data, json = json, * * kwargs) |
這里可以看到它調用了request方法,咱們繼續跟進request方法,在requests.api.py中:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
|
def request(method, url, * * kwargs): """Constructs and sends a :class:`Request <Request>`. :param method: method for the new :class:`Request` object: ``GET``, ``OPTIONS``, ``HEAD``, ``POST``, ``PUT``, ``PATCH``, or ``DELETE``. :param url: URL for the new :class:`Request` object. :param params: (optional) Dictionary, list of tuples or bytes to send in the query string for the :class:`Request`. :param data: (optional) Dictionary, list of tuples, bytes, or file-like object to send in the body of the :class:`Request`. :param json: (optional) A JSON serializable Python object to send in the body of the :class:`Request`. :param headers: (optional) Dictionary of HTTP Headers to send with the :class:`Request`. :param cookies: (optional) Dict or CookieJar object to send with the :class:`Request`. :param files: (optional) Dictionary of ``'name': file-like-objects`` (or ``{'name': file-tuple}``) for multipart encoding upload. ``file-tuple`` can be a 2-tuple ``('filename', fileobj)``, 3-tuple ``('filename', fileobj, 'content_type')`` or a 4-tuple ``('filename', fileobj, 'content_type', custom_headers)``, where ``'content-type'`` is a string defining the content type of the given file and ``custom_headers`` a dict-like object containing additional headers to add for the file. :param auth: (optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth. :param timeout: (optional) How many seconds to wait for the server to send data before giving up, as a float, or a :ref:`(connect timeout, read timeout) <timeouts>` tuple. :type timeout: float or tuple :param allow_redirects: (optional) Boolean. Enable/disable GET/OPTIONS/POST/PUT/PATCH/DELETE/HEAD redirection. Defaults to ``True``. :type allow_redirects: bool :param proxies: (optional) Dictionary mapping protocol to the URL of the proxy. :param verify: (optional) Either a boolean, in which case it controls whether we verify the server's TLS certificate, or a string, in which case it must be a path to a CA bundle to use. Defaults to ``True``. :param stream: (optional) if ``False``, the response content will be immediately downloaded. :param cert: (optional) if String, path to ssl client cert file (.pem). If Tuple, ('cert', 'key') pair. :return: :class:`Response <Response>` object :rtype: requests.Response Usage:: >>> import requests >>> req = requests.request('GET', 'https://httpbin.org/get') <Response [200]> """ # By using the 'with' statement we are sure the session is closed, thus we # avoid leaving sockets open which can trigger a ResourceWarning in some # cases, and look like a memory leak in others. with sessions.Session() as session: return session.request(method = method, url = url, * * kwargs) |
這個方法的注釋比較多,從注釋里其實已經可以看到files參數使用傳送文件,但是還是無法知道當需要同時傳遞參數和文件時該如何處理,繼續跟進session.request方法,在requests.session.py中:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
|
def request( self , method, url, params = None , data = None , headers = None , cookies = None , files = None , auth = None , timeout = None , allow_redirects = True , proxies = None , hooks = None , stream = None , verify = None , cert = None , json = None ): """Constructs a :class:`Request <Request>`, prepares it and sends it. Returns :class:`Response <Response>` object. :param method: method for the new :class:`Request` object. :param url: URL for the new :class:`Request` object. :param params: (optional) Dictionary or bytes to be sent in the query string for the :class:`Request`. :param data: (optional) Dictionary, list of tuples, bytes, or file-like object to send in the body of the :class:`Request`. :param json: (optional) json to send in the body of the :class:`Request`. :param headers: (optional) Dictionary of HTTP Headers to send with the :class:`Request`. :param cookies: (optional) Dict or CookieJar object to send with the :class:`Request`. :param files: (optional) Dictionary of ``'filename': file-like-objects`` for multipart encoding upload. :param auth: (optional) Auth tuple or callable to enable Basic/Digest/Custom HTTP Auth. :param timeout: (optional) How long to wait for the server to send data before giving up, as a float, or a :ref:`(connect timeout, read timeout) <timeouts>` tuple. :type timeout: float or tuple :param allow_redirects: (optional) Set to True by default. :type allow_redirects: bool :param proxies: (optional) Dictionary mapping protocol or protocol and hostname to the URL of the proxy. :param stream: (optional) whether to immediately download the response content. Defaults to ``False``. :param verify: (optional) Either a boolean, in which case it controls whether we verify the server's TLS certificate, or a string, in which case it must be a path to a CA bundle to use. Defaults to ``True``. :param cert: (optional) if String, path to ssl client cert file (.pem). If Tuple, ('cert', 'key') pair. :rtype: requests.Response """ # Create the Request. req = Request( method = method.upper(), url = url, headers = headers, files = files, data = data or {}, json = json, params = params or {}, auth = auth, cookies = cookies, hooks = hooks, ) prep = self .prepare_request(req) proxies = proxies or {} settings = self .merge_environment_settings( prep.url, proxies, stream, verify, cert ) # Send the request. send_kwargs = { 'timeout' : timeout, 'allow_redirects' : allow_redirects, } send_kwargs.update(settings) resp = self .send(prep, * * send_kwargs) return resp |
先大概看一下這個方法,先是準備request,最后一步是調用send,推測應該是發送請求了,所以我們需要跟進到prepare_request方法中,在requests.session.py中:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
|
def prepare_request( self , request): """Constructs a :class:`PreparedRequest <PreparedRequest>` for transmission and returns it. The :class:`PreparedRequest` has settings merged from the :class:`Request <Request>` instance and those of the :class:`Session`. :param request: :class:`Request` instance to prepare with this session's settings. :rtype: requests.PreparedRequest """ cookies = request.cookies or {} # Bootstrap CookieJar. if not isinstance (cookies, cookielib.CookieJar): cookies = cookiejar_from_dict(cookies) # Merge with session cookies merged_cookies = merge_cookies( merge_cookies(RequestsCookieJar(), self .cookies), cookies) # Set environment's basic authentication if not explicitly set. auth = request.auth if self .trust_env and not auth and not self .auth: auth = get_netrc_auth(request.url) p = PreparedRequest() p.prepare( method = request.method.upper(), url = request.url, files = request.files, data = request.data, json = request.json, headers = merge_setting(request.headers, self .headers, dict_class = CaseInsensitiveDict), params = merge_setting(request.params, self .params), auth = merge_setting(auth, self .auth), cookies = merged_cookies, hooks = merge_hooks(request.hooks, self .hooks), ) return p |
在prepare_request中,生成了一個PreparedRequest對象,并調用其prepare方法,跟進到prepare方法中,在requests.models.py中:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
def prepare( self , method = None , url = None , headers = None , files = None , data = None , params = None , auth = None , cookies = None , hooks = None , json = None ): """Prepares the entire request with the given parameters.""" self .prepare_method(method) self .prepare_url(url, params) self .prepare_headers(headers) self .prepare_cookies(cookies) self .prepare_body(data, files, json) self .prepare_auth(auth, url) # Note that prepare_auth must be last to enable authentication schemes # such as OAuth to work on a fully prepared request. # This MUST go after prepare_auth. Authenticators could add a hook self .prepare_hooks(hooks) |
這里調用許多prepare_xx方法,這里我們只關心處理了data、files、json的方法,跟進到prepare_body中,在requests.models.py中:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
|
def prepare_body( self , data, files, json = None ): """Prepares the given HTTP body data.""" # Check if file, fo, generator, iterator. # If not, run through normal process. # Nottin' on you. body = None content_type = None if not data and json is not None : # urllib3 requires a bytes-like body. Python 2's json.dumps # provides this natively, but Python 3 gives a Unicode string. content_type = 'application/json' body = complexjson.dumps(json) if not isinstance (body, bytes): body = body.encode( 'utf-8' ) is_stream = all ([ hasattr (data, '__iter__' ), not isinstance (data, ( basestring , list , tuple , Mapping)) ]) try : length = super_len(data) except (TypeError, AttributeError, UnsupportedOperation): length = None if is_stream: body = data if getattr (body, 'tell' , None ) is not None : # Record the current file position before reading. # This will allow us to rewind a file in the event # of a redirect. try : self ._body_position = body.tell() except (IOError, OSError): # This differentiates from None, allowing us to catch # a failed `tell()` later when trying to rewind the body self ._body_position = object () if files: raise NotImplementedError( 'Streamed bodies and files are mutually exclusive.' ) if length: self .headers[ 'Content-Length' ] = builtin_str(length) else : self .headers[ 'Transfer-Encoding' ] = 'chunked' else : # Multi-part file uploads. if files: (body, content_type) = self ._encode_files(files, data) else : if data: body = self ._encode_params(data) if isinstance (data, basestring ) or hasattr (data, 'read' ): content_type = None else : content_type = 'application/x-www-form-urlencoded' self .prepare_content_length(body) # Add content-type if it wasn't explicitly provided. if content_type and ( 'content-type' not in self .headers): self .headers[ 'Content-Type' ] = content_type self .body = body |
這個函數比較長,需要重點關注L52,這里調用了_encode_files方法,我們跟進這個方法:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
|
def _encode_files(files, data): """Build the body for a multipart/form-data request. Will successfully encode files when passed as a dict or a list of tuples. Order is retained if data is a list of tuples but arbitrary if parameters are supplied as a dict. The tuples may be 2-tuples (filename, fileobj), 3-tuples (filename, fileobj, contentype) or 4-tuples (filename, fileobj, contentype, custom_headers). """ if ( not files): raise ValueError( "Files must be provided." ) elif isinstance (data, basestring ): raise ValueError( "Data must not be a string." ) new_fields = [] fields = to_key_val_list(data or {}) files = to_key_val_list(files or {}) for field, val in fields: if isinstance (val, basestring ) or not hasattr (val, '__iter__' ): val = [val] for v in val: if v is not None : # Don't call str() on bytestrings: in Py3 it all goes wrong. if not isinstance (v, bytes): v = str (v) new_fields.append( (field.decode( 'utf-8' ) if isinstance (field, bytes) else field, v.encode( 'utf-8' ) if isinstance (v, str ) else v)) for (k, v) in files: # support for explicit filename ft = None fh = None if isinstance (v, ( tuple , list )): if len (v) = = 2 : fn, fp = v elif len (v) = = 3 : fn, fp, ft = v else : fn, fp, ft, fh = v else : fn = guess_filename(v) or k fp = v if isinstance (fp, ( str , bytes, bytearray)): fdata = fp elif hasattr (fp, 'read' ): fdata = fp.read() elif fp is None : continue else : fdata = fp rf = RequestField(name = k, data = fdata, filename = fn, headers = fh) rf.make_multipart(content_type = ft) new_fields.append(rf) body, content_type = encode_multipart_formdata(new_fields) return body, content_type |
OK,到此為止,仔細閱讀完這個段代碼,就可以搞明白requests.post方法傳入的data、files兩個參數的作用了,其實requests在這里把它倆合并在一起了,作為post的body。
以上就是本文的全部內容,希望對大家的學習有所幫助,也希望大家多多支持服務器之家。
原文鏈接:https://www.cnblogs.com/lit10050528/p/11285600.html