摘要:问题简述使用上传文件时报错误问题代码分析过程将全部读入内存了没想到的实现这么粗暴直接实际也是如此发送大文件时内存快速上涨代码如下这里将所有文件都读入内存官方文档推荐使用使用的写法总结的发送文件的实现
问题 简述
requests 2.21.0
requests-toolbelt 0.9.1
使用python requests上传文件时, 报
OverflowError: string longer than 2147483647 bytes 错误.
问题代码
data = {} with open("bigfile", "rb") as f: r = requests.post(PUBLISH_URL, data=data, files={"xxx": f})traceback
Traceback (most recent call last): File "test.py", line 52, in分析过程 requests 将file obj 全部读入内存了main() File "test.py", line 49, in main publish() File "test.py", line 41, in publish r = requests.post(PUBLISH_URL, data=cfg, files={file_key: ("./test.apk", f)}) File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 116, in post return request("post", url, data=data, json=json, **kwargs) File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 60, in request return session.request(method=method, url=url, **kwargs) File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 533, in request resp = self.send(prep, **send_kwargs) File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 646, in send r = adapter.send(request, **kwargs) File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 449, in send timeout=timeout File "/usr/local/lib/python2.7/dist-packages/urllib3/connectionpool.py", line 600, in urlopen chunked=chunked) File "/usr/local/lib/python2.7/dist-packages/urllib3/connectionpool.py", line 354, in _make_request conn.request(method, url, **httplib_request_kw) File "/usr/lib/python2.7/httplib.py", line 1057, in request self._send_request(method, url, body, headers) File "/usr/lib/python2.7/httplib.py", line 1097, in _send_request self.endheaders(body) File "/usr/lib/python2.7/httplib.py", line 1053, in endheaders self._send_output(message_body) File "/usr/lib/python2.7/httplib.py", line 897, in _send_output self.send(msg) File "/usr/lib/python2.7/httplib.py", line 873, in send self.sock.sendall(data) File "/usr/lib/python2.7/ssl.py", line 743, in sendall v = self.send(data[count:]) File "/usr/lib/python2.7/ssl.py", line 709, in send v = self._sslobj.write(data) OverflowError: string longer than 2147483647 bytes
没想到requests的实现这么粗暴, 直接file.read(), 实际也是如此, 发送大文件时, 内存快速上涨. 代码如下:
requests/models.py
@staticmethod def _encode_files(files, data): """Build the body for a multipart/form-data request. Will successfully encode files when passed as a dict or a list of tuples. Order is retained if data is a list of tuples but arbitrary if parameters are supplied as a dict. The tuples may be 2-tuples (filename, fileobj), 3-tuples (filename, fileobj, contentype) or 4-tuples (filename, fileobj, contentype, custom_headers). """ if (not files): raise ValueError("Files must be provided.") elif isinstance(data, basestring): raise ValueError("Data must not be a string.") new_fields = [] fields = to_key_val_list(data or {}) files = to_key_val_list(files or {}) for field, val in fields: if isinstance(val, basestring) or not hasattr(val, "__iter__"): val = [val] for v in val: if v is not None: # Don"t call str() on bytestrings: in Py3 it all goes wrong. if not isinstance(v, bytes): v = str(v) new_fields.append( (field.decode("utf-8") if isinstance(field, bytes) else field, v.encode("utf-8") if isinstance(v, str) else v)) for (k, v) in files: # support for explicit filename ft = None fh = None if isinstance(v, (tuple, list)): if len(v) == 2: fn, fp = v elif len(v) == 3: fn, fp, ft = v else: fn, fp, ft, fh = v else: fn = guess_filename(v) or k fp = v if isinstance(fp, (str, bytes, bytearray)): fdata = fp elif hasattr(fp, "read"): fdata = fp.read() # 这里将所有文件都读入内存 elif fp is None: continue else: fdata = fp rf = RequestField(name=k, data=fdata, filename=fn, headers=fh) rf.make_multipart(content_type=ft) new_fields.append(rf) body, content_type = encode_multipart_formdata(new_fields) return body, content_type官方文档推荐使用requests-toolbelt
https://2.python-requests.org...
In the event you are posting a very large file as a multipart/form-data request, you may want to stream the request. By default, requests does not support this, but there is a separate package which does - requests-toolbelt. You should read the toolbelt’s documentation for more details about how to use it.使用requests-toolbelt的写法
from requests_toolbelt import MultipartEncoder data = {} with open("bigfile", "rb") as f: data["xxx"] = ("filename", f) m = MultipartEncoder(fields=data) r = requests.post(PUBLISH_URL, data=m, headers={"Content-Type": m.content_type})总结
requests的发送文件的实现十分粗暴, 会直接读全部文件内容到内存再sign, ssl sign大于2GB会报错, 官方文档推荐使用requests-toolbelt上传大文件.
分块上传当然也是一个方案(如果服务器支持).
文章版权归作者所有,未经允许请勿转载,若此文章存在违规行为,您可以联系管理员删除。
转载请注明本文地址:https://www.ucloud.cn/yun/43650.html
摘要:上一篇文章标准库内置类型逻辑值检测布尔运算比较下一篇文章标准库内置类型迭代器类型序列类型数字类型存在三种不同的数字类型整数浮点数和复数。标准库包含附加的数字类型,如表示有理数的以及以用户定制精度表示浮点数的。 上一篇文章:Python标准库---9、内置类型:逻辑值检测、布尔运算、比较下一篇文章:Python标准库---11、内置类型:迭代器类型、序列类型 数字类型 --- int,...
摘要:本文讲解常用种数据类型通过剖析源码弄清楚每一种数据类型所有的内置函数,理解每一个函数的参数返回值使用场景是什么。 本文讲解Python常用7种数据类型:int, float, str, list, set, dict. 通过剖析源码弄清楚每一种数据类型所有的内置函数,理解每一个函数的参数、返回值、使用场景是什么。 一、整型 int Python3.6源码解析 class int(obj...
摘要:字符串和基本数据类型也能通过进行拼接操作,比如字符串的内容为。即基本类型和字符串类型相加时,基本类型会自动转换为其字符串表示,在这个例子中相当于回顾包装类这一小节的代码类型的最大值就是将字符串和数据类型的拼接。 数据类型定义了变量可以采用的值,例如,定义变量为 int 类型,则只能取整数值。 在 Java 中有两类数据类型: 1)原始数据类型 2)非原始数据类型 - 数组和字符串是非原...
摘要:静态常量,的长度,值为,单位为位。字节位最大值和最小值进制的次方进制的次方类型声明为,所以可以直接使用类反射方法。普通方法转成其他基本类型,,超过范围会符号取反。和将字符串转为进制整数。 静态常量 Integer.SIZE,Integer.BYTES SIZE: Integer的长度,值为32,单位为位(bit)。BYTES:Integer的字节数,值为8,单位为字节(byte)。 1...
阅读 2718·2021-11-19 11:35
阅读 2531·2021-11-02 14:40
阅读 1342·2021-09-04 16:48
阅读 2969·2019-08-30 15:55
阅读 1674·2019-08-30 13:11
阅读 1909·2019-08-29 11:12
阅读 1046·2019-08-27 10:52
阅读 3109·2019-08-26 18:36