Upgrade to Pro — share decks privately, control downloads, hide ads and more …

"High-bandwidth HTTP downloads: unpeeling the o...

Pycon ZA
October 08, 2020

"High-bandwidth HTTP downloads: unpeeling the onion" by Bruce Merry

The MeerKAT radio telescope produces massive volumes of data. We provide a data access library for scientists to retrieve the data, but our initial implementation using boto had disappointing performance when used on a high-speed (25 Gb/s) network. On investigation, we found that boto wraps requests wraps urllib3 wraps http.client, and these wrapping layers introduce a lot of overheads that limit bandwidth. I'll walk through all the steps involved in getting data from the socket into a final response, show how this reduces throughput, and describe our solution to achieve bandwidths of multiple gigabytes per second.

Pycon ZA

October 08, 2020
Tweet

More Decks by Pycon ZA

Other Decks in Programming

Transcript

  1. www.ska.ac.za What do astronomers care about HTTP? Figure: MeerKAT storage

    cluster PB of storage accessed through S -compatible interface.
  2. www.ska.ac.za What do astronomers care about HTTP? Figure: MeerKAT storage

    cluster PB of storage accessed through S -compatible interface. Python library (katdal) presents high-level interface.
  3. www.ska.ac.za What do astronomers care about HTTP? Figure: MeerKAT storage

    cluster PB of storage accessed through S -compatible interface. Python library (katdal) presents high-level interface. Aim for Gb/s ( GB/s).
  4. www.ska.ac.za Rules of the benchmarking game • Fetch a GB

    file from localhost. • Return the content as a bytes. • Use a single thread. • No TLS, no content encoding, no transfer encoding.
  5. www.ska.ac.za Why so slow? Will PyPy save the day? requests

    0 500 1000 1500 2000 2500 3000 3500 MB/s 501 Python 3.6.12
  6. www.ska.ac.za Why so slow? Will PyPy save the day? No.

    requests 0 500 1000 1500 2000 2500 3000 3500 MB/s 501 647 Python 3.6.12 PyPy 7.3.1
  7. www.ska.ac.za Inefficient memory model • bytes is immutable: have to

    copy to change • bytearray is zero-initialized • BytesIO.getvalue() makes a copy
  8. www.ska.ac.za Test code def load_requests(url: str) -> bytes: return requests.get(url).content

    def load_urllib3(url: str) -> bytes: return urllib3.PoolManager().request('GET', url).data def load_httpclient(url: str) -> bytes: parts = urllib.parse.urlparse(url) conn = http.client.HTTPConnection(parts.netloc) conn.request('GET', parts.path) resp = conn.getresponse() return resp.read(resp.length) def load_socket_read(url: str) -> bytes: # Code is much too long for a slide ...
  9. www.ska.ac.za Results requests urllib3 httpclient socket-read 0 500 1000 1500

    2000 2500 3000 3500 MB/s 501 1371 896 3219 Python 3.6.12
  10. www.ska.ac.za Requests — unnecessary chunking CONTENT_CHUNK_SIZE = 10 * 1024

    ... self._content = b''.join(self.iter_content(CONTENT_CHUNK_SIZE)) or b''
  11. www.ska.ac.za Requests — unnecessary chunking CONTENT_CHUNK_SIZE = 10 * 1024

    ... self._content = b''.join(self.iter_content(CONTENT_CHUNK_SIZE)) or b'' • kiB is too small to amortize overheads • bytes.join involves a copy
  12. www.ska.ac.za Requests — unnecessary chunking 10 kiB 1 MiB Chunk

    size 0 500 1000 1500 2000 2500 3000 3500 MB/s 501 1037 647 772 Python 3.6.12 PyPy 7.3.1
  13. www.ska.ac.za Requests — an alternative We can bypass Response.content: with

    requests.get(url, stream=True) as resp: return resp.raw.read()
  14. www.ska.ac.za Requests — an alternative requests requests-stream urllib3 0 500

    1000 1500 2000 2500 3000 3500 MB/s 501 1318 1371 Python 3.6.12
  15. www.ska.ac.za What’s up with http.client? Let’s look inside HTTPResponse.read: if

    amt is not None: # Amount is given, implement using readinto b = bytearray(amt) n = self.readinto(b) return memoryview(b)[:n].tobytes()
  16. www.ska.ac.za What’s up with http.client? Let’s look inside HTTPResponse.read: if

    amt is not None: # Amount is given, implement using readinto b = bytearray(amt) n = self.readinto(b) return memoryview(b)[:n].tobytes() • Allocate some memory, and zero-fill it. • Read the data into that memory. • Make a copy of it.
  17. www.ska.ac.za What if we don’t specify an amount? Then it’s

    implemented with via _safe_read instead: def _safe_read(self, amt): s = [] while amt > 0: chunk = self.fp.read(min(amt, MAXAMOUNT)) if not chunk: raise IncompleteRead(b''.join(s), amt) s.append(chunk) amt -= len(chunk) return b"".join(s)
  18. www.ska.ac.za What if we don’t specify an amount? Then it’s

    implemented with via _safe_read instead: def _safe_read(self, amt): s = [] while amt > 0: chunk = self.fp.read(min(amt, MAXAMOUNT)) if not chunk: raise IncompleteRead(b''.join(s), amt) s.append(chunk) amt -= len(chunk) return b"".join(s) At least MAXAMOUNT = 1048576
  19. www.ska.ac.za What if we don’t specify an amount? httpclient httpclient-na

    0 500 1000 1500 2000 2500 3000 3500 MB/s 896 1334 Python 3.6.12
  20. www.ska.ac.za It’ll be better — one day httpclient httpclient-na 0

    500 1000 1500 2000 2500 3000 3500 MB/s 896 1334 892 3233 3258 3236 Python 3.6.12 3.8.2 master
  21. www.ska.ac.za More results requests requests-stream urllib3 httpclient-na 0 500 1000

    1500 2000 2500 3000 3500 MB/s Python 3.6.12 3.8.2 master PyPy 7.3.1
  22. www.ska.ac.za Other libraries httpx tornado aiohttp 0 500 1000 1500

    2000 2500 3000 3500 MB/s 709 868 996 711 481 1090 Python 3.6.12 PyPy 7.3.1
  23. www.ska.ac.za So what do we do about Python .6? Let’s

    relax the rules and return a numpy array. def load_requests_np(url: str) -> np.ndarray: with requests.get(url, stream=True) as resp: data = np.empty(int(resp.headers['Content-length']), np.uint8) resp.raw.readinto(memoryview(data)) return data This gets us MB/s.
  24. www.ska.ac.za So what do we do about Python .6? Let’s

    relax the rules and return a numpy array. def load_requests_np(url: str) -> np.ndarray: with requests.get(url, stream=True) as resp: data = np.empty(int(resp.headers['Content-length']), np.uint8) resp.raw.readinto(memoryview(data)) return data This gets us 6 MB/s.
  25. www.ska.ac.za So what do we do about Python .6? Let’s

    relax the rules and return a numpy array. def load_requests_np(url: str) -> np.ndarray: with requests.get(url, stream=True) as resp: data = np.empty(int(resp.headers['Content-length']), np.uint8) resp.raw.readinto(memoryview(data)) return data This gets us 6 MB/s. This time it’s urllib : def readinto(self, b): temp = self.read(len(b)) if len(temp) == 0: return 0 else: b[: len(temp)] = temp return len(temp)
  26. www.ska.ac.za Summary People who write HTTP libraries don’t optimize for

    throughput. But sometimes you can do something about it.
  27. www.ska.ac.za SARAO, a business unit of the National Research Foundation.

    The South African Radio Astronomy Observatory (SARAO) spearheads South Africa’s activities in the Square Kilometre Array Radio Telescope, commonly known as the SKA, in engineering, science and construction. SARAO is a National Facility managed by the National Research Foundation and incorporates radio astronomy instruments and programmes such as the MeerKAT and KAT- telescopes in the Karoo, the Hartebeesthoek Radio Astronomy Observatory (HartRAO) in Gauteng, the African Very Long Baseline Interferometry (AVN) programme in nine African countries as well as the associated human capital development and commercialisation endeavours. Contact information Bruce Merry Senior Science Processing Developer Email: [email protected]