Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How to Use In-Memory Streams
Search
HayaoSuzuki
August 29, 2020
Technology
1
5.5k
How to Use In-Memory Streams
PyCon JP 2020
HayaoSuzuki
August 29, 2020
Tweet
Share
More Decks by HayaoSuzuki
See All by HayaoSuzuki
Tasting "Python Distilled"
hayaosuzuki
0
310
Let's implement useless Python objects
hayaosuzuki
0
1.9k
How to Write Robust Python Code
hayaosuzuki
5
4.4k
Unknown Evolution of the Built-in Function pow
hayaosuzuki
0
1.4k
Python for Everyday
hayaosuzuki
1
2.2k
Do you know cmath module?
hayaosuzuki
0
3.3k
Elementary Number Theory with Python
hayaosuzuki
1
3.5k
Django QuerySet "ARE" Patterns
hayaosuzuki
0
3.3k
A Modernization of Legacy Django Based Applications
hayaosuzuki
1
7.9k
Other Decks in Technology
See All in Technology
なぜスクラムはこうなったのか?歴史が教えてくれたこと/Shall we explore the roots of Scrum
sanogemaru
5
1.5k
react-callを使ってダイヤログをいろんなとこで再利用しよう!
shinaps
1
220
落ちる 落ちるよ サーバーは落ちる
suehiromasatoshi
0
150
クラウドセキュリティを支える技術と運用の最前線 / Cutting-edge Technologies and Operations Supporting Cloud Security
yuj1osm
2
310
Rustから学ぶ 非同期処理の仕組み
skanehira
1
130
OCI Oracle Database Services新機能アップデート(2025/06-2025/08)
oracle4engineer
PRO
0
100
Skrub: machine-learning with dataframes
gaelvaroquaux
0
120
Terraformで構築する セルフサービス型データプラットフォーム / terraform-self-service-data-platform
pei0804
1
120
COVESA VSSによる車両データモデルの標準化とAWS IoT FleetWiseの活用
osawa
1
250
カミナシ社の『ID管理基盤』製品内製 - その意思決定背景と2年間の進化 #AWSUnicornDay / Kaminashi ID - The Big Whys
kaminashi
3
860
なぜSaaSがMCPサーバーをサービス提供するのか?
sansantech
PRO
8
2.7k
【Grafana Meetup Japan #6】Grafanaをリバプロ配下で動かすときにやること ~ Grafana Liveってなんだ ~
yoshitake945
0
420
Featured
See All Featured
Site-Speed That Sticks
csswizardry
10
810
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
23
1.4k
How to train your dragon (web standard)
notwaldorf
96
6.2k
Imperfection Machines: The Place of Print at Facebook
scottboms
268
13k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
229
22k
Why You Should Never Use an ORM
jnunemaker
PRO
59
9.5k
Why Our Code Smells
bkeepers
PRO
339
57k
Balancing Empowerment & Direction
lara
3
610
Speed Design
sergeychernyshev
32
1.1k
A designer walks into a library…
pauljervisheath
207
24k
Optimizing for Happiness
mojombo
379
70k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
31
2.2k
Transcript
ΠϯϝϞϦʔετϦʔϜ׆༻ज़ How to Use In-Memory Streams Hayao Suzuki PyCon JP
2020 August 29, 2020
ൃදʹࡍͯ͠ GitHub ʹࢿྉ͕͋Γ·͢ › https://github.com/HayaoSuzuki/pyconjp2020 Twitter ͷϋογϡλά › #pyconjp_1 PyCon
JP Fellow Slack › #jp-2020-track-1 2 / 27
Who am I ? ͓લ୭Α Name Hayao Suzukiʢླɹॣʣ Twitter @CardinalXaro
Work Python Programmer at iRidge, Inc. 3 / 27
Who am I ? Technical Reviewer › Effective Python ୈ
2 ൛ (O’Reilly Japan) › ಈֶ͔ͯ͠Ϳྔࢠίϯϐϡʔλϓϩάϥϛϯά (O’Reilly Japan) https://xaro.hatenablog.jp/ ʹϦετ͕͋Γ·͢ɻ 4 / 27
Who am I ? Selected Talks › ϨΨγʔ Django ΞϓϦέʔγϣϯͷݱԽ
(DjangoCongress JP 2018) › SymPy ʹΑΔࣜॲཧ (PyCon JP 2018) › Python ͱָ͠Ήॳ (PyCon mini Hiroshima 2019) › ܅ cmath Λ͍ͬͯΔ͔ (PyCon mini Shizuoka 2020) https://xaro.hatenablog.jp/ ʹϦετ͕͋Γ·͢ɻ 5 / 27
ࠓͷඪ ͜Μͳ՝Λղܾ͍ͨ͠ʂ › Πϯλʔωοτܦ༝Ͱ GB αΠζͷσʔλΛऔಘ͠ɺCSV ϑΝΠϧʹՃ͢Δ › Ϋϥυ্ʹߏஙͨ͠طଘͷγεςϜʹՃ͢ΔܗͰ࣮͢Δ ›
ຖ࣮ߦ͢Δ ΫϥυαʔϏεैྔ՝ۚ ͳΔ͘ਝʹॲཧ͍ͨ͠ʂ 6 / 27
ࠓͷඪ ॲཧͷྲྀΕ › Πϯλʔωοτܦ༝Ͱ GB αΠζͷσʔλΛऔಘ͢Δ › GB αΠζͷσʔλΛ
CSV ϑΝΠϧʹՃ͢Δ › CSV ϑΝΠϧΛ ZIP ѹॖ͢Δ › ZIP ѹॖσʔλΛΫϥυετϨʔδʹΞοϓϩʔυ͢Δ ੳ › σʔλαΠζ͕େ͖͍ › σʔλͷՃ୯७ͳॲཧ 7 / 27
ࠓͷඪ ϘτϧωοΫͲ͔͜ › ZIP ѹॖͦΕ΄ͲେมͰͳ͍ › σʔλՃ୯७ͳॲཧ › ϘτϧωοΫ I/O
ॲཧʹ͋Γͦ͏ Կͱ͔ͯ͠ I/O ॲཧΛਝʹॲཧ͍ͨ͠ʂʂʂ 8 / 27
Today’s Theme In-Memory Streams 9 / 27
Stream? ͦͦετϦʔϜͬͯԿʁ ετϦʔϜϑΝΠϧΦϒδΣΫτͰ͋Δɻ 10 / 27
File Object? ϑΝΠϧΦϒδΣΫτͬͯԿʁ › read() write() ͳͲͷϝιουΛ࣋ͭΦϒδΣΫτ › σΟεΫ্ͷϑΝΠϧผͷॴʹ͋ΔετϨʔδɺೖग़ྗػثͱ
ΓͱΓ͕Ͱ͖Δ 11 / 27
File Object? ϑΝΠϧΦϒδΣΫτͨͪ › ੜόΠφϦϑΝΠϧ › όοϑΝ͖όΠφϦϑΝΠϧ › ςΩετϑΝΠϧ 12
/ 27
͍ํ ςΩετϑΝΠϧ f = open("myfile.txt", "r") όοϑΝ͖όΠφϦ f = open("myfile.jpg",
"rb") 13 / 27
open ؔͷཪଆ open ԿΛ͍ͯ͠Δͷ͔ʁ OS ͷγεςϜίʔϧ API ΛݺͿ 14 /
27
open ؔͷཪଆ ྫɿCSV ʹՃ͢Δ with open("events.csv", "w") as csv_file: fieldnames
= ["title", "started_at", "ended_at"] writer = csv.DictWriter(csv_file, fieldnames) writer.writeheader() writer.writerows(events) 15 / 27
open ؔͷཪଆ ྫɿWindows › CreateFileʢϑΝΠϧͷΞΫηεݖऔಘʣ › QueryAllInformationFileʢϑΝΠϧใͷऔಘʣ › WriteFileʢϑΝΠϧॻ͖ࠐΉʣ ›
CloseFileʢϑΝΠϧΛด͡Δʣ Process Monitor ܦ༝Ͱ֬ೝͨ͠ɻ 16 / 27
open ؔͷཪଆ ྫɿUbuntu on WSL › openat ʢϑΝΠϧͷΦʔϓϯʣ › fstatʢϑΝΠϧใͷऔಘʣ
› ioctlʢσόΠε੍ޚʣ › lseekʢϑΝΠϧͷγʔΫʣ › writeʢϑΝΠϧॻ͖ࠐΉʣ › closeʢϑΝΠϧΛด͡Δʣ strace ܦ༝Ͱ֬ೝͨ͠ɻ 17 / 27
࠷ޙʹস͏ͷ୭ͩ ࠷ऴతͳՌͲ͜ʹஔ͘ʁ › ϑΝΠϧΛϩʔΧϧʹอଘ͢Δͷ͕ΰʔϧͰͳ͍ › ϑΝΠϧΛ AWS S3 ͳͲͷ֎෦ʹஔ͖͍ͨ ϩʔΧϧσόΠεʹϑΝΠϧΛॻ͖ࠐΈͨ͘ͳ͍ʂ
18 / 27
Today’s Theme In-Memory Streams 19 / 27
ΠϯϝϞϦʔετϦʔϜ ΠϯϝϞϦʔετϦʔϜͱ › str bytes ΛϑΝΠϧΦϒδΣΫτͷΑ͏ʹѻ͑Δ › ಡΈॻ͖ՄೳɺϥϯμϜΞΫηεՄೳ 20
/ 27
StringIO StringIO ςΩετϑΝΠϧͷͨΊͷΠϯϝϞϦετϦʔϜ ྫɿCSV Λ StringIO ͰऔΓѻ͏ import io with
io.StringIO() as csv_file: fieldnames = ["title", "started_at", "ended_at"] writer = csv.DictWriter(csv_file, fieldnames) writer.writeheader() writer.writerows(events) 21 / 27
BytesIO BytesIO όοϑΝ͖όΠφϦϑΝΠϧͷͨΊͷΠϯϝϞϦετϦʔϜ ྫɿPNG Λ BytesIO ͰऔΓѻ͏ import io with
io.BytesIO(png_bytes) as f: png_header = f.read(8) print(png_header) # b'\x89PNG\r\n\x1a\n' 22 / 27
෮शɿࠓͷඪ ॲཧͷྲྀΕ › Πϯλʔωοτܦ༝Ͱ GB αΠζͷσʔλΛऔಘ͢Δ › GB αΠζͷσʔλΛ
CSV ϑΝΠϧʹՃ͢Δ › CSV ϑΝΠϧΛ ZIP ѹॖ͢Δ › ZIP ѹॖσʔλΛΫϥυετϨʔδʹΞοϓϩʔυ͢Δ 23 / 27
σʔλΛΠϯλʔωοτܦ༝Ͱऔಘ͢Δ ྫɿConnpass API Λίʔϧ͢Δ with urllib.request.urlopen(url) as response: events =
json.load(response)["events"] 24 / 27
σʔλΛՃ͢Δ ྫɿAPI ͷऔಘ݁ՌΛ CSV ʹ͢Δ with io.StringIO() as ts: header
= ["title", "started_at", "ended_at"] writer = csv.DictWriter(ts, fieldnames=header) writer.writeheader() writer.writerows(events) 25 / 27
σʔλΛѹॖ&Ξοϓϩʔυ ྫɿZIP ʹѹॖͯ͠ AWS S3 ʹΞοϓϩʔυ with io.BytesIO() as bs:
with zipfile.ZipFile(bytes_stream, "w") as zf: zf.writestr("events.csv", ts.getvalue()) bs.seek(0) # ϑΝΠϧγʔΫ͕ϙΠϯτ s3.upload_fileobj(bs, "bucket", "events.zip") 26 / 27
Conclusion ·ͱΊ › io ϞδϡʔϧʹΠϯϝϞϦʔετϦʔϜؚ͕·ΕΔɻ › str bytes ΛϑΝΠϧΦϒδΣΫτͷΑ͏ʹѻ͏͜ͱ͕Ͱ͖Δɻ
› ௨ৗͷ open ͱҟͳΓγεςϜίʔϧ͕ݺΕͳ͍ɻ › σΟεΫͷ I/O ΛݮΒ͍ͨ͠ɺ·ͨͰ͖ͳ͍ঢ়گԼͰͷར༻ ͕࠷దͰ͋Δɻ io ϞδϡʔϧΛօ༷ͷಓ۩ശʹೖΕ͍ͯͩ͘͞ʂ 27 / 27