Upgrade to Pro — share decks privately, control downloads, hide ads and more …

FastAPIで始める やまなし温泉めぐり

FastAPIで始める やまなし温泉めぐり

2021.7.15
信玄パイ LT大会 みんなの FastAPI LT

Avatar for Yuuki Shimizu

Yuuki Shimizu

July 15, 2021
Tweet

More Decks by Yuuki Shimizu

Other Decks in Programming

Transcript

  1. ͓·͑୭Αʁ ͠Έͣ Ώ͏͖ • Android / iOS ϓϩάϥϚ • ࢁསੜ·Ε

    ࢁསҭͪ ˞ݱࡏ͸౦ژʹग़Ք͗த 5೥໨ 2
  2. ͸͡Ίʹ ઌ݄։࠵͞Εͨ৴ݰύΠͷ FastAPI ษڧձ ʮFastAPI Ͱϋδϝϧ PythonʯͰֶΜͩ͜ͱ Λࢀߟʹɺࢼ͠ʹ API Λ࡞੒ͯ͠Έ·ͨ͠ͷͰ

    ൸࿐͠·͢ ʢษڧձ౰೔࢒ۀͰࢀՃͰ͖ͳ͔ͬͨͷͰɺࣗश͖ͯ͠·͠ ͨɻࢿྉͷڞ༗͋Γ͕ͱ͏͍͟͝·͢ʂʣ 3
  3. ࢁས͸Թઘ͕͍ͬͺ͍ʂ • ଟ༷ͳઘ࣭ ◦ “શ෦Ͱ10छྨ͋Δઘ࣭ͷ͏ͪɺࢁསʹ͸9छྨͷ ઘ࣭͕͋ΓɺશࠃͰ΋τοϓϨϕϧ” • ܠ؍ͷྑ͞ ◦ “ૣே΍༦฻Ε࣌ʹଠཅͷޫͰߚ͘છ·Δ෋࢜ࢁ΍ɺ

    ߕ෎ຍ஍ͷ༦฻Ε͔Β໷ܠʹҠΔॠؒͷඒ͍͠ܠ ৭ɺࣗવ๛͔ͳܢ୩ඒͳͲɺ༷ʑͳܠ৭ΛԹઘʹ ਁ͔Γͳ͕Βָ͠Ή͜ͱ͕Ͱ͖·͢” 5 ग़యɿ΍·ͳཱͪ͠دΓඦ໊౬ ؼলͷࡍʹΑཱͪ͘دͬͯ·͢
  4. chezou/tabula-py • PDF ϑΝΠϧ಺ͷදΛ pandas ͷ DataFrame ΦϒδΣΫτʹ ม׵͢ΔϥΠϒϥϦ ◦

    CSVɺTSVɺJSON ϑΝΠϧʹม ׵͢Δ͜ͱ΋Մೳ • OCR πʔϧͰ͸ͳ͍ • Java 8 Ҏ͕߱ඞཁ 10
  5. main.py - ᶃ PDF ಡΈࠐΈ def check_columns(df, previous_df): difference1 =

    set(df.keys()) - set(previous_df.keys()) difference2 = set(previous_df.keys()) - set(df.keys()) return (len(difference1) == 0 and len(difference2) == 0) 11 def get_data(pdf_path): previous_df = pd.DataFrame() dfs = tabula.read_pdf(pdf_path, lattice=True, pages = 'all') for df in dfs: # ෳ਺ϖʔδͷදΛ݁߹͢Δ if (check_columns(df, previous_df)): df = pd.concat([previous_df, df]) previous_df = df return previous_df PDFΛಡΈࠐΈɺDataFrame Φϒ δΣΫτΛฦ٫͢Δ ෳ਺ϖʔδʹ·͕ͨΔදͷ߲໨໊ Λൺֱ͠ɺಉ͡ද͔Ͳ͏͔Λ൑ఆ ͢Δʢ্ͷؔ਺͔Βݺ͹ΕΔʣ
  6. main.py - ᶄ API ࡞੒ 12 app = FastAPI() pdf_path

    = "h3012011.pdf" @app.get("/") def read_root(): data = get_data(pdf_path) json_data = data.to_json(orient = 'records') return json.loads(json_data) @app.get("/area/{area}") def read_item(area: str): data = get_data(pdf_path) df_mask = data['市町村名'] == area data = data[df_mask] json_data = data.to_json(orient = 'records') return json.loads(json_data) [get] / શ݅ฦ٫͢Δ API [get] /area/{area} ࢦఆ͞ΕͨࢢொଜͷΈฦ٫͢Δ API
  7. Docker ࢖͍·ͨ͠ 13 Docker Host (VPS) Nginx ϦόʔεϓϩΩγ onsen.yamanashi.dev:443 App

    Container FastAPI localhost:45280 main.py ࢁསݝ WebαΠτ PDF PDF tiangolo/uvicorn-gunicorn-fastapi :python3.8-alpine3.10 ্هͷΠϝʔδΛϕʔεʹ openjdk11 ΛΠϯετʔϧ ͨ͠΋ͷΛ࢖༻ ࠓճ͸ݝαΠτʹෛՙ͕͔͔Βͳ͍Α͏ɺ PDFΛࣄલʹίϯςφ಺ʹίϐʔ
  8. ࡶײɾ·ͱΊ ʮFastAPI Ͱ࢝ΊΔ ΍·ͳ͠ ԹઘΊ͙Γʯ • FastAPI ͸Φʔϓϯσʔλ͔Βखܰʹ API Λ࡞੒͢Δͷʹྑͦ͞͏

    ◦ Tabula ͱ૊Έ߹ΘͤΔ͜ͱͰɺPDF ϑΝΠϧ΋ FastAPI ʹࡌͤΔ͜ͱ͕Մೳ ◦ PDFϑΝΠϧͷม׵ʹ͕͔͔͍࣌ؒͬͯΔͷͰɺதؒσʔλΛอଘ͢Δ౳ͷ ޻෉͕͍Δ͔΋ • ͓ؾʹೖΓͷԹઘࢪઃΛ౤ථ͢Δ API ͷ࡞੒΍ೝূ΋ࢼͯ͠Έ͍ͨ • ·ͩߦͬͨ͜ͱͷͳ͍ԹઘࢪઃΛ৭ʑ஌Εͯྑ͔ͬͨʂ 16