Upgrade to Pro — share decks privately, control downloads, hide ads and more …

MySQLからBigQueryの同期を差分更新にしたら4倍高速になった話 / Sync fro...

MySQLからBigQueryの同期を差分更新にしたら4倍高速になった話 / Sync from MySQL to BigQuery become 4x faster by incremental updating

Embulk Meetup Tokyo #3のLTです

Takehiro Shiozaki

May 16, 2017
Tweet

More Decks by Takehiro Shiozaki

Other Decks in Technology

Transcript

  1. σʔλ෼ੳج൫ͷ঺հ Ϛελʔσʔλ 3%4 .Z42- %BUBUSBOTGFS TFSWFS ($4 #JH2VFSZ ෗Γ͔͚͍ͯΔൿ఻ͷͨΕঢ়ଶ ɾNZTRMEVNQ

    US TFE FUDΛෳࡶʹ૊Έ߹Θ͍ͤͯΔ ɾςʔϒϧΛ෼ׂͯ͠సૹ͢ΔઃఆΛϋʔυίʔσΟϯά
  2. લճͷಉظҎ߱ͷσʔλΛऔಘ w JOQVUQMVHJONZTRMͷઃఆʹXIFSFΛ௥Ճ͢Δ w औಘ͢Δඞཁͷ͋Δߦ͕ҎԼʹͳΔ in: type: mysql host: example.com

    user: user_name password: ******** database: db_name table: items select: "*" where: "updated_at > '2017-05-15 00:00:00'" # この行を追加
  3. 42-ͰςʔϒϧΛϚʔδ w ͭͷςʔϒϧΛVOJPOBMMͰ݁߹͠ɺओΩʔͰ QBSUJUJPOCZͯ͠৽͍͠ߦͷΈΛબ୒ w ͜ͷ42-ͷ݁ՌΛݩͷςʔϒϧʹॻ͖໭͢ select * from (

    select *, row_number() over (partition by id order by updated_at desc) as rn from ( select * from tmp.items union all select * from mysql.items ) ) where rn = 1
  4. ݁Ռ 3%4 .Z42- %BUBUSBOTGFS TFSWFS ($4 #JH2VFSZ NJO ˠNJO NJO

    ˠNJO NJO ˠNJO શମͰഒͷߴ଎Խ ໿INJOˠNJO ςʔϒϧͷϚʔδ NJOˠNJO