Upgrade to Pro — share decks privately, control downloads, hide ads and more …

BigData-JAWS #9: Data Analytics Infra for Crowd...

BigData-JAWS #9: Data Analytics Infra for CrowdWorks

Masayuki Morita

October 11, 2017
Tweet

More Decks by Masayuki Morita

Other Decks in Technology

Transcript

  1. ࠓ೔࿩͢͜ͱ w σʔλ෼ੳج൫ͷγεςϜߏ੒ w .Z42-3FETIJGU΁ͷσʔλಉظ w ߦಈϩάͷσʔλಉظ w όονूܭ w

    σʔλ෼ੳ w 3FETIJGUӡ༻5JQT w ϏϡʔΛར༻ͨ͠σʔλࢀরݖݶͷ؅ཧ w 3FETIJGUͷ؂ࢹ
  2. σʔλ෼ੳج൫ 3FETIJGU SFQMJDB .Z42- "QQT ,JOFTJT4USFBNT -BNCEB 4 'JSFIPTF CSJDPMBHF

    αʔϏεຊମ .Z42-σʔλಉظ ߦಈϩάಉظ όονूܭ σʔλ෼ੳ
  3. σʔλ෼ੳج൫ 3FETIJGU SFQMJDB .Z42- "QQT ,JOFTJT4USFBNT -BNCEB 4 'JSFIPTF CSJDPMBHF

    αʔϏεຊମ .Z42-σʔλಉظ ߦಈϩάಉظ όονूܭ σʔλ෼ੳ
  4. σʔλ෼ੳج൫ 3FETIJGU SFQMJDB .Z42- "QQT ,JOFTJT4USFBNT -BNCEB 4 'JSFIPTF CSJDPMBHF

    αʔϏεຊମ .Z42-σʔλಉظ ߦಈϩάಉظ όονूܭ σʔλ෼ੳ
  5. ߦಈϩάͷಉظ w ΍Γ͍ͨ͜ͱ w ಛఆϖʔδͷӾཡͳͲߦಈϩάΛɺϝΠϯͷ%#ʹॻ͔ͣʹ෼ੳ༻ͷ3FETIJGU ʹඇಉظʹऔΓࠐΈ͍ͨ w ࣮૷ w ΞϓϦέʔγϣϯͰॻ͍ͨϩάϑΝΠϧΛqVFOUQMVHJOLJOFTJTͰ,JOFTJT

    4USFBNTʹॻ͖ࠐΈ w -BNCEBͰϩάϑΝΠϧͷόϦσʔγϣϯͯ͠ɺਖ਼ৗͳϨίʔυͷΈ,JOFTJT 'JSFIPTF΁࿈ܞͭͭ͠ɺੜϩάΛ4ʹόοΫΞοϓ w ,JOFTJT'JSFIPTFͰ3FETIJGUʹ$01: w ΞϓϦέʔγϣϯ͔ΒϩάΛ௚઀'JSFIPTFʹૹΒͣʹ4UFBNT-BNCEBΛڬΜ Ͱ͍Δͷ͸ɺϩάͷόϦσʔγϣϯ͍ͨ͠ͷͱɺ'JSFIPTF͸ετϦʔϜ͋ͨ ΓѼઌͷςʔϒϧ͕ͭͰɺߏ੒ͷॊೈੑ͕͋Μ·Γͳ͍ͷͰɻ
  6. ,JOFTJT'JSFIPTFͱ͸ w ετϦʔϛϯάσʔλΛ3FETIJGU4&MBTUJD4FBSDIʹอଘͯ͘͠ΕΔϚωʔδ υαʔϏε w ࣗલͰఆظతʹ3FETIJGUʹ$01:ίϚϯυୟ͍ͨΓ͠ͳͯ͘Α͍ w ࠷ۙ౦ژϦʔδϣϯʹདྷͨ w ಺෦తʹ͸ɺ4USFBNT-BNCEB43FETIJGUΛ΍ͬͯΔΑ͏Ͱɺ"84

    ͕؅ཧ͢Δ'JSFIPTFͷ*1ΞυϨεΛ3FETIJGUͷ4FDVSJUZ(SPVQͰڐՄͯ͋͛͠Δ ඞཁ͕͋Δͷʹ஫ҙ w ౦ژϦʔδϣϯͷ*1ΞυϨε͸ެࣜυΩϡϝϯτࢀর w IUUQEPDTBXTBNB[PODPNpSFIPTFMBUFTUEFWDPOUSPMMJOHBDDFTTIUNM
  7. σʔλ෼ੳج൫ 3FETIJGU SFQMJDB .Z42- "QQT ,JOFTJT4USFBNT -BNCEB 4 'JSFIPTF CSJDPMBHF

    αʔϏεຊମ .Z42-σʔλಉظ ߦಈϩάಉظ όονूܭ σʔλ෼ੳ
  8. CSJDPMBHFͷδϣϒఆٛ  w TRMϑΝΠϧʹूܭΫΤϦΛఆٛ w ूܭΫΤϦʹ΋ม਺͕ຒΊࠐΊΔͷͰ࢖͍ճ͕͠Ͱ͖Δ INSERT INTO $dest_table SELECT

    date_trunc('${term}', jst_visited_on)::date AS ${term}, v.user_id AS user_id, .. FROM daily_active_user_visits AS v GROUP BY ${term}, v.user_id ;
  9. CSJDPMBHFͷδϣϒఆٛ  w KPCϑΝΠϧʹδϣϒΛఆٛ w ಉ͡42-ʹม਺Λઃఆͯ͠࢖͍ճ͠ w SFCVJMESFOBNF͸ςʔϒϧͷDSFBUFTXBQܕͷચ͍ସ͑ class: rebuild-rename

    sql-file: active_users.sql table-def: active_users.ct dest-table: active_users_daily term: day class: rebuild-rename sql-file: active_users.sql table-def: active_users.ct dest-table: active_users_monthly term: month
  10. CSJDPMBHFͷδϣϒఆٛ  $ bundle exec bricolage --dry-run --job active_users_daily.job \timing

    on \set ON_ERROR_STOP false drop table active_users_daily_old cascade; .... -- active_users.ct CREATE TABLE active_users_daily_wk ( day date not null, .... -- active_users.sql INSERT INTO active_users_daily_wk SELECT date_trunc('day', jst_visited_on)::date AS day, .... ESZSVOͰੜ੒͞ΕΔ42-จ͕֬ೝͰ͖Δ
  11. σʔλ෼ੳج൫ 3FETIJGU SFQMJDB .Z42- "QQT ,JOFTJT4USFBNT -BNCEB 4 'JSFIPTF CSJDPMBHF

    αʔϏεຊମ .Z42-σʔλಉظ ߦಈϩάಉظ όονूܭ σʔλ෼ੳ
  12. ϏϡʔΛར༻ͨ͠ σʔλࢀরݖݶͷ؅ཧ w ΍Γ͍ͨ͜ͱ w ಛఆͷςʔϒϧΧϥϜ͸ݟ͑ΔϢʔβάϧʔϓΛ੍ݶ͍ͨ͠ w ೔ʑαʔϏε։ൃͷ%#ϚΠάϨʔγϣϯͰςʔϒϧΧϥϜ͕૿͑ͨΓ͢Δ ͷͰɺͲΕ͕ݟ͑ͯΑ͍͔͸αʔϏεຊମͷίʔυϕʔεͰ؅ཧ͍ͨ͠ w

    ࣮૷ w ςʔϒϧͷ࣮ମ͸ݟͤͣʹɺϏϡʔ͚ͩݟͤΔΑ͏ݖݶௐ੔ w ର৅ͷςʔϒϧΧϥϜ͸ΞϓϦέʔγϣϯ಺ͰΞϊςʔγϣϯͯ͠ɺϝλσʔ λςʔϒϧΛ࡞੒͠3FETIJGUʹಉظ w ϝλσʔλςʔϒϧ͔ΒϏϡʔΛੜ੒͢Δ಺੡πʔϧΛ࡞੒
  13. ςʔϒϧͷ࣮ମΛݟͤͣʹ Ϗϡʔ͚ͩݟͤΔ w VONBTLFEεΩʔϚʹςʔϒϧͷ࣮ମɺNBTLFEεΩʔϚʹϏϡʔΛ࡞੒ w VONBTLFE@WJFXFSTͱNBTLFE@WJFXFSTͷάϧʔϓΛ࡞ͬͯݖݶௐ੔ w ϋϚΓϙΠϯτ w (3"/54&-&$50/"--5"#-&4ʙ͸طʹଘࡏ͢ΔϏϡʔͷΈ༗ޮͰɺޙ͔Β૿

    ͑ͨϏϡʔ΋ࣗಈͰݟ͑ΔΑ͏ʹ͢Δʹ͸"-5&3%&'"6-513*7*-&(&4ʙ͕ඞཁ w NBTLFE@WJFXFST͸࣮ςʔϒϧͷଘࡏ͢ΔVONBTLFEεΩʔϚ΁ͷ64"(&͕ඞཁ w ۩ମతͳ42-͸एׯ൥ࡶͳͷͰɺ࣮૷ৄࡉ͸2JJUBʹॻ͍ͨ w 3FETIJGUͰςʔϒϧͷ࣮ମΛݟͤͣʹϏϡʔ͚ͩݟͤͯΞΫηε੍ݶ͍ͨ͠  IUUQTRJJUBDPNNJOBNJKPZPJUFNTDCB
  14. 3FETIJGUͷ؂ࢹ w ΍Γ͍ͨ͜ͱ w 3FETIJGUͷ҆ఆՔಇͷͨΊɺՄ༻ੑɺੑೳʹؔ࿈͢ΔϝτϦΫεΛ؂ࢹ͍ͨ͠ w 'MZ%BUB4ZODͷಉظ͕஗Ԇͯ͠ͳ͍͔؂ࢹ͍ͨ͠ w ࣮૷ w

    $MPVE8BUDI͔Β%BUBEPHͷެࣜΠϯςάϨʔγϣϯͰϝτϦΫε౤͛ͯ؂ࢹ w 3FETIJGUͷγεςϜϏϡʔʹఆظతʹ42-࣮ߦͨ݁͠ՌΛ%BUBEPHʹϝτϦΫε ౤͛ͯ؂ࢹ w 'MZ%BUBͷಉظݩ%#ʹγεςϜλΠϜελϯϓςʔϒϧΛ࡞ͬͯDSPOͰ෼ʹ ճ61%"5&ͯ͠ɺఆظతʹ3FETIJGUͰݱࡏ࣌ࠁͱൺֱͯ͠஗ԆΛ%BUBEPHͰ؂ࢹ w ఆظతʹ42-Λ࣮ߦͯ͠%BUBEPHʹ౤͚͛ͭΔDZRMEPHͱ͍͏πʔϧΛ࡞ͬͨ
  15. $MPVE8BUDIͷ؂ࢹ߲໨ w ݟͨ΄͏͕Αͦ͞͏ͳ΋ͷ w $166UJMJ[BUJPO$16࢖༻཰ w 1FSDFOUBHF%JTL4QBDF6TFEσΟεΫ࢖༻཰ w )FBMUI4UBUVTϔϧενΣοΫ w

    .BJOUFOBODF.PEFϝϯςঢ়ଶ͔Ͳ͏͔ w ͦͷଞऔಘͰ͖Δ߲໨ w IUUQEPDTBXTBNB[PODPN"NB[PO$MPVE8BUDIMBUFTU NPOJUPSJOHSTNFUSJDTDPMMFDUFEIUNM
  16. γεςϜϏϡʔͷ؂ࢹ߲໨ w ݟͨ΄͏͕Αͦ͞͏ͳ΋ͷ w 457@8-.@26&3:@45"5&Ωϡʔ͝ͱͷΫΤϦ࣮ߦ࣌ؒ w 457@8-.@4&37*$&@$-"44@45"5&Ωϡʔ͝ͱͷΫΤϦ࣮ߦ਺ w 45-@26&3:Ϣʔβ͝ͱͷΫΤϦ࣮ߦ਺ɺࣦഊ਺ɺ࣮ߦ࣌ؒ w

    45-@"-&35@&7&/5@-0(ύϑΥʔϚϯεܯࠂͷΠϕϯτϩά w ͦͷଞऔಘͰ͖Δ߲໨ w IUUQEPDTBXTBNB[PODPNSFETIJGUMBUFTUEHDN@DIBQ@TZTUFN UBCMFTIUNM
  17. ݱࡏͷΩϡʔ͝ͱͷΫΤϦ࣮ߦ࣌ؒ SELECT TRANSLATE(TRIM(TRANSLATE(c.condition, '():', '')), ' ', '_') AS name,

    s.state, COUNT(*) AS count, NVL(MAX(s.queue_time), 0) AS max_queue_time, NVL(MAX(s.exec_time), 0) AS max_exec_time FROM stv_wlm_classification_config c, stv_wlm_query_state s WHERE c.action_service_class = s.service_class GROUP BY s.state, c.condition; name | state | count | max_queue_time | max_exec_time --------------------+------------------+-------+----------------+--------------- user_group_hoge | Running | 1 | 1129521 | 393674 user_group_hoge | QueuedWaiting | 1 | 390360 | 0 querytype_any | Running | 1 | 0 | 137126
  18. ݱࡏͷΩϡʔ͝ͱͷΫΤϦ࣮ߦ਺ SELECT TRANSLATE(TRIM(TRANSLATE(c.condition, '():', '')), ' ', '_') AS name,

    s.num_queued_queries AS queued, s.num_executing_queries AS executing, s.num_executed_queries AS executed FROM stv_wlm_classification_config c, stv_wlm_service_class_state s WHERE c.action_service_class = s.service_class; name | queued | executing | executed --------------------------------------+--------+-----------+---------- querytype_any | 0 | 2 | 142534 user_group_hoge | 0 | 1 | 1365805
  19. ௚ۙ෼ͷϢʔβ͝ͱͷΫΤϦ࣮ߦ࣌ؒ SELECT TRIM(u.usename) AS username, (CASE q.aborted WHEN 0 THEN

    'no' WHEN 1 THEN 'yes' END) AS aborted, COUNT(*) AS count, NVL(MIN(DATEDIFF(us, q.starttime, q.endtime)), 0) AS min, NVL(MAX(DATEDIFF(us, q.starttime, q.endtime)), 0) AS max, NVL(AVG(DATEDIFF(us, q.starttime, q.endtime)), 0) AS avg, NVL(MEDIAN(DATEDIFF(us, q.starttime, q.endtime))::integer, 0) AS median FROM stl_query q, pg_user u WHERE q.userid = u.usesysid AND q.endtime > (SYSDATE - interval '5m') GROUP BY q.userid, q.aborted, u.usename; username | aborted | count | min | max | avg | median -------------------------+---------+-------+----------+----------+----------+---------- hoge | no | 1 | 12725416 | 12725416 | 12725416 | 12725416 hoge | yes | 1 | 9176396 | 9176396 | 9176396 | 9176396
  20. DZRMEPH w IUUQTHJUIVCDPNDSPXEXPSLTDZRMEPH w ఆظతʹ42-Λ࣮ߦͯ͠%BUBEPHʹϝτϦΫεΛ౤͚͛ͭΔ؂ࢹ πʔϧ w ࣮૷͸(P w σʔλιʔε͸1PTUHSFT

    3FETIJGUؚΉ ɺ.Z42-ʹରԠ w ෳ਺ͷ42-ΛҟͳΔִؒͰ࣮ߦՄೳ w ؂ࢹΫΤϦͰ%#ෛՙΛ͔͚ͳ͍Α͏ʹಉ࣮࣌ߦΫΤϦ͸ͭʹ੍ޚ