Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
AWSにおけるデータ分析入門 / Introduction To Data Analytic...
Search
hedgehog051
October 06, 2021
0
240
AWSにおけるデータ分析入門 / Introduction To Data Analytics In AWS
hedgehog051
October 06, 2021
Tweet
Share
More Decks by hedgehog051
See All by hedgehog051
AWS Generative AI CDK Constructsについて
hedgehog051
2
290
KnowledgeBasesとAgentsの紹介
hedgehog051
4
1.8k
BedrockUpdatesPost-GW Summary
hedgehog051
4
830
来てくれClaude 3! Agents for Amazon Bedrockのモデル比較或いはチューニングの話
hedgehog051
5
1.7k
Relic_Tech_Camp_GenerativeAI.pdf
hedgehog051
11
88k
concurrencyで爆速並列デプロイ
hedgehog051
1
1.8k
AWS App Runnerについてとこれから期待したいこと/About-AWS-App-Runner-and-what-to-expect-in-the-future
hedgehog051
0
100
また増えた!?AWSコンテナ関連サービスを10分でざっくり掴もう/Learn-about-AWS-0container-services-in-10-minutes
hedgehog051
0
120
Featured
See All Featured
VelocityConf: Rendering Performance Case Studies
addyosmani
333
24k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
359
30k
Building an army of robots
kneath
306
46k
Become a Pro
speakerdeck
PRO
29
5.6k
Raft: Consensus for Rubyists
vanstee
140
7.2k
Documentation Writing (for coders)
carmenintech
75
5.1k
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
It's Worth the Effort
3n
187
28k
Building a Scalable Design System with Sketch
lauravandoore
463
33k
Visualization
eitanlees
150
16k
KATA
mclloyd
PRO
32
15k
Faster Mobile Websites
deanohume
310
31k
Transcript
"8 4 ʹ ͓ ͚ Δ σ ʔ λ
ੳ ೖ ג ࣜ ձ ࣾ R e l i c ۽ ా
ࣗݾհ • ۽ా ,BO,VNBEB • ळdΠϯϑϥΤϯδχΞ • ݄ʹגࣜձࣾ3FMJDೖࣾ
σʔλੳ͕͍ͨ͠ʜ
ϏδωεΛΠϯςϦδΣϯε͍ͨ͠ʜ
ʑσʔλੳͷػӡߴ·Δ
ͦͷલʹ
• ݁Ռ࣮ͳͲͷσʔλΛऩू ˞زΒചΓ্͔͛ͨɺͲΕ͘Β͍ΞΫηε͕͔͋ͬͨͳͲ • ऩूͨ͠σʔλ͔ΒԿ͔͠ΒͷΠϯαΠτΛಘΔ ˞Կ͕ചΕ͍ͯΔ͔ɺ͍ͭɺ୭ʹചΕ͍ͯΔ͔ͳͲ • ಘΒΕͨΠϯαΠτʹରͯ͠ΞΫγϣϯΛى͜͢ ˞Ձ֨ΛௐɺදࣔΛௐɺλʔήοτ֦େͳͲ
σʔλੳͬͯԿ͢Δͷ
ԿΛ࣮ݱͨͯ͘͠σʔλੳΛ ͢Δͷ͔Λ໌֬ʹ͢Δͷ͕େࣄ
"84Ͱͷσʔλੳؔ࿈αʔϏε
ͳΔ΄ͲɺΘ͔ΒΜ
• ݁Ռ࣮ͳͲͷσʔλΛऩू ˠͲ͏ͬͯूΊΔ͔ɺԿॲʹूΊΔ͔ • ऩूͨ͠σʔλ͔ΒԿ͔͠ΒͷΠϯαΠτΛಘΔ ˠੳ͘͢͠ՃɺੳɺՄࢹԽ σʔλੳج൫Λߏங͢Δʹ͋ͨͬͯ
ͬ͘͟Γྨ
ऩू Amazon Kinesis Amazon Kinesi s Video Streams Amazon Kinesi
s Data Streams Amazon Kinesi s Data Firehose Amazon Manage d Streaming for Apache Kafka AWS Data Pipeline AWS Data Exchange
ੵ Amazon Redshift Amazon LakeFarmation Amazon S3
Ճ Amazon EMR AWS Glue AWS Glue Elastic Views
AWS Glue DataBrew Amazon Kinesi s Data Analytics
ੳ Amazon EMR AWS Athena Amazon Kinesi s Data Analytics
Amazon Redshift Amazon QuickSight Amazon OpenSearch Service
ՄࢹԽ Amazon ElasticSearch Service Amazon QuickSight Amazon OpenSearch Service ৭ʑ͋ͬͯ
ؾ࣋ͪɺগ͠ํੑݟ͖͑ͯͨ ؾ͕͢Δ
ͦΕͧΕΛͬ͘͟Γ
ऩू
ϦΞϧλΠϜετϦʔϛϯά Amazon Kinesis Amazon Kinesi s Video Streams Amazon Kinesi
s Data Streams Amazon Kinesi s Data Firehose Amazon Manage d Streaming for Apache Kafka AWS Data Pipeline AWS Data Exchange
Amazon Kinesis Amazon Kinesi s Video Streams Amazon Kinesi s
Data Streams Amazon Kinesi s Data Firehose Amazon Manage d Streaming for Apache Kafka KinesisαʔϏεͷ૯শ ετϦʔϛϯάಈըͷΩϟϓνϟɺ ॲཧɺอଘ ετϦʔϜσʔλͷΩϟϓνϟɺ ॲཧɺอଘ AWS σʔλετΞʹ ετϦʔϜσʔλΛϩʔυ ϚωʔδυܕApache Kafk a ετϦʔϜσʔλͷૹड৴
ͦͷଞ Amazon Kinesis Amazon Kinesi s Video Streams Amazon Kinesi
s Data Streams Amazon Kinesi s Data Firehose Amazon Manage d Streaming for Apache Kafka AWS Data Pipeline AWS Data Exchange
AWS Data Pipeline AWS Data Exchange αʔυύʔςΟσʔλͷ αϒεΫϦϓγϣϯ Reuters͕ఏڙ͢ΔهࣄσʔλͳͲ ఆظ࣮ߦʹΑΔσʔλҠಈɺม
ੵ
Amazon Redshift Amazon LakeFarmation Amazon S3 σʔλΣΞϋε γεςϜ͔Βେͳ”ߏԽσʔλ ” ΛूΊཧ͢Δݿ
σʔλϨΠΫΛߏங ະՃͰ༻్ఆΊΒΕ͍ͯͳ͍ σʔλΛอ͢Δ ΦϒδΣΫτετϨʔδ ”ߏԽσʔλ”ɺ“ඇߏԽσʔλ ” ͳͲΛอ͢ΔετϨʔδ
Ճɾੳ
Amazon EMR AWS Glue AWS Glue Elastic View s
(ϓϨϏϡʔ) AWS Glue DataBrew ϏοάσʔλϑϨʔϜϫʔΫ ؔ࿈OSSΛΈ߹Θͤͯେྔσʔλͷ ETLετϦʔϛϯάॲཧੳΛ࣮ߦ αʔόϨεETL(நग़/ม/ϩʔυ) ϊʔίʔυͰσʔλͷ ΫϦʔϯΞοϓͱਖ਼نԽ ϚςϦΞϥΠζυϏϡʔߏங ෳσʔλετΞʹΞΫηεͯ͠ σʔλΛ݁߹&ίϐʔ
AWS Athena Amazon Kinesi s Data Analytics ΞυϗοΫΫΤϦΛS3ʹର࣮ͯ͠ߦ ετϦʔϛϯάσʔλΛมɺੳ Amazon
Redshift σʔλΣΞϋε ෳࡶͳSQLΫΤϦΛ࣮ߦ
ՄࢹԽ
Amazon QuickSight Amazon OpenSearch Service&Kibana ϦΞϧλΠϜσʔλݕࡧ/ՄࢹԽ αʔόϨεBIπʔϧ/ՄࢹԽ
ͲΜͳ࣌ʹ͏ ओཁͦ͏ͳͷ
Amazon Kinesis Video Streams ɾಈըσʔλΛੜ͢ΔσόΠε͍҃ΞϓϦέʔγϣϯ͕͋Δ ɾHLSͰϥΠϒಈըըϝσΟΞΛϒϥβεϚϗʹετϦʔϛϯά͍ͨ͠ ɾϦΞϧλΠϜͷํϝσΟΞετϦʔϛϯάwebϒϥβετϦʔϛϯά͕͍ͨ͠ ɾಈըσʔλΛRekognitionVideo(ಈըೝࣝ)SageMaker(ML)ʹ͍͍ͨ
ɾαʔόσόΠε͕ੜ͢ΔϩάΠϕϯτσʔλΛϦΞϧλΠϜͰߴऩू͍ͨ͠ ɾ1ඵҎԼͷ͞ͰσʔλΛऩू͍ͨ͠ ɾετϦʔϛϯάσʔλΛLambdaͰॲཧ͍ͨ͠ ɾετϦʔϛϯάσʔλΛEC2ʹసૹ͍ͨ͠ ɾετϦʔϛϯάσʔλΛKinesis Data Analyticsʹసૹͯ͠ϦΞϧλΠϜੳ͍ͨ͠ Amazon Kinesis Data
Streams
ɾετϦʔϜσʔλΛS3RedshiftɺOpenSearchService৴͍ͨ͠ ɾ΄΅ϦΞϧλΠϜ(60ඵҎ)ͷ͞ͰσʔλΛ্هσʔλετΞ৴͍ͨ͠ ɾσʔλΛDatadogɺNewRelicɺMongoDBͳͲͷαʔϏεϓϩόΠμ৴͍ͨ͠ ɾσʔλΛσʔλετΞʹ৴͢ΔલʹApachParquetApacheORCʹม͍ͨ͠ ɾΞϓϦͷ։ൃΠϯϑϥͷཧΛͤͣʹσʔλετΞ৴͍ͨ͠ Amazon Kinesis Data Firehose
ɾετϦʔϛϯάσʔλʹରͯ͠ϦΞϧλΠϜʹඪ४SQLͰΫΤϦ͍ͨ͠ ɾ1ඵະຬͷ͞ͰετϦʔϛϯάσʔλΛϦΞϧλΠϜͰੳ͍ͨ͠ ɾApache FlinkΛ༷ͬͯʑͳAWSαʔϏεͱ౷߹ͯ͠ετϦʔϛϯά ETL͍ͨ͠ ɾSQLɺJavaɺScalaɺPythonͰੳΞϓϦέʔγϣϯΛߏஙͯ͠ੳ͍ͨ͠ Amazon Kinesis Data Analytics
ɾϊϯϦΞϧλΠϜ ɾAWSͷετϨʔδίϯϐϡʔςΟϯάɺΦϯϓϨϛεͷσʔλΛఆظతʹҠಈ͍ͨ͠ ɾσʔλҠಈͷࡍʹ؆୯ͳมͳͲͷॲཧΛߦ͍͍ͨ ɾRDS→DynamoDBͳͲͷσʔλҠಈ͕͍ͨ͠ͳͲ AWS Data Pipeline
ɾߏԽσʔλɺߏԽσʔλΛੳ͍ͨ͠ ɾେن(ϖλόΠτ)σʔλʹରͯ͠ෳࡶͳSQLΫΤϦΛ࣮ߦ͍ͨ͠ ɾܧଓతͳॻ͖ࠐΈߋ৽ͳ͘ɺେنσʔλΛҰׅͰੳ͕͍ͨ͠ ɾRedshift SpectrumΛ༻͍ͯS3ͷσʔλʹରͯ͠SQLΫΤϦΛ࣮ߦ͍ͨ͠ ɾΫΤϦ݁ՌΛS3ʹอଘͯ͠ଞAWSαʔϏεͳͲͰར༻͍ͨ͠ Amazon Redshift
ɾσʔλS3ʹ͋ΓɺγϯϓϧͳΞυϗοΫΫΤϦΛ࣮ߦ͍ͨ͠ ɾcsvɺjsonɼorcɺParquetܗࣜͳͲͷϑΝΠϧʹΫΤϦ͍ͨ͠ ɾαʔόϨεʹΫΤϦΛ࣮ߦ͍ͨ͠ ɾETLෆཁ ɾΫΤϦ݁ՌΛcsvʹग़ྗ͍ͨ͠ AWS Athena
ɾσʔλϨΠΫΛ؆୯ʹߏங͍ͨ͠ ɾࠓޙͷσʔλੳʹ͚ͯنʹؔΘΒͣະՃͷσʔλΛҰݩอ͍ͨ͠ ɾσʔλՃޙɺະՃσʔλอ͍࣋ͨ͠ ɾ৫ͷ༷ʑͳ෦ॺ͕֤ʑσʔλΛͬͯੳΛ͍ͨ͠ Amazon LakeFarmation
ɾOSSΛॊೈʹΧελϚΠζͯ͠σʔλॲཧΛΓ͍ͨ ɾେنσʔληοτͷETL(நग़/ม/ಡΈࠐΈ)Λ͍ͨ͠ ɾApache Spark MLlibɺTensorFlowɺApache MXNetͰML͍ͨ͠ ɾApache SparkApache HiveͰS3ͷΫϦοΫετϦʔϜσʔλΛੳ͍ͨ͠ ɾApache
FlinkͱApache Spark StreamingͰϦΞϧλΠϜετϦʔϛϯά͍ͨ͠ Amazon EMR
ɾαʔόʔϨεͰதنͷETL(நग़/ม/ಡΈࠐΈ)͕͍ͨ͠ ɾRedshiftɺS3ɺRDSɺDynamoDBͳͲͷσʔλΛETL͍ͨ͠ ɾσʔλιʔεΛఆظతʹΫϩʔϧͯ͠DataCatalogΛߋ৽ࣗ͠ಈతʹม͍ͨ͠ AWS Glue
ɾOpenSearchΫϥελΛ؆୯ʹߏஙͯ͠ΞϓϦͷϩάσʔλΛੳ͍ͨ͠ ɾΞϓϦΣϒαΠτɺσʔλϨΠΫΧλϩάͷݕࡧͰ͖ΔΑ͏ʹ͍ͨ͠ ɾΠϯϑϥͷϩάϝτϦοΫΛऩूͯ͠ϦΞϧλΠϜʹՄࢹԽ͍ͨ͠ ɾετϦʔϜσʔλΛϦΞϧλΠϜʹՄࢹԽ͍ͨ͠ Amazon OpenSearch Service&Kibana
ɾαʔόϨεͳBIπʔϧ͕͍͍ͨ ɾ༷ʑͳσʔλιʔε͔ΒσʔλΛՄࢹԽ͍ͨ͠ ɹ※S3ɺRDSɺAthenaɺRedshiftɺOpenSearchɺcsvjsonͳͲ ɾϦΞϧλΠϜͰͳ͘ఆظతͳάϥϑσʔλͳͲͷϨϙʔτ͕ཉ͍͠ ɾ༷ʑͳάϥϑΛ༻͍ͯੳ͍ͨ͠ Amazon QuickSight
2VJDL4JHIUՄࢹԽΠϝʔδ IUUQTBXTBNB[PODPNKQRVJDLTJHIUHBMMFSZ
None
None
બఆʹ͓͚ΔߟྀϙΠϯτ
·ͱΊ
·ͱΊ ऩू/ੳ/ՄࢹԽͷཻʹӨڹ͢ΔͷͰɺ Կͷҝͷੳ͔Λ໌֬ʹ͠Α͏
͋Γ͕ͱ͏͍͟͝·ͨ͠