Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
AWSにおけるデータ分析入門 / Introduction To Data Analytic...
Search
hedgehog051
October 06, 2021
0
210
AWSにおけるデータ分析入門 / Introduction To Data Analytics In AWS
hedgehog051
October 06, 2021
Tweet
Share
More Decks by hedgehog051
See All by hedgehog051
AWS Generative AI CDK Constructsについて
hedgehog051
2
220
KnowledgeBasesとAgentsの紹介
hedgehog051
4
1.6k
BedrockUpdatesPost-GW Summary
hedgehog051
4
690
来てくれClaude 3! Agents for Amazon Bedrockのモデル比較或いはチューニングの話
hedgehog051
5
1.6k
Relic_Tech_Camp_GenerativeAI.pdf
hedgehog051
11
87k
concurrencyで爆速並列デプロイ
hedgehog051
1
1.7k
AWS App Runnerについてとこれから期待したいこと/About-AWS-App-Runner-and-what-to-expect-in-the-future
hedgehog051
0
75
また増えた!?AWSコンテナ関連サービスを10分でざっくり掴もう/Learn-about-AWS-0container-services-in-10-minutes
hedgehog051
0
87
Featured
See All Featured
Building a Modern Day E-commerce SEO Strategy
aleyda
39
7.2k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
30
1.1k
VelocityConf: Rendering Performance Case Studies
addyosmani
328
24k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
102
18k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
12
1.4k
Site-Speed That Sticks
csswizardry
4
450
The Cost Of JavaScript in 2023
addyosmani
48
7.6k
Six Lessons from altMBA
skipperchong
27
3.7k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
331
21k
Making the Leap to Tech Lead
cromwellryan
133
9.2k
What's in a price? How to price your products and services
michaelherold
245
12k
Visualization
eitanlees
146
16k
Transcript
"8 4 ʹ ͓ ͚ Δ σ ʔ λ
ੳ ೖ ג ࣜ ձ ࣾ R e l i c ۽ ా
ࣗݾհ • ۽ా ,BO,VNBEB • ळdΠϯϑϥΤϯδχΞ • ݄ʹגࣜձࣾ3FMJDೖࣾ
σʔλੳ͕͍ͨ͠ʜ
ϏδωεΛΠϯςϦδΣϯε͍ͨ͠ʜ
ʑσʔλੳͷػӡߴ·Δ
ͦͷલʹ
• ݁Ռ࣮ͳͲͷσʔλΛऩू ˞زΒചΓ্͔͛ͨɺͲΕ͘Β͍ΞΫηε͕͔͋ͬͨͳͲ • ऩूͨ͠σʔλ͔ΒԿ͔͠ΒͷΠϯαΠτΛಘΔ ˞Կ͕ചΕ͍ͯΔ͔ɺ͍ͭɺ୭ʹചΕ͍ͯΔ͔ͳͲ • ಘΒΕͨΠϯαΠτʹରͯ͠ΞΫγϣϯΛى͜͢ ˞Ձ֨ΛௐɺදࣔΛௐɺλʔήοτ֦େͳͲ
σʔλੳͬͯԿ͢Δͷ
ԿΛ࣮ݱͨͯ͘͠σʔλੳΛ ͢Δͷ͔Λ໌֬ʹ͢Δͷ͕େࣄ
"84Ͱͷσʔλੳؔ࿈αʔϏε
ͳΔ΄ͲɺΘ͔ΒΜ
• ݁Ռ࣮ͳͲͷσʔλΛऩू ˠͲ͏ͬͯूΊΔ͔ɺԿॲʹूΊΔ͔ • ऩूͨ͠σʔλ͔ΒԿ͔͠ΒͷΠϯαΠτΛಘΔ ˠੳ͘͢͠ՃɺੳɺՄࢹԽ σʔλੳج൫Λߏங͢Δʹ͋ͨͬͯ
ͬ͘͟Γྨ
ऩू Amazon Kinesis Amazon Kinesi s Video Streams Amazon Kinesi
s Data Streams Amazon Kinesi s Data Firehose Amazon Manage d Streaming for Apache Kafka AWS Data Pipeline AWS Data Exchange
ੵ Amazon Redshift Amazon LakeFarmation Amazon S3
Ճ Amazon EMR AWS Glue AWS Glue Elastic Views
AWS Glue DataBrew Amazon Kinesi s Data Analytics
ੳ Amazon EMR AWS Athena Amazon Kinesi s Data Analytics
Amazon Redshift Amazon QuickSight Amazon OpenSearch Service
ՄࢹԽ Amazon ElasticSearch Service Amazon QuickSight Amazon OpenSearch Service ৭ʑ͋ͬͯ
ؾ࣋ͪɺগ͠ํੑݟ͖͑ͯͨ ؾ͕͢Δ
ͦΕͧΕΛͬ͘͟Γ
ऩू
ϦΞϧλΠϜετϦʔϛϯά Amazon Kinesis Amazon Kinesi s Video Streams Amazon Kinesi
s Data Streams Amazon Kinesi s Data Firehose Amazon Manage d Streaming for Apache Kafka AWS Data Pipeline AWS Data Exchange
Amazon Kinesis Amazon Kinesi s Video Streams Amazon Kinesi s
Data Streams Amazon Kinesi s Data Firehose Amazon Manage d Streaming for Apache Kafka KinesisαʔϏεͷ૯শ ετϦʔϛϯάಈըͷΩϟϓνϟɺ ॲཧɺอଘ ετϦʔϜσʔλͷΩϟϓνϟɺ ॲཧɺอଘ AWS σʔλετΞʹ ετϦʔϜσʔλΛϩʔυ ϚωʔδυܕApache Kafk a ετϦʔϜσʔλͷૹड৴
ͦͷଞ Amazon Kinesis Amazon Kinesi s Video Streams Amazon Kinesi
s Data Streams Amazon Kinesi s Data Firehose Amazon Manage d Streaming for Apache Kafka AWS Data Pipeline AWS Data Exchange
AWS Data Pipeline AWS Data Exchange αʔυύʔςΟσʔλͷ αϒεΫϦϓγϣϯ Reuters͕ఏڙ͢ΔهࣄσʔλͳͲ ఆظ࣮ߦʹΑΔσʔλҠಈɺม
ੵ
Amazon Redshift Amazon LakeFarmation Amazon S3 σʔλΣΞϋε γεςϜ͔Βେͳ”ߏԽσʔλ ” ΛूΊཧ͢Δݿ
σʔλϨΠΫΛߏங ະՃͰ༻్ఆΊΒΕ͍ͯͳ͍ σʔλΛอ͢Δ ΦϒδΣΫτετϨʔδ ”ߏԽσʔλ”ɺ“ඇߏԽσʔλ ” ͳͲΛอ͢ΔετϨʔδ
Ճɾੳ
Amazon EMR AWS Glue AWS Glue Elastic View s
(ϓϨϏϡʔ) AWS Glue DataBrew ϏοάσʔλϑϨʔϜϫʔΫ ؔ࿈OSSΛΈ߹Θͤͯେྔσʔλͷ ETLετϦʔϛϯάॲཧੳΛ࣮ߦ αʔόϨεETL(நग़/ม/ϩʔυ) ϊʔίʔυͰσʔλͷ ΫϦʔϯΞοϓͱਖ਼نԽ ϚςϦΞϥΠζυϏϡʔߏங ෳσʔλετΞʹΞΫηεͯ͠ σʔλΛ݁߹&ίϐʔ
AWS Athena Amazon Kinesi s Data Analytics ΞυϗοΫΫΤϦΛS3ʹର࣮ͯ͠ߦ ετϦʔϛϯάσʔλΛมɺੳ Amazon
Redshift σʔλΣΞϋε ෳࡶͳSQLΫΤϦΛ࣮ߦ
ՄࢹԽ
Amazon QuickSight Amazon OpenSearch Service&Kibana ϦΞϧλΠϜσʔλݕࡧ/ՄࢹԽ αʔόϨεBIπʔϧ/ՄࢹԽ
ͲΜͳ࣌ʹ͏ ओཁͦ͏ͳͷ
Amazon Kinesis Video Streams ɾಈըσʔλΛੜ͢ΔσόΠε͍҃ΞϓϦέʔγϣϯ͕͋Δ ɾHLSͰϥΠϒಈըըϝσΟΞΛϒϥβεϚϗʹετϦʔϛϯά͍ͨ͠ ɾϦΞϧλΠϜͷํϝσΟΞετϦʔϛϯάwebϒϥβετϦʔϛϯά͕͍ͨ͠ ɾಈըσʔλΛRekognitionVideo(ಈըೝࣝ)SageMaker(ML)ʹ͍͍ͨ
ɾαʔόσόΠε͕ੜ͢ΔϩάΠϕϯτσʔλΛϦΞϧλΠϜͰߴऩू͍ͨ͠ ɾ1ඵҎԼͷ͞ͰσʔλΛऩू͍ͨ͠ ɾετϦʔϛϯάσʔλΛLambdaͰॲཧ͍ͨ͠ ɾετϦʔϛϯάσʔλΛEC2ʹసૹ͍ͨ͠ ɾετϦʔϛϯάσʔλΛKinesis Data Analyticsʹసૹͯ͠ϦΞϧλΠϜੳ͍ͨ͠ Amazon Kinesis Data
Streams
ɾετϦʔϜσʔλΛS3RedshiftɺOpenSearchService৴͍ͨ͠ ɾ΄΅ϦΞϧλΠϜ(60ඵҎ)ͷ͞ͰσʔλΛ্هσʔλετΞ৴͍ͨ͠ ɾσʔλΛDatadogɺNewRelicɺMongoDBͳͲͷαʔϏεϓϩόΠμ৴͍ͨ͠ ɾσʔλΛσʔλετΞʹ৴͢ΔલʹApachParquetApacheORCʹม͍ͨ͠ ɾΞϓϦͷ։ൃΠϯϑϥͷཧΛͤͣʹσʔλετΞ৴͍ͨ͠ Amazon Kinesis Data Firehose
ɾετϦʔϛϯάσʔλʹରͯ͠ϦΞϧλΠϜʹඪ४SQLͰΫΤϦ͍ͨ͠ ɾ1ඵະຬͷ͞ͰετϦʔϛϯάσʔλΛϦΞϧλΠϜͰੳ͍ͨ͠ ɾApache FlinkΛ༷ͬͯʑͳAWSαʔϏεͱ౷߹ͯ͠ετϦʔϛϯά ETL͍ͨ͠ ɾSQLɺJavaɺScalaɺPythonͰੳΞϓϦέʔγϣϯΛߏஙͯ͠ੳ͍ͨ͠ Amazon Kinesis Data Analytics
ɾϊϯϦΞϧλΠϜ ɾAWSͷετϨʔδίϯϐϡʔςΟϯάɺΦϯϓϨϛεͷσʔλΛఆظతʹҠಈ͍ͨ͠ ɾσʔλҠಈͷࡍʹ؆୯ͳมͳͲͷॲཧΛߦ͍͍ͨ ɾRDS→DynamoDBͳͲͷσʔλҠಈ͕͍ͨ͠ͳͲ AWS Data Pipeline
ɾߏԽσʔλɺߏԽσʔλΛੳ͍ͨ͠ ɾେن(ϖλόΠτ)σʔλʹରͯ͠ෳࡶͳSQLΫΤϦΛ࣮ߦ͍ͨ͠ ɾܧଓతͳॻ͖ࠐΈߋ৽ͳ͘ɺେنσʔλΛҰׅͰੳ͕͍ͨ͠ ɾRedshift SpectrumΛ༻͍ͯS3ͷσʔλʹରͯ͠SQLΫΤϦΛ࣮ߦ͍ͨ͠ ɾΫΤϦ݁ՌΛS3ʹอଘͯ͠ଞAWSαʔϏεͳͲͰར༻͍ͨ͠ Amazon Redshift
ɾσʔλS3ʹ͋ΓɺγϯϓϧͳΞυϗοΫΫΤϦΛ࣮ߦ͍ͨ͠ ɾcsvɺjsonɼorcɺParquetܗࣜͳͲͷϑΝΠϧʹΫΤϦ͍ͨ͠ ɾαʔόϨεʹΫΤϦΛ࣮ߦ͍ͨ͠ ɾETLෆཁ ɾΫΤϦ݁ՌΛcsvʹग़ྗ͍ͨ͠ AWS Athena
ɾσʔλϨΠΫΛ؆୯ʹߏங͍ͨ͠ ɾࠓޙͷσʔλੳʹ͚ͯنʹؔΘΒͣະՃͷσʔλΛҰݩอ͍ͨ͠ ɾσʔλՃޙɺະՃσʔλอ͍࣋ͨ͠ ɾ৫ͷ༷ʑͳ෦ॺ͕֤ʑσʔλΛͬͯੳΛ͍ͨ͠ Amazon LakeFarmation
ɾOSSΛॊೈʹΧελϚΠζͯ͠σʔλॲཧΛΓ͍ͨ ɾେنσʔληοτͷETL(நग़/ม/ಡΈࠐΈ)Λ͍ͨ͠ ɾApache Spark MLlibɺTensorFlowɺApache MXNetͰML͍ͨ͠ ɾApache SparkApache HiveͰS3ͷΫϦοΫετϦʔϜσʔλΛੳ͍ͨ͠ ɾApache
FlinkͱApache Spark StreamingͰϦΞϧλΠϜετϦʔϛϯά͍ͨ͠ Amazon EMR
ɾαʔόʔϨεͰதنͷETL(நग़/ม/ಡΈࠐΈ)͕͍ͨ͠ ɾRedshiftɺS3ɺRDSɺDynamoDBͳͲͷσʔλΛETL͍ͨ͠ ɾσʔλιʔεΛఆظతʹΫϩʔϧͯ͠DataCatalogΛߋ৽ࣗ͠ಈతʹม͍ͨ͠ AWS Glue
ɾOpenSearchΫϥελΛ؆୯ʹߏஙͯ͠ΞϓϦͷϩάσʔλΛੳ͍ͨ͠ ɾΞϓϦΣϒαΠτɺσʔλϨΠΫΧλϩάͷݕࡧͰ͖ΔΑ͏ʹ͍ͨ͠ ɾΠϯϑϥͷϩάϝτϦοΫΛऩूͯ͠ϦΞϧλΠϜʹՄࢹԽ͍ͨ͠ ɾετϦʔϜσʔλΛϦΞϧλΠϜʹՄࢹԽ͍ͨ͠ Amazon OpenSearch Service&Kibana
ɾαʔόϨεͳBIπʔϧ͕͍͍ͨ ɾ༷ʑͳσʔλιʔε͔ΒσʔλΛՄࢹԽ͍ͨ͠ ɹ※S3ɺRDSɺAthenaɺRedshiftɺOpenSearchɺcsvjsonͳͲ ɾϦΞϧλΠϜͰͳ͘ఆظతͳάϥϑσʔλͳͲͷϨϙʔτ͕ཉ͍͠ ɾ༷ʑͳάϥϑΛ༻͍ͯੳ͍ͨ͠ Amazon QuickSight
2VJDL4JHIUՄࢹԽΠϝʔδ IUUQTBXTBNB[PODPNKQRVJDLTJHIUHBMMFSZ
None
None
બఆʹ͓͚ΔߟྀϙΠϯτ
·ͱΊ
·ͱΊ ऩू/ੳ/ՄࢹԽͷཻʹӨڹ͢ΔͷͰɺ Կͷҝͷੳ͔Λ໌֬ʹ͠Α͏
͋Γ͕ͱ͏͍͟͝·ͨ͠