Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Streaming Ingestion & Processing at Flipkart
Search
Siddhartha Reddy
May 15, 2015
Technology
0
390
Streaming Ingestion & Processing at Flipkart
Presented at the Bangalore Hadoop Meetup held on 15th May 2015.
Siddhartha Reddy
May 15, 2015
Tweet
Share
More Decks by Siddhartha Reddy
See All by Siddhartha Reddy
Future Patterns in Data Ecosystem
sids
1
190
CAP Theorem: You don’t need CP, you don’t want AP, and you can’t have CA
sids
6
11k
Other Decks in Technology
See All in Technology
会社にデータエンジニアがいることでできるようになること
10xinc
9
1.6k
マイクロモビリティシェアサービスを支える プラットフォームアーキテクチャ
grimoh
1
240
Product Management Conference -AI時代に進化するPdM-
kojima111
0
220
mruby(PicoRuby)で ファミコン音楽を奏でる
kishima
1
240
コスト削減の基本の「キ」~ コスト消費3大リソースへの対策 ~
smt7174
2
130
制約理論(ToC)入門
recruitengineers
PRO
3
310
RAID6 を楔形文字で組んで現代人を怖がらせましょう(実装編)
mimifuwa
1
310
自社製CMSからmicroCMSへのリプレースがプロダクトグロースを加速させた話
nextbeatdev
0
140
開発と脆弱性と脆弱性診断についての話
su3158
1
1.1k
VPC Latticeのサービスエンドポイント機能を使用した複数VPCアクセス
duelist2020jp
0
240
JOAI発表資料 @ 関東kaggler会
joai_committee
1
330
ソフトウェア エンジニアとしての 姿勢と心構え
recruitengineers
PRO
4
760
Featured
See All Featured
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
161
15k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
31
2.2k
Documentation Writing (for coders)
carmenintech
73
5k
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
36
2.5k
The Cult of Friendly URLs
andyhume
79
6.5k
Code Review Best Practice
trishagee
70
19k
Music & Morning Musume
bryan
46
6.7k
The Power of CSS Pseudo Elements
geoffreycrofte
77
5.9k
RailsConf 2023
tenderlove
30
1.2k
YesSQL, Process and Tooling at Scale
rocio
173
14k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
18
1.1k
Fantastic passwords and where to find them - at NoRuKo
philnash
51
3.4k
Transcript
Streaming Ingestion & Processing at Flipkart Siddhartha Reddy @sids
Flipkart Data Platform (an oversimplified view)
Streaming Ingestion
Choices • push, not pull • schemas & validations
Streaming Ingestion v1.0
None
• Push 㱺 accountability (with source teams) • good call!
• Schemas 㱺 contracts for consumers • can make assumptions that are assured to be true • Insufficient tooling 㱺 too many “ingestion frameworks” • adopt some frameworks & offer as tools! • Synchronous error handling 㱺 complexity • accept all data
Streaming Ingestion v2.0
Stream Processing
An Example
Streaming Joins: Example It works! But… how do we deal
with lookup failures?
Streaming Joins: Handling Failures
None
None
Streaming Joins: Bootstrapping With a little help from MR friends
Streaming Joins: But… The example that doesn’t really work correctly
Streaming Joins
In summary • Streaming Ingestion: push, schemas & validation, HTTP
service, local daemon, change data capture • Streaming Joins: indexing, lookup tables, map-joins, retry queue, batch re-driver sid@flipkart.com