Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Data streams processing with PHP and STORM
Search
Sponsored
·
SiteGround - Reliable hosting with speed, security, and support you can count on.
→
Mariusz Gil
April 20, 2013
Programming
750
5
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Data streams processing with PHP and STORM
Mariusz Gil
April 20, 2013
More Decks by Mariusz Gil
See All by Mariusz Gil
Aspect Oriented Programming
mariuszgil
1
340
Designing and implementing GraphQL API
mariuszgil
1
110
Discovering unknown with EventStorming ConFoo
mariuszgil
0
320
Game of Developer Life... Deconstructed
mariuszgil
1
200
Back to forgotten roots
mariuszgil
1
430
Go micro with microservices
mariuszgil
5
710
Machine Learning for the rescue
mariuszgil
0
450
Discovering graph structures
mariuszgil
3
560
Introduction to Aerospike with PHP
mariuszgil
8
870
Other Decks in Programming
See All in Programming
運用エージェントは "作る" から "育てる" へ - 記憶と自己進化の3層設計パターン / self-evolving-agents-three-layer-agent-design
gawa
12
3.6k
Spec Driven Development | AI Summit Lisbon
danielsogl
PRO
0
170
肥大化するレガシーコードに立ち向かうためのインターフェース分離と依存の逆転 / JJUG CCC 2026 Spring
hirokunimaeta
0
530
AI 時代のソフトウェア設計の学び方
masuda220
PRO
29
12k
コンテキストの使い捨てをやめる — ビジネスルール駆動開発と miko —
ioki
0
180
生成AI時代にこそ効くGo | Why Go Works in the Age of Generative AI
mom0tomo
8
3.2k
IBM Bobを活用したレガシーアプリの最新化
oniak3ibm
PRO
1
190
Lessons from Spec-Driven Development
simas
PRO
0
150
Swiftのレキシカルスコープ管理
kntkymt
0
220
ユニットテストの先へ:テスト技法で要求・仕様を整理するJava開発実践 / Beyond_Unit_Testing_Practical_Java_Development_Techniques_for_Organizing_Requirements_and_Specifications
shimashima35
0
380
正しくソフトウェアを作る、前提を疑うための認知の視点 / doubt-premise
minodriven
20
6.4k
3Dシーンの圧縮
fadis
1
690
Featured
See All Featured
YesSQL, Process and Tooling at Scale
rocio
174
15k
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.7k
Google's AI Overviews - The New Search
badams
0
1k
Design of three-dimensional binary manipulators for pick-and-place task avoiding obstacles (IECON2024)
konakalab
0
450
What Being in a Rock Band Can Teach Us About Real World SEO
427marketing
0
250
Ruling the World: When Life Gets Gamed
codingconduct
0
250
Git: the NoSQL Database
bkeepers
PRO
432
67k
What’s in a name? Adding method to the madness
productmarketing
PRO
24
4.1k
What the history of the web can teach us about the future of AI
inesmontani
PRO
1
610
How to build a perfect <img>
jonoalderson
1
5.6k
Data-driven link building: lessons from a $708K investment (BrightonSEO talk)
szymonslowik
1
1.1k
Collaborative Software Design: How to facilitate domain modelling decisions
baasie
1
250
Transcript
PROCESSING t he php way of... STORM DAta STREAMS Mariusz
Gil
about me
#php #scalability #nosql #performance #hadoop #hive #pig #bigdata #mahout #datamining
#storm https://music.twitter.com/_login/background.jpg
batch #1 batch #2 batch #3 t he P r
obl em
t he S t or y
STORM DISTRIBUTED REALTIME COMPUTATION SYSTEM
scalable no data lost fault tolerant extremely robust language agnostic
efficient messaging local or distributed
terms and architecture
Spouts Bolts Stream Topologies (val1, val2) (val3, val4) (val5, val6)
unbounded sequence of tuples tuple tuple tuple tuple tuple tuple tuple
Spouts Bolts Stream Topologies (val1, val2) (val3, val4) (val5, val6)
source of streams tuple tuple tuple tuple tuple tuple tuple tuple tuple tuple tuple tuple tuple tuple
Spouts Bolts Stream Topologies (val1, val2) (val3, val4) (val5, val6)
process input streams and produce new streams tuple tuple tuple tuple tuple tuple tuple tuple tuple tuple tuple tuple tuple tuple
Spouts Bolts Stream Topologies (val1, val2) (val3, val4) (val5, val6)
network of spouts and bolts TextSpout SplitSentenceBolt WordCountBolt [sentence] [word] [word, count]
None
storm-kestrel storm-kafka storm-amqp-spout storm-jms storm-pubsub storm-beanstalkd mapr-spout
shuffle grouping fields grouping all grouping global grouping direct grouping
local or shuffle grouping
ZooKeepers Supervisors Nimbus
fast CLUSTER STATE IS STORED LOCALLY OR IN ZOOKEEPERS fail
code examples
https://github.com/nathanmarz/storm
https://github.com/maltoe/storm-install
https://github.com/nathanmarz/storm-starter/
https://github.com/lazyshot/storm-php
public class DoubleAndTripleBolt extends BaseRichBolt { private OutputCollectorBase _collector; @Override
public void prepare(Map conf, TopologyContext context, OutputCollectorBase collector) { _collector = collector; } @Override public void execute(Tuple input) { int val = input.getInteger(0); _collector.emit(input, new Values(val*2, val*3)); _collector.ack(input); } @Override public void declareOutputFields(OutputFieldsDeclarer declarer) { declarer.declare(new Fields("double", "triple")); } } Java example / bolt
public static class ExclamationBolt implements IRichBolt { OutputCollector _collector; public
void prepare(Map conf, TopologyContext context, OutputCollector collector) { _collector = collector; } public void execute(Tuple tuple) { _collector.emit(tuple, new Values(tuple.getString(0) + "!!!")); _collector.ack(tuple); } public void cleanup() { } public void declareOutputFields(OutputFieldsDeclarer declarer) { declarer.declare(new Fields("word")); } public Map getComponentConfiguration() { return null; } } Java example / bolt
TopologyBuilder builder = new TopologyBuilder(); builder.setSpout("words", new TestWordSpout(), 10); builder.setBolt("exclaim1",
new ExclamationBolt(), 3) .shuffleGrouping("words"); builder.setBolt("exclaim2", new ExclamationBolt(), 2) .shuffleGrouping("exclaim1"); Java example / topology ... words exclaim1 exclaim2
zkServer.sh start bin/storm nimbus bin/storm supervisor bin/storm ui #optional storm
jar all-my-code.jar backtype.storm.MyTopology arg1 arg2 Java example / run
PHP example / spout PHP example / spout require_once('storm.php'); class
RandomSentenceSpout extends ShellSpout { ! protected $sentences = array( ! ! "the cow jumped over the moon", ! ! "an apple a day keeps the doctor away", ! ! "four score and seven years ago", ! ! "snow white and the seven dwarfs", ! ); ! protected function nextTuple() ! { ! ! sleep(.1); ! ! $sentence = $this->sentences[ rand(0, count($this->sentences) -1)];! ! ! $this->emit(array($sentence)); ! } ! protected function ack($tuple_id) ! { ! ! return; ! } ! protected function fail($tuple_id) ! { ! ! return; ! }! } $SentenceSpout = new RandomSentenceSpout(); $SentenceSpout->run();
PHP example / bolt require_once('storm.php'); class SplitSentenceBolt extends BasicBolt {
! public function process(Tuple $tuple) ! { ! ! $words = explode(" ", $tuple->values[0]); ! ! foreach($words as $word) ! ! { ! ! ! $this->emit(array($word)); ! ! } ! } } $splitsentence = new SplitSentenceBolt(); $splitsentence->run();
/** * This topology demonstrates Storm's stream groupings and multilang
capabilities. */ public class WordCountPHPTopology { public static class SplitSentence extends ShellBolt implements IRichBolt { public SplitSentence() { super("php", "splitsentence.php"); } @Override public void declareOutputFields(OutputFieldsDeclarer declarer) { declarer.declare(new Fields("word")); } @Override public Map<String, Object> getComponentConfiguration() { return null; } } // ... } MultiLang example / Topology, Bolt
{"command": "next"} {"command": "ack", "id": "1231231"} {"command": "fail", "id": "1231231"}
NonJVMSpout NonJVMBolt {"command": "sync"} { ! "command": "emit", ! "id": "1231231", ! "stream": "1", ! "task": 9, ! "tuple": ["field1", 2, 3] } { ! "id": "-6955786537413359385", ! "comp": "1", ! "stream": "1", ! "task": 9, ! "tuple": ["snow white and dwarfs", "field2", 3] } { ! "command": "emit", ! "anchors": ["1231231", "-234234234"], ! "stream": "1", ! "task": 9, ! "tuple": ["field1", 2, 3] } https://github.com/nathanmarz/storm/wiki/Multilang-protocol
demo
use cases
stream processing
continous query computation
RPC distributed arguments results [request-id, arguments] [request-id, results]
realtime analytics personalization search revenue optimization monitoring
content search realtime analytics generating feeds integrated with elastic search,
Hbase,hadoop and hdfs
realtime scoring moments generation integration with kafka queues and hdfs
storage
thanks! feel free to contact with me email:
[email protected]
twitter:
@mariuszgil