The Web graph • Road Network Real-world graphs are dynamic • Rapidly evolving • Streams of events/messages • Calls, interactions, mobile proximity • Relationships decay/become stale over time 6
0 5 10 15 20 25 30 time elapsed (days) 0.4 0.5 0.6 0.7 0.8 ratio of cuts Interaction graph from 1 month of mobile calls data (CDR) Inactive links expire in 1 week partition quality: ratio of edges cut between partitions 10
0 2 4 6 8 10 throughput (queries per hour) Maximum clique performance week Hash partitioning Adaptive partitioning 0 5 10 15 20 25 30 time elapsed (days) 0.4 0.5 0.6 0.7 0.8 ratio of cuts Quality of partitioning Dataset: 1 month of mobile calls 21 million unique nodes 7% Addition 4% Deletion each week Sliding window of one week Algorithm: Maximum clique computation 13
Average Tweets per sec 0 1 2 3 4 5 Superstep time (s) 0 2 4 6 8 10 12 14 16 18 20 22 24 Time (h) Tweets per second Hash superstep time Adaptive superstep time Dataset: one week of tweets published from London in 2012 Algorithm: TunkRank (User influence metric) 14
• Adaptive partitioning / repartitioning might be needed • >50% performance improvement on dynamic graphs • Partitioning overhead should be considered • Smarter partition strategies might not be practical • Migrations /repartitions might not be worth it • BSP aids system optimisations • Message aggregation (migration & computation) Lessons learnt 15