Upgrade to Pro — share decks privately, control downloads, hide ads and more …

MoRe Than Monitoring

MoRe Than Monitoring

Monitorama Keynote, Boston, March 28 2013

Dr. Neil Gunther

March 28, 2013
Tweet

More Decks by Dr. Neil Gunther

Other Decks in Technology

Transcript

  1. Mo e Than Monitoring #monitoring ++ Neil Gunther Performance Dynamics

    Monitorama Keynote Boston, March 28 2013 SM c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 1 / 47
  2. Let’s Get Calibrated about Data Outline 1 Let’s Get Calibrated

    about Data 2 Potted History of Monitoring 3 Performance Visualization Basics 4 Monitored Data are Time Series 5 Performance Visualization in R 6 Possible Hacks c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 2 / 47
  3. Let’s Get Calibrated about Data Guerrilla Mantra: All data is

    wrong by definition Measurement is a process, not math. All data contains measurement errors. How big are they and can you tolerate them? Treating data as divine is a sin. c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 3 / 47
  4. Let’s Get Calibrated about Data Guerrilla Mantra: All data is

    wrong by definition Measurement is a process, not math. All data contains measurement errors. How big are they and can you tolerate them? Treating data as divine is a sin. c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 3 / 47
  5. Let’s Get Calibrated about Data Guerrilla Mantra: VAMOOS your data

    doubts Visualize Analyze Modelize Over and Over until Satisfied c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 4 / 47
  6. Let’s Get Calibrated about Data Guerrilla Mantra: VAMOOS your data

    doubts Visualize Analyze Modelize Over and Over until Satisfied c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 4 / 47
  7. Let’s Get Calibrated about Data Guerrilla Mantra: There are only

    3 performance metrics 1 Time, e.g., cpu_ticks 2 Rate (inverse time), e.g., httpGets/s, 3 Number or count, e.g., RSS c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 5 / 47
  8. Let’s Get Calibrated about Data Guerrilla Mantra: There are only

    3 performance metrics 1 Time, e.g., cpu_ticks 2 Rate (inverse time), e.g., httpGets/s, 3 Number or count, e.g., RSS c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 5 / 47
  9. Let’s Get Calibrated about Data Watch Out for Patterns I

    mean that in a bad way. Your brain can’t help itself. c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 6 / 47
  10. Potted History of Monitoring Outline 1 Let’s Get Calibrated about

    Data 2 Potted History of Monitoring 3 Performance Visualization Basics 4 Monitored Data are Time Series 5 Performance Visualization in R 6 Possible Hacks c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 7 / 47
  11. Potted History of Monitoring Old Adage: “Nothing New in Computer

    Science” Mainframes didn’t need real-time monitoring. Batch processing. c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 8 / 47
  12. Potted History of Monitoring How You Programmed It c 2013

    Performance Dynamics Mo e Than Monitoring March 30, 2013 9 / 47
  13. Potted History of Monitoring Later ... the interface improved c

    2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 10 / 47
  14. Potted History of Monitoring CTSS (Compatible Time-Sharing System) developed in

    1961 at MIT on IBM 7094. Compatible meant compatibility with the standard IBM batch processing O/S. c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 11 / 47
  15. Potted History of Monitoring Multics Instrumentation c.1965 Multics was a

    multiuser O/S following CTSS time-share. The Implementation “a rough measure of response time for a time-sharing console user, an exponential average of the number of users in the highest priority scheduling queue is continuously maintained. An integrator, L, initially zero, is updated periodically by the formula L ← L × m + Nq where Nq is the measured length of the scheduling queue at the instant of update, and m is an exponential damping constant” This equation is an iterative form of exponentially damped moving average. In modern terminology, it’s a data smoother. The Lesson “experience with Multics, and earlier with CTSS, shows that building permanent instrumentation into key supervisor modules is well worth the effort, since the cost of maintaining well-organized instrumentation is low, and the payoff is very high.” c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 12 / 47
  16. Potted History of Monitoring You know this better as ...

    Linux load average 58 extern unsigned long avenrun[ ]; /* Load averages */ 59 60 #define FSHIFT 11 /* nr of bits of precision */ 61 #define FIXED_1 (1<<FSHIFT) /* 1.0 as fixed-point */ 62 #define LOAD_FREQ (5*HZ) /* 5 sec intervals */ 63 #define EXP_1 1884 /* 1/exp(5sec/1min) as fixed-pt */ 64 #define EXP_5 2014 /* 1/exp(5sec/5min) */ 65 #define EXP_15 2037 /* 1/exp(5sec/15min) */ 66 67 #define CALC_LOAD(load,exp,n) \ 68 load *= exp; \ 69 load += n*(FIXED_1-exp); \ 70 load >>= FSHIFT; Lines 67–70 are identical to the 1965 Multics formula. See Chap. 4 of my Perl::PDQ book for the details. UNIX load average c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 13 / 47
  17. Potted History of Monitoring Unix at Bell Labs c.1970 c

    2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 14 / 47
  18. Potted History of Monitoring Unix at Bell Labs c.1970 CTSS

    c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 14 / 47
  19. Potted History of Monitoring Unix at Bell Labs c.1970 CTSSbegat

    Multics c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 14 / 47
  20. Potted History of Monitoring Unix at Bell Labs c.1970 CTSSbegat

    Multicsbegat Unics c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 14 / 47
  21. Potted History of Monitoring Unix at Bell Labs c.1970 CTSSbegat

    Multicsbegat Unicsbegat Unix c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 14 / 47
  22. Potted History of Monitoring Unix at Bell Labs c.1970 CTSSbegat

    Multicsbegat Unicsbegat Unix Get it? c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 14 / 47
  23. Potted History of Monitoring Then Came Screens 9:40 Note the

    mouse in her right hand. c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 15 / 47
  24. Potted History of Monitoring Unix top: A Legacy App Green

    ASCII characters on black background c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 16 / 47
  25. Potted History of Monitoring Desktop GUI c.1995 Lots of colored

    spaghetti c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 17 / 47
  26. Potted History of Monitoring Static Charts on the Web c.2000

    Load average over 24 hr period with 1, 5, 15 min LAs as green, blue, red TS. (which is completely redundant, BTW) As informative as watching a ticker chart on Wall Street c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 18 / 47
  27. Potted History of Monitoring Browser-based Dashboards Interminable strip charts are

    not good for your brain. c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 19 / 47
  28. Performance Visualization Basics Outline 1 Let’s Get Calibrated about Data

    2 Potted History of Monitoring 3 Performance Visualization Basics 4 Monitored Data are Time Series 5 Performance Visualization in R 6 Possible Hacks c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 20 / 47
  29. Performance Visualization Basics The Central Challenge Find the best cognitive

    impedance match between the digital computer and the neural computer c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 21 / 47
  30. Performance Visualization Basics Cognitive Circuitry is Largely Unknown PerfViz is

    an N-dimensional problem Brain is trapped in (3 + 1)-dimensions No 5-fold rotational symmetry Physicists have all the fun with SciViz Time dimension becomes animation sequence c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 22 / 47
  31. Performance Visualization Basics Your Brain is Easily Fooled All cognition

    is computation Your brain is a differential analyzer Difference errors produce perceptual illusions c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 23 / 47
  32. Monitored Data are Time Series Outline 1 Let’s Get Calibrated

    about Data 2 Potted History of Monitoring 3 Performance Visualization Basics 4 Monitored Data are Time Series 5 Performance Visualization in R 6 Possible Hacks c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 24 / 47
  33. Monitored Data are Time Series Gothic graphs can hurt your

    brain (Bad Z value) c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 25 / 47
  34. Monitored Data are Time Series There’s a Whole Science of

    Color c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 26 / 47
  35. Monitored Data are Time Series Pastel Colors on White 0

    1000 2000 3000 4000 5000 200000 400000 600000 800000 1200000 t-Index LIO/s Sandy Bridge 16 VPU Throughput test1.HTT.Turb test2.Turbo test3.HTT test4.AllOff c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 27 / 47
  36. Monitored Data are Time Series Pastel Colors on Black 0

    1000 2000 3000 4000 5000 200000 400000 600000 800000 1200000 t-Index LIO/s Sandy Bridge 16 VPU Throughput test1.HTT.Turb test2.Turbo test3.HTT test4.AllOff c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 28 / 47
  37. Monitored Data are Time Series Pastel Colors on Neutral Gray

    0 1000 2000 3000 4000 5000 200000 400000 600000 800000 1200000 t-Index LIO/s Sandy Bridge 16 VPU Throughput test1.HTT.Turb test2.Turbo test3.HTT test4.AllOff c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 29 / 47
  38. Monitored Data are Time Series Coordinated Colors on Neutral Gray

    0 1000 2000 3000 4000 5000 200000 400000 600000 800000 1200000 t-Index LIO/s Sandy Bridge 16 VPU Throughput test1.HTT.Turb test2.Turbo test3.HTT test4.AllOff c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 30 / 47
  39. Monitored Data are Time Series Time Series Can Reveal Data

    Correlations 9:50 02:00 07:00 12:00 17:00 22:00 0 10 20 30 CPU% 02:00 07:00 12:00 17:00 22:00 75 85 95 Mem% 02:00 07:00 12:00 17:00 22:00 0 5 10 15 20 ioWait% 02:00 07:00 12:00 17:00 22:00 0.0 0.2 0.4 Time LdAvg-1 server.p.65 : 2012-05-03 to 2012-05-04 c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 31 / 47
  40. Monitored Data are Time Series But Data Doesn’t Tell All:

    Monitored Server Consumption 0 50 100 150 200 Time (m:s) Capacity (U%) 00:02 02:32 05:08 07:38 10:08 12:38 15:18 17:48 20:18 22:48 Server saturation Uavg data Umax data Monitored Server Consumption c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 32 / 47
  41. Monitored Data are Time Series Beyond Data: Effective Server Consumption

    0 50 100 150 200 Time (m:s) Capacity (U%) 00:02 02:32 05:08 07:38 10:08 12:38 15:18 17:48 20:18 22:48 Effective max consumption Server saturation Uavg data Umax data Ueff predicted Lookahead Server Consumption c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 33 / 47
  42. Performance Visualization in R Outline 1 Let’s Get Calibrated about

    Data 2 Potted History of Monitoring 3 Performance Visualization Basics 4 Monitored Data are Time Series 5 Performance Visualization in R 6 Possible Hacks c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 34 / 47
  43. Performance Visualization in R Choose Your Cognitive Z in R

    0 1 2 3 4 5 mpg 100 200 300 400 2 3 4 5 10 15 20 25 30 100 200 300 400 disp drat 3.0 3.5 4.0 4.5 5.0 10 15 20 25 30 2 3 4 5 3.0 3.5 4.0 4.5 5.0 wt 4 6 8 10 15 20 25 30 3D Scatterplot 1 2 3 4 5 6 10 15 20 25 30 35 0 100 200 300 400 500 wt disp mpg c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 35 / 47
  44. Performance Visualization in R Enhanced Plots in R Raw bench

    data p Xp 50 100 150 200 250 300 10 20 30 40 50 60 Data smoother p Xp 50 100 150 200 250 300 10 20 30 40 50 60 USL fit p Xp 50 100 150 200 250 300 10 20 30 40 50 60 USL fit + CI bands p Xp 50 100 150 200 250 300 10 20 30 40 50 60 c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 36 / 47
  45. Performance Visualization in R Chernoff Faces in R Example (using

    R) library(TeachingDemos) faces2(matrix( runif(18*10), nrow=12), main=’Random Faces’) c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 37 / 47
  46. Performance Visualization in R Kiviat and Radar Charts in R

    Correlation Radar Alp12Mn AvrROE DivToP GrowAPS GrowAsst GrowBPS GrowCFPS GrowDPS GrowEPS GrowSPS HistAlp HistSigm InvVsSal LevGrow Payout5 PredSigm RecVsSal Ret12Mn Ret3Mn Ret1Mn ROE _CshPlow _DDM _EarnMom _EstChgs _EstRvMd _Neglect _NrmEToP _PredEToP _RelStMd _ResRev _SectMom AssetToP ARM_Pref_Earnings AvrCFtoP AvrDtoP AvrEtoP ARM_Sec_Earnings BondSens BookToP Capt CaptAdj CashToP CshFlToP CurrSen DivCuts5 EarnToP Earnvar Earnyld Growth HistBeta IndConc Leveflag Leverag Leverage Lncap Momentum Payoflag PredBeta Ret_11M_Momentum PotDilu Price ProjEgro RecEPSGr SalesToP Size SizeNonl Tradactv TradVol Value VarDPS Volatility Yield CFROI ADJUST ERC RC SPX R1000 MarketCap TotalRisk Value_AX truncate_ret_1mo truncate_PredSigma Residual_Returns ARM_Revenue ARM_Rec_Comp ARM_Revisions_Comp ARM_Global_Rank ARM_Score TEMP EQ_Raw EQ_Region_Rank EQ_Acc_Comp EQ_CF_Comp EQ_Oper_Eff_Comp EQ_Exc_Comp -0.5 0 0.5 1 Example (using R) require(plotrix) corelations <- c(1:97) corelation.names <- names(corelations) <- c("Alp12Mn", "AvrROE", "DivToP", "GrowAPS", "GrowAsst", "GrowBPS", "GrowCFPS", ... corelations <- c(0.223, 0.1884, -0.131, 0.1287, 0.0307, ... par(ps=6) radial.plot(corelations, labels=corelation.names,rp.type="p", main="Correlation Radar", radial.lim=c(-1,1),line.col="blue") c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 38 / 47
  47. Performance Visualization in R Treemaps in R GDAT: Top 100

    Websites -8e+09 -4e+09 0e+00 4e+09 8e+09 Search/portal Retail Software Media/news Social network Reference Video Portal Blogging Financial Computer Media/news Commerce Tech news Photo sharing Health WeatherAdult Travel Gaming Voip File sharing Online dating Children Recruitment Sport File storage Forum GDAT: Top 100 Websites -8e+09 -4e+09 0e+00 4e+09 8e+09 Google MSNBing Yahoo! Microsoft Facebook YouTube Wikipedia AOL eBay Apple Amazon Blogger Ask Fox Interactive Media Mozilla Real Network Adobe About PayPalWordPress Weather Channel Glam MediaCNN Twitter Skype CBS IMDb Wal-Mart Craigslist BBC Terra CNETOrange Disney Online AT&T NetShelter Technology Flickr Picasa Gorilla Nation Websites WikiAnswers Orkut Chase UOL Bank of America eHow Livejasmin ESPN Zynga Shopzilla Comcast Videolan Everyday Health Network LinkedIn Expedia iG Target Dell Globo Scripps Networks Digital NYTimes LimeWire WebMD FriendFinder Network Shopping.com Nickelodeon Kids and Family Network Classmates Online NetflixMeebo Six Apart Turner Sports & Entertainment Digital Network Comcast Hewlett Packard NexTag NBC Universal Conduit Verizon TripAdvisor Best Buy Monster RTL Network Priceline Network Experian Pornhub iVillage UPS SuperPages Fox News NFL Dailymotion T-Online Reed Business Information Network Free Citibank Vistaprint Sears Tribune Newspapers Electronic Arts Online Megaupload Vodafone Geeknet Example (using R) library(portfolio) bbc <- read.csv("nielsen100-2010.csv") map.market(id=seq(1:100), area=bbc$uniqueAudience, group=bbc$categoryBBC, color=bbc$totalVisits, main="GDAT: Top 100 Websites") There is another treemap pkg on CRAN c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 39 / 47
  48. Performance Visualization in R Heatmap of Multiple Servers in Time

    c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 40 / 47
  49. Performance Visualization in R Barry in 2D p1 p3 p2

    p3=1/3 p1=1/3 p2=1/3 p2 p3=0.3 p1=0.6 p2=0.1 p1 p3 p1 p1 p3 p2 p3=1/3 p1=1/3 p2=1/3 p2 p3=0.3 p1=0.6 p2=0.1 p1 p3 p1 Barycentric coordinate system for %CPU = %user + %sys + %idle c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 41 / 47
  50. Performance Visualization in R Barry in 3D: Tukey-like Rotations Tukey

    trumps Tufte Barycentric coordinate system for %BW = %unicast + %multicast + %broadcast + %idle c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 42 / 47
  51. Possible Hacks Outline 1 Let’s Get Calibrated about Data 2

    Potted History of Monitoring 3 Performance Visualization Basics 4 Monitored Data are Time Series 5 Performance Visualization in R 6 Possible Hacks c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 43 / 47
  52. Possible Hacks Interactive and Streaming in R R derives from

    S at Bell Labs (home of Unix) c.1975, 1980, 1988 R scripting language console interface > (x^(k-1)*exp^(-x/s))/(gamma(k)*s^k) cf. Mathematica document paradigm xk−1 e−x/θ Γ(k) θk No fonts, no symbolic computation More recent focus is on enabling: Better IDE integration, e.g., RStudio Browser-based interaction, e.g., Shiny Streaming data acquisition, e.g., R plus Hadoop, but ... R interpreter is single-threaded Needs a full app stack b/w data and R engine Revolution Analytics is in this space Plenty of room for innovative development c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 44 / 47
  53. Possible Hacks Some Ideas for Tomorrow 1 Lots of opportunities

    2 Coupling simple statistical analysis to monitored data 3 Display the errors in monitored data 4 Replace the black background in Graphite 5 Apply ColorBrewer to Graphite 6 Apply effective capacity consumption to your monitored data 7 Replacing strip charts with animation WARNING Common sense is the p i t f a l l of all performance analysis c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 45 / 47
  54. Possible Hacks Modelizing GitHub Growth Since I didn’t discuss modeling

    part of VAMOOS ... Donnie Berkholz of redmonk.com wrote on his Jan 21, 2013 blog that GitHub will reach: 4 million users near Aug 2013 5 million users near Dec 2013 That’s based on a log-linear model. I claim it’s a log-log model and therefore: 4 million users around Oct 2013 5 million users around Apr 2014 c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 46 / 47
  55. Possible Hacks Performance Dynamics Company Castro Valley, California www.perfdynamics.com perfdynamics.blogspot.com

    twitter.com/DrQz Facebook [email protected] OFF: +1-510-537-5758 c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 47 / 47