Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Oracle Meets Fractals and Learns the Power of P...

Dr. Neil Gunther
December 12, 2012
2

Oracle Meets Fractals and Learns the Power of Power Laws

Talk presented at Northern California Oracle User Group (NoCOUG) meeting in 2012.

Dr. Neil Gunther

December 12, 2012
Tweet

Transcript

  1. Oracle Meets Fractals and Learns the Power of Power Laws

    Dr. Neil Gunther Performance Dynamics NoCOUG Winter Conference 2012 Thursday, February 23 @ 1 pm SM c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 1 / 44
  2. Fractals Outline 1 Fractals What is a Fractal? How It

    Works Internet Traffic 2 Applications Word Fractals Fractal Query Times Fractal Time Series 3 Conclusions c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 1 / 44
  3. Fractals What is a Fractal? Outline 1 Fractals What is

    a Fractal? How It Works Internet Traffic 2 Applications Word Fractals Fractal Query Times Fractal Time Series 3 Conclusions c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 1 / 44
  4. Fractals What is a Fractal? Fractals in Space c 2012

    Performance Dynamics Oracle Meets Fractals February 21, 2012 2 / 44
  5. Fractals What is a Fractal? Fractals are Described by Power

    Laws Zipf’s law 5e+02 2e+03 5e+03 2e+04 5e+04 2e+05 Zipf's Law for UK English Words Ranked Words (W) Frequency (F) the it at would much us love lay eye dare Created by NJG Sat Oct 24 11:15:45 2009 F = W!D D = 1.1311 c = 981293.8 SQL accesses 1 2 5 10 20 50 100 100 200 300 400 500 Log-Log SQL Times [1:100] Index etimesA Internet packets DRAFT log10(d) log10(r/s) 0 1 2 3 4 5 0 1 2 3 4 Website visitors !"#$%&'()*')+,'-./'0+-/1+/-' ' $ !"$ "$ !"$ #$ !"$ $$ !"$ "$ !"$ #$ !"$ $$ %&'()*+,-+&%./&)+0.1.2,*1$ *3%4+,-+1.2)$ 567+0.1.2,*1+2,+1.2)1$ -.2+8.29+ !$ +:+!$ c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 3 / 44
  6. Fractals How It Works Outline 1 Fractals What is a

    Fractal? How It Works Internet Traffic 2 Applications Word Fractals Fractal Query Times Fractal Time Series 3 Conclusions c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 3 / 44
  7. Fractals How It Works Calibrating a Circumference Approximating circumference by

    regular polygons with successively shorter sides. Polygon represents measurement device or ruler Double-log plot of the estimated circumference (y-axis) vs. the length of the polygon side (x-axis). As the sides get shorter the perimeter of the polygon approaches the actual circumference. 1.00 0.50 0.20 0.30 0.15 1.50 0.70 Ln S 6.1 6.15 6.2 6.25 Ln P Estimated perimeter vs. length of polygon sides Euclidean geometry of the circle Greeks knew (irrational) ratio of diameter D to circumference C: π = C/D Successive measurements converge to fixed value: C Speed of convergence is clearly seen on logarithmic axes. c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 4 / 44
  8. Fractals How It Works Crinkly Coastlines What about highly irregular

    shapes like coastline of Britain? Greeks would never consider such imperfect non-Euclidean geometry. Successively smaller ruler size (S) produces longer coastline estimate (L)! Why? Smaller ruler gets into more coastal nooks and crannies. c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 5 / 44
  9. Fractals How It Works Borders and Wars Plot border length

    (L) against ruler size (S) on log-log axes1. Why do border lengths fall on straight lines in log-log plot? Any crazy country shape is then characterized by a single number: its slope! Reason remained obscure until Mandelbrot resurrected it as geometry of fractals2 1 Lewis F. Richardson (1961) “The problem of contiguity: An appendix to Statistic of Deadly Quarrels.” 2 B. Mandelbrot, The Fractal Geometry of Nature, W. H. Freeman, New York (1983) c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 6 / 44
  10. Fractals How It Works The Power of Power Laws Straight

    lines on log-log plot have form: Y = mX + c But Y ≡ ln(L) and X ≡ ln(S) with negative slope: ln(L) = −α ln(S) + ln(k) = ln(k S−α) Taking antilogs of both sides reveals general power law form: L = k S−α L = k Sα (1) Reverse this logic If your data looks “linear” on a log-log plot Assume it signals presence of a power law like (1) Find the slope to characterize it Exponent α is the “power” in power law c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 7 / 44
  11. Fractals How It Works The Shape of Power Laws Α

    1.0 Α 1.5 Α 2.0 Α 3.0 20 40 60 80 100 0.01 0.02 0.03 0.04 General shape of power law eqn.(1) is a hyperbola Blue curve is an exponential function Other curves are power laws with increasing α exponents (slopes) Big powers Large α power laws are indistinguishable from an exponential c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 8 / 44
  12. Fractals How It Works The Tale is in the Tail

    k CDF 0 10 20 30 40 50 x 0.1 0.2 0.3 0.4 0.5 PDF Power laws differ from standard statistical distributions Power laws carry most of the information in their tail Fatter tail corresponds to stronger correlations than “normal” Mass of tail measured by cumulative distribution function (CDF) Log-log fitting On a log-log plot we are trying to fit right-hand side data, not left side c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 9 / 44
  13. Fractals Internet Traffic Outline 1 Fractals What is a Fractal?

    How It Works Internet Traffic 2 Applications Word Fractals Fractal Query Times Fractal Time Series 3 Conclusions c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 9 / 44
  14. Fractals Internet Traffic Internet Congestion Internet performance collapsed c.1986 Bellcore

    c.1990 impact of ISDN broadband Packet tracing measurements at Bellcore Surprise: large packet trains Surprise: Service times file-size dependent Surprise: Packet arrivals not always Poisson Surprise: Queueing models break down How to do CaP for future Internet? Why is it happening? Part of the answer is power laws Netflix uses 33% of USA Internet BW c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 10 / 44
  15. Fractals Internet Traffic Strangeness in the Interpipes DRAFT 0 100

    200 300 400 500 600 700 800 900 1000 0 20000 40000 60000 (a) Time Unit = 100 Seconds Packets/Time Unit 0 100 200 300 400 500 600 700 800 900 1000 0 2000 4000 6000 (b) Time Unit = 10 Seconds Packets/Time Unit 0 100 200 300 400 500 600 700 800 900 1000 0 200 400 600 800 (c) Time Unit = 1 Second Packets/Time Unit 0 100 200 300 400 500 600 700 800 900 1000 0 20 40 60 80 100 (d) Time Unit = 0.1 Second Packets/Time Unit Packets/Time Unit 5 10 15 Read bottom to top, left to right Variance persists over 5 decades of time c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 11 / 44
  16. Fractals Internet Traffic Traces on Log-Log Axes DRAFT log10(d) log10(r/s)

    0 1 2 3 4 5 0 1 2 3 4 Figure 4.2.1 (b). Pox plot of R/S for sequence AUG89.MB. The plot tightly clusters around a straight line whose asymptotic slope clearly lies between the slopes 0.5 (lower dotted line) and 1.0 (upper dotted line) and is readily estimated (using the "brushed" points) to be about 0.79. 39 Y = (max − min)/std dev (“rescaled range”) X = sample size (in trace file c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 12 / 44
  17. Fractals Internet Traffic Fractals in Time 10 5 5 10

    0.05 0.10 0.15 0.20 0.25 Bellcore packet data shows fractal behavior in time (5 decades) Diffusion model of Brownian motion Solution is a normal distribution which evolves in time (t) ∂t f(x, t) = σ2∂2 x f(x, t) f(x, t) = 1 √ 4πσ2t exp „ −(x − µ)2 4σ2t « L2 = 4σ2t Diffusion length L = √ 4σ2t L ∼ t 1 2 Generalization L ∼ t 1 2 → tH (Brownian → Levy) What happens if H = 1 2 becomes 1 2 < H < 1? c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 13 / 44
  18. Fractals Internet Traffic Router Occupancy 0.0 0.2 0.4 0.6 0.8

    1.0 0 2 4 6 8 10 Router utilization Buffer occupancy M/M/1 (H = 0.50) H = 0.75 H = 0.90 Q: queue length or buffer occupancy ρ = λS: router utilization H: power law exponent (Hurst parameter) Q = ρ 1 2(1−H) (1 − ρ) 1 1−H H = 0.5 is identical to M/M/1 queue H = 0.9 Internet empirical Hurst exponent Buffer overflow can occur at lower loads Router model x<-c(1:100) rho<-x/100 qlen<-function(r,H){rˆ(1/(2*(1-H))) / ((1-r)ˆ(H/(1-H)))} plot(rho,qlen(rho,0.5),type="l",xlab="Router utilization",ylab="Buffer occupancy",ylim=c(0,10)) lines(rho,qlen(rho,0.75),col="blue") lines(rho,qlen(rho,0.90),col="red") c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 14 / 44
  19. Applications Outline 1 Fractals What is a Fractal? How It

    Works Internet Traffic 2 Applications Word Fractals Fractal Query Times Fractal Time Series 3 Conclusions c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 14 / 44
  20. Applications Word Fractals Outline 1 Fractals What is a Fractal?

    How It Works Internet Traffic 2 Applications Word Fractals Fractal Query Times Fractal Time Series 3 Conclusions c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 14 / 44
  21. Applications Word Fractals Data Source — Corpus (Body of Words)

    Data (already ranked) is 1000 most common wordforms in UK English based on 29 works of literature by 18 authors (4.6 × 106 words) Wordform: english word Abs: absolute frequency (total number of occurrences) r: range (number of texts in which the word occurs) mod: modified frequency as defined by Rosengren (1972) Read data file > dir<-setwd("˜/../GDAT Scripts/Power Laws/") > td<-read.table("zipf1000.txt",header=TRUE) > head(td) Rank Wordform Abs r mod 1 1 the 225300 29 223066.9 2 2 and 157486 29 156214.4 3 3 to 134478 29 134044.8 4 4 of 126523 29 125510.2 5 5 a 100200 29 99871.2 6 6 I 91584 29 86645.5 c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 15 / 44
  22. Applications Word Fractals Linear Visualization 0 50000 100000 150000 200000

    Ranked 1000 UK English Words Ranked words (W) Frequency of occurrence (F) the their us love voice true state eye stand worth service neck land art Linear plot of data > plot(td$Rank, td$Abs, type="p", main="Ranked 1000 UK English Words", xlab="Ranked words (W)", ylab="Frequency of occurrence (F)", xaxt="n",log="xy",cex=0.5) > ticks.at<-seq(min(td$Rank), max(td$Rank),10) > ticks.lab<-as.character(td$Wordform[ticks.at]) > Axis(td$Rank, at=ticks.at, las=1, side=1,labels=ticks.lab, cex.axis=0.75) c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 16 / 44
  23. Applications Word Fractals Double-Log Visualization 5e+02 2e+03 5e+03 2e+04 5e+04

    2e+05 Ranked 1000 UK English Words Ranked words (W) Frequency of occurrence (F) the it at would much us love lay eye dare Data on log-log axes > plot(td$Rank, td$Abs, type="p", main="Ranked 1000 UK English Words", xlab="Ranked words (W)", ylab="Frequency of occurrence (F)", xaxt="n",log="xy",cex=0.5) > ticks.at<-seq(min(td$Rank), max(td$Rank),10) > ticks.lab<-as.character(td$Wordform[ticks.at]) > Axis(td$Rank, at=ticks.at, las=1, side=1,labels=ticks.lab, cex.axis=0.75) c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 17 / 44
  24. Applications Word Fractals Regression Fit 5e+02 2e+03 5e+03 2e+04 5e+04

    2e+05 Ranked 1000 UK English Words Ranked words (W) Frequency of occurrence (F) the it at would much us love lay eye dare Regression fit to logarithmic data # regression model of Y=log(y) and X=log(x) > z.fit <- lm(log(td$Abs) ˜ log(td$Rank)) # Must transform back to log scaled coords in plot > ly <- exp( (coef(z.fit)[2])*log(td$Rank) + coef(z.fit)[1] ) > lines(td$Rank, ly, col="blue",lty="solid",lwd=2) c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 18 / 44
  25. Applications Word Fractals Summary of Regression Statistics Regression summary >

    summary(z.fit) Call: lm(formula = log(td$Abs) ˜ log(td$Rank)) Residuals: Min 1Q Median 3Q Max -1.47144 -0.06902 -0.01477 0.05757 0.81894 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 13.796627 0.024210 569.9 <2e-16 *** log(td$Rank) -1.131084 0.004039 -280.0 <2e-16 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 0.1258 on 998 degrees of freedom Multiple R-squared: 0.9874, Adjusted R-squared: 0.9874 F-statistic: 7.841e+04 on 1 and 998 DF, p-value: < 2.2e-16 c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 19 / 44
  26. Applications Fractal Query Times Outline 1 Fractals What is a

    Fractal? How It Works Internet Traffic 2 Applications Word Fractals Fractal Query Times Fractal Time Series 3 Conclusions c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 19 / 44
  27. Applications Fractal Query Times Interpreting Time Histogram SQL Queries Elapse

    time Frequency 0 100 200 300 400 0 50 100 150 Histogram of measured SQL query times x-axis is elapsed time in seconds y-axis is number of queries with that time What distribution profile is it? Exponential, log-normal,... Can’t tell by just staring at it c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 20 / 44
  28. Applications Fractal Query Times Data Source Original question: Craig Shallahamer’s

    blog Attempted solution: Dave Abercrombie’s blog My solution: My blog Read data file > dir<-setwd("˜/Desktop/GDAT Dev 2011/GDAT Scripts/Power Laws/") > orad<-read.table("orasql-data.txt",header=FALSE) # Add column names > colnames(orad) <- c( "SQLid", "sample", "execns", "dkReads", "buffGets", "CPUtime", "Elapstime" ) > head(orad) SQLid sample execns dkReads buffGets CPUtime Elapstime 1 8qtkxy0g5d1p3,2282376281 1 1 0 3 0.100 0.100 2 8qtkxy0g5d1p3,2282376281 2 1 0 3 0.106 0.106 3 8qtkxy0g5d1p3,2282376281 3 1 0 3 0.101 0.101 4 8qtkxy0g5d1p3,2282376281 4 1 0 3 0.098 0.098 5 8qtkxy0g5d1p3,2282376281 5 1 6 118 0.000 33.575 6 8qtkxy0g5d1p3,2282376281 6 1 8 137 10.000 31.004 c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 21 / 44
  29. Applications Fractal Query Times Visualize Raw Data 0 100 200

    300 400 500 0 100 200 300 400 Index orad$Elapstime Linear plot of unranked data > plot(orad$Elapstime, type="h") Like Zipf’s law, data must be ranked by frequency of occurrence c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 22 / 44
  30. Applications Fractal Query Times Visualize Ranked Data 0 100 200

    300 400 500 0 100 200 300 400 Ranked SQL Times Index otr Linear plot of ranked data > otr <- sort(orad$Elapstime, decreasing=TRUE) > plot(otr,type="h",main="Ranked SQL Times") c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 23 / 44
  31. Applications Fractal Query Times Double-Log Visualization 1 2 5 10

    20 50 100 200 500 0.1 0.5 1.0 5.0 50.0 500.0 Log-Log SQL Times Index otr Log-log plot of ranked data > plot(otr,log="xy",main="Log-Log SQL Times") Clearly this profile is not power law overall But the first 100 queries do appear to be power law c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 24 / 44
  32. Applications Fractal Query Times Data Regions This suggests breaking data

    across 3 regions as follows: Windowed plots # Define data windows of ranked data etA<-otr[1:100] etB<-otr[100:270] # gap... etC<-otr[420:500] plot(etA,type="p",log="xy",main="Log-Log of SQL-A Times") plot(etB,type="p",log="y", main="Log-Lin of SQL-B Times") plot(etC,type="p",log="y", main="Log-Lin of SQL-C Times") 1 2 5 10 20 50 100 100 200 300 400 500 Log-Log of SQL-A Times Index etA 0 50 100 150 30 40 50 60 70 80 Log-Lin of SQL-B Times Index etB 0 20 40 60 80 0.090 0.095 0.100 0.105 0.110 Log-Lin of SQL-C Times Index etC c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 25 / 44
  33. Applications Fractal Query Times Data Region A Fit 1 2

    5 10 20 50 100 100 200 300 400 500 Log-Log SQL A-Times Index etA Regression analysis for Window A > xA<-seq(1:length(etA)) > zA.fit<-lm(log(etA) ˜ log(xA)) > EyA<-exp(coef(zA.fit)[2]*log(xA) + coef(zA.fit)[1]) > plot(etA,log="xy",main="Log-Log SQL A-Times") > lines(xA,EyA,col="blue",lwd=2) c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 26 / 44
  34. Applications Fractal Query Times Data Region B Fit 0 50

    100 150 30 40 50 60 70 80 Log-Lin SQL B-Times Index etB Regression analysis for Window B > xB<-seq(1:length(etB)) > zB.fit<-lm(log(etB) ˜ xB) > EyB<-exp(coef(zB.fit)[2]*xB + coef(zB.fit)[1]) > plot(etB,log="y",main="Log-Lin SQL B-Times") > lines(xB,EyB,col="blue",lwd=2) c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 27 / 44
  35. Applications Fractal Query Times Data Region C Fit 0 20

    40 60 80 0.090 0.095 0.100 0.105 0.110 Log-Lin SQL C-Times Index etC Regression analysis for Window C > xC<-seq(1:length(etC)) > zC.fit<-lm(log(etC) ˜ xC) > EyC<-exp(coef(zC.fit)[2]*xC + coef(zC.fit)[1]) > plot(etC,log="y",main="Log-Lin SQL C-Times") > lines(xC,EyC,col="blue",lwd=2) c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 28 / 44
  36. Applications Fractal Query Times Regression Models 0 20 40 60

    80 100 0 200 400 600 800 x y yA ∼ x−0.4632 power law yB ∼ e−0.0074x exponential decay yC ∼ e−0.0028x exponential decay Regression coefficients > coef(zA.fit) (Intercept) log(xA) 6.6055308 -0.4632485 > coef(zB.fit) (Intercept) xB 4.416070310 -0.007438368 > coef(zC.fit) (Intercept) xC -2.198802043 -0.002782828 c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 29 / 44
  37. Applications Fractal Query Times Slope Analysis 1 2 5 10

    20 50 100 100 200 300 400 500 Log-Log SQL A-Times Index etA From coef(zA.fit) know log(xA) = -0.4632485 Empirical slope γ = 0.46 to two significant decimal digits About half Zipfian slope γ = 1.0 ± 0.5 Correlations are stronger than for Zipf Hypothesis Shorter query times (window A) may be associated with dictionary lookups or other structured data. That structure provides correlations. Longer queries in windows B and C are not structured (ad hoc?) and are therefore more randomized. The lack of strong correlations shows up as different exponential decay rates. c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 30 / 44
  38. Applications Fractal Time Series Outline 1 Fractals What is a

    Fractal? How It Works Internet Traffic 2 Applications Word Fractals Fractal Query Times Fractal Time Series 3 Conclusions c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 30 / 44
  39. Applications Fractal Time Series The Perfect Storm A Power Law

    Storm c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 31 / 44
  40. Applications Fractal Time Series Before the Storm All businesses are

    required to register with the Australian Tax Office (ATO) for an Australian Business Number (ABN) to claim an income tax refund. The ABN was introduced in Y2K. Data from website hosting initial ABN registrations. Period covers March 27 to September 19, 2000 Post-advertising traffic 1 March to 30 May , 2000 Deadline spike on 31 May, 2000 Smaller traffic peaks from 1 June to 30 June, 2000 Post deadline period from 1 July to 19 Sept, 2000 Full details can be found in my CMG-A paper cmga-p10167.pdf included in your GDAT class materials. c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 32 / 44
  41. Applications Fractal Time Series Full Data Profile 11 3 2000

    21 4 2000 21 5 2000 15 6 2000 10 7 2000 4 8 2000 29 8 2000 0 200000 400000 600000 800000 1. 106 ORA Connections Question: Could the “11th hour” spike have been predicted? Answer: Yes, but quite involved. How: Using a power law. What else!? c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 33 / 44
  42. Applications Fractal Time Series Log-Linear Plot 11 3 2000 21

    4 2000 21 5 2000 1 104 2 104 5 104 1 105 2 105 5 105 1 106 2 106 ORA Connections y-axis is the number of Oracle RDBMS connections Here, the y-axis is log scaled Peak growth preceding spike looks linear on semi-log plot x-axis index (not shown) is “days from the start of data window” time series index range t = 0 to t = 38 days c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 34 / 44
  43. Applications Fractal Time Series Semi-Log Regression on Peaks 11 3

    2000 21 4 2000 1 104 2 104 5 104 1 105 2 105 5 105 1 106 ORA Connections Linear peak-growth on semi-log axes Curve must be an exponential function Use Exp as regression model ˆ y(t) = A exp(Bt) Model parameters Origin: A = 114128 Curvature: B= 0.0175 Doubling period: T2 = ln(2) B ∼ 6 months c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 35 / 44
  44. Applications Fractal Time Series Trend Overview 11 3 2000 21

    4 2000 21 5 2000 15 6 2000 10 7 2000 0 200000 400000 600000 800000 1. 106 ORA Connections Revert to linear axes to review the trend Exponential forecast up to the crosshairs looks valid But significantly underestimates onset of the “11th hour” peak As well as rapid drop off on RHS of the peak c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 36 / 44
  45. Applications Fractal Time Series Power Law Fit 20 40 60

    80 100 120 400000 600000 800000 1. 106 Power law has a critical point tc Equation: ˆ y(t) = k|t − tc|−γ See far LHS curve with tc = 40 (blue) Estimate ˆ y(t) → ∞ at t = tc Translate ˆ y(t) rightward until lower part of curve matches Exp function (red) Critical point also moves to tc = 61 (31 May, 2000) Critical point New element is the appearance of a critical point at tc Power law goes infinite and spikes at tc = 61 with γ = 0.6421 c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 37 / 44
  46. Applications Fractal Time Series Comparison of Models Exp growth Power

    law 11 3 2000 21 4 2000 21 5 2000 15 6 2000 10 7 2000 0 200000 400000 600000 800000 1. 106 ORA Connections Exponential trend is consistent with data through April 2000 Completely underestimates onset of the “11th hour” spike Completely overestimates decay of traffic load beyond spike Data is already exceeding Exp model during April-May period Power law model predicts all these effects quite well Critical point is inclusion of critical point tc Use Exp model to baseline a power law model c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 38 / 44
  47. Applications Fractal Time Series Look-ahead Tools Could we have seen

    the spike coming without knowing tc ? 2 weeks from data start 0.30 0.35 0.40 0.45 9.24 9.26 9.28 9.30 log(scale) log(SDF) SDF H = 0.714 1.5 months from data start 0.4 0.6 0.8 1.0 9.0 9.2 9.4 9.6 9.8 10.0 log(scale) log(SDF) SDF H = 0.687 Estimate H = 1 2 (1 − β) from the slope β of ln[S(f)] vs ln[f] in frequency domain: H ∈ [1 2 , 1) persistent autocorrelations (increase/decrease typically followed by increase/decrease) H = 1 2 statistically independent random fluctuations H ∈ (0, 1 2 ] antipersistent autocorrelations (increase/decrease typically followed by decrease/increase) c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 39 / 44
  48. Conclusions Outline 1 Fractals What is a Fractal? How It

    Works Internet Traffic 2 Applications Word Fractals Fractal Query Times Fractal Time Series 3 Conclusions c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 39 / 44
  49. Conclusions Review Power laws are ubiquitous (but hidden) Power laws

    are not like standard statistical distributions Power laws have fatter tails than standard dsns Fat tail carries the bulk of the information (tale is in the tail) Power laws are easy to demonstrate with log-log plot Looked at 3 examples: Zipf, ORA SQL, and ORA ABN traffic spike Trick is to explain persistent strong correlations May need more data but that’s exactly how it should be c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 40 / 44
  50. Conclusions Wanna Learn More? Chapter 10 Internet Planning Bellcore traces

    Fractals and Self-Similarity Short-range Dependence Long-range Dependence (LRD) Ethernet Packetization LRD and Flicker Noise Guerrilla training classes c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 41 / 44
  51. Conclusions Why You Should Care Power laws are ubiquitous c

    2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 42 / 44
  52. Conclusions Why You Should Care Power laws are ubiquitous Hard

    to see them in raw performance data c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 42 / 44
  53. Conclusions Why You Should Care Power laws are ubiquitous Hard

    to see them in raw performance data Must transform your data to see them c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 42 / 44
  54. Conclusions Why You Should Care Power laws are ubiquitous Hard

    to see them in raw performance data Must transform your data to see them Ranked data appears linear on double-log axes c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 42 / 44
  55. Conclusions Why You Should Care Power laws are ubiquitous Hard

    to see them in raw performance data Must transform your data to see them Ranked data appears linear on double-log axes More persistent response degradation than usual c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 42 / 44
  56. Conclusions Why You Should Care Power laws are ubiquitous Hard

    to see them in raw performance data Must transform your data to see them Ranked data appears linear on double-log axes More persistent response degradation than usual Can seriously impact overall database performance c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 42 / 44
  57. Conclusions References B. Mandelbrot, The Fractal Geometry of Nature, W.

    H. Freeman, 1983 L. Liebovitch, Fractals and Chaos Simplified for the Life Sciences, Oxford Uni. Press, 1998 K. Park and W. Willinger, Self-Similar Network Traffic and Performance Evaluation, John Wiley, 2000 N. J. Gunther, Guerrilla Capacity Planning, Springer, 2007 www.perfdynamics.com/iBook/gcap.html The R Project Tools for Statistical Computing www.r-project.org Performance Dynamics Educational Services Local and on-site training www.perfdynamics.com/Classes/schedule.html c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 43 / 44
  58. Conclusions Thank you for attending! Performance Dynamics Company Castro Valley,

    California www.perfdynamics.com perfdynamics.blogspot.com twitter.com/DrQz facebook.com/Performance-Dynamics-Company [email protected] +1-510-537-5758 c 2012 Performance Dynamics Oracle Meets Fractals February 21, 2012 44 / 44