main goal: accurate detection of changes Online - data must be processed quickly “on the fly” before new data arrives - main goal: the quickest detection of a change after it has occured There are two types of change point analysis ...
average, lower & upper control limit - Focus on point-wise error rate - Lower & upper limit is determined based on standard deviation Example of control chart (source)
observations (ȳ) 2. Calculate residuals: difference between y i and ȳ 3. Set cumulative sum of residuals at 0: S 0 = 0 4. Calculate cumulative sum of residuals: S i = S i-1 + ε i Example of CUSUM plot (source)
0 ## occurrence when sample residual difference < observed residual difference for i in np.arange(0,N): _sample = pd.DataFrame( np.random.choice(df[0], size = df.shape[0], replace = False) ) _sample = calculate_cusum_residuals(_sample) _sample_resid_diff = calculate_residual_difference(_sample) if _sample_resid_diff < resid_diff: X += 1 confidence_level = 100 * X / N print("Confidence level: {:.2f}%".format(confidence_level)) CUSUM: how confident are we that the change exists?
into 2 segments - segment 1 = {1, …, m} - segment 2 = {m+1, …, n} 2. Calculate average value of each segment: X ̄ 1 and X ̄ 2 3. Calculate mean squared error of observation in each segment 4. Value of m which minimizes the MSE is the best estimator of the last point before the change occured n n n
set number of sample ## set t = time, from 0 to length of observations samples = 5000 ## number of iteration t = np.arange(0, len(z)) ## array of observation positions (time) with pm.Model() as model: ## define uniform priors for the mean values mu_a = pm.Uniform('mu_a', 0, 10) mu_b = pm.Uniform('mu_b', 0, 10) sigma = pm.HalfCauchy('sigma', np.std(z)) tau = pm.DiscreteUniform('tau', t.min(), t.max()) ## define stochastic variable mu mu = pm.math.switch(tau >= t, mu_a, mu_b) observation = pm.Normal('observation', mu, sigma, observed = z) trace = pm.sample(samples, step = pm.NUTS()) burned_trace = trace[1000:]
changepoint detection algorithms. useR! Tutorial 2017 Kass-Hout, T. (2010). Change point analysis. Slideshare. Bellei, C. (2016). Changepoint Detection. Part I - A Frequentist Approach. [Blog] Bellei, C. (2017). Changepoint Detection. Part II - A Bayesian Approach. [Blog] Davidson-Pilon, C. (2015). Chapter 1 - Introduction - PyMC3. Probabilistic Programming and Bayesian Methods for Hackers. Slide template by Slidesgo 38