Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How to find a transiting exoplanets

How to find a transiting exoplanets

A colloquium about noise.

Dan Foreman-Mackey

May 16, 2017
Tweet

More Decks by Dan Foreman-Mackey

Other Decks in Science

Transcript

  1. Dan Foreman-Mackey Sagan Fellow / University of Washington @exoplaneteer /

    dfm.io / github.com/dfm How to find a transiting exoplanet data-driven discovery in the astronomical time domain
  2. Dan Foreman-Mackey Sagan Fellow / University of Washington @exoplaneteer /

    dfm.io / github.com/dfm Noise models and some more noise models
  3. transit radial velocity direct imaging microlensing timing 2712 692 52

    40 25 Data Source: The Open Exoplanet Catalogue
  4. Data Source: The Open Exoplanet Catalogue 2000 2005 2010 2015

    year 0 500 1000 confirmed exoplanets transit RV microlensing direct imaging timing
  5. Data Source: The NASA Exoplanet Archive; Kepler DR25; 5/13/2017 1

    10 100 1000 orbital period [days] 1 10 planet radius [R ]
  6. Burke et al. (2015) the data, rises toward small planets

    with a = -1.8 2 and has a break near the edge of the parameter space. Given the low numbers of observed planet candidates in the smallest planet bins, the full posterior allowed behavior (1σ orange region ; 3σ Figure 6) the occurrence rates in the smallest Rp bins. (b) The more complicated model ensures the ability to adapt to variations in the PLDF in the sensitivity analysis of Section 6.2. (c) Previous work on Kepler planet occurrence rates indicated a break in the planet population for 1 2.0 Rp  2.8 Å R (Fressin et al. 2013; Petigura et al. 2013a, 2013b; Silburt et al. 2015). (d) Finally, extending this work to a larger parameter space and for alternative target selection samples, such as the Kepler M dwarf sample where a sharp break at Rp ∼ 2.5 Å R is observed (Dressing & Charbonneau 2013; Burke et al. 2015), the double power law in Rp is strongly (BIC >10) warranted. Symptomatic of the weak evidence for a broken power law model over the ⩽ 0.75 Rp ⩽ 2.5 Å R range, Rbrk is not constrained within the prior Rp limits of the parameter space. When Rbrk is near the lower and upper Rp limits, a1 and a2 also become poorly constrained, respectively. To provide a more meaningful constraint on the average power law behavior for Rp in the double power law PLDF model, we introduce aavg , which we set to a a = avg 1 if ⩾ R R brk mid and a a = avg 2 otherwise, where Rmid is the midpoint between the upper and lower limits of Rp . We find a = -1.54 0.5 avg and b = -0.68 0.17 for our baseline result. We use aavg as a summary statistic for the model parameters only to enable a simpler comparison of our results to independent analyses of planet occurrence rates and to approximate the behavior for the power law Rp dependence if we had used the simpler single power law model. The results for a single power law model in both Rp and P orb are equivalent to the results for the double Figure 7. Same as Figure 6, but marginalized over 0.75 < Rp < 2.5 Å R and bins of dP orb = 31.25 days. Figure 8. Shows the underlying planet occurrence rate model. Marginalized over 50 < P orb < 300 days and bins of dRp =0.25 Å R planet occurrence rates for the model parameters that maximize the likelihood (white dash line). Posterior distribution for the underlying planet occurrence rate for the median (blue solid line), 1σ region (orange region), and 3σ region (blue region). An approximate PLDF based upon results from Petigura et al. (2013a) for comparison (dash dot line). Figure 9. Same as Figure 8, but marginalized over 0.75 < Rp < 2.5 Å R and bins of dP orb =31.25 days. Figure 6) the occurrence rates in the smallest Rp bins. (b) The more complicated model ensures the ability to adapt to variations in the PLDF in the sensitivity analysis of Section 6.2. (c) Previous work on Kepler planet occurrence rates indicated a break in the planet population for 1 2.0 Rp  2.8 Å R (Fressin et al. 2013; Petigura et al. 2013a, 2013b; Silburt et al. 2015). (d) Finally, extending this work to a larger parameter space and for alternative target selection samples, such as the Kepler M gure 7. Same as Figure 6, but marginalized over 0.75 < Rp < 2.5 Å R and bins dP orb = 31.25 days. Figure 9. Same as Figure 8, but marginalized over 0.75 < Rp < 2.5 Å R and bins of dP orb =31.25 days. he Astrophysical Journal, 809:8 (19pp), 2015 August 10 Burke et al.
  7. 1.0 0.5 0.0 0.5 1.0 time since transit [days] 100

    50 0 relative brightness [ppm]
  8. …but this is the real world. A few problems: 1

    Timing 2 Geometry 3 Spacecraft motion 4 Intrinsic brightness variation
  9. …but this is the real world. A few problems: 1

    Timing 2 Geometry 3 Spacecraft motion 4 Intrinsic brightness variation transit probability
  10. …but this is the real world. A few problems: 1

    Timing 2 Geometry 3 Spacecraft motion 4 Intrinsic brightness variation transit probability noise!
  11. Credit: NASA 190,000 stars for 4 years at 30 minute

    cadence with 10-3 pixel pointing precision
  12. Data Source: The NASA Exoplanet Archive; Kepler DR25; 5/13/2017 1

    10 100 1000 orbital period [days] 1 10 planet radius [R ]
  13. Ingredients 1 Systematic target selection & catalog of stellar properties

    2 Systematic catalog of planets 3 Quantified completeness & reliability 4 False positive rates & other effects (e.g. multiplicity)
  14. Data Source: The NASA Exoplanet Archive; Kepler DR25; 5/13/2017 1

    10 100 1000 orbital period [days] 1 10 planet radius [R ]
  15. Burke, et al. (2015) model et al. ming e and

    al. g by f the ough tudy, peline planet planet hlight matic with ng & e we e the ump- icity, Figure 1. Fractional completeness model for the host to Kepler-22b (KIC: 10593626) in the Q1-Q16 pipeline run using the analytic model described in Section 2. Burke et al.
  16. We need 1 Fully automated methods for planet discovery 2

    Rigorous methods for population inference
  17. + planet star spacecraft detector observation + + = PHYSICS

    DATA-DRIVEN MODELS (Gaussian Process)
  18. How to find a transiting exoplanet 1 Fit & remove

    data-driven noise model 2 Matched filter grid search for candidate signals 3 Vet candidates to remove false alarms
  19. Medium data; big questions… 1 Kepler 2 K2 3 TESS

    190,000 stars 60,000 obs. per star 250,000 stars 4,000 obs. per star 500,000 stars 20,000 obs. per star approximately…
  20. Scaling of Gaussian Processes O(N3) Cholesky factorization O ( N

    log 2 N ) Approximate methods Ambikasaran, DFM, et al. (2016); arXiv:1403.6015
  21. Scaling of Gaussian Processes O(N3) Cholesky factorization O ( N

    log 2 N ) Approximate methods Ambikasaran, DFM, et al. (2016); arXiv:1403.6015 O(N) Exploiting structure of specific 1D kernels DFM, et al. (submitted); arXiv:1703.09710
  22. DFM, et al. (submitted); arXiv:1703.09710 102 103 104 105 number

    of data points [N] 10 5 10 4 10 3 10 2 10 1 100 computational cost [seconds] 1 2 4 8 16 32 64 128 256 direct O(N) 100 numb github.com/dfm/celerite
  23. Tim Morton (Princeton) David Hogg (NYU) Eric Agol (UW) Bernhard

    Schölkopf (MPIS) in collaboration with… DFM, et al. (2016) arXiv:1607.08237
  24. 1 10 100 orbital period [days] 1 10 planet radius

    [R ] Data Source: The NASA Exoplanet Archive
  25. Data Source: The NASA Exoplanet Archive 1 10 100 orbital

    period [days] 1 10 planet radius [R ]
  26. Data Source: The NASA Exoplanet Archive 1 10 100 1000

    10000 orbital period [days] 1 10 planet radius [R ]
  27. Data Source: The NASA Exoplanet Archive 1 10 100 1000

    10000 orbital period [days] 1 10 planet radius [R ]
  28. 1 Systematic target selection & catalog of stellar properties 2

    Systematic catalog of planets 3 Quantified completeness & reliability 4 False positive rates & other effects (e.g. multiplicity)
  29. Data Source: The NASA Exoplanet Archive 1 10 100 1000

    10000 orbital period [days] 1 10 planet radius [R ]
  30. DFM et al. (2016); arXiv:1607.08237 1 10 100 1000 10000

    orbital period [days] 1 10 planet radius [R ] Data Source: The NASA Exoplanet Archive
  31. How to find a transiting exoplanet 1 Fit & remove

    data-driven noise model 2 Matched filter grid search for candidate signals 3 Vet candidates to remove false alarms
  32. + planet star spacecraft detector observation + + = PHYSICS

    GAUSSIAN PROCESS CAUSAL MODEL (PCA) PHOTON NOISE
  33. How to find a transiting exoplanet 1 Fit & remove

    data-driven noise model 2 Matched filter grid search for candidate signals 3 Vet candidates to remove false alarms
  34. DFM, et al. (2016) 40 20 0 20 40 hours

    since event (a) variability KIC 7220674 40 20 0 20 40 hours since event (b) step KIC 8631697 40 20 0 20 40 hours since event (c) box KIC 5521451 40 20 0 20 40 hours since event (d) transit KIC 8505215
  35. 12 Foreman-Mackey, Hogg, Morton, et al. 0.50 0.25 0.00 10321319

    1.2 0.6 0.0 10287723 1.6 0.8 0.0 8505215 0.8 0.0 6551440 0.8 0.0 8738735 3 2 1 0 8800954 4 2 0 10187159 4 2 0 3218908 3.0 1.5 0.0 4754460 5.0 2.5 0.0 8410697 4 2 0 10842718 8 4 0 11709124 16 8 0 3239945 4 2 0 8426957 50 25 0 9306307 80 40 0 10602068 Figure 3. Sections of PDC light curve centered on each candidate (black) with the posterior-median transit model over-plotted (orange). Candidates with two transits are folded on the posterior-median DFM, et al. (2016)
  36. 1 Systematic target selection & catalog of stellar properties 2

    Systematic catalog of planets 3 Quantified completeness & reliability 4 False positive rates & other effects (e.g. multiplicity)
  37. DFM, et al. (2016) 3 5 10 20 period [years]

    0.2 0.5 1.0 2.0 RP /RJ 0.048 0.211 0.499 0.669 0.727 0.710 0.635 0.046 0.194 0.468 0.616 0.657 0.630 0.569 0.043 0.193 0.460 0.605 0.623 0.591 0.520 0.038 0.174 0.433 0.529 0.529 0.492 0.427 0.0 0.3 0.6 0.0 0.3 0.6
  38. DFM, et al. (2016) 2.00 ± 0.72 planets per G/K-

    dwarf occurrence rate in range: 2 – 25 years, 0.1 – 1 RJ
  39. 3.4 3.6 3.8 4.0 log10 Te↵ 0 2 4 log10

    g Kepler 3.4 3.6 3.8 4.0 log10 Te↵ K2 Data Source: The NASA Exoplanet Archive; 5/13/2017
  40. 4000 2000 0 2000 4000 raw: 301 ppm EPIC 201374602;

    Kp = 11.5 mag 10 20 30 40 50 60 70 80 time [BJD - 2456808] 400 0 400 residuals: 35 ppm relative brightness [ppm] 4000 2000 0 2000 4000 raw: 301 ppm EPIC 201374602; Kp = 11.5 mag 10 20 30 40 50 60 70 80 time [BJD - 2456808] 400 0 400 residuals: 35 ppm relative brightness [ppm]
  41. cbna Flickr user Aamir Choudhry Luger, et al. (2016, 2017)

    led by… Rodrigo Luger & Ethan Kruse
  42. + planet star spacecraft detector observation + + = PHYSICS

    GAUSSIAN PROCESS CAUSAL MODEL + PIXEL-LEVEL DECORRELATION PHOTON NOISE inspired by: Vanderburg & Johnson (2014) Crossfield, et al. (2015) Aigrain, et al. (2015) DFM, et al. (2015) Deming, et al. (2015) + more
  43. Pixel-level decorrelation (PLD) if background is correctly subtracted, and astrophysical

    signal is multiplicative, then the fractional astrophysical contribution is equal in all pixels. Deming, et al. (2015); Luger, et al. (2016, 2017) ˆ pn(t) = pn(t) PN k=1 pn(t) estimator for instrumental signal estimator for astrophysical signal pixel time series
  44. = Figure credit: Rodrigo Luger; Deming, et al. (2015); Luger,

    et al. (2016, 2017) Pixel-level decorrelation (PLD) ÷
  45. + planet star spacecraft detector observation + + = PHYSICS

    GAUSSIAN PROCESS CAUSAL MODEL + PIXEL-LEVEL DECORRELATION PHOTON NOISE
  46. EVEREST + planet star spacecraft detector observation + + =

    PHYSICS GAUSSIAN PROCESS CAUSAL MODEL + PIXEL-LEVEL DECORRELATION PHOTON NOISE
  47. EVEREST + planet star spacecraft detector observation + + =

    PHYSICS GAUSSIAN PROCESS CAUSAL MODEL + PIXEL-LEVEL DECORRELATION PHOTON NOISE
  48. g. 3.— Cross-validation procedure for first order PLD o 03150

    (WASP-47 e), a campaign 3 planet host. Show ter v in the validation set (red) and the scatter in the (blue) as a function of , the prior amplitude for Luger, Kruse, DFM, et al. (2017)
  49. Luger, Kruse, DFM, et al. (2017) Kp = 15; for

    campaigns 3, 4, and 8, EVEREST recovers the Kepler precision dow of (variable) giant stars, leading to a higher average CDPP, while campaign 7 change in the orientation of the spacecraft and excess jitter. Fig. 20.— The same as Figure 19, but comparing the CDPP of all K2 stars to that of Kepler . EVEREST 2.0 recovers the original Kepler photometric precision down to at least Kp = 14, and past contam the in which inated valida fects o overfit spacec get ap of the apertu a time overfit § 3.7, o this be In F ing bin overfit light c binary
  50. EVEREST 2.0 7 This pro- ma of the v n

    , and that e of the seg- 3, where we sections for e minimum al line indi- se between and slight re conserva- ith nPLD to report our and a com- arisons with curves. We proxy 6 hr h we calcu- we smooth clip outliers deviation in Luger, Kruse, DFM, et al. (2017)
  51. Data Source: The NASA Exoplanet Archive; Kepler DR25; 5/13/2017 1

    10 100 1000 orbital period [days] 1 10 planet radius [R ]
  52. 1 10 100 1000 orbital period [days] 1 10 planet

    radius [R ] Data Source: The NASA Exoplanet Archive; Kepler DR25; 5/13/2017 Kruse, et al. (in prep)
  53. 1 10 100 1000 orbital period [days] 1 10 planet

    radius [R ] Data Source: The NASA Exoplanet Archive; Kepler DR25; 5/13/2017 Kruse, et al. (in prep) 800 candidates 500 new
  54. 40 50 60 70 80 0.985 0.990 0.995 1.000 1.005

    90 100 110 0.985 0.990 0.995 1.000 1.005 −0.05 0.00 0.05 0.90 0.92 0.94 0.96 0.98 1.00 −0.06 −0.04 −0.02 0.00 0.02 0.04 0.06 −0.06 −0.04 −0.02 0.00 0.02 0.04 0.06 . . . . a b c d K2 long cadence data Barycentric Julian Date − 2,457,700 [day] Relative brightness Relative brightness 1b 1c 1d 1e 1f 1g 1h 1b 1c 1d 1e 1f 1g 1h Time from mid−transit [day] Relative brightness transit 1 transit 2 transit 3 transit 4 folded lightcurve Orbital separation [AU] Figure 1: a, b : Long cadence K2 light curve detrended with EVEREST and with stellar variability removed. Data points are in black, and our highest likelihood transit model for all seven planets TRAPPIST-1h: Luger, Sestovic, Kruse, et al. (2017); arXiv:1703.04166 embargoed
  55. Suzanne Aigrain (Oxford) Vinesh Rajpaul (Oxford) Eric Agol (UW) Sivaram

    Ambikasaran (Indian Inst. of Sci.) in collaboration with… Angus, et al. (submitted) DFM, et al. (submitted) Ruth Angus (Columbia) led by…
  56. Figure credit: Ruth Angus 100 101 Age (Gyr) 101 102

    Rotation period (days) Coma Berenices Praesepe Hyades NGC 6811 NGC 6819 The Sun Asteroseismic targets M67 (Esselstein, in prep)
  57. ctive model should ers and be flexible QP behaviour. A

    irements. We thus a method to prob- ation periods. This e rotation period, rtainty. arning community iology, geophysics used in the stellar e stellar variability l. 2012; Haywood 5; Haywood 2015; t al. 2015; Rajpaul eful in regression cifically when the variate Gaussian. If n in N dimensions, can describe that ocesses is provided tween data points demonstration, we ight curve of KIC s once every ⇠ 30.5 FGK stars. Clearly, summit of the Mauna Loa volcano in Hawaii (data from Keeling and Whorf 2004) using a kernel which is the product of a periodic and a SE kernel: the QP kernel. This kernel is defined as ki , j = A exp 2 6 6 6 6 4 ( xi xj )2 2 l 2 2 sin2 ⇡( xi xj ) P !3 7 7 7 7 5 + 2 ij . (2) It is the product of the SE kernel function, which describes the overall covariance decay, and an exponentiated, squared, sinusoidal kernel function that describes the periodic covariance structure. P can be interpreted as the rotation period of the star, and controls the amplitude of the sin2 term. If is very large, only points almost exactly one period away are tightly correlated and points that are slightly more or less than one period away are very loosely cor- related. If is small, points separated by one period are tightly correlated, and points separated by slightly more or less are still highly correlated, although less so. In other words, large values of lead to periodic variations with increasingly complex harmonic con- tent. This kernel function allows two data points that are separated in time by one rotation period to be tightly correlated, while also allowing points separated by half a period to be weakly correlated. The additional parameter captures white noise by adding a term to the diagonal of the covariance matrix. This can be interpreted to represent underestimation of observational uncertainties — if the uncertainties reported on the data are too small, it will be non- zero — or it can capture any remaining “jitter,” or residuals not captured by the e ective GP model. We use this QP kernel function (Equation 2) to produce the GP model that fits the Kepler light curve 0 20 40 time [days] 1.0 0.5 0.0 0.5 1.0 relative flux [ppt] Kepler light curve 10 1 100 ! [days 1] 10 3 10 2 10 1 S(!) power spectrum 0 0.000 0.025 0.050 0.075 0.100 0.125 k(⌧) 3.50 3.75 4.00 4.25 rotation period [days] Angus, et al. (submitted); github.com/RuthAngus/GProtation
  58. 0 1 2 3 4 ln(Injected Period) 2 0 2

    4 6 ln(Recovered Period) 7 6 5 4 3 ln (Amplitude) Angus, et al. (submitted); github.com/RuthAngus/GProtation
  59. Summary 1 Find exoplanets 2 Learn about stars Build data-driven

    noise models and… Dan Foreman-Mackey Sagan Fellow / University of Washington @exoplaneteer / dfm.io / github.com/dfm