Upgrade to Pro — share decks privately, control downloads, hide ads and more …

MCQMC 2024 QMC Tutorial

MCQMC 2024 QMC Tutorial

A 70 minute tutorial given at MCQMC 2024 in Waterloo on August 18, 2024

Fred J. Hickernell

August 19, 2024
Tweet

More Decks by Fred J. Hickernell

Other Decks in Research

Transcript

  1. Fred J. Hickernell, Illinois Institute of Technology August 18, 2024

    Quasi-Monte Carlo Methods What, Why, and How? Thanks to • The organizers • The US National Science Foundation #2316011
  2. My aim for the next 75 minutes You will •

    Understand what quasi-Monte Carlo (qMC) methods are
  3. My aim for the next 75 minutes You will •

    Understand what quasi-Monte Carlo (qMC) methods are • Try qMC in place of simple (or IID) Monte Carlo
  4. My aim for the next 75 minutes You will •

    Understand what quasi-Monte Carlo (qMC) methods are • Try qMC in place of simple (or IID) Monte Carlo • Use qMC properly
  5. My aim for the next 75 minutes You will •

    Understand what quasi-Monte Carlo (qMC) methods are • Try qMC in place of simple (or IID) Monte Carlo • Use qMC properly • Feel free to interrupt and ask questions
  6. My aim for the next 75 minutes You will •

    Understand what quasi-Monte Carlo (qMC) methods are • Try qMC in place of simple (or IID) Monte Carlo • Use qMC properly • Feel free to interrupt and ask questions • Join our friendly qMC research community
  7. My aim for the next 75 minutes You will •

    Understand what quasi-Monte Carlo (qMC) methods are • Try qMC in place of simple (or IID) Monte Carlo • Use qMC properly • Feel free to interrupt and ask questions • Join our friendly qMC research community Try the computations at https://tinyurl.com/QMCTutorialNotebook
  8. Overview μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰

    ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn
  9. Overview • Where in practice μ := expectation 𝔼 [f(

    X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn
  10. Overview • Where in practice • Constructing low discrepancy (LD)

    x0 , x1 , … μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn
  11. Overview • Where in practice • Constructing low discrepancy (LD)

    x0 , x1 , … • Discrepancy (quality) measures for x0 , x1 , … μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn
  12. Overview • Where in practice • Constructing low discrepancy (LD)

    x0 , x1 , … • Discrepancy (quality) measures for x0 , x1 , … • Choosing so that n |μ − ̂ μn | ≤ ε μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn
  13. Overview • Where in practice • Constructing low discrepancy (LD)

    x0 , x1 , … • Discrepancy (quality) measures for x0 , x1 , … • Choosing so that n |μ − ̂ μn | ≤ ε • Making our original problem look like the above μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn
  14. Overview • Where in practice • Constructing low discrepancy (LD)

    x0 , x1 , … • Discrepancy (quality) measures for x0 , x1 , … • Choosing so that n |μ − ̂ μn | ≤ ε • Making our original problem look like the above • Ongoing research μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn
  15. Where does this arise in practice? μ := expectation 𝔼

    [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn f(X) = option payo ff underground water pressure with random rock porosity pixel intensity from random ray option price average water pressure average pixel intensity = μ
  16. Where does this arise in practice? μ := expectation 𝔼

    [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn f(X) = option payo ff underground water pressure with random rock porosity pixel intensity from random ray option price average water pressure average pixel intensity = μ may be dozens or hundreds d
  17. Overview • Where in practice • Constructing low discrepancy (LD)

    • Discrepancy (quality) measures for • Choosing so that • Making our original problem look like the above • Ongoing research x0 , x1 , … x0 , x1 , … n |μ − ̂ μn | ≤ ε μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn
  18. How to choose ? x0 , x1 , … μ

    := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn
  19. How to choose ? x0 , x1 , … μ

    := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn
  20. How to choose ? x0 , x1 , … μ

    := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn
  21. How to choose ? x0 , x1 , … μ

    := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn
  22. How to choose ? x0 , x1 , … μ

    := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn Grids do not fi ll space well
 Hard to extend
  23. How to choose ? x0 , x1 , … μ

    := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn IID fi ll space better better than grids 0.00 0.25 0.50 0.75 1.00 xi1 0.00 0.25 0.50 0.75 1.00 xi2 0.00 0.25 0.50 0.75 1.00 xi1 0.00 0.25 0.50 0.75 1.00 xi3 0.00 0.25 0.50 0.75 1.00 xi1 0.00 0.25 0.50 0.75 1.00 xi4 64 Independent and Identically Distributed (IID) points (d = 6)
  24. How to choose ? x0 , x1 , … μ

    := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn LD points fi ll space even better! 0.00 0.25 0.50 0.75 1.00 xi1 0.00 0.25 0.50 0.75 1.00 xi2 0.00 0.25 0.50 0.75 1.00 xi1 0.00 0.25 0.50 0.75 1.00 xi3 0.00 0.25 0.50 0.75 1.00 xi1 0.00 0.25 0.50 0.75 1.00 xi4 64 Low Discrepancy (LD) Points (d = 6)
  25. Quasi-Monte Carlo (qMC) methods use low discrepancy (LD) or evenly

    spread sequences instead of grids or IID sequences to solve problems more efficiently
  26. Integration lattices [DKP 2022] 0.00 0.25 0.50 0.75 1.00 0.0

    0.2 0.4 0.6 0.8 1.0 1.2 1.4 x1 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 x2 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 x3 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 x15 Lattice xi = i(1, 11)/16 (mod 1), i = 0, . . . , 15
  27. Integration lattices [DKP 2022] xi = ih/n (mod 1), i

    = 0,…, n − 1 In general xi + xj (mod 1) = xi+jmodn Group structure 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 x1 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 x2 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 x3 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 x15 Lattice xi = i(1, 11)/16 (mod 1), i = 0, . . . , 15
  28. Integration lattices [DKP 2022] xi = ih/n (mod 1), i

    = 0,…, n − 1 In general xi + xj (mod 1) = xi+jmodn Group structure Good chosen by computer search h 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 x1 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 x2 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 x3 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 x15 Lattice xi = i(1, 11)/16 (mod 1), i = 0, . . . , 15
  29. Shifted integration lattices 0.00 0.25 0.50 0.75 1.00 0.0 0.2

    0.4 0.6 0.8 1.0 1.2 1.4 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 x0 = ¢ 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 x0 = ¢ 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 x0 = ¢ Lattice xi = i(1, 11)/16 + ¢ (mod 1), i = 0, . . . , 15
  30. Shifted integration lattices xi = ih/n + Δ (mod 1),

    i = 0,…, n − 1, Δ ∼ 𝒰 [0,1]d In general Shifted lattice is a coset 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 x0 = ¢ 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 x0 = ¢ 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 x0 = ¢ Lattice xi = i(1, 11)/16 + ¢ (mod 1), i = 0, . . . , 15
  31. Shifted integration lattices xi = ih/n + Δ (mod 1),

    i = 0,…, n − 1, Δ ∼ 𝒰 [0,1]d In general Shifted lattice is a coset Random shifts make unbiased and shift points away from the boundary ̂ μn 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 x0 = ¢ 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 x0 = ¢ 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 x0 = ¢ Lattice xi = i(1, 11)/16 + ¢ (mod 1), i = 0, . . . , 15
  32. Extensible shifted lattice sequences van der Corput sequence i =

    (⋯i2 i1 i0 )2 = i0 + i1 2 + i2 4 + ⋯ + im 2m + ⋯ ∈ ℕ0 xi = 2 (0.i0 i1 i2 ⋯) = i0 2 + i1 4 + i2 8 + ⋯ + im 2m−1 + ⋯ ∈ [0,1) 0.00 0.25 0.50 0.75 1.00 x0 x1 x2 x3 x4 x5 16 points of a van der Corput squence
  33. Extensible shifted lattice sequences xi = ( i0 2 +

    i1 4 + i2 8 + ⋯) h + Δ (mod 1), i = 0,…, 2m − 1, m ∈ ℕ0 Lattice reordered van der Corput sequence i = (⋯i2 i1 i0 )2 = i0 + i1 2 + i2 4 + ⋯ + im 2m + ⋯ ∈ ℕ0 xi = 2 (0.i0 i1 i2 ⋯) = i0 2 + i1 4 + i2 8 + ⋯ + im 2m−1 + ⋯ ∈ [0,1) 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 0.00 0.25 0.50 0.75 1.00 0.0 0.2 0.4 0.6 0.8 1.0 0.00 0.25 0.50 0.75 1.00 x0 x1 x2 x3 x4 x5 16 points of a van der Corput squence
  34. Digital nets and sequences [DP 2010] xi = i0 z0

    ⊕ i1 z1 ⊕ i2 z2 ⊕ ⋯ ⊕ Δ, i = 0,1,…,2m − 1, m ∈ ℕ0 ⊕ = bitwise addition, e.g., 2 0.011 ⊕ 2 0.101 = 2 0.110, zm ∈ [0,1)d Digital net
 with shift van der Corput sequence i = (⋯i2 i1 i0 )2 = i0 + i1 2 + i2 4 + ⋯ + im 2m + ⋯ ∈ ℕ0 xi = 2 (0.i0 i1 i2 ⋯) = i0 2 + i1 4 + i2 8 + ⋯ + im 2m−1 + ⋯ ∈ [0,1) Chosen by number theory or computer
  35. Digital nets and sequences [DP 2010] xi = i0 z0

    ⊕ i1 z1 ⊕ i2 z2 ⊕ ⋯ ⊕ Δ, i = 0,1,…,2m − 1, m ∈ ℕ0 ⊕ = bitwise addition, e.g., 2 0.011 ⊕ 2 0.101 = 2 0.110, zm ∈ [0,1)d Digital net
 with shift van der Corput sequence i = (⋯i2 i1 i0 )2 = i0 + i1 2 + i2 4 + ⋯ + im 2m + ⋯ ∈ ℕ0 xi = 2 (0.i0 i1 i2 ⋯) = i0 2 + i1 4 + i2 8 + ⋯ + im 2m−1 + ⋯ ∈ [0,1) 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 Every tile has the same # of points is a coset for {xi }2m−1 i=0 m ∈ ℕ0
  36. Digital nets and sequences van der Corput sequence i =

    (⋯i2 i1 i0 )2 = i0 + i1 2 + i2 4 + ⋯ + im 2m + ⋯ ∈ ℕ0 xi = 2 (0.i0 i1 i2 ⋯) = i0 2 + i1 4 + i2 8 + ⋯ + im 2m−1 + ⋯ ∈ [0,1) xi = i0 z0 ⊕ i1 z1 ⊕ i2 z2 ⊕ ⋯ ⊕ Δ, i = 0,…,2m − 1, m ∈ ℕ0 ⊕ = bitwise addition, zm ∈ [0,1)d Digital net
 with shift 0.0 0.5 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.5 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.5 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.5 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.5 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Digital nets are extensible as sequences
  37. Scrambled, shifted digital sequences Digital sequence
 with shift xi =

    i0 z0 ⊕ i1 z1 ⊕ i2 z2 ⊕ ⋯ ⊕ Δ, i = 0,1,…,2m − 1, m ∈ ℕ0 ⊕ = bitwise addition, zm ∈ [0,1)d, Δ ∼ 𝒰 [0,1]d, zm random Every tile has the same # of points
 How small a tile can be depends on d 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
  38. Scrambled, shifted digital sequences Digital sequence
 with shift xi =

    i0 z0 ⊕ i1 z1 ⊕ i2 z2 ⊕ ⋯ ⊕ Δ, i = 0,1,…,2m − 1, m ∈ ℕ0 ⊕ = bitwise addition, zm ∈ [0,1)d, Δ ∼ 𝒰 [0,1]d, zm random Every tile has the same # of points
 How small a tile can be depends on d 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 Random scrambles & shifts make unbiased
 move the fi rst point away from the origin ̂ μn
  39. • Use preferred for lattices and nets • Do not

    drop or skip points • Use randomized LD sequences to avoid points on the boundaries and to make your answers unbiased; you may also gain in convergence rate n = 2m
  40. Overview • Where in practice • Constructing low discrepancy (LD)

    • Discrepancy (quality) measures for • Choosing so that • Making our original problem look like the above • Ongoing research x0 , x1 , … x0 , x1 , … n |μ − ̂ μn | ≤ ε μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn
  41. Discrepancy measures the quality of ? [H00] x0 , x1

    , … μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn |μ − ̂ μn | ≤ tight discrepancy({xi }n−1 i=0 ) norm of the error functional variation(f ) semi-norm
  42. Discrepancy measures the quality of ? [H00] x0 , x1

    , … μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn |μ − ̂ μn | ≤ tight discrepancy({xi }n−1 i=0 ) norm of the error functional variation(f ) semi-norm If is in a Hilbert space with reproducing kernel , then f ℋ K discrepancy2({xi }n−1 i=0 ) = ∫ [0,1]d×[0,1]d K(t, x) dt dx − 2 n n−1 ∑ i=0 ∫ [0,1]d K(t, xi ) dt + 1 n2 n−1 ∑ i,j=0 K(xi , xj ) variation(f ) = inf c∈ℝ ∥f − c∥ℋ
  43. Discrepancy measures the quality of ? [H00] x0 , x1

    , … μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn |μ − ̂ μn | ≤ tight discrepancy({xi }n−1 i=0 ) norm of the error functional variation(f ) semi-norm If is in a Hilbert space with reproducing kernel , then f ℋ K discrepancy2({xi }n−1 i=0 ) = ∫ [0,1]d×[0,1]d K(t, x) dt dx − 2 n n−1 ∑ i=0 ∫ [0,1]d K(t, xi ) dt + 1 n2 n−1 ∑ i,j=0 K(xi , xj ) variation(f ) = inf c∈ℝ ∥f − c∥ℋ Quasi-Monte Carlo (qMC) methods use low discrepancy (LD) points
  44. Ex., centered discrepancy K(t, x) = d ∏ ℓ=1 [1

    + 1 2 ( tℓ − 1/2 + xℓ − 1/2 − tℓ − xℓ )] (d = 1) Any sub matrix is positive de fi nite 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.96 1.04 1.12 1.20 1.28 1.36 1.44 1.52
  45. discrepancy2({xi }n−1 i=0 ) = ( 13 12) d −

    2 n n−1 ∑ i=0 d ∏ ℓ=1 [1 + 1 2 ( xiℓ − 1/2 − xiℓ − 1/2 2 )] + 1 n2 n−1 ∑ i,j=0 d ∏ ℓ=1 [1 + 1 2 ( xiℓ − 1/2 + xjℓ − 1/2 − xiℓ − xjℓ )] Ex., centered discrepancy K(t, x) = d ∏ ℓ=1 [1 + 1 2 ( tℓ − 1/2 + xℓ − 1/2 − tℓ − xℓ )] Closed form
 computations 𝒪 (dn2)
  46. discrepancy2({xi }n−1 i=0 ) = ( 13 12) d −

    2 n n−1 ∑ i=0 d ∏ ℓ=1 [1 + 1 2 ( xiℓ − 1/2 − xiℓ − 1/2 2 )] + 1 n2 n−1 ∑ i,j=0 d ∏ ℓ=1 [1 + 1 2 ( xiℓ − 1/2 + xjℓ − 1/2 − xiℓ − xjℓ )] variation2(f ) = ∫ [0,1] ∂f(x1 , 1/2) ∂x1 2 dx1 + ⋯ + ∫ [0,1]2 ∂2f(x1 , x2 , 1/2) ∂x1 ∂x2 2 dx1 dx2 +⋯ + ∫ [0,1]d ∂df(x) ∂x1 ⋯∂xd 2 dx Ex., centered discrepancy K(t, x) = d ∏ ℓ=1 [1 + 1 2 ( tℓ − 1/2 + xℓ − 1/2 − tℓ − xℓ )] Closed form
 computations 𝒪 (dn2) Hard to compute
  47. Centered discrepancy for digital sequences • decay in theory •

    decay practically for small • decay practically for large • Discrepancy increases with 𝒪 (n−1+δ) ∀d 𝒪 (n−1+δ) d 𝒪 (n−1/2) d d |μ − ̂ μn | ≤ discrepancy({xi }n−1 i=0 ) variation(f )
  48. Why discrepancy increases with d • Integration becomes harder as

    increases d discrepancy(∅) = sup ∥f∥ℋ ≤1 ∫ [0,1]d f(x) dx = ∫ [0,1]2d K(t, x) dt dx = ( 13 12 ) d/2
  49. Why discrepancy increases with d • Integration becomes harder as

    increases d • Dividing discrepancy by 
 helps a bit discrepancy(∅) discrepancy(∅) = sup ∥f∥ℋ ≤1 ∫ [0,1]d f(x) dx = ∫ [0,1]2d K(t, x) dt dx = ( 13 12 ) d/2
  50. Why discrepancy increases with d discrepancy(∅) = sup ∥f∥ℋ ≤1

    ∫ [0,1]d f(x) dx = ∫ [0,1]2d K(t, x) dt dx = ( 13 12) d/2
  51. Why discrepancy increases with d discrepancy(∅) = sup ∥f∥ℋ ≤1

    ∫ [0,1]d f(x) dx = ∫ [0,1]2d K(t, x) dt dx = ( 13 12) d/2 • Dividing discrepancy by 
 helps a bit • LD still (usually) beats IID, although not in convergence order discrepancy(∅)
  52. A little data can be worse than no data sup

    ∥f∥ℋ ≤1 |μ|2 = discrepancy2(∅) = ( 13 12 ) d may be < ( 13 12 ) d − 2 d ∏ ℓ=1 [1 + 1 2 ( x0ℓ − 1/2 − x0ℓ − 1/2 2 )] + d ∏ ℓ=1 [1 + x0ℓ − 1/2 ] = discrepancy2({x0 }) = sup ∥f∥ℋ ≤1 |μ − ̂ μ1 |2
  53. Shrinkage estimators? sup ∥f∥ℋ ≤1 |μ|2 = discrepancy2(∅) = (

    13 12 ) d must be > ( 13 12 ) d − 2α d ∏ ℓ=1 [1 + 1 2 ( x0ℓ − 1/2 − x0ℓ − 1/2 2 )] +α2 d ∏ ℓ=1 [1 + x0ℓ − 1/2 ] = sup ∥f∥ℋ ≤1 |μ−α ̂ μ1 |2 for optimal α
  54. Shrinkage estimators? sup ∥f∥ℋ ≤1 |μ|2 = discrepancy2(∅) = (

    13 12 ) d must be > ( 13 12 ) d − 2α d ∏ ℓ=1 [1 + 1 2 ( x0ℓ − 1/2 − x0ℓ − 1/2 2 )] +α2 d ∏ ℓ=1 [1 + x0ℓ − 1/2 ] = sup ∥f∥ℋ ≤1 |μ−α ̂ μ1 |2 for optimal α Is this the right and corresponding discrepancy? K
  55. Centered discrepancy w/ coordinate weights K(t, x) = d ∏

    ℓ=1 [ 1 + γ2 ℓ 2 ( tℓ − 1/2 + xℓ − 1/2 − tℓ − xℓ )] discrepancy2({xi }n−1 i=0 ) = d ∏ ℓ=1 ( 1 + γ2 ℓ 12) − 2 n n−1 ∑ i=0 d ∏ ℓ=1 [ 1 + γ2 ℓ 2 ( xiℓ − 1/2 − xiℓ − 1/2 2 )] + 1 n2 n−1 ∑ i,j=0 d ∏ ℓ=1 [ 1 + γ2 ℓ 2 ( xiℓ − 1/2 + xjℓ − 1/2 − xiℓ − xjℓ )] variation2(f ) = ∫ [0,1] ∂f(x1 , 1/2) γ1 ∂x1 2 dx1 + ⋯ + ∫ [0,1]2 ∂2f(x1 , x2 , 1/2) γ1 γ2 ∂x1 ∂x2 2 dx1 dx2 +⋯ + ∫ [0,1]d ∂df(x) γ1 ⋯γd ∂x1 ⋯∂xd 2 dx γ1 ≥ γ2 ≥ ⋯ > 0
  56. Decaying coordinate weights make integration tractable [NW10] |μ − ̂

    μn | ≤ discrepancy({xi }n−1 i=0 ) variation(f )
  57. Decaying coordinate weights make integration tractable [NW10] • Problems are

    tractable if the work required to solve them to error grows slower than exponentially in • Tractability requires shrinking • Hopefully, the encountered in practice have moderate even with decaying coordinate weights ≤ ε ε−1 {f : variation(f) ≤ 1} f variation(f) |μ − ̂ μn | ≤ discrepancy({xi }n−1 i=0 ) variation(f )
  58. Decaying coordinate weights may overcome the curse of dimensionality, but

    you may need to formulate your problem with most of the variation into the lower coordinates
  59. How to find low discrepancy sequences • Computing requires operations,

    but … discrepancy({xn−1 i=0 }) 𝒪 (dn2)
  60. How to find low discrepancy sequences • Computing requires operations,

    but … discrepancy({xn−1 i=0 }) 𝒪 (dn2) • Computing for shifted lattices and nets requires only operations to compute, because … 𝔼 [(discrepancy2({xshift i }n−1 i=0 ))] 𝒪 (dn) 𝔼 [(discrepancy2({xshift i }n−1 i=0 ; K))] = discrepancy2({xi }n−1 i=0 ; ˜ K ) = 1 n n−1 ∑ i=0 ˜ K (xi , x0 ) − ∫ [0,1]d ˜ K (x, x0 ) dx where is a (digital) shift invariant version of , e.g., for lattices ˜ K K ˜ K (t, x) = ∫ [0,1]d K(t + Δ mod 1, x + Δ mod 1) dΔ
  61. When working with lattices and digital sequences, if possible use

    shift-invariant kernels for greater computational efficiency
  62. • Spaces of smoother with correspondingly smoother reproducing kernels, ,

    may allow faster decay of the error—see lattices for periodic with higher order smoothness and higher order nets for with higher order smoothness f K f f Is worst case error analysis too pessimistic?
  63. • Spaces of smoother with correspondingly smoother reproducing kernels, ,

    may allow faster decay of the error—see lattices for periodic with higher order smoothness and higher order nets for with higher order smoothness f K f f • For randomized , {xn−1 i=0 } Swapping the order may lead to an extra 𝒪 (n−1/2) Is worst case error analysis too pessimistic? 𝔼 {xi }n−1 i=0 sup variation(f)≤1 |μ − ̂ μn |2 = 𝔼 {xi }n−1 i=0 discrepancy2({xi }n−1 i=0 ) ≥ sup variation(f)≤1 𝔼 {xi }n−1 i=0 |μ − ̂ μn |2
  64. • Spaces of smoother with correspondingly smoother reproducing kernels, ,

    may allow faster decay of the error—see lattices for periodic with higher order smoothness and higher order nets for with higher order smoothness f K f f • For randomized , {xn−1 i=0 } Swapping the order may lead to an extra 𝒪 (n−1/2) • [H18] surveys worst case, randomized, and Bayesian error analyses Is worst case error analysis too pessimistic? 𝔼 {xi }n−1 i=0 sup variation(f)≤1 |μ − ̂ μn |2 = 𝔼 {xi }n−1 i=0 discrepancy2({xi }n−1 i=0 ) ≥ sup variation(f)≤1 𝔼 {xi }n−1 i=0 |μ − ̂ μn |2
  65. Overview • Where in practice • Constructing low discrepancy (LD)

    • Discrepancy (quality) measures for • Choosing so that • Making our original problem look like the above • Ongoing research x0 , x1 , … x0 , x1 , … n |μ − ̂ μn | ≤ ε μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn
  66. How to choose to get the desired accuracy? n μ

    := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn |μ − ̂ μn | ≤ discrepancy({xi }n−1 i=0 ) variation(f ) want |μ − ̂ μn | ≤ ε
  67. How to choose to get the desired accuracy? n μ

    := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn • is for low discrepancy points versus 
 for IID discrepancy({xi }n−1 i=0 ) 𝒪 (n−1) 𝒪 (n−1/2) |μ − ̂ μn | ≤ discrepancy({xi }n−1 i=0 ) variation(f ) want |μ − ̂ μn | ≤ ε
  68. How to choose to get the desired accuracy? n μ

    := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn • is for low discrepancy points versus 
 for IID discrepancy({xi }n−1 i=0 ) 𝒪 (n−1) 𝒪 (n−1/2) • There is an explicit formula for discrepancy, but the variation is hard to compute in practice |μ − ̂ μn | ≤ discrepancy({xi }n−1 i=0 ) variation(f ) want |μ − ̂ μn | ≤ ε
  69. How to choose to get the desired accuracy? n μ

    := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn • is for low discrepancy points versus 
 for IID discrepancy({xi }n−1 i=0 ) 𝒪 (n−1) 𝒪 (n−1/2) • There is an explicit formula for discrepancy, but the variation is hard to compute in practice • One method is random replications plus the Student’s con fi dence interval [LENOT24] t |μ − ̂ μn | ≤ discrepancy({xi }n−1 i=0 ) variation(f ) want |μ − ̂ μn | ≤ ε
  70. QMCPy has deterministic stopping rules μ := expectation 𝔼 [f(

    X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn , want |μ − ̂ μn | ≤ ε
  71. QMCPy has deterministic stopping rules μ := expectation 𝔼 [f(

    X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn , want |μ − ̂ μn | ≤ ε • For lattices, , , ̂ f(k) := ∫ [0,1]d f(x) 𝖾 −2π −1k′  x dx f(x) = ∑ k∈ℤd ̂ f(k) 𝖾 2π −1k′  x μ = ̂ f(0)
  72. QMCPy has deterministic stopping rules μ := expectation 𝔼 [f(

    X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn , want |μ − ̂ μn | ≤ ε • For lattices, , , ̂ f(k) := ∫ [0,1]d f(x) 𝖾 −2π −1k′  x dx f(x) = ∑ k∈ℤd ̂ f(k) 𝖾 2π −1k′  x μ = ̂ f(0) • due to aliasing ( constant for all ) μ − ̂ μn 𝖾 2π −1k′  xi i
  73. QMCPy has deterministic stopping rules μ := expectation 𝔼 [f(

    X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn , want |μ − ̂ μn | ≤ ε • For lattices, , , ̂ f(k) := ∫ [0,1]d f(x) 𝖾 −2π −1k′  x dx f(x) = ∑ k∈ℤd ̂ f(k) 𝖾 2π −1k′  x μ = ̂ f(0) • due to aliasing ( constant for all ) μ − ̂ μn 𝖾 2π −1k′  xi i • Approximate by a one-dimensional FFT of ̂ f(k) {f(xi )}n−1 i=0
  74. QMCPy has deterministic stopping rules μ := expectation 𝔼 [f(

    X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn , want |μ − ̂ μn | ≤ ε • For lattices, , , ̂ f(k) := ∫ [0,1]d f(x) 𝖾 −2π −1k′  x dx f(x) = ∑ k∈ℤd ̂ f(k) 𝖾 2π −1k′  x μ = ̂ f(0) • due to aliasing ( constant for all ) μ − ̂ μn 𝖾 2π −1k′  xi i • Approximate by a one-dimensional FFT of ̂ f(k) {f(xi )}n−1 i=0 • If the decay in a reasonable manner, their FFT approximations can be used to provide a rigorous data-driven bound on ̂ f(k) μ − ̂ μn
  75. QMCPy has deterministic stopping rules μ := expectation 𝔼 [f(

    X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn , want |μ − ̂ μn | ≤ ε • For lattices, , , ̂ f(k) := ∫ [0,1]d f(x) 𝖾 −2π −1k′  x dx f(x) = ∑ k∈ℤd ̂ f(k) 𝖾 2π −1k′  x μ = ̂ f(0) • due to aliasing ( constant for all ) μ − ̂ μn 𝖾 2π −1k′  xi i • Approximate by a one-dimensional FFT of ̂ f(k) {f(xi )}n−1 i=0 • If the decay in a reasonable manner, their FFT approximations can be used to provide a rigorous data-driven bound on ̂ f(k) μ − ̂ μn • Similarly for digital nets
  76. QMCPy has Bayesian stopping rules μ := expectation 𝔼 [f(

    X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn , want |μ − ̂ μn | ≤ ε • If is assumed to be a Gaussian stochastic process with covariance kernel , where the hyper-parameters are properly tuned, then one may construct a Bayesian credible interval for the error • If is chosen to be shift invariant and lattice/digital sequences are used, then the computation normally required is reduced to f K K 𝒪 (n3) 𝒪 (n log n)
  77. Overview • Where in practice • Constructing low discrepancy (LD)

    • Discrepancy (quality) measures for • Choosing so that • Making our original problem look like the above • Ongoing research x0 , x1 , … x0 , x1 , … n |μ − ̂ μn | ≤ ε μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn
  78. Variable Transformations μ := ∫ Ω g(z) λ(z) dz =

    something wonderful ⏞ ⋯ = expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn
  79. Variable Transformations μ := ∫ Ω g(z) λ(z) dz =

    something wonderful ⏞ ⋯ = expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn If for , then • is not unique • Want to be as small as possible (an art) • Often choose , but not necessary Z = Ψ(X) X ∼ 𝒰 [0,1]d f(x) = g(Ψ(x)) λ(Ψ(x)) ∂Ψ ∂x Ψ variation(f) λ(Ψ(x)) ∂Ψ ∂x = 1
  80. Ex., option pricing [G04] μ := 𝔼 [payoff(Brownian motion(t1 ,

    …, td ))] = ∫ ℝd payoff(z) exp(−zTΣ−1z/2) (2π)d|Σ| dz Often, choosing by principal component analysis (PCA) gives faster convergence than by Cholesky 𝖠 Z = 𝖠 Φ−1(X1 ) ⋮ Φ−1(Xd ) Ψ(X) , 𝖠 𝖠 T = Σ 10°3 10°2 10°1 " 10°2 10°1 100 101 Execution Time (s) IID, PCA Sobol, Cholesky Sobol, PCA
  81. Variation Reduction μ := ∫ Ω g(z)λ(z) dz = something

    wonderful ⏞ ⋯ = expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn • Variable transforms are akin to importance sampling • May be di ff i cult for Bayesian posterior mean problems • Control variates may also be used • Acceptance-rejection sampling does not work well
  82. Multilevel methods reduce computational cost [G12] μ := expectation 𝔼

    [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn
  83. Multilevel methods reduce computational cost [G12] μ := expectation 𝔼

    [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn • The cost to evaluate is typically , so the cost to obtain is typically f(xi ) 𝒪 (d) |μ − ̂ μn | ≤ ε 𝒪 (dε−1−δ)
  84. Multilevel methods reduce computational cost [G12] μ := expectation 𝔼

    [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn • The cost to evaluate is typically , so the cost to obtain is typically f(xi ) 𝒪 (d) |μ − ̂ μn | ≤ ε 𝒪 (dε−1−δ) • If one can approximate by lower dimensional approximations, , then f fs : [0,1]s → ℝ μ = 𝔼 [fs1 (X1:s1 )] μ(1) + 𝔼 [fs2 (X1:s2 ) − fs1 (X1:s1 )] μ(2) + ⋯ + 𝔼 [f(X1:d ) − fsL−1 (X1:sL−1 )] μ(L)
  85. Multilevel methods reduce computational cost [G12] μ := expectation 𝔼

    [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn • The cost to evaluate is typically , so the cost to obtain is typically f(xi ) 𝒪 (d) |μ − ̂ μn | ≤ ε 𝒪 (dε−1−δ) • If one can approximate by lower dimensional approximations, , then f fs : [0,1]s → ℝ μ = 𝔼 [fs1 (X1:s1 )] μ(1) + 𝔼 [fs2 (X1:s2 ) − fs1 (X1:s1 )] μ(2) + ⋯ + 𝔼 [f(X1:d ) − fsL−1 (X1:sL−1 )] μ(L) • Balance the cost to approximate well and the total cost to obtain may be as small as as 𝒪 (nl sl ) μ(s) |μ − ̂ μn | ≤ ε 𝒪 (ε−1−δ) d, ε−1 → ∞
  86. Conditional Monte Carlo for Density Estimation [LEPBA22, GKS24] Estimate the

    probability density, , for , where ϱ Y = f(X) X ∼ 𝒰 [0,1]d
  87. Conditional Monte Carlo for Density Estimation [LEPBA22, GKS24] Estimate the

    probability density, , for , where ϱ Y = f(X) X ∼ 𝒰 [0,1]d y = f(x) ⟺ x1 = g(y; x2:d ) If one can identify , such that g
  88. Conditional Monte Carlo for Density Estimation [LEPBA22, GKS24] Estimate the

    probability density, , for , where ϱ Y = f(X) X ∼ 𝒰 [0,1]d y = f(x) ⟺ x1 = g(y; x2:d ) Then ϱ(y) = ∫ [0,1]d−1 g′  (y; x) dx If one can identify , such that g
  89. Overview • Where in practice • Constructing low discrepancy (LD)

    • Discrepancy (quality) measures for • Choosing so that • Making our original problem look like the above • Ongoing research x0 , x1 , … x0 , x1 , … n |μ − ̂ μn | ≤ ε μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n−1 ∑ i=0 f(xi ) =: ̂ μn
  90. Ongoing research • Construction of LD sequences — Takahashi Goda

    • Error estimation — Art Owen — and stopping rules, especially for multilevel methods • Multi- fi delity models and Bayesian approaches — Chris Oates • Connections with machine learning — Frances Kuo • Applications beyond integration • Good software — QMCPy and others
  91. My aim for the past 75 minutes You will •

    Understand what quasi-Monte Carlo (qMC) methods are • Try qMC in place of simple (or IID) Monte Carlo • Use qMC properly • Feel free to interrupt and ask questions • Join our friendly qMC research community
  92. MCM2025Chicago.org July 28 – August 1 Plenary Speakers Nicholas Chopin,

    ENSAE Peter Glynn, Stanford U
 Roshan Joseph, Georgia Tech Christiane Lemieux, U Waterloo
 Matt Pharr, NVIDIA Veronika Rockova, U Chicago
 Uros Seljak, U California, Berkeley Michaela Szölgyenyi, U Klagenfurt
  93. References [QMCPy] S.-C. T. Choi, F. J. Hickernell, R. Jagadeeswaran,

    M. McCourt, and A. Sorokin, QMCPy: A quasi- Monte Carlo Python library (versions 1–1.5), 2024. [DKP22] J. Dick, P. Kritzer, and F. Pillichshammer, Lattice rules: Numerical integration, approximation, and discrepancy, Springer Series in Computational Mathematics, Springer Cham, 2022. [DP10] J. Dick and F. Pillichshammer, Digital nets and sequences: Discrepancy theory and quasi-Monte Carlo integration, Cambridge University Press, Cambridge, 2010. GKS23] A. D. Gilbert, F. Y. Kuo, and I. H. Sloan, Analysis of preintegration followed by quasi-Monte Carlo integration for distribution functions and densities, SIAM J. Numer. Anal. 61 (2023), 135–166. [G13] M. Giles, Multilevel Monte Carlo methods, Monte Carlo and Quasi-Monte Carlo Methods 2012 (J. Dick, F. Y. Kuo, G. W. Peters, and I. H. Sloan, eds.), Springer Proceedings in Mathematics and Statistics, vol. 65, Springer-Verlag, Berlin, 2013. [G04] P. Glasserman, Monte Carlo methods in fi nancial engineering, Applications of Mathematics, vol. 53, Springer-Verlag, New York, 2004.
  94. References [H00] F. J. Hickernell, What a ff ects the

    accuracy of quasi-Monte Carlo quadrature?, Monte Carlo and Quasi-Monte Carlo Methods 1998 (H. Niederreiter and J. Spanier, eds.), Springer-Verlag, Berlin, 2000, pp. 16–55. [H18] __________, The trio identity for quasi-Monte Carlo error analysis, Monte Carlo and Quasi-Monte Carlo Methods: MCQMC, Stanford, USA, August 2016 (P. Glynn and A. Owen, eds.), Springer Proceedings in Mathematics and Statistics, Springer-Verlag, Berlin, 2018, pp. 3–27. [LEPBA22] P. L'Ecuyer, F. Puchhammer, and A. Ben Abdellah, Monte Carlo and quasi-Monte Carlo density estimation via conditioning, INFORMS J. Comput. 34 (2022), no. 3, 1729–1748. [NW10] E. Novak and H. Woźniakowski, Tractability of multivariate problems Volume II: Standard information for functionals, EMS Tracts in Mathematics, no. 12, European Mathematical Society, Zürich, 2010. [LENOT24] P. L'Ecuyer, M. K. Nakayama, A. B. Owen, and B. Tu ffi n, Con fi dence intervals for randomized quasi-Monte Carlo estimators, WSC ’23: Proceedings of the Winter Simulation Conference, 2024, pp. 445–456.
  95. References [SPDOHHH24] A. G. Sorokin, A. Pachalieva, D. O'Malley, J.

    M. Hyman, F. J. Hickernell, and N.~W. Hengartner, Computationally e ff i cient and error aware surrogate construction for numerical solutions of subsurface fl ow through porous media, 2024+, arXiv:2310.13765.