We Built for Predictability; The Workloads Didn’t Care

Michael Stahnke VP of Stuff at Flox (flox.dev) @stahnma We
Built for Predictability The Workloads Didn’t Care

@stahnma Edu

3 If I can define desired state, I can operate
in desired state @stahnma

4 Desired State is a Myth @stahnma

5 Desired State is a Myth* @stahnma * Save for
small values of desired and state

6 This is about science @stahnma

7 @stahnma We built an entire industry on the"Fixed Point"
If the manifest is correct, the system is correct.

8 The science might be math @stahnma

9 f(x) = f(f(x)) @stahnma

10 Idempotency @stahnma

11 f(x, y) = f(y, x) @stahnma

12 Commutative @stahnma

13 y = f(x) @stahnma

14 Hermeticity @stahnma

15 Convergence @stahnma

16 Eventual Consistency @stahnma

17 Declarative @stahnma

18 Dry Run / Plan Mode @stahnma

19 We’ve always been at war with entropy @stahnma

20 Some got obsessed with it @stahnma

21 Chaos Engineering Develops as a field @stahnma

22 @stahnma The system Known Inputs Known Outputs

23 @stahnma The system Known Inputs Known Outputs Users ?

24 @stahnma The system Known Inputs Known Outputs Chaos ?
Users

25 A sufficiently large enough selection of users is indistinguishable
from chaos engineering @stahnma

26 @stahnma So we took the most difficult part of
computers, and replicated it and made it super easy to use at levels never imagined.

27 Kernighan's Law: Debugging is twice as hard as writing
the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. @stahnma

28 @stahnma

@stahnma

30 f(x) = 🤷 @stahnma

31 Now we must think in bets, probabilities, or distributions
@stahnma

32 @stahnma

34 Some LLM> I'd like the download area to show
the version you downloading. Right now, it's not clear at all. @stahnma

37 Your assessment of simple or complex may not match
the tools’ abilities. @stahnma

38 Who would win? @stahnma Add a version badge to
a website Rewrite a production perl application running for more than 20 years in Go

39 Who would win? @stahnma Add a version badge to
a website    Took about 6 hours of iteration, a terraform change, multiple approach changes, and edited something like 10 fi les. Tried multiple models, etc. Rewrite a 20 year old production perl application in Go      One-shot (about 10 minutes) got > 80% of behavior correct and happy path was 100%.

40 @stahnma

41 f(x) = 🤷 @stahnma

42 Scientific Method: Isolated (control) for variables. @stahnma

43 Make the set of things you know as large
as possible @stahnma

44 @stahnma Idempotency Safety in repetition. Apply the same operation
multiple times with the same result. Commutativity Order shouldn't matter. The "Holy Grail" of multi-node configuration management. Hermeticity Total isolation from the host. No external dependencies or side effects. The Three Pillars of Certainty

45 @stahnma The Anchor in the Storm Deterministic Foundations Isolate
the variables

46 @stahnma The shift from "Static" to "Agentic" The environment
is reproducible; the application behavior is not Deterministic foundations meeting probabilistic workloads The Intrusion of Probability

47 Boundary Enforcement @stahnma

48 @stahnma Troubleshooting the "It Depends" Observability Dashboard ⾠ Boundary
Violation ℹ Behavior within bounds Observability over Enforcement From "Is the file there?" to "Is the behavior within bounds?"

49 Take inspiration from good (real) SRE. @stahnma

50 Make the set of things you know as large
as possible @stahnma

51 It’s smaller than you think. @stahnma

52 We transition from "State Enforcement" to "Boundary Enforcement." @stahnma

53 @stahnma If you think the answer is sandboxing…ask yourself
does all the sand stay in the sandbox?

54 Since we can't guarantee f(x)=y, we must prove y
stayed within acceptable bounds. @stahnma

55 For 20 years, we wanted ‘==‘ for everything. Now
we must embrace ‘∈’ (member of) @stahnma

56 y is the output of your probabilistic workload ∈
is a member of Ωsuccess is the "Sample Space" or the set of all acceptable/safe outcomes. @stahnma As in y ∈ Ωsuccess

57 Alternatively. ‖ y - ŷ ‖ > ε @stahnma
y - What actually happened ŷ - What you wanted to happen ε - The distance between them (error budget)

58 In a probabilistic world, a "failed" run isn't necessarily
a configuration error. It might just be the tail end of a distribution curve. @stahnma

59 @stahnma

60 @stahnma Workload Evaluation Criteria Standard Health Check: Is the
process running? Is the port open? (Deterministic) WEC: Is the output within the expected "Confidence Interval"? (Probabilistic)

61 @stahnma Workload Evaluation Criteria Build Statistical Alarms Don't alert
on a single P(Fail). Alert when the Shape of Success changes (e.g., the mean response time drifts or the hallucination rate spikes).

62 Chaos is a property we work with. @stahnma Is
it correct vs Is it in bounds?

63 Chaos is a property we work with. @stahnma Is
it correct vs Is it in bounds?

64 @stahnma Is it correct vs Is it in bounds?
You should look at output distribution and test for it.

Shift from Unit Testing our infra to Statistical Testing our workloads.

If the LLM returns a hallucination, your Puppet run didn't fail. Your boundary did.

67 @stahnma The Anchor Principle If the Workload is a
variable, the Environment must be a constant. You cannot reliably debug a probabilistic application on a mutable substrate.

68 @stahnma The Anchor Principle If the Workload is a
variable, the Environment must be a constant. You cannot reliably debug a probabilistic application on a mutable substrate. 💜

69 @stahnma The new CI/CD Pipeline Step 1: Provision a
Hermetic Environment. Step 2: Run the Probabilistic Workload n times. Step 3: Measure the Probability of Success (P).

70 @stahnma The Deployment Guide We don't ship because "the
build passed.” We ship because "the success distribution in this environment remains stable.” Your infrastructure tool’s job is to ensure the Environment consistency so the Success Score is valid.

71 @stahnma Summary of Enforcement Pin the Substrate (recommended): Use
Hermetic tools (I may suggest Flox) to lock the environment. Define the Curve: Establish a baseline for Normal Randomness. Monitor the Shape: Alert on shifts in probability, not just binary failures.

72 @stahnma Closing Determinism was the childhood Probability is our
adulthood. We must build the hermetic foundations and work to make randomness safe.

@stahnma When the code has unknown quality and risk levels,
we need to treat it as hostile. Final Thoughts

@stahnma Control points need to be created and developed Final
Thoughts

@stahnma This is our time.

@stahnma Slides This deck Speakerdeck

We have a Flox workshop on Wednesday! 77 @stahnma

flox.dev

We Built for Predictability; The Workloads Didn...

We Built for Predictability; The Workloads Didn’t Care

More Decks by Michael Stahnke

Other Decks in Technology

Featured

Transcript