JavaZone 2025 - Non-deterministic? No problem! You can test it!

@shelajev @edeandrea Eric Deandrea, Java Champion Oleg Šelajev, Java Champion
Non-deterministic? No problem! You can test it!

@shelajev @edeandrea • Java Champion • 26+ years software development
experience • Contributor to Open Source projects Quarkus Spring Boot, Spring Framework, Spring Security LangChain4j (& Quarkus LangChain4j) Wiremock Microcks • Boston Java Users ACM Chapter Board Member & Vice Chair • Published Author • Cat lover • Black belt in martial arts About Us

@shelajev @edeandrea • Showcase & explain Quarkus, how it enables
modern Java development & the Kubernetes- native experience • Introduce familiar Spring concepts, constructs, & conventions and how they map to Quarkus • Equivalent code examples between Quarkus and Spring as well as emphasis on testing patterns & practices 3 https://red.ht/quarkus-spring-devs

@shelajev @edeandrea • Surprisingly, also, a Java Champion • 18+
years software development experience • ~11 years Developer Advocate • Loves to stare at the code of Open Source projects Quarkus Spring Boot LangChain4j Microcks Testcontainers (sometimes contributes bugs too!) • Half decent chess player • Allergic to cats About Us

@shelajev @edeandrea What are you hoping to learn here? What
are you hoping to learn here? What are you going to leave with?

@shelajev @edeandrea @shelajev @edeandrea Did we get better or worse
with this release?

@shelajev @edeandrea

@shelajev @edeandrea What’s changed in the last 6 months? •
Standardization ◦ Or lack thereof (lots of competing standards)? • Distributed • Orchestrated • Smaller models • Agentic

@shelajev @edeandrea @shelajev @edeandrea “Types” of AI

@shelajev @edeandrea AI replacing humans

@shelajev @edeandrea AI replacing software

@shelajev @edeandrea How does your DevOps evolve when you infuse
your applications with AI?

@shelajev @edeandrea DevOps Evolution Dev Ops Evaluate Data ML

@shelajev @edeandrea https://github.com/edeandrea/non-deterministic-no-problem

@shelajev @edeandrea Chat Bot Web Socket Claim AI Assistant Claim
Status Notification Generate Email AI Assistant Output Guardrails Politeness AI Assistant AI replacing humans AI replacing software https://github.com/edeandrea/non-deterministic-no-problem Code I write Voodoo magic Legend Input Guardrails

@shelajev @edeandrea @shelajev @edeandrea Vanilla AI

@shelajev @edeandrea Application Database Application Service CRUD application Microservice Application
Model AI-Infused application What’s the difference between these?

Model AI-Infused application Integration Points What’s the difference between these?

@shelajev @edeandrea Testing AI Replacing Humans Playwright chat bot Recording
done by Playwright while testing a chat bot

@shelajev @edeandrea Testing AI Replacing Humans 28

@shelajev @edeandrea Rethink your approach

@shelajev @edeandrea Signal from tests: - stuff needs fixing -
confident to release

@shelajev @edeandrea Signal from tests: - stuff needs fixing -
confident to release Purpose of tests: - prevent breaking prod - continuously improve your app

@shelajev @edeandrea https://www.upworthy.com/prankster-tricks-a-gm-dealership-chatbot-to-sell-him-a-76000-chevy-tahoe-for-1-rp3 https://www.cbsnews.com/news/aircanada-chatbot-discount-customer https://www.bbc.com/news/technology-35902104 https://www.spiceworks.com/tech/artificial-intelligence/news/meta-blender-bot-3-controversy https://www.linkedin.com/posts/stephanjanssen_princoming-activity-7285987635628507136-9Ubw

@shelajev @edeandrea What does failure look like? What do we
need to do differently?

Model AI-Infused application Integration Points Observability (metrics, tracing, logs, auditing) Fault Tolerance (timeout, bulkhead, circuit breaker, rate limiting, fallbacks, …) What’s the difference between these?

@shelajev @edeandrea https://library.wiremock.org/catalog/api/o/openai.com/openai-com https://mockgpt.wiremock.io https://docs.quarkiverse.io/quarkus-wiremock/dev

@shelajev @edeandrea What happens when we do this?

@shelajev @edeandrea Stupidity will always find a way…

@shelajev @edeandrea Stupidity Prompt: Please return a JSON document in
the following format: { “name: “String”, “countryOfOrigin”: “String”} Response: Sure I’d love to give you some JSON! Here it is: ```json { “name”: “Eric”, “countryOfOrigin”: “USA” } ```

@shelajev @edeandrea Prompt Engineering and team topologies I said JSON!

@shelajev @edeandrea Guardrails - Out of the box in LangChain4j
& Quarkus! - Functions used to validate the input and output of the model - Detect invalid input or output - Detect prompt injection - Detect hallucination - Chain of guardrails - Sequential - Stop at first failure

@shelajev @edeandrea Retry and Reprompt Output guardrails can have 4
different outcomes: - Success - Response is passed to the caller or next guardrail - Fatal - Stop and throw an exception - Retry - Call the model again with the same context we never know ;-) - Reprompt - Call the model again with another message in the model indicating how to fix the response

@shelajev @edeandrea Stupidity will always find a way…

@shelajev @edeandrea Observability

@shelajev @edeandrea Observability Collect metrics - Exposed as Prometheus -
Track token usage & cost OpenTelemetry Tracing - Trace interactions with the LLM Auditing - Track of interactions with the LLM - Ability to replay & re-score interactions

@shelajev @edeandrea @shelajev @edeandrea Putting it all together

@shelajev @edeandrea • Like static analysis ◦ Are we getting
better or worse over time? • Remember observability? Systematic Eval: are you getting better or worse? https://docs.quarkiverse.io/quarkus-langchain4j/dev/testing.html

@shelajev @edeandrea Selection decisions are not application based https://artificialanalysis.ai/models

@shelajev @edeandrea @shelajev @edeandrea Takeaways

@shelajev @edeandrea • LangChain4j & Quarkus are awesome! Get simple
problems out of the way first • Naming things is still the hardest thing in computer science • Don’t forget your craft: DevOps process is there to help • Write tests, expect change and failure, deploy often • AI is just an API call Actual takeaways

@shelajev @edeandrea https://quarkus.io @quarkusio https://quarkusio.zulipchat.com @quarkus.io

@shelajev @edeandrea @shelajev @edeandrea Thank You! Slides https://bit.ly/jz25-non-deterministic

JavaZone 2025 - Non-deterministic? No problem! ...

JavaZone 2025 - Non-deterministic? No problem! You can test it!

More Decks by Eric Deandrea

Other Decks in Technology

Featured

Transcript