I presented this topic for the second time at the first Elastic APJ Virtual User Group of 2025. Here's the recording https://www.youtube.com/watch?v=KeRolTavWqA&ab_channel=OfficialElasticCommunity
This is a slight revision of my slides, though probably a different reading of them, from when they were first given at the syndey testers meetup. Notably, I added some anecdotes that are current to the date of the meetup, such as deepseek r1 also being an openai compatible model. Also, I added a section to look forward to, or ask me to present about: where model evaluations fit into automated tests.