Serving Ollama models, on-premise from any machine in your company’s datacenter or even from the PC you use for playing video games, and orchestrating inference, as easy as using Redis, official ollama-python/ollama-js packages, and oshepherd, in your web application or your jupyter notebook.
PyCon Austria 2025: https://pycon.pyug.at/talks/beyond-the-cloud-on-premise-orchestration-for-open-source-llms/