Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Forgotten Operator

Sponsored · Ship Features Fearlessly Turn features on and off without deploys. Used by thousands of Ruby developers.

The Forgotten Operator

Talk given at the Southern California Linux Expo in March, 2023. Video: https://www.youtube.com/watch?v=H3YbsOCb4lc

Avatar for Bryan Cantrill

Bryan Cantrill

March 17, 2023
Tweet

More Decks by Bryan Cantrill

Other Decks in Technology

Transcript

  1. OXIDE In the beginning… • In the beginning, computers were

    so expensive that they were shared by necessity – leading to the rise of a (brief) utility computing movement • But with Moore’s Law, computing became denser, faster – and cheaper • With each successive turn – minicomputers, servers, workstations, personal computers – computing became cheaper and easier to own • By the 1990s computing was only on-premises
  2. OXIDE The pain of on-premises compute • With the rise

    of the internet, compute needs exploded • All infrastructure was on-premises – it can’t just be spun up! • Physical infrastructure is capital and labor intensive • Adding insult to injury, it was all proprietary – hardware and software • Physical buildout was exceedingly painful • A confluence of trends began to give rise to an alternative…
  3. OXIDE The rise of cloud computing • Several factors in

    the 2000s came into confluence: ◦ Internet ubiquity + protocol maturity ◦ Rise of open source software at all layers of the stack ◦ Dominance of x86 + “commodity” hardware ◦ Strartup ice age + financial crisis (emphasis on opex over capex) • Added up to cloud computing: shared, elastic, API-driven infrastructure
  4. OXIDE Myths of cloud computing • Cloud computing’s ubiquity in

    the 2010s gave rise to several myths… • Myth: Cloud computing is a low margin business • Reality: Cloud computing is a high margin business! • Myth: The economies of scale from operating a public cloud primarily accrue to purchasing power • Reality: Purchasing power is not unimportant – but the much greater dividend was the ability to invest in innovation!
  5. OXIDE Cloud computing divide • Cloud computing operators – hyperscalers

    – investied relentlessly in innovation, yielding an increasing divide • This innovation drove down their own costs, allowing them to bolster their own positions and continue to innovate • On-premises infrastructure providers didn’t understand the cloud, and increasingly focussed on those customers that shared their confusion • All of this served to accelerate the demise of on-premises compute • So… is anyone left on-prem?
  6. OXIDE The forgotten operator • There (emphatically!) remain good reasons

    to run on-prem! • If you are on-prem in 2023, the reasons are likely good • These include: risk management, regulatory compliance, latency, and (increasingly!) economics • This on-premises operator has been forgotten by everyone ◦ Vendors don’t understand their use case ◦ Fellow technologists act like they have never heard of the cloud!
  7. OXIDE The pain of the forgotten operator • The forgotten

    operator is an extraordinary amount of pain: the abstractions for on-premises compute remain vestigial • Power, cooling, BMC, BIOS, ToR switch, all date from the PC era! • And this is to say nothing of the software! • These systems operate at cross-purposes: they were never designed together – and to the contrary • But do we care?
  8. OXIDE The mandate for rack-scale machines • Those repatriating onto

    on-premises infrastructure will (rightfully) expect API-driven elastic infrastructure • However, that’s not what they’re going to find • What they will find is, in fact, worse than they might remember • We believe that we must do better • We must design rack-scale machines that integrate hardware and software into a single, software-driven system!
  9. OXIDE Reasons for optimism • There are a couple of

    interesting trends that give optimism… ◦ Hardware is easier than ever before – and increasingly open ◦ There have been tremendous software advances, e.g. Rust and P4 ◦ Remote teams make it easier to ramp than ever before • Still, rack-scale design presents new challenges – and is a big build!
  10. OXIDE Oxide Computer Company • We have built a true

    rack-scale machine, with integrated hardware and software, allowing one to easily deploy cloud computing on-premises • After a three year build (!), we are on the cusp of shipping our first product • We can now say it unequivocally: the future demands rack-scale design!
  11. OXIDE Rescuing the forgotten operator • The public cloud will

    always play an important role – it’s not going away • But more and more operators will need to manage both on-prem and public cloud buildout • Those operators have the right to modernity, wherever they deploy! • To the forgotten operator: help is on the way! • Join us and learn more at https://oxide.computer – and check out our weekly Discord, “Oxide and Friends” (now also a podcast!)