times, mainly to deride others • Not clear what “serious” means in the context of an argument that equates a computer program with nuclear weapons? • Or accuses anyone who disagrees with this assessment of “just vibes”? • Or one that puts the risk of human extinction at the (metaphorical!) hands of a computer program to be 5% with zero methodology? • So, a serious question: why treat this seriously at all?
isn’t new – and isn’t always poorly founded! • New technologies often have unintended consequences and externalities that merit consideration and discussion • But in those who believe in AI-based extinction risk, the fear itself is alarming – in part because of the actions that it would justify • The “AI pause” – if implemented – would be brazenly authoritarian • The accompanying rhetoric is often disturbingly violent
– when made concrete – hinge on: ◦ A computer program getting ahold of nuclear weapons ◦ A computer program making a novel bioweapon ◦ A computer program developing novel molecular nanotechnology • We are going to leave aside nuclear weapons, as indisputably serious people have been thinking about it since the dawn of the atomic age • But the latter two have something important in common…
we talk about the fear of a superintelligent AI actively killing not just some humans but all of them, we are talking about AI making weapons • Let us leave aside many questions about such scenarios (e.g., AI’s alignment, motivation, or means of production – and human adaptability, countermeasures, and resilience), and focus on one pillar… • It depends on AI making applying the constraints of physical and mathematical reality to make new stuff – which is to say, engineering
threatened by a superintelligence engaged in engineering, it prompts an important question… • Is engineering an act of intelligence alone? • I can’t speak to building novel bioweapons or the significant challenges in reviving otherwise moribund molecular nanotechnology… • …but we do have a bunch of recent experience building something big and new that is surely simpler than these domains
be said: building a new computer + new network switch + high-speed backplane + all software from lowest levels of firmware to highest levels of control plane is hard and complicated • It is still, however, engineering not science • Engineering is the act of learning from failure: even when building anew, there will be many occasions when the system does not, in fact, work! • It is worth exploring a tiny fraction of the failures that we endured in building, as they are instructive as to the nature of engineering…
following the documented power sequencing to the CPU (AMD Milan), it was refusing to come out of reset, simply reinitiating the power-on sequence after 1.25 seconds of inactivity • Natural assumption was that power was marginal – but the power looked good (and making it extraordinary didn’t change anything) • Went down any number of blind alleys, performing directed experiments with respect to non-connected pins that shouldn’t make any difference • These experiments weren’t easy!
several weeks of debugging, we discovered that our voltage regulator had a firmware bug: it adjusted voltage as requested by the CPU via SVI2 – but never sent a completion (VOTF Complete) • The CPU had no way of knowing that the power was in fact correct • AMD’s tool for verifying power (SDLE) did not check for this packet • Corrected regulator firmware resulted in the CPU coming out of reset!
could not get the Chelsio NIC to come out of reset • Extensive validation did not reveal any signal that was out of spec • Attempting to take a working add-in card (AIC) and destroy it revealed that one of the pinstrap resistors (to select the clock source) was incorrectly specified • We had a 1K ohm pull-down resistor, but this was in fact too weak – and a 499 ohm resistor was required to overcome an internal pull-up • Reworking with the correct resistor resulted in the NIC correctly starting!
We have our own platform enablement layer (i.e., no BIOS); we are responsible for initializing devices at the lowest layer • With disconcerting frequency, some number of Chelsio NIC links did not train correctly for some of their lanes on boot • Decoding the Link Status and Training State Machine (LSTSM) on the CPU allowed us to better understand where it was failing, but not why • Discovered that a second PERST resulted in correct training – and moreover that this second PERST is present on legacy firmware!
a revision of our PCIe-to-U.2 passthrough card (Sharkfin), we had I2C connectivity – but no PCIe connectivity whatsoever • A previous version of this card had worked, but little had changed in the schematic and the layout – why were the new ones broken?! • Physical inspection revealed that one of the parts was simply wrong! • The wrong reel of parts had been loaded into a pick-and-place machine, and an inverter had been laid down instead of an AND gate (!) • Reworked ~1200 cards in ~96 hours!
OS boot images, sporadic (!) corruption was seen • Adding checksums to these images revealed corruption was rampant (!!) • Microprocessor was speculatively loading through a stowaway mapping from early boot, which was allocating in the TLB • If application address conflicted with address of stowaway mapping, kernel would incorrectly copy data from the wire to the wrong location • Eliminating stowaway mapping eliminated the corruption – but highlighted divergent perspectives on side-effects of speculative loads
an existential risk for the artifact: without solving them, we wouldn’t have something that’s impaired – we would have nothing • Each revealed an emergent property, often at an interface boundary • The breakthrough was often something that “shouldn’t” have worked • Intelligence alone does not solve problems like this • In all cases, we summoned other elements of our character: our resilience, our teamwork, our rigor, our optimism, our curiosity
important to us, that we have codified them – and use them very explicitly as a lens for hiring • To be clear, we are certainly seeking capable, intelligent people – but that intelligence is useless without these shared (human!) values • We may be more explicit about it than others, but many engineering teams are also implicitly hiring for shared values • Viz.: It is comical to think of an engineering team hiring based only on the results of a test – or any other linear measure of intelligence!
understand and resolve failure – so essential in designing and building – is hidden in the final artifact • This is the soul in Tracy Kidder’s Soul of a New Machine – and the perspiration in Edison’s proverbial 99% perspiration • Computer programs lack this humanity: they do not have willpower, desire, or drive – let alone the deeper human qualities required • Which doesn’t mean that AI can’t be useful to engineers, merely that it cannot engineer autonomously
due to AGI is de minimis – but we must not falsely dichotomize AI into posing existential risk or no risk whatsoever! • The risk that AI does pose may feel mundane – but it is much more how it will be abused (deliberately or accidentally) by existing structures • AI ethics is exceedingly important, especially when it is being used to inform decisions that affect people’s lives! • By acknowledging that AI is and will be an important tool, we can move beyond fear to focus on enforcing existing regulatory regimes
Eric Drexler debate on molecular nanotechnology • Lex Friedman interview with Marc Andreessen • Logan Bartlett interview with Eliezer Yudkowsky • Oxide and Friends podcast, especially Okay Doomer, Tales From the Bringup Lab and More Tales from the Bringup Lab