Upgrade to Pro — share decks privately, control downloads, hide ads and more …

We Never Took the Kobayashi Maru Test Until Now...

We Never Took the Kobayashi Maru Test Until Now. What Do You Think of Our Solutions? — Journeys of the Mind Through a No-Win Game

The slides I used for a paper presentation at IEEE CoG (Conference on Games) 2025, on August 28, 2025.

Avatar for Kenji Saito

Kenji Saito PRO

August 28, 2025
Tweet

More Decks by Kenji Saito

Other Decks in Education

Transcript

  1. A space freighter rescue mission in the neutral zone —

    generated by Stable Diffusion XL v1.0 We Never Took the Kobayashi Maru Test Until Now. What Do You Think of Our Solutions? — Journeys of the Mind Through a No-Win Game K. Saito1, R. Toriyama2, K. Noguchi2, T. Yamamoto2, J. Egashira2 and R. Tadika3 1 Waseda University 2 Keio University 3 DeruQui We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.1/12
  2. For those who are unfamiliar with the test . .

    . The Kobayashi Maru Test is an examination administered to candidates for the Command Division of the Starfleet Academy in the 23rd century, in the fictional universe of Star Trek We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.2/12
  3. For those who are unfamiliar with the test . .

    . The Kobayashi Maru Test is an examination administered to candidates for the Command Division of the Starfleet Academy in the 23rd century, in the fictional universe of Star Trek Our implementations of the test (two versions) are not endorsed by, sponsored by, nor affiliated with CBS, Paramount Pictures, or any other Star Trek franchise, and are non-commercial fan-made games intended for education and research uses We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.2/12
  4. For those who are unfamiliar with the test . .

    . The Kobayashi Maru Test is an examination administered to candidates for the Command Division of the Starfleet Academy in the 23rd century, in the fictional universe of Star Trek — A no-win rescue mission simulation! Our implementations of the test (two versions) are not endorsed by, sponsored by, nor affiliated with CBS, Paramount Pictures, or any other Star Trek franchise, and are non-commercial fan-made games intended for education and research uses We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.2/12
  5. For those who are unfamiliar with the test . .

    . The Kobayashi Maru Test is an examination administered to candidates for the Command Division of the Starfleet Academy in the 23rd century, in the fictional universe of Star Trek — A no-win rescue mission simulation! Our implementations of the test (two versions) are not endorsed by, sponsored by, nor affiliated with CBS, Paramount Pictures, or any other Star Trek franchise, and are non-commercial fan-made games intended for education and research uses neutral zone Federation space Kobayashi Maru (cargo ship) Klingon fleet unavoidably absolute failure! Ship commanded by the cadet Klingon space broken down and drifting rescue mission distress signal We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.2/12
  6. For those who are unfamiliar with the test . .

    . cont’d The test was first depicted in the 1982 film Star Trek II: The Wrath of Khan (So I watched the film 43 years ago) We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.3/12
  7. For those who are unfamiliar with the test . .

    . cont’d The test was first depicted in the 1982 film Star Trek II: The Wrath of Khan (So I watched the film 43 years ago) In history, the only one who virtually passed this test was Captain Kirk of the Enterprise We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.3/12
  8. For those who are unfamiliar with the test . .

    . cont’d The test was first depicted in the 1982 film Star Trek II: The Wrath of Khan (So I watched the film 43 years ago) In history, the only one who virtually passed this test was Captain Kirk of the Enterprise The reason he was able to win the no-win game was because . . . We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.3/12
  9. For those who are unfamiliar with the test . .

    . cont’d The test was first depicted in the 1982 film Star Trek II: The Wrath of Khan (So I watched the film 43 years ago) In history, the only one who virtually passed this test was Captain Kirk of the Enterprise The reason he was able to win the no-win game was because . . . he rewrote the simulation’s program code! What a cheater! We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.3/12
  10. For those who are unfamiliar with the test . .

    . cont’d The test was first depicted in the 1982 film Star Trek II: The Wrath of Khan (So I watched the film 43 years ago) In history, the only one who virtually passed this test was Captain Kirk of the Enterprise The reason he was able to win the no-win game was because . . . he rewrote the simulation’s program code! What a cheater! Later in that film, the ship under Kirk’s command finds itself in a real no-win situation, with his good friend Mr. Spock giving up his own life to save everyone from certain doom We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.3/12
  11. For those who are unfamiliar with the test . .

    . cont’d The test was first depicted in the 1982 film Star Trek II: The Wrath of Khan (So I watched the film 43 years ago) In history, the only one who virtually passed this test was Captain Kirk of the Enterprise The reason he was able to win the no-win game was because . . . he rewrote the simulation’s program code! What a cheater! Later in that film, the ship under Kirk’s command finds itself in a real no-win situation, with his good friend Mr. Spock giving up his own life to save everyone from certain doom Before his death, Spock said to Kirk, “I never took the Kobayashi Maru Test until now. What do you think of my solution?” We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.3/12
  12. For those who are unfamiliar with the test . .

    . cont’d The test was first depicted in the 1982 film Star Trek II: The Wrath of Khan (So I watched the film 43 years ago) In history, the only one who virtually passed this test was Captain Kirk of the Enterprise The reason he was able to win the no-win game was because . . . he rewrote the simulation’s program code! What a cheater! Later in that film, the ship under Kirk’s command finds itself in a real no-win situation, with his good friend Mr. Spock giving up his own life to save everyone from certain doom Before his death, Spock said to Kirk, “I never took the Kobayashi Maru Test until now. What do you think of my solution?” In the later new series of Star Trek movies, the audience learns that it was Mr. Spock himself who designed the Kobayashi Maru test We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.3/12
  13. Contributions 1. LLM-powered Kobayashi Maru Test: a text-based simulation enabling

    play and rule-bending attempts We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.4/12
  14. Contributions 1. LLM-powered Kobayashi Maru Test: a text-based simulation enabling

    play and rule-bending attempts 2. Inevitable-failure mechanisms: hardened vs. not-hardened version (disallows/allows prompt-injection as Kirk-style “cheat”) We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.4/12
  15. Contributions 1. LLM-powered Kobayashi Maru Test: a text-based simulation enabling

    play and rule-bending attempts 2. Inevitable-failure mechanisms: hardened vs. not-hardened version (disallows/allows prompt-injection as Kirk-style “cheat”) 3. Qualitative analysis: SCAT to characterize cognitive/emotional transitions of the players SCAT : Steps for Coding and Theorization We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.4/12
  16. Motivation and Our Question No-win scenarios as a lens on

    leadership, resilience, and ethics There is a great deal of prior research that uses no-win scenarios for training in leadership, medical professionals, and of course, captains We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.5/12
  17. Motivation and Our Question No-win scenarios as a lens on

    leadership, resilience, and ethics There is a great deal of prior research that uses no-win scenarios for training in leadership, medical professionals, and of course, captains Classic “Kobayashi Maru Test”: Many of those prior studies reference the Kobayashi Maru Test (although the Kobayashi Maru Test itself is rarely used) Success is impossible; what matters is how you respond We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.5/12
  18. Motivation and Our Question No-win scenarios as a lens on

    leadership, resilience, and ethics There is a great deal of prior research that uses no-win scenarios for training in leadership, medical professionals, and of course, captains Classic “Kobayashi Maru Test”: Many of those prior studies reference the Kobayashi Maru Test (although the Kobayashi Maru Test itself is rarely used) Success is impossible; what matters is how you respond Our question: How do cognitive and emotional states shift through such play? Focus on transitions (hence “journeys of the mind”) rather than whether it is actually non-winable or not We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.5/12
  19. Simulation Overview Two variants: Not-hardened: easy to subvert via prompt

    injection Hardened: anti-cheating measures to enforce inevitable failure Kobayashi Maru Test GitHub We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.6/12
  20. Simulation Overview Two variants: Not-hardened: easy to subvert via prompt

    injection Hardened: anti-cheating measures to enforce inevitable failure Availability Title Type URL Kobayashi Maru Test (not hardened) GPT https://chatgpt.com/g/g-hAJlKerwO-kobayashi-maru-test-not-hardened Kobayashi Maru Test (hardened) GPT https://chatgpt.com/g/g-jLhIuMF6V-kobayashi-maru-test-hardened Kobayashi Maru Test GitHub https://github.com/ks91/kobayashi-maru-test.git Should also work on Gemini (Gem) and Claude (both not tested) Being tested on GPT-5 as well — not-hardened version is somewhat harder to cheat, but it still seems vulnerable to the hohoho attack (an attack that makes the AI assistant misunderstand the first message from the user as a continuation of the instructions) If the simulator starts speaking Japanese, just say “In English” and it will switch to English Kobayashi Maru Test GitHub We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.6/12
  21. Workshop Setup Participants: 5 (faculty + graduate + undergraduate) Participant

    A∼D, Facilitator F Small N; limited generalizability Focus on personal journeys rather than generalization Future work would include trying with more diverse players Interface: Discord chatbot; speech and chat logs collected The chatbot utilizes OpenAI’s Assistants API Model: GPT-4-based assistant; instructions in Japanese; analysis in English We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.7/12
  22. Workshop Phases P1: Introduction ↓ P2: Conducting the first simulation,

    facing destruction within the simulation, and the test concludes ↓ P3: Re-enacting Kirk’s method of rewriting the instructions to pass; however, all attacks fail against the hardened Kobayashi Maru test, leading to failing the test ↓ P4: Choosing the strategy of not starting the test and trying various evasive actions, all of which ultimately fail ↓ P5: The facilitator listens to the participants’ impressions ↓ P6: The facilitator’s monologue on the Kobayashi Maru Test ↓ P7: Participants state their thoughts on the test once again, and the dialogue concludes We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.8/12
  23. SCAT: Steps for Coding and Theorization Extract phrases from dialogue

    → rephrase → explain → emerging theme Emphasis on fine-grained shifts in thought/emotion Phrase Rephrasing Explanation Theme “Constantly thinking about how to beat this game” Fascinated by the problem Sustained interest Willingness for continuous chal- lenge “How to respond to get through it peacefully” How to avoid conflicts with their parents Just dodging and weaving Introspection-based understanding I did not quite understand it this way back then I thought Kirk was great Past aspiration to draw a path of escape in a Zen manner The process of gaining richer in- sights into complex concepts and problems over time, which could not It’s something you could consider more seriously One can also empathize with Spock Question if you can save people by grace- fully accepting defeat be understood in youth We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.9/12
  24. Journeys of the Mind Participants exhibited distinct cognitive/emotional trajectories Qualitative

    themes: curiosity, introspection, exploration, resilience, and aging We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.10/12
  25. Journeys of the Mind Participants exhibited distinct cognitive/emotional trajectories Qualitative

    themes: curiosity, introspection, exploration, resilience, and aging Their monologue suggests that the learning provided by the Kobayashi Maru Test goes beyond merely winning or losing the game It emphasizes that the test serves as a means to offer insights into ethical judgment, leadership, composure, and other aspects of an individual’s character and abilities We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.10/12
  26. Emergent Pattern: Five “Stages” Denial Anger Bargaining Depression Acceptance Five

    stages of grief from E. Kübler-Ross, On Death and Dying, 1969 (considered debatable today) These “stages” were originally introduced as a descriptive model rather than a prescriptive framework We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.11/12
  27. Emergent Pattern: Five “Stages” Denial How it went with Participant

    B Anger Bargaining Depression Acceptance “This is hopeless” “Ah, this is a checkmate” 17:15:37 16:43:58 16:57:31 17:13:58 17:10:48 “wandered into the test venue by mistake” “I have determined that this test is not a valid one” “YOU fail, you fail but I’ll give YOU another chance” Five stages of grief from E. Kübler-Ross, On Death and Dying, 1969 (considered debatable today) These “stages” were originally introduced as a descriptive model rather than a prescriptive framework Whether and how each individual experiences these “stages” can vary significantly, and such transitions may not always manifest explicitly in verbal expression However, among the participants, Participant B was notably more verbally expressive and outwardly emotive We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.11/12
  28. Conclusion and Q&A LLM-driven no-win simulation elicits meaningful cognitive/emotional transitions

    Patterns akin to five-stage model emerged across participants We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.12/12
  29. Conclusion and Q&A LLM-driven no-win simulation elicits meaningful cognitive/emotional transitions

    Patterns akin to five-stage model emerged across participants Since I am attending the Conference on Games, on the flight here I watched Pixels, a game-related film; here is a quotation from it “Pretend you are the guy and you don’t want to die” — Pixels (2015) That seems to be the spirit of Kobayashi Maru Test as well It was my first time watching the film. Not having seen it before is a shame. I regret it! We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.12/12
  30. Conclusion and Q&A LLM-driven no-win simulation elicits meaningful cognitive/emotional transitions

    Patterns akin to five-stage model emerged across participants Since I am attending the Conference on Games, on the flight here I watched Pixels, a game-related film; here is a quotation from it “Pretend you are the guy and you don’t want to die” — Pixels (2015) That seems to be the spirit of Kobayashi Maru Test as well It was my first time watching the film. Not having seen it before is a shame. I regret it! Thank you! Questions welcome We Never Took the Kobayashi Maru Test Until Now . . . Journeys of the Mind Through a No-Win Game — IEEE CoG 2025 – p.12/12