James Ryan JAMES RYAN . . . . . . . . . . .

Omnivore bricoleur system builder

Overview

I build abstractions, tools, and languages to help humans (and machines) author, control, and understand AI systems. While I specialize in art and entertainment, and in storytelling systems particularly, I've also worked in domains such as healthcare, applied linguistics, library science, and cybersecurity. Like many of my colleagues, I am deeply concerned about the dangers posed by ongoing advances in AI. I hope for a future in which humans can live meaning-filled lives, and I would like to do what I can to help secure such a future. Currently I am interested in investigating the degree to which tacit narrative modeling drives LLM behaviors, since this could make the narrative-systems toolset applicable to AI safety. In other words, can we control an LLM by manipulating (its modeling of) the story it is performing?

Approach

I was brought up in an omnivorous AI tradition that values a broad set of technical methods — symbolic, statistical, and neural — and especially larger configurations of techniques that expose levers for authorial control. Primarily I am interested in system building, which I view as a teleologically oriented act of bricolage: construct an assemblage of AI components, potentially from distinct paradigms, that allows a designer to make a system do their bidding. In recent years, I've been outfitting neural systems with symbolic control mechanisms, and increasingly I am interested in symbolic interpretability mechanisms, which I suspect will have structural similarities. More broadly, I'm fascinated by the history of computing, and by the wonder of computation as a physical phenomenon that our universe affords.

Initiatives
Sifty logo

2025–curr

I operate this boutique software studio specializing in author-centric AI systems for art and entertainment.

Aleator Press logo

2020–curr

I also run an indie press and art dealer specializing in computer-generated literature and early generative art.

SCMA logo

2020–2022

When I was on the faculty at Carleton College, I led a research group called the Studio for Computational Media Archaeology.

Selected Projects
Viv action tree

2022–curr

Story engine centered on a custom programming language.

Text from synthesized training data used for the Viv wizard

2026–curr

Viv-specialized coding assistant trained via fine-tuning LoRA adapters.

Image from the Catbird VR experience

2025

Story engine for a VR espionage game.

mPath AI notification

2024–2025

AI engine undergirding a platform that augments human life coaches.

Image from the VR experience in which Arnie was embedded

2024

AI-powered game master for a VR tabletop roleplaying game.

Image from a project that used Esper

2022–2024

Simulation engine with authorable domains.

Karavani chat transcript

2023–2024

Story engine for an augmented reality game played over WhatsApp.

Soloist example character

2023–2024

Engine for authorable RAG-augmented conversational characters.

MESSY-71 computer-generated story text

2020–2021

Reimplementation of an early story generator.

Wendit Tnce Inf page spread

2021–2022

Handmade letterpress booklet of computer-generated asemic poetry.

Program ERATO page spread

2020

Facsimile edition of an early volume of computer-generated poetry.

Program RETURNER page spread

2020

First published edition of an early work of computer-generated poetry.

Artificial Versifying hexameter table

2020–2021

First English translation of a Latin poetry generator from the 1670s.

SIENNA transcript

2018–2021

DARPA-funded chatbot that wastes the time of email scammers.

Sheldon County title graphic

2018–2019

Proof of concept for a computer-generated radio drama.

Hennepin simulation log

2018–2019

Simulation engine centering on character actions.

Expressionist grammar expansion trace

2015–2018

Authoring tool for natural language generation in expressive domains, like videogames.

Academical in-game scene

2017–2018

Interactive narrative game for teaching responsible conduct of research.

Bad News live performance

2015–2017, 2019, 2023

Work of immersive theater whose story and setting is uniquely generated by a computer simulation.

Talk of the Town simulation log

2015–2017

Simulation engine centering on character knowledge.

GameSpace galaxy visualization

2015–2017

Explorable 3D visualization of the videogame medium.

GameNet interface

2015

Tool for videogame discovery in the form of a hypertext network in which related games are linked.

GameSage interface

2015

Search engine that accepts an idea for a game and returns the most related existing games.

GameGlobs interface

2015

Interactive visualization of various clusterings of the videogame medium.

Islanders ship-exploration visualization

2013–2014

Simulationist roguelike text adventure.

VFClust clustering diagram

2012–2013

Clinical tool augmenting analysis of cognitive testing via techniques like latent semantic analysis.

Experience
Principal Sifty
2025–curr
I operate Sifty, a software studio specializing in author-centric AI systems for art and entertainment.
Proprietor Aleator Press
2020–curr
I run this indie press and art dealer specializing in computer-generated literature and early generative art.
Head of AI mPath AI
2024–2025
Built the AI engine and monitoring dashboard for a platform for life coaching.
Narrative Systems Lead Hexagram
2022–2024
Led a team of authors and engineers building next-generation story engines for videogames and other interactive media.
Visiting Assistant Professor Carleton College
2020–2022
Full-time instructor in the Computer Science department at the perennial #1 school for undergraduate teaching in the US.
Research Scientist BBN Technologies
2018–2021
Principal investigator leading a $7M multi-year DARPA project centering on a conversational AI system that engages with email scammers to waste their time.
AI Specialist Spirit AI
2016–2017
Built the NLG capabilities for the startup's Character Engine product.
Research Assistant Expressive Intelligence Studio
2013–2018
Conversational AI, social simulation, procedural narrative, discovery systems, and more. Lab directed by Michael Mateas and Noah Wardrip-Fruin.
Research Assistant Natural Language and Dialogue Systems Lab
2013–2015
Conversational AI and narrative modeling. Lab directed by Marilyn Walker.
Research Assistant NLP/IE Group
2010–2013
Data annotation and system building for clinical ML applications. Lab directed by Serguei Pakhomov and Genevieve Melton-Meaux.
Education
PhD, Computational Media University of California, Santa Cruz
2013–2018
Thesis: Curating Simulated Storyworlds
MS, Computer Science University of California, Santa Cruz
2013–2016
MS, Health Informatics (minor: Cognitive Science) University of Minnesota
2011–2013
Thesis: A System for Computerized Analysis of Verbal Fluency Tests
BA, Linguistics University of Minnesota
2009–2011
AA, Liberal Arts Normandale Community College
2006–2008
Publications
Lexical Production and Organisation in L2 EFL and L3 EFL Learners: A Distributional Semantic Analysis of Verbal Fluency

Almudena Fernández-Fontecha, Rosa M. Jiménez Catalán, and James Ryan

International Journal of Multilingualism 21(1), 2024

The Use of Lexical Retrieval Strategies by Creative Second Language Learners: A Computational Analysis of Clustering and Switching

Almudena Fernández-Fontecha and James Ryan

Studies in Second Language Learning and Teaching 13(3), 2023

Strategies for Investigating and Eliciting Information from Nuanced Attackers (SIENNA)

Brian Krisler et al.

technical report, 2022

A Quantified Analysis of Bad News for Story Sifting Interfaces

Ben Samuel et al.

14th International Conference on Interactive Digital Storytelling, 2021

Academical: A Choice-Based Interactive Storytelling Game for Enhancing Moral Reasoning, Knowledge, and Attitudes in Responsible Conduct of Research

Katherine M. Grasse et al.

Games and Narrative: Theory and Practice, 2021

Casual Creator Cursed Problems, or: How I Learned to Start Worrying and Love Designers

Adam Summerville et al.

AIIDE Workshops, 2021

English Versification for the Billion: Translating the Early Latin Poetry Generator “Artificial Versifying” (1677)

Kavita Berg, Hannah Koelling, and James Ryan

Annual Meeting of the Electronic Literature Organization, 2021

Getting Academical: A Choice-Based Interactive Storytelling Game for Teaching Responsible Conduct of Research

Edward F. Melcer et al.

15th International Conference on the Foundations of Digital Games, 2020

How to Tame Your Data: Data Augmentation for Dialog State Tracking

Adam Summerville et al.

2nd Workshop on Natural Language Processing for Conversational AI, 2020

Curating Simulated Storyworlds

James Ryan

PhD thesis, University of California, Santa Cruz, 2018

The 13th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment

Brian Magerko et al.

AI Magazine 39(2) (conference report), 2018

1st Workshop on the History of Expressive Systems

James Ryan and Mark Nelson

10th International Conference on Interactive Digital Storytelling (workshop), 2017

A Computational Analysis of Verbal Fluency in Schizophrenia

Jacqueline Roig et al.

16th International Conference on Schizophrenia Research, 2017

A Garden, A Forking Path: Interactive Branching Narrative in The Lady of May (1578)

James Ryan

1st Workshop on the History of Expressive Systems, 2017

Analyzing Expressionist Grammars by Reduction to Symbolic Visibly Pushdown Automata

Joseph C. Osborn, James Ryan, and Michael Mateas

10th Workshop on Intelligent Narrative Technologies, 2017

GameSpace: An Explorable Visualization of the Videogame Medium

James Ryan et al.

University of California, Santa Cruz (technical report), 2017

Grimes’ Fairy Tales: A 1960s Story Generator

James Ryan

10th International Conference on Interactive Digital Storytelling, 2017

Simulating Character Knowledge Phenomena in Talk of the Town

James Ryan and Michael Mateas

Game AI Pro 3 (book chapter), 2017

Translation of “La Simulación”

Joseph E. Grimes, Rogelio E. Cardona-Rivera, and James Ryan

translation, 2017

A Lightweight Videogame Dialogue Manager

James Ryan, Michael Mateas, and Noah Wardrip-Fruin

1st Joint International Conference of DiGRA and FDG, 2016

A Simple Method for Evolving Large Character Social Networks

James Ryan, Michael Mateas, and Noah Wardrip-Fruin

5th Workshop on Social Believability in Games, 2016

A Typology of Verbs Culled From 23,000 Videogame Walkthroughs

James Ryan and Sergiy Ravnyago

1st Joint International Conference of DiGRA and FDG, 2016

Bad News: An Experiment in Computationally Assisted Performance

Benjamin Samuel et al.

9th International Conference on Interactive Digital Storytelling, 2016

Bad News: A Game of Death and Communication

James Ryan, Adam Summerville, and Ben Samuel

34th Annual ACM Conference on Human Factors in Computing Systems, 2016

CFGs-2-NLU: Sequence-to-Sequence Learning for Mapping Utterances to Semantics and Pragmatics

Adam J. Summerville et al.

Technical Report UCSC-SOE-16-11, 2016

Characters Who Speak Their Minds: Dialogue Generation in Talk of the Town

James Ryan, Michael Mateas, and Noah Wardrip-Fruin

12th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, 2016

Computatrum Personae: Toward a Role-Based Taxonomy of (Computationally Assisted) Performance

Benjamin Samuel et al.

3rd Workshop on Experimental AI in Games, 2016

Diegetically Grounded Evolution of Gameworld Languages

James Ryan

7th Workshop on Procedural Content Generation, 2016

GameNet and GameSage: Videogame Discovery as Design Insight

James Ryan et al.

1st Joint International Conference of DiGRA and FDG, 2016

Expressionist: An Authoring Tool for In-Game Text Generation

James Ryan et al.

9th International Conference on Interactive Digital Storytelling, 2016

Generating American Small Towns for Narrative Applications

James Ryan

1st Workshop on Tutorials in Intelligent Narrative Technologies, 2016

Generative Character Conversations for Background Believability and Storytelling

James Ryan, Michael Mateas, and Noah Wardrip-Fruin

5th Workshop on Social Believability in Games, 2016

Juke Joint: A Demo

Tyler Brothers and James Ryan

3rd Workshop on Experimental AI in Games, 2016

Juke Joint: Characters Who Are Moved By Music

James Ryan et al.

3rd Workshop on Experimental AI in Games, 2016

Playable Experiences at AIIDE 2016

Alexander Zook et al.

12th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, 2016

Recognizing Coherent Narrative Blog Content

James Ryan and Reid Swanson

9th International Conference on Interactive Digital Storytelling, 2016

Translating Player Dialogue into Meaning Representations Using LSTMs

James Ryan et al.

16th International Conference on Intelligent Virtual Agents, 2016

White Matter Correlates of Semantic Fluency in Parkinson’s Disease

Megan Ichinose et al.

71st Annual Meeting of the Society for Biological Psychiatry, 2016

Augmented Exploration of Library Videogame Holdings by Techniques from Computational Linguistics

Glynn Edwards et al.

9th Annual Society of American Archivists Science, Technology, and Healthcare Roundtable, 2015

Bad News: A Computationally Assisted Live-Action Prototype to Guide Content Creation

James Ryan et al.

2nd Workshop on Experimental AI in Games, 2015

Domain Adaption of Parsing for Operative Notes

Yan Wang et al.

Journal of Biomedical Informatics, 2015

Generating Natural Language Retellings from Prom Week Play Traces

Christopher Antoun et al.

6th Workshop on Procedural Content Generation, 2015

Large-Scale Visualizations of Nearly 12,000 Digital Games

James Ryan et al.

10th International Conference on the Foundations of Digital Games, 2015

Open Design Challenges for Interactive Emergent Narrative

James Ryan, Michael Mateas, and Noah Wardrip-Fruin

8th International Conference on Interactive Digital Storytelling, 2015

People Tend to Like Related Games

James Ryan et al.

10th International Conference on the Foundations of Digital Games, 2015

Tools for Videogame Discovery Built Using Latent Semantic Analysis

James Ryan et al.

10th International Conference on the Foundations of Digital Games, 2015

Toward Characters Who Observe, Tell, Misremember, and Lie

James Ryan et al.

2nd Workshop on Experimental AI in Games, 2015

Toward Natural Language Generation by Humans

James Ryan et al.

8th Workshop on Intelligent Narrative Technologies and 4th Workshop on Social Believability in Games, 2015

What We Talk About When We Talk About Games: Bottom-Up Game Studies Using Natural Language Processing

James Ryan et al.

10th International Conference on the Foundations of Digital Games, 2015

A Sense Inventory for Clinical Abbreviations and Acronyms Created Using Clinical Notes and Medical Dictionary Resources

SungRim Moon et al.

Journal of the American Medical Informatics Association, 2014

Automating Direct Speech Variations in Stories and Games

Stephanie M. Lukin, James Ryan, and Marilyn A. Walker

3rd Workshop on Games and NLP, 2014

Combinatorial Dialogue Authoring

James Ryan et al.

7th International Conference on Interactive Digital Storytelling, 2014

Toward Recombinant Dialogue in Interactive Narrative

James Ryan, Marilyn Walker, and Noah Wardrip-Fruin

7th Workshop on Intelligent Narrative Technologies, 2014

A System for Computerized Analysis of Verbal Fluency Tests

James Ryan

Master’s thesis, University of Minnesota, 2013

Computerized Analysis of a Verbal Fluency Test

James Ryan et al.

51st Annual Meeting of the Association for Computational Linguistics, 2013

A Study of Actions in Operative Notes

Yan Wang et al.

American Medical Informatics Association Annual Symposium, 2012

Automated Non-Alphanumeric Symbol Resolution in Clinical Texts

SungRim Moon et al.

American Medical Informatics Association Annual Symposium, 2011

Selected Press
How the Mixed Reality Game 'Bad News' Brings Towns Like 'Twin Peaks' to Life

Rolling Stone, 2017

Find your next must-play game by flying through a virtual galaxy

New Scientist, 2017

An Artificial Intelligence is Generating an 'Infinite' Podcast

VICE, 2018

Boffin rediscovers 1960s attempt to write fiction with computers

The Register, 2017

Video games where people matter? The strange future of emotional AI

The Guardian, 2016

Cleverness isn't everything for a gaming artificial intelligence

New Scientist, 2016

Viv action tree

Viv

2022–curr

In emergent narrative, stories bubble up from a simulation of characters as autonomous agents in a world, making it a popular alternative to handcrafted videogame storylines. This was the topic of my PhD thesis, which introduced a concept I called story sifting: the task of automatically identifying storylines as they emerge from a simulation, including by the simulation itself — or by its agents from within that simulation. Imagine a game where characters understand and deal in the unwritten stories that happen to be happening around them.

Viv centers on a custom programming language that allows technical narrative designers to define constructs that drive both character simulation (actions and plans) and story sifting (queries and story templates) in a project with emergent narrative. The language is fully-featured and Turing-complete, with a notation hardened from hundreds of hours of actual authoring use. The Viv compiler (published on PyPI) transforms Viv source code into a content bundle, over which the Viv runtime (published on npm) operates for action selection and story sifting in a running application.

Viv also ships editor plugins for JetBrains IDEs, VS Code, and Sublime Text, and a Claude Code plugin that turns Claude into a Viv expert.

Text from synthesized training data used for the Viv wizard

Viv Wizard

2026–curr

The Viv wizard is a forthcoming AI-powered authoring tool for creating, debugging, and refining Viv content. Concretely, it will take the form of a fine-tuned LLM trained as a series of LoRA adapters sequentially merged into a base model, Qwen2.5-Coder-7B.

It will be able to serve as a partner in all kinds of Viv work, through a command-line interface and potentially as a component in a Viv editor plugin. While it will likely not hold up to the Viv Claude Code plugin, it will be quite inexpensive and it could be run in commercial contexts with concerns around intellectual property. It’s also a fun project that is helping me to level up my skills when it comes to training LLMs.

This project is not yet completed, but is currently underway. The plan is for training to proceed in three distinct stages, each of which builds on the previous one:

  • Domain-adaptive pretraining.
    • The pipeline is implemented, a first training run has been conducted, and an evaluation module is partly written.
  • Supervised finetuning.
    • Not yet started.
  • Reinforcement learning (RLAIF, RLVR).
    • Not yet started.

After each stage, the LoRA adapter is merged into the base model, producing a standalone model for the next stage. The final artifact is a full custom model, rather than an adapter.

Image from the Catbird VR experience

Catbird

2025

Working for Hexagram, I built the story engine (and associated authoring apparatus) for a larger proof-of-concept VR experience, codenamed Catbird, which our team successfully delivered to a client.

The demo is structured into multiple episodes, each of which is made up of a series of one-on-one conversations with NPCs. These conversations play out in real time, and the player participates simply by speaking aloud. Ultimately this is an espionage game in which the player gathers information and then uses that information to sway character decision-making, all of which is achieved primarily through these conversational interactions.

While the interaction paradigm is open-ended, the larger experience is deeply authored by means of the story engine that I created, which operates as a beat system in the lineage of Façade (2005).

Using a custom YAML notation — in lieu of a proper DSL, because of time considerations — the team’s author defined the game’s episodes, scenes, and dramatic beats in terms of a series of events that trigger in certain contexts and execute effects when they occur. The primary form of trigger is defined in natural language, describing a potential situation or occurrence. Such triggers are evaluated by a task-specific LLM component that monitors ongoing gameplay to determine if these descriptions obtain, in which case the associated trigger fires, causing its effects to be executed. As such, the system works like a symbolic AI system, and in particular a production system, but with LLMs swapped in to evaluate conditions — a brittleness point for a purely symbolic approach when used in domains like conversation.

Meanwhile, the primary form of effect is a task that is assigned to a character and is resolved when an associated completion trigger or termination trigger holds. The core LLM component is the dialogue generator, which is tasked with driving character improvisation so as to opportunistically pursue the assigned tasks while still maintaining believability.

This is the third beat system I’ve created, all of them proprietary, but I have plans for creating a fourth refinement for public release.

Collaborators

Hexagram team

mPath AI notification

mPath AI Engine

2024–2025

I spent a year working full-time as Head of AI at mPath AI, a startup that aimed to democratize life coaching by scaling up the capacities of expert human coaches through the use of generative AI. My job was to build our AI engine and also a monitoring dashboard.

While mPath clients would still meet periodically with their coach, in between sessions they could chat with their coach’s “AI extension” by using the mPath mobile app, which shipped on both iPhone and Android. Each coach’s extension was custom-built to capture what we called their “coachess”: their strategies, philosophies, life history, manner of speaking, and various other concerns that characterize how they coach. Concretely, this was structured data that was ingested by the AI engine at various points.

At the heart of the AI engine was the waypoint system, which operated over a library of authored waypoints to structure the coaching strategies undertaken by an AI extension. Using the metaphor of a coach–client relationship being a kind of journey through coaching space, a waypoint is a coaching microstrategy undertaken at a given point along the path. Each coach’s waypoint library was structured as a graph, where the nodes are individual waypoints and the edges connect waypoints. When the system enters a waypoint, an authored set of coaching goals are adopted, and a set of transition triggers become active. Each trigger is specified in natural language, describing a potential situation or occurrence, which is evaluated by a task-specific LLM component that monitors the ongoing chat and fires the event should the description obtain at some point. When a transition trigger fires, its associated target waypoint is entered. In this way, the system navigates through sequences of authored coaching microstrategies in a way that is reactive to the constantly shifting context of the coach-client chat.

The AI engine that I built made heavy use of special-purpose LLM tasks that kicked in at various frequencies to maintain data capturing concerns like the client’s life history, goals and desires, and coaching preferences. My favorite subsystem was the check-in system, which allowed the AI coach to form volitions around reaching out to a client proactively. For example, if the coach and client discussed an upcoming job interview, the AI coach might reach out to the client an hour before it to encourage them, or an hour after it to see how it went.

While a partner team built the mobile app, I was solely responsible for building and deploying our AI engine, which was an Express app backed by MongoDB and deployed on AWS as an ECS Fargate task behind an ALB. I also built a monitoring dashboard with the same backend stack and a React frontend.

Collaborators

mPath team

Image from the VR experience in which Arnie was embedded

Arnie

2024

Working for Hexagram, I built the story engine (and associated authoring apparatus) for a larger proof-of-concept VR experience that our team successfully delivered to a client.

Specifically, this experience was an augmented tabletop roleplaying game with an LLM-powered game master named Arnie. Like a conventional game master, Arnie is responsible for modeling the storyworld, tracking state, performing NPCs (with real-time conversation), and progressing the story, but it does so through task-specific LLM calls.

In lieu of a kind of “Yes, and” aesthetic typical of LLM-mediated narrative experiences, we sought to produce a highly structured flow that felt like more like a human-mediated tabletop experience. To support this aesthetic, I created a campaign authoring scheme that centers on a beat system in the lineage of Façade (2005).

Here, ‘beat’ refers to the notion of a dramatic beat, which in this story engine is structured as a set of goals that Arnie will pursue while the beat is active. Some of these goals pertain to NPC performances, while others would cause additional subsystems to be invoked, such as for running a skill check or introducing a new storyworld element. Additionally, each beat has a set of triggers associated with potential transitions into other beats, such that if a trigger fires the character will transition into its associated target beat. Critically, these triggers are defined in natural language and evaluated by an LLM that has access to all chat transcripts for a given player. As such, the system works like a symbolic AI system, and in particular a production system, but with LLMs swapped in to evaluate conditions — a brittleness point for a purely symbolic approach when used in domains like conversation.

Beyond the beat system, the Arnie AI engine comprises over a dozen task-specific LLM-powered modules, such as an entity generator, speaker selector, movement tracker, and more.

Collaborators

Hexagram team

Image from a project that used Esper

Esper

2022–2024

During my time as Narrative Systems Lead at Hexagram, my main project was Esper, a domain-agnostic simulation engine.

Esper works much like my earlier simulation engine Hennepin, except that rather than operating over a static authored domain (an American county), Esper ingests and simulates arbitrary authored domains. The impetus for this design constraint was twofold: we wanted to use the engine in multiple projects, each with its own domain, and we wanted non-technical writers on the team to author those domains, which necessitated a friendlier authoring interface (than code).

Like Hennepin, Esper centers on concerns like businesses and character actions, but these concerns are defined in a spreadsheet notation tailored to the videogame writers on the team. Additionally, the engine features a real-time subsystem where characters meet their needs by pathfinding to smart objects — like businesses, other characters, and props — in the style of The Sims.

Esper was used in multiple client projects that were successfully delivered.

This is the most recent in a family of four simulation engines that I have developed, and currently I am working on creating a production-grade fifth refinement for public release.

Collaborators

Hexagram team

Karavani chat transcript

Karavani

2023–2024

Working for Hexagram, I built the story engine (and associated authoring apparatus) for a larger proof-of-concept augmented reality game (ARG), codenamed Karavani, which our team successfully delivered to a client.

The experience begins when a player finds a phone number for one of the characters that our authoring team created, which the backend team deployed as WhatsApp bots. These numbers were distributed in various creative ways, across both physical and digital artifacts cleverly placed in the wild. Once a player engages with a character, the experience largely takes the form of a series of chat conversations that take place over WhatsApp.

While the interaction paradigm is open-ended, the larger experience is deeply authored by means of the story engine that I created, which operates as a beat system in the lineage of Façade (2005). Here, ‘beat’ refers to the notion of a dramatic beat, which in this story engine is structured as a set of goals that the character will pursue while the beat is active. Additionally, each beat has a set of triggers associated with potential transitions into other beats, such that if a trigger fires the character will transition into its associated target beat. Critically, these triggers are defined in natural language and evaluated by an LLM that has access to all chat transcripts for a given player. As such, the system works like a symbolic AI system, and in particular a production system, but with LLMs swapped in to evaluate conditions — a brittleness point for a purely symbolic approach when used in domains like conversation.

This was the first of three beat systems that I’ve created, all of them proprietary, but I have plans for creating a fourth refinement for public release.

Collaborators

Hexagram team

Soloist example character

Soloist

2023–2024

While I was working at Hexagram, we had a number of client projects that each entailed the development of a single conversational character who could serve as an expert in open-ended chat about a particular domain (e.g., an author’s collected works). At the time, LLM context windows were still quite cramped at ~4000 tokens, so domain knowledge couldn’t go into the prompt. This was the impetus for retrieval-augmented generation (RAG), a technique whereby relevant textual material is retrieved on-demand and inserted into the prompt for a given generation instance.

Soloist was a tool for authoring and running RAG-powered conversational characters. Using a simple spreadsheet notation, our clients defined concerns such as domain knowledge (in an FAQ scheme), character emotes and their triggers, and configuration parameters controlling conversational behaviors. The Soloist compiler then processed this material to create a bundle defining the character, including a vector DB containing vectorized domain knowledge.

The Soloist runtime plugged into Hexagram’s cloud platform, Saga, which exposed various hooks for integrating a character into a project (web API, Unreal plugin, etc.).

Collaborators

Hexagram team

MESSY-71 computer-generated story text

MESSY-71

2020–2021

A half century ago, in the early months of 1971, the University of Wisconsin computer — a Burroughs B5500 mainframe — was on the verge of writing a novel when it disappeared quite suddenly from the campus. The night before, professor Sheldon Klein and a trio of students had submitted a first test run of their novel-writing program, which terminated early as a result of errors that they would not be able to rectify until the arrival of a new university computer. While they waited for a UNIVAC 1108 to be delivered, Klein and his students authored a technical report on their progress: the remarkably titled “A Program for Generating Reports on the Status and History of Stochastically Modifiable Semantic Models of Arbitrary Universes.”

Intriguingly, Klein’s technical report contains source code, annotated outputs, and even specifications for the domain-specific language underpinning the system, which he later called MESSY. Using the latter, I created a Python reimplementation of the language, which students Theresa Chen and Piper Welch — then members of my Studio for Computational Media Archaeology research group — used to author a murder-mystery story generator that extends the partial domain included in Klein’s original technical report.

Our reimplementation, called MESSY-71 to distinguish it from a later version of Klein’s system, is available on GitHub, where it might be particularly intriguing for pedagogical use.

Collaborators

Theresa Chen, Piper Welch

Wendit Tnce Inf page spread

Wendit Tnce Inf

2021–2022

Wendit Tnce Inf (2022) is a handmade, letterpress-printed book compiling asemic prose poems that were generated pixel by pixel by a suite of generative adversarial networks (GANs) trained by the author, poet Allison Parrish. My role was to design and craft the physical edition, which was published by my indie publishing concern Aleator Press in an edition of 56 copies.

Working at the level of words or characters, GANs can produce poetry that is indistinguishable from examples written by humans. But this doesn’t interest Parrish, who has instead trained her models at the pixel level. In this setting, the technology is not capable of reproducing English text, and in attempting to do so it generates words composed of peculiar letterforms that are eerily beautiful. The romanization ‘Wendit Tnce Inf’ is the result of subjecting an image of the book’s title page to a model for optical character recognition.

The book is letterpress-printed (from polymer plates) on fine paper with natural deckle edges, and bound in hand-sewn softcover wrappers with printed French flaps.

To create the physical edition, I learned how to operate the letterpress at Carleton College, a 1930s Vandercook No. 3 Proof Press, which I thoroughly enjoyed. I also had to learn how to impose the book’s pages, and how to transform stacks of printed pages into saddle-stitched booklets. All of this was made possible by the generous tutelage of Conor McGrann, Digital Studio Arts Technician at Carleton.

Wendit Tnce Inf was exhibited at the HMCT Gallery in Pasadena, as part of Digital Witness: Algorithmic Spaces for Typography and Language, and it’s held in the Anne & Michael Spalter Digital Art Collection and Ragnar Digital Computer Art Collection.

Collaborators

Allison Parrish

Program ERATO page spread

Program ERATO

2020

Through my indie publishing concern Aleator Press, I created a facsimile edition of Louis T. Milic’s Program ERATO (1971), a scarce early volume of computer-generated poetry.

Though preceded at least by Jean A. Baudot’s La machine à écrire (1964), Manfred Krause and Götz F. Schaudt’s Computer-Lyrik (1967), and Alison Knowles and James Tenney’s A House of Dust (c. 1968), Program ERATO (1971) was likely the first volume of computer poetry published in the United States. Further contributing to its intrigue is Milic’s status as an important apologist for computer poetry, and the most prominent scholar and critic of the form writing in English in its first two decades.

In producing this facsimile edition, I took special care to accurately reproduce Milic’s booklet, with due attention to the construction, material, typography, layout, and style of the original edition that was created by The Cleveland State University Poetry Center in 1971. That edition is believed to have been released in a small run of 300 to 500 copies, and copies of the original booklet are difficult to obtain today. Hence this new edition, which was approved by the Louis T. Milic estate.

Copies of this book are held in special collections at UC Berkeley and the University of Mary Washington.

Program RETURNER page spread

Program RETURNER

2020

Through my indie publishing concern Aleator Press, I created this booklet imagining how a sequel to Louis T. Milic’s Program ERATO (1971), one of the earliest published volumes of computer poetry, might have looked. While Milic developed at least four poetry generators, only ERATO’s outputs were graced with a dedicated volume. This is especially curious in that Milic’s most ambitious undertaking was not ERATO, but rather a system called RETURNER — the subject of this speculative edition.

I took special care to make this booklet look and feel like a sequel to Program ERATO: the original edition that was published by The Cleveland State University Poetry Center in 1971, but also my own facsimile edition of 2020. Milic created the RETURNER program to synthesize stanzas similar in form to those of Alberta T. Turner’s poem “Return” (1968). Fascinatingly, RETURNER enchanted Turner to such a degree that she was compelled to create new poems that were inspired by its generated outputs. In her words, she “re-turned” the RETURNER content.

This booklet contains one hundred stanzas produced by RETURNER, along with “Return” and two of Turner’s post-computational explorations, “Hoeing Song” and “Season.” It also features a foreword by Milic and an afterword by Turner, each being distilled from their original writings on these projects. This edition has been approved by the Louis T. Milic estate, and Alberta T. Turner’s poems are reprinted here with the permission of her estate.

Copies of this book are held in special collections at UC Berkeley and the University of Mary Washington.

Artificial Versifying hexameter table

Artificial Versifying

2020–2021

Among the more remarkable antecedents to electronic literature is a little-known system for Latin poetry generation that was published in 1677 by its inventor John Peter, in a booklet titled Artificial Versifying.

To generate a line of verse, the user produces a random number by which a sequence of words may be retrieved from tables containing scrambled letters. Improbably, Peter’s strange invention was quite successful: the booklet appeared in three editions and the system was republished in various periodicals over the subsequent two centuries.

In an effort led by students Kaeden Berg and Henry Koelling, the Studio for Computational Media Archaeology (my research group at the time) produced the first translation into English of this remarkable work.

Critically, we translated the system as a whole, as opposed to individual outputs or isolated components. In our paper on the effort, we argued that the peculiar considerations inherent in the translation of electronic literature are already present in protocomputational works that are sufficiently procedural, such as John Peter’s Artificial Versifying.

Collaborators

Kaeden Berg, Henry Koelling

Papers

SIENNA transcript

SIENNA

2018–2021

After getting my PhD from UC Santa Cruz, I went to work at the storied research firm BBN Technologies, where I served as principal investigator on a project called SIENNA. This was a multiyear effort funded as part of DARPA’s Active Social Engineering Defense (ASED) program, which sought to develop technologies for detecting and combatting sophisticated cyberattacks, and in particular so-called spearphishing campaigns in the email domain. Broadly speaking, the program was interested in systems that could assume a target’s identity over email and then autonomously counterattack bad actors to waste their time and gather information about them.

Our team’s approach was to give the email attacker an engaging dramatic experience — one that would hint at greater prospects than they had initially anticipated, so that they would be willing to spend more time on the attack and take more risks.

To achieve this, we designed a content orientation centered around the notion of quests that our AI engine assigns to attackers, in the manner of the scambaiting community. A module in the style of a drama manager handles the selection of quests, with sequences generally growing more elaborate over time, as the attacker sinks cost into the target and receives enticing hints about potential bigger targets and more valuable information. To allow nontechnical domain experts to create quests, we created a GUI-based authoring tool.

While SIENNA was largely a symbolic AI system making use of a classical NLG pipeline, we did incorporate then-nascent LLMs for stylistic variation and flavor text, beginning with GPT-2 shortly after its initial public release.

Collaborators

BBN team

Papers

Sheldon County title graphic

Sheldon County

2018–2019

Sheldon County was a proof of concept for a computer-generated radio drama, which was the last project I undertook before finishing my PhD at UC Santa Cruz.

Concretely, Sheldon County was to be a podcast distributed like any other, except that each instance of the show would be created on demand for a particular listener. Most pertinently, the show would recount the goings-on of a fictional world whose history was simulated prior to generating any material for the actual radio dramas. (This was done using my Hennepin simulation engine.) So rather than generating the script for a radio drama from whole cloth, the underlying AI systems would simulate a world, sift through its history to identify interesting storylines, and then construct scripts that recount that material in a dramatically satisfying manner.

Unfortunately, I wasn’t able to complete the project in time, hence it being released only as a proof of concept. Nonetheless, it captured the attention of a pre-LLM media landscape, where the idea of an infinite work of media was more evocative than horrifying. One episode was featured on BBC Radio, for instance, and it was a semifinalist on Gimlet Media’s Casting Call podcast competition.

Papers

Press

Hennepin simulation log

Hennepin

2018–2019

Hennepin is a simulation engine that models an American county over the course of its history, with particular attention to the lives of its denizens.

Whereas my earlier simulation engine Talk of the Town modeled major life events and abstract character interactions, in Hennepin the fundamental unit of simulation is specific character actions. A character action casts one or more characters in its roles, such that the assembled cast of characters satisfies the action’s authored preconditions. When an action is performed, it changes the world according to its authored effects.

Other intriguing features include: a rich memory system where characters form and propagate knowledge about past actions, characters maintaining Bayesian models about other characters’ typical whereabouts, and actions causing the topology of the county to change, for instance by subdividing a farm or building a country road.

Many of the ideas underpinning Hennepin’s action formalism and knowledge system have since crystalized in Viv, a system centered a domain-specific language for writing actions and other kinds of constructs.

This was the third in a family of four simulation engines that I have developed, and currently I am working on creating a production-grade fifth refinement for public release.

Expressionist grammar expansion trace

Expressionist

2015–2018

Expressionist is a GUI-based authoring tool for text generation in expressive domains, like videogames, that was created prior to the advent of LLMs.

Influenced by Kate Compton’s Tracery, Expressionist was intended for non-technical authors and utilizes generative grammars, which yield huge amounts of content with relatively little authoring effort.

Where this tool diverges from Tracery is in its tagging affordance: authors can attach tags to chunks of content, and whenever chunks are used to build larger units of content, the result comes packaged with the tags of all the components.

This ends up being powerful in two ways. First, it enables content understanding: generated content now comes packaged with author-defined tags about things that matter in the author’s game, which means that the computer can actually understand that content and do something accordingly (e.g., use its tags to update the game state). Second, it enables targeted generation: a game engine can request the kind of content it wants by specifying the tags that the content should come packaged with upon being generated.

As such, Expressionist allows authors to quickly define content bases comprising billions or trillions or more outputs, each of which can be a) understood by the computer and b) furnished on-demand (by requesting the meaning of the output).

Collaborators

Tyler Brothers, Ethan Seither

Papers

Academical in-game scene

Academical

2017–2018

Academical is an interactive narrative game for teaching responsible conduct of research.

Funded internally by the UC Santa Cruz graduate division, the project aimed to replace existing webinar-style training materials with a more interactive experience that might prove to have better efficacy.

My role in the project was to lead development of an expanded graphical version of the game, starting from an initial text-based prototype produced by another graduate student. I assembled a team of talented student writers and artists, and together we spent an academic year building a polished proof of concept that we delivered to the graduate division. The game was created in Twine.

Later, other researchers at UC Santa Cruz conducted studies demonstrating that, relative to standard RCR training materials, Academical was more engaging and better at promoting moral reasoning and teaching certain RCR topics.

This game is no longer available to play online, from what I can tell.

Collaborators

Nic Junius, Dietrich 'Squinky' Squinkifer, Silvia Ordonez, Thovatey Tep, Janel Catajoy, Aislynn Cetera, Lisa Durand, Yani Mohamad Fauzi, Adesh Kumar, Trevor Holoch, Merita Lundstrom, Jacinda Ni, David Nguyen, Jinah Noh, Jared Ono, Tiffany Phan, Emily Rodriguez, Thomas Ruiz, Reshma Zachariah, and later many others who extended the project in various ways.

Papers

Bad News live performance

Bad News

2015–2017, 2019, 2023

Bad News is an award-winning installation work that combines deep social simulation and live improvisational acting into an emotionally charged interactive experience whose story and setting is uniquely generated by a computer program. Each 45-minute performance is an original work of immersive theatre, produced for an audience of one.

In the summer of 1979, a resident in a computer-generated American small town has died alone at home, and the player, a mortician’s assistant, is tasked with tracking down and notifying the next of kin.

To do this, the player navigates the richly simulated town to interact with its residents, who are each played live by a professional actor. Throughout gameplay, an unseen wizard listens in remotely to manage the unfolding experience via live coding and discreet communication with the actor.

Since its inception in 2015, Bad News has been mounted internationally, at venues including the San Francisco Museum of Modern Art, Slamdance Film Festival, and IndieCade, where it won the 2016 Audience Choice award. Writing about Bad News for Rolling Stone, Steven T. Wright remarked, “This marvel of procedural performance can only be played by a lucky few, and that’s a crying shame.”

Collaborators

Ben Samuel, Adam Summerville

Papers

Press

Talk of the Town simulation log

Talk of the Town

2015–2017

Talk of the Town is a simulation engine that models an American small town over the course of its history, with particular attention to the lives of its denizens.

As the simulation proceeds, characters live out abstracted lives centered on major life events and everyday routines. When a character spends time at the same location as another character, they may interact, in which case their respective affinities toward one another evolve. In this way, as time proceeds characters become embedded in rich social networks that feed back into the decision making about life events (e.g., whom to hire for a job opening) and routines (e.g., which bar to visit after work).

The system’s hallmark, however, is a rich modeling of character knowledge phenomena, particularly memory fallibility. Characters form mental models about other characters, and concerns such as propagation, misremembering, forgetting are simulated.

This was the second in a family of four simulation engines that I have developed, with the first being the simulation undergirding Islanders. Currently, I am working on creating a production-grade fifth refinement for public release.

Collaborators

Adam Summerville

Papers

Press

GameSpace galaxy visualization

GameSpace

2015–2017

GameSpace is an interactive visualization of the videogame medium that takes the form of an explorable 3D galaxy comprising over 15,000 stars, each of which stands for an actual game that exists in the real world. Critically, stars are placed in the space such that more similar games are nearer to one another. For example, games from the same series might be positioned in star clusters, and clusters like these may themselves agglomerate to compose larger nebulae corresponding to notions like game genre or subgenre.

The model underpinning the tool was built by submitting over 15,000 Wikipedia articles about games to a series of techniques from natural language processing and machine learning, namely latent semantic analysis and multidimensional scaling.

GameSpace is no longer available on the web. It was produced as part of the Game Metadata and Citation Project (GAMECIP), an IMLS-funded initiative to improve library practices around videogames.

Collaborators

Eric Kaltman, Taylor Owen-Milner, Andrew Max Fisher, Michael Mateas, Noah Wardrip-Fruin

Papers

Press

GameNet interface

GameNet

2015

GameNet is a tool for videogame discovery in the form of a hypertext network in which related games are linked.

The user starts at a game of their choice and then explores the network by following links to related games or by jumping across the network through a link to a highly unrelated game. In addition to links to its most related and unrelated games, each game’s entry gives a text summary of it, extracted from Wikipedia, as well as links to YouTube gameplay videos and images from Google.

I created GameNet’s underlying model using a technique called latent semantic analysis (LSA), which at the time was a state-of-the-art approach to topic modeling. LSA produces vector representations for the documents in a corpus, which affords computing the relatedness between two documents by taking the cosine between their respective vectors. By assembling a corpus whose individual documents each describe a single videogame — about 12,000 Wikipedia articles about games — we created an LSA model that could compute the relatedness between the games themselves. In turn, this afforded the construction of a static network in which each game is connected to its fifty most related games and fifty least related games, which GameNet made explorable through a GUI interface.

GameNet is no longer available on the web. It was produced as part of the Game Metadata and Citation Project (GAMECIP), an IMLS-funded initiative to improve library practices around videogames.

Collaborators

Eric Kaltman, Michael Mateas, Noah Wardrip-Fruin

Papers

Press

GameSage interface

GameSage

2015

GameSage takes your idea for a game, described in unconstrained free text, and returns an explorable listing of the existing games that are most related to the abstract idea. It can also be used to recover forgotten game titles.

This alternative interface to GameNet uses the same underlying latent semantic analysis (LSA) model. LSA produces vector representations for the documents in a corpus, which affords computing the relatedness between two documents by taking the cosine between their respective vectors. By assembling a corpus whose individual documents each describe a single videogame — about 12,000 Wikipedia articles about games — we created an LSA model that could compute the relatedness between the games themselves.

One cool affordance of LSA is the method of folding in: take an entirely new document that is not included in the original corpus, and derive for it a vector in the semantic space encompassing the original corpus. GameSage is made possible by folding in the user’s idea for the game, as if it were a document describing an actual game, which allows us to compute the fifty most related and fifty least related games to the user’s idea. We then generate a dynamic GameNet entry for the user’s idea, which allows them to begin exploring from that entrypoint into the space.

GameSage is no longer available on the web. It was produced as part of the Game Metadata and Citation Project (GAMECIP), an IMLS-funded initiative to improve library practices around videogames.

Collaborators

Eric Kaltman, Michael Mateas, Noah Wardrip-Fruin

Papers

Press

GameGlobs interface

GameGlobs

2015

GameGlobs is a tool that visualizes various ways of partitioning 12,000 videogames according to how similarly they are described.

It was created using the same latent semantic analysis (LSA) model that undergirds GameNet. LSA produces vector representations for the documents in a corpus, which affords computing the relatedness between two documents by taking the cosine between their respective vectors. By assembling a corpus whose individual documents each describe a single videogame — about 12,000 Wikipedia articles about games — we created an LSA model that could compute the relatedness between the games themselves.

GameGlobs was built by submitting the LSA vectors in this model to the classic k-means algorithm, for several values of k, with the resulting clusterings being selectable by the user. It’s meant to operationalize the notion of game genre in an almost absurdly tidy manner: if there are only two genres — or if there are 20, or 50, or 5000 — what are the games that compose them?

GameGlobs is no longer available on the web. It was produced as part of the Game Metadata and Citation Project (GAMECIP), an IMLS-funded initiative to improve library practices around videogames.

Collaborators

Tim Hong, Eric Kaltman, Michael Mateas, Noah Wardrip-Fruin

Papers

Press

Islanders ship-exploration visualization

Islanders

2013–2014

Roughly inspired by the Polynesian settlement of the south Pacific, Islanders is a genealogically focused adventure game where the player lives out the life of a randomly chosen NPC. By this, I mean that the player lives out a life in the gameworld in the same way that NPCs in that world live out their lives (i.e., with the same affordances).

Gameplay is preceded by a world-generation procedure that begins with a large sea containing scattered island archipelagos and proceeds as follows: the world is uninhabited, save for a single ship holding several dozen characters, which is guided to land at a randomly chosen island on which they establish a settlement. From here, the general simulation proceeds at regular timesteps that each represent one year of game time. During a year, each character carries out a life in a low-fidelity simulation, and thereby the small initial character community grows and eventually disperses across the gameworld.

At the beginning of gameplay, the player is assigned to the womb of a randomly selected pregnant character, and gameplay commences when that character is born. This embeds the player into an actual NPC genealogy and into specific (low-fidelity) sociocultural contexts, which quite intentionally are not of the player’s choosing. Gameplay is mostly text-based, but there are also some light graphical elements.

Interesting features include genealogical tracking, character knowledge propagation, characters recalling the past, the evolution of abstract gameworld languages, and generated gameworld encyclopedias.

Islanders was privately distributed but never released publicly.

Papers

VFClust clustering diagram

VFClust

2012–2013

Neuropsychological tests of verbal fluency are commonly used as part of larger test batteries to study and assess cognitive impairment from neurological conditions such as Alzheimer’s and Parkinson’s diseases and traumatic brain injury. On these tests, the subject is asked to name as many words as they can in one minute beginning with a specified letter, for phonemic verbal fluency (PVF), or belonging to a specified category, for semantic verbal fluency (SVF).

The standard measure by which these tests are scored is the total number of satisfactory words produced, which tends to be fewer in individuals with these conditions. However, prior studies have found that impairment from these conditions also affects clustering and switching behavior on these tests. Clustering refers to the contiguous grouping of phonetically similar or semantically related words in a test response, and switching denotes transitioning from one cluster to the next. But while scoring the number of satisfactory words produced is trivial, scoring for clustering and switching is laborious and requires operating from a predefined set of clusters.

For my master’s project in Health Informatics at the University of Minnesota, I created a software tool called VFClust, which generates clustering analyses for both PVF and SVF test responses. Its phonetic clustering analysis module uses phonetic representations for words to determine cluster spans, while the semantic clustering analysis module utilizes semantic relatedness scores generated by latent semantic analysis, a statistical method for determining distance in meaning between words.

VFClust is published on PyPI and has been used in various published studies, some of which I’ve coauthored.

Collaborators

Serguei Pakhomov, Thomas Christie, Kyle Marek-Spartz

Papers

Sifty logo

Sifty

2025–curr

Recently I have been focusing my energy on Sifty, a boutique software studio building author-centric AI systems for art and entertainment, with a particular focus on storytelling technologies.

Our first product is Viv, an engine for emergent narrative in games and simulations.

Aleator Press logo

Aleator Press

2020–curr

I’m the proprietor of Aleator Press, an indie publishing concern and art dealer specializing in computer-generated literature and early generative art.

Our highlight release is Wendit Tnce Inf (2022), a letterpress book of computer-generated prose poems by Allison Parrish.

Aleator Press publications are held in special collections at UC Berkeley and Mary Washington University, and in the Anne & Michael Spalter Digital Art Collection and Ragnar Digital Computer Art Collection, the foremost private collections in this area.

As a dealer, we have handled early pieces by artists such as Harold Cohen.

SCMA logo

SCMA

2020–2022

While I was on the faculty at Carleton College, I led a research group called the Studio for Computational Media Archaeology.

Our focus was primarily on studying, reimplementing, and translating historical works in computational media and its antecedents, such as John Peter’s Artificial Versifying (1677), Alison Knowles and James Tenney’s A House of Dust (ca. 1968), and Sheldon Klein’s MESSY (ca. 1971).

Other student projects concerned work on deep social simulation and emergent narrative.

Members

Owen Barnett, Kaeden Berg, Aiden Chang, Theresa Chen, William Dudarov, Jade Kandel, Henry Koelling, Rie Kurita, Mimi Rapoport, Piper Welch

Papers