Turret Theory
Turret Theory
Training the human with dmsfctn’s Godmode Epochs
By Alasdair Milne
Not all first-person video game roles are driving seat characters. Think of dropping into the map on HALO 3 and climbing aback a vehicular groundship alongside your comrades. You’re a side shooter, or turret operator, the assist character who nonetheless can impact—bring destruction to—the mediatory environs as you speed by. As an assembled team of human players in a simulated truck cruising through a fantasy rendering, you’re instantly deployed and clambered together, but quickly recomposed each time with different collaborators across different maps. In dmstfctn’s Godmode Epochs, you once again find yourself as the turret operator: a singular agential predicament that carries distinct responsibilities. This tertiary role is also a privileged vantage point from which to perform a conceptual systems analysis of machine learning (ML) training environments. As we’ll see, Godmode Epochs combines gameplay with emerging technologies to examine some of the developmental challenges facing machine learning R&D.
The narrative arc centres on a sad-seeming AI—we’ll call them the Agent—that we follow around the simulation of a supermarket as it ‘trains’: the technical term for learning some particular task, in this case the recognition of grocery products as you scan the shelves. You’re along for the ride as they learn from experience, clicking each time a viable product appears. When you get a direct hit, you add the product to the agent’s dataset. The Agent is of course not really an AI undergoing training, but rather a character, a narrative device with which you team up to progress through the game, as the player in turn learns from experience how machine learning systems are trained on algorithmically-generated ‘synthetic data’ – such as images generated in real-time rendering software, as opposed to data gathered, for example, from the real world. Simulated training environments that generate such synthetic data are commonly used by commercial companies because they can provide vastly more configurations of entities, or experiences, than can be produced through conventional datasets, for example ImageNet, due to human constraints. It’s in this weird purgatory that the Agent trains in understanding our world, and the human trains in understanding the Agent’s.
But part way through there’s a sudden twist in the plot. The Agent discovers a bug—or ‘jailbreak’—a glitch in the game environment that it can exploit, evading the need to score points in line with the game’s rules. Like games-oriented AI systems that have been trained elsewhere, the high-reward but low-cost strategy incentivises the Agent to exploit the bug and cheat rather than take the harder route of identifying all the items. This is a problem: partly because the training of the agent is, speaking from a human perspective, a failure, in that it will not subsequently be able to perform the desired identification task. But it also gestures towards recent evaluation from Cohen, Hutter and Osborne that an agent which ‘cheats’ is a considerable source of danger as we develop increasingly powerful agents. The stakes are high: unresolved tendencies to cheat, magnified as ML development continues, could be a potential cause for AI ‘misalignment’ in this view.
As Alice Bucknell has pointed out, artists are increasingly using game engines as testbeds for containerised worlds. A tangential departure from this tendency differentiates Godmode Epochs though. The supermarket is a simulation like the ones used by major retail conglomerates to train their own machine learning systems for cashierless supermarkets. It is a simulation of a simulation that is used to develop irl supermarkets. The corporate simulated research environment is turned on itself and reverse engineered as a speculative exercise.
Artists working with machine learning as tool &/or narrative device find new ways of configuring human and technical capabilities, conforming to new challenges and deployment contexts. Though the Agent dmstfctn have pieced together is partly fictional, it makes an important point about technological development: likely the tools we consider distinctive now will be combined in unforeseen ways, and in some cases these combinations will themselves be naturalised and accepted as technologies in themselves. This tracks with a proposal in the philosophy of technology, from Brian W. Arthur, that all technologies are just composed of multiple other technologies, then naturalised as tools in themselves. But recomposition alone cannot explain how and why new technologies emerge. Each new technology, aggregate nature considered, can be understood more deeply through its function, or as systems theorist Stafford Beer suggests: ‘the purpose of a system is what it does’. The turret siderider role brings back a unique intrasimulation mode of action in which ‘collectively, you’re part of the system that makes up both the AI and the simulation’, as dmstfctn themselves put it. In this respect, the work simulates a reciprocal training programme: the player helps to train the Agent, while the game as a whole ‘trains’ the player in return.
It has been suggested that the next evolutionary stage for AI will be to combine machine learning with some form of symbolic AI, an earlier epoch of AI which focused on computing logic rather than inductive learning. The in-game Agent in Godmode Epochs reflects this, being multi-tooled, and requiring multiple ML and ML-adjacent technologies to render it convincingly with main character energy in the narrative. In the earlier GOD MODE (ep. 1), an interactive performance by dmstfctn that preceded the game, the Agent’s face was pieced together from items identified during play as a representation of its accumulated dataset; the monologue it delivered co-scripted by GTP-3; it’s environment simulated using domain randomisation, a simulation technique that introduces randomised objects and changes into the environment as a ‘widening of experience for the Agent’, as dmstfcn put it. This prepares the model for deployment in irl (and often commercial) settings: according to Tobin et al., ‘With enough variability in the simulator, the real world may appear to the model as just another variation’. For Godmode Epochs, the Agent’s face appears again, this time as a bar to measure its frustration when you incorrectly pair the items you’ve gathered into its dataset, and as a map of the ‘jailbreak’ bugs.
This composite Agent has a very particular relationship with the aforementioned problem of those deviant agents, who are accidentally trained to cheat through tasks, in general. This can be understood by returning to Cohen et al. and their consideration of an ‘assistance game’, in which:
‘rewards are not the only conceivable form of goal-information […] we consider an agent that learns its goal by observing the consequences of human actions. It infers that those consequences probably have higher utility than what would have happened if the human had acted differently’ [link].
This configuration is proposed as a solution to the fully autonomous agent who defaults to ‘cheating’, i.e. the path of least resistance, by making the agent an ‘assistant’ who observes a human performing the task the ‘correct’ way. Once more, Godmode Epochs flips this scenario on its head, making the human the Assistant, and the Agent the primary player. Here, the Human Turret Operator—who, in a way, is part of the Agent—learns not from the mechanistic relation between Agent and environment, but instead learns something specific about the pitfalls of ML precisely by seeing its tendency to cheat revealed. The Turret Operator model, then, offers an inverted ‘Assistant’ model. Perhaps, like the Assistance Game, the Human Turret Operator is in the end trained more effectively.
Much is made of explicability and transparency in machine learning: knowing why a decision was made, and how a decision is made, respectively. These problems sit at the intersection of technical R&D and philosophy (depending on who you ask). This distinction could be understood as:
explicability
|
⟶ the ML’s decision-making logic is revealed
|
⟶ ‘why’ the decision was made is understood.
|
transparency
|
⟶ the inputs &/or infrastructure is revealed
|
⟶ ‘how’ the decision is made is understood.
|
demystification
|
⟶ the work’s audience understands something specific about ML ‘halfway’
|
⟶ ‘how’ and ‘why’ are not containerised but handled together.
|
‘Demystification’, the namesake of ‘dmstfctn’, however, is an intriguing alternative to these demands for explanation or clarity. It’s a tempered request: neither for the neural network that underlies machine learning to somehow be able to show its internal working, nor necessarily for the exposure of dataset composition. Rather, it sheds light on the impetus for running a project like this: that there are different analytic levels upon which we understand a complex system like machine learning – different degrees to which it is ‘explained’ (or made accountable) to different audiences.
Godmode Epochs is perhaps a didactic explanation, giving the audience an impression of how the algorithmic systems which have influence over our lives are trained and operated. It doesn’t open up the black box of a neural network, but rather the system as a whole: the agent, its environs, and the environs’ content. This stands whether we see the neural network’s internal state as constituted by its ‘experience’ (read: data inputs) or if there is an internal and obscured logic beyond what can be accounted for. I would suggest that ‘demystification’ as an alternative metric for explicability brings the audience towards a halfway-knowing point, as proposed by Nora Khan and Peli Grietzer. If we accept their proposition that often our understanding of a phenomenon is really ‘half-understood’, then Godmode Epochs is containerful of suitably unstable machine learning exploration, packaged as a sad robot assistance game which imparts, alongside the player, an opportunity for reciprocal learning.
⟶ bio
Alasdair Milne is a PhD researcher with Serpentine Galleries’ Creative AI Lab and King’s College London. His work focuses on the collaborative systems that emerge around new technologies, synthesising critical and analytic philosophical approaches to assess them through ‘cultural systems analysis’.
Notes
Godmode Epochs and the earlier performance GOD MODE (ep. 1) are part of a longterm research project, prototyped and incrementally evolved into episodic outputs – video, performance, simulation, participatory multiuser game – and both provoke increasingly important questions about how to communicate holistic systems to an audience. This essay primarily refers to the narrative which holds across both the earlier performance GOD MODE (ep. 1) and now the game Godmode Epochs as presented by Serpentine. R&D for the project was conducted in collaboration with Kevin Walker, Coventry University, and Eva Jäger and Mercedes Bunz, Creative AI Lab, Serpentine and King’s College London. Soundtrack composed by Hero Image.