It's night in the watchtower and a distant orange glow begins to glow through the trees. Delilah asks if you're there, watching your first fire on the job. She normally doesn't call this late. She's talkative tonight, telling you about a trick for cooling tequila she learned from her sister. At the end of her story, she adds, "You'd like it." A walkie talkie pops up in your HUD, the sign of the availability to choose what to say next. You raise the talkie to your mouth and three options appear over your view. "I'm sure," "I've had a bad time with tequila," or "I would if I was with you." You scan through all of them while a timer slowly ticks down and briefly considering just letting it run out, being silent this time. You're curious about what might happen if you flirt with her. You have a wife and she is your boss, but what the heck, let's see where it leads, right? She takes it well and you flirt with her for the rest of the night.
This is one of the many moments of dialogue in Campo Santo's 2016 game Firewatch. It's typical, at this point, to see dialogue implemented like this in games with no more than five options to choose from under the pressure of a timer. It's one of those systems that just seems commonplace now, so much so that there is little question over what the system actually represents. Yes, the intention is to represent two people talking, but the mechanics of the system make it feel like a completely different interaction altogether. I'd like to explore what our videogame dialogue truly represents by examining it through the work of three authors: Alexander Galloway's "Gamic Actions, Four Moments" from Gaming: Essays on Algorithmic Culture, Jesper Juul's "Video Games and the Classic Model" from Half-Real, and McKenzie Wark's "America" from GAM3R 7H30RY. Through this examination, I hope to better understand what we represent when we use dialogue in games.
To begin, I'll look at Galloway's piece. In "Gamic Actions" he separates all actions in a game into four categories: diegetic operator acts, non-diegetic operator acts, diegetic machine acts, and nondiegetic machine acts. Operator acts are performed by the player and machine acts are performed by the computer. Diegetic acts are actions that are made within the narrative context of the game such as moving a character or NPCs interacting with each other. Nondiegetic acts are outside of the fiction of the game world, things like menus or loading screens. When we try to classify dialogue using these terms, it doesn't fit easily into just one of them. At first glance, it's tempting to declare dialogue a diegetic operator act. It is, after all, so tightly woven with the narrative of a game. However, the way options are presented is just a menu- a nondiegetic operator act- and Galloway recognizes systems similar to this as an action "in which the act of configuration is the very site of gameplay" (Galloway 15) naming heavily systemic games like Warcraft III and Civilization III as examples. So, under this extension of the definition of the nondiegetic operator act, I would classify the type of dialogue found in Firewatch as such. But what implications does this have for the actual decision making process? How does this classification change our representation of dialogue?
I believe that it puts an additional layer of abstraction in between the player, the player character, and the NPC the player talks to. As Galloway says on the type of game that includes these sorts of non-diegetic operator acts: "These are games oriented around understanding and executing specific algorithms" (Galloway 15). In this light, you could cynically say that dialogue games is no more than an execution of a desired path- and you'd be somewhat accurate. In Firewatch, this would mean that the goal of romancing Delilah would be much more important than saying what you feel. It makes conversation goal-oriented rather than an organic transaction. This type of abstraction of narrative away from open-endedness towards goals is not necessarily problematic, rather it is just part of the nature of how we represent dialogue in games. It's something to keep in mind as we move forward in our discussion about the representation of dialogue in games.
Next, I'd like to discuss Jesper Juul's six components of the classic game model found in "Video Games and the Classic Model." Juul's six components are rules, variable outcomes (there are multiple end states of a game), valorization of outcomes (some end states are better than others), player effort, player attachment to outcomes, and negotiable consequences (playing the game may or may not have an affect on real life). Juul very much emphasizes that this is the classic model of games and that one can very easily find more modern games that don't fit all the criteria listed. Certainly, the developers of Firewatch make no effort to valorize the outcomes of the game. Whether or not your relationship with Delilah is strong or weak, the same ending occurs. It would appear that Juul's classic model does not completely guide the systems of Firewatch, but to be so concrete with disregarding the model would ignore the historical and cultural impact of the model on the game.
Just the inclusion of "classic" in the title of the model denotes its influence on games. There is just an expectation, both on the part of designers and players, that most games and systems within those games are going to adhere to these six components. Firewatch and dialogue are no exceptions. In dialogue driven videogames, there is an expectation among players that their choices will lead to a different conclusion and if not a different conclusion, at least a different path. The impact of this expectation upon the options presented in moments of dialogue cannot be overstated. Just look at the scene I presented at the beginning of this essay. Reading the first two possible responses to Delilah's insistence that you'd like drinking tequila, the difference between the consequences of your choice is not readily apparent. Upon reading the third option ("I would if I was with you"), the first two options ("I'm sure" and "I've had a bad time with tequila") are recontextualized. The third option is clearly the route to try to romance Delilah, the second is still a little flirty, but not entirely so, and the first is so bland that it's doubtful she'll react strongly. This analysis occurs not only because the expectation of different outcomes is present in players, but also because developers are aware of it and specifically design the options to telegraph the possible consequences.
This relationship of expectations can create a dialogue system that at some points feels artificial, like you are choosing between two options in a menu rather than talking to a person. I think that this contributes to dialogue becoming more goal-oriented as I discussed earlier. When the consequences of a dialogue option are telegraphed to the player before choosing, it becomes easier to pick the option that most aligns with your goal. Just as with the non-diegism of dialogue, this isn't necessarily a bad thing- in fact, when done right this telegraphing of consequence helps the player understand the options. The problem rises when the telegraphing is done too overtly and as dialogue driven games become more popular, especially with the rise in popularity of TellTale Games, I'm finding that it makes me question more and more what I'm actually doing when I choose an option. There's a feeling I get that I'm not actually talking to a person or even a computer-controlled person, rather I'm just trying to manipulate a piece of software.
In order to tie up this conversation about goal-oriented, consequence-telegraphing dialogue systems, I'd like to introduce some ideas from the "America" section of McKenzie Wark's GAM3R 7H30RY. Wark uses an extended metaphor involving the game Civilization III to describe the progression of civilization and artistic representation. Essentially, he argues that as we have become more connected, different forms of art have best represented those times in society not only because their form is conducive to that representation, but because they are a product of the time they represent. For example, he says that the novel best represents the ages of trails, roads, and railways, while the movie sits between the telegraph and the broadcast. Television sits between the broadcast and the internet and games sit between the internet and whatever comes next. This is, of course, a simplification of his argument, but what matters most to us relative to our conversation about dialogue is that games best represent the mechanics of today's society.
So, with that in mind, allow me to remind you of my previous arguments considering the goal-oriented nature of dialogue in games. Because games where gameplay occurs through non-diegetic operator acts make us consider our ultimate goal before acting, dialogue becomes a goal-oriented system. Additionally, the telegraphing of consequence supports this goal-oriented behavior because it allows players to more easily direct themselves to their goals. I think when we consider Wark's idea that games represent the mechanics of our internet-driven society along with the idea that dialogue in games is goal-oriented, it's easy to see what our dialogue represents the majority of the time. Our game's systems of dialogue do not represent talking face to face with another person. No, the majority of the time they are actually mimicking what we do when we answer an email or send a text. When you talk face to face, your reply to something your acquaintance has said must come almost immediately. When you send an email or a text or any other non direct form of communication, you are given the leisure of thinking upon your options and your goals for what you want to get out of that interaction. Isn't that closer to what we're doing in our games?
If the common way dialogue is done in games mimics the way we talk online, what does this mean for the games? I'm puzzled in the case of Firewatch where the narrative would lead you to believe that you are talking with your boss over the radio in the middle of the Wyoming wilderness in the 1980s. The mechanics of dialogue seem to contradict the setting and therefore degrade the believability of the game. This may very well be the case, but I don't see it as an issue. It makes the game's setting more palatable to a modern audience. Just as films about medieval times don't use medieval language, we are willing to concede historical accuracy in order to gain digestibility. However, unlike films that modernize and American-ize the words spoken, in games there doesn't seem to be an option to build it a different way. It would be acceptable for a film maker to create movie in which the language spoken by characters was historically accurate, yet there exists no easy or common alternative yet for replacing today's dominant system of dialogue. The closest thing that comes to mind is the dialogue in Cardboard Computer's Kentucky Route Zero which does not operate under the presumption of telegraphing consequence to the player at every step along the way. But one game is hardly a reusable and common system. It will take more refinement and thinking to create a viable alternative to our current system and that system will undoubtedly have its own pros and cons just like our current one. More divergences from the standard practices of dialogue will mean more representation for a variety of communications.