Using State Space Representations of Game Design to Evaluate Mechanical Design

May 29, 2023 - 26 mins read

Celia Wagar has written frequently about a “state” representation of games in posts such as this general description of game state and this more in-depth discussion of gameplay depth. In the latter, she explicitly floats the idea of reducing the state space of a game in order to analyze its depth. Let’s expand upon this: how can we use a state space representation of a game as an evaluation tool for its mechanics?

What Are States?

In the first link above, Celia provides a rock-solid definition of what a game’s state is: take a save state (effectively a RAM dump) and observe the values for each variable resulting from the game’s execution. Indeed, the “state” of a system refers to the variables that characterize the system.

At a high-level for a continuous system, the state is meant to be the simplest representation of the system’s operation. The usual formulation of this is as a relation between the current state and the following state, where the following state is a arithmetic combination of the current state’s values (which may be multiplied by a relevant coefficient) and external inputs to the machine. As explained in Modern Control Systems (page 157) by Richard Dorf and Robert Bishop:

The state of a system is a set of variables whose values, together with the input signals and the equations describing the dynamics, will provide the future state and output of the system.

This added context gives us the ability to make some necessary simplifications off the bat. From a mechanical-design perspective, we care solely about those variables that actively move relevant entities forward. If my game is designed around interactions between the player entity and various enemy entities, we can presume that variables related to extraneous things do not need to be included in the design state variables. Good examples of these include animation timers and other variables associated with non-interactables such as blades of grass, flickering lights, or HUD elements. Obviously what is relevant to a given game’s design depends on the rules dictating said design; a series like Splinter Cell includes lighting elements that do affect gameplay, and thus these variables would be included in our state variables.

The application domain above is primarily continuous systems that require analog controllers; they deal in the land of signal processors and motor actuators and other such things where you’d like the system to self-regulate. Our goal isn’t stability, and we operate in a discrete space. Thus, we naturally gravitate to the concept of “state machines.” Essentially, state machines specify a list of states that our system can reside in, with transitions between states representing the relation between the current state and future state. This structure is convenient and easily legible for discrete applications such as ours, as we can represent it as a “graph” of state nodes and transition edges. In simpler terms, we represent each state as a circle and each transition as an arrow between them.

In practical representations of state machines, there tends to be “guards” on each transition, which functionally keep the system within a certain state until a condition is met that lets it transition out on a given edge. An example one can find in most systems textbooks is a traffic light. The simplest traffic light rotates between three states: green -> yellow -> red, and then back again. In reality, a traffic light does not spend an equal amount of time in each state. A common way to solve this issue would be to add a timer variable that constantly increments and resets on each transition. Once the timer has met a certain threshold within each state (let’s say 10s for green, or 3s for yellow) it resets and transitions to the next state. When we begin reducing states in our model in the upcoming section, we’ll expand on this idea regarding how it relates to games.

What Are We Trying to Examine?

Referring back to Celia Wagar’s “Depth Done Right” article cited in the introduction, she refers to depth as “…the sum of all the possible states, minus the redundant and irrelevant ones…” While this is a useful starting point, and she does proceed to list some ways to trim down our state list, there are some important elaborations to make upon this concept. For starters, her conception of depth is a holistic one, which by definition means that it lacks the ability to judge specific mechanics or actions. It also is inherently judgmental in the sense that more depth is always considered better. In this case the inherent quality within this depth metric is shouldered by the fact that it does not consider “irrelevant” actions; it’s assumed that a “meta” already exists that dictates the quality of given mechanics and approaches, and thus mechanics that are not “relevant” to effective gameplay are ignored. While this is useful for the purposes of informal critique, where such a meta may already exist, our goal of evaluation using the state space itself precludes us coming in with a pre-established notion of the game’s quality. Therefore, it is to our benefit to establish a series of new metrics that we can then use as an interpretive tool.

But first, we need to eliminate the redundant states. Let us consider the game modeled as a state machine as in the previous section. As mentioned prior, we can think of each state as a node in a graph with directional edges leading between them indicating where a state transition can occur. We also need to provide a scope for our analysis to reside in, as creating a gigantic state space of an entire game from back to front would be unwieldy. Per my previous post, I will be using my compositional ontology to build our framework for modeling the game. The second section of said post goes into detail on what each level of the ontology is, which I will not repeat here.

The game’s system is represented in our model as the transitional rules of the game, explaining what states are valid and what transitions between states are valid. An example may be that the player may not jump while sliding, or that an enemy will be staggered when hit with a certain attack. Next, the game’s scenarios are where the overall state machines reside. I find it easier mentally to consider a single scenario such as a combat encounter or a puzzle as a state machine unto itself. Furthermore, I assign win states and fail states to each scenario, where the goal of the player starting from the initial state is to eventually reach a win state and to avoid a fail state. Finally, the game’s structure contains all of the scenarios (or individual state machines) inside of it. We won’t be considering the structure much, but it is useful for holistically thinking about the metrics I lay out below. I will go into more detail on this later. With this scope defined, we can begin considering what states may be redundant and how we can detect them.

Symmetry Reduction

As Celia astutely notes in her above article, there are symmetric positions within games that fundamentally yield the same states. Her example of opposite sides within a fighting game being identical rings true, as creating a mirror reflection of the playfield produces the same options and outcomes for both players due to the way the game does not privilege or change options for players on one side or the other. Therefore, if we find elements of a given scenario that feature symmetry, we can truncate them down to a single set of states.

Symmetry reduction has been a well-researched topic in the world of “model checking.” This discipline consists of systems designers who, much like us, model their system via state machines. Their goal is primarily to detect states that are unsafe or that violate specific “properties” such as a state variable going over a threshold or two variables having coinciding unsafe values at a particular time. Thus for their purposes, when performing what’s called “reachability analysis,” where the model is executed exhaustively in order to find every single theoretically possible state it can enter, finding a way to reduce the states that need to be explored is useful for reducing the execution time needed for the analysis process.

As expounded in the link above, their formal conception of symmetry reduction has to do with permutations of states aligning across transitions. For two transitions to be symmetric, a permutation that is applied to the starting state (such as adding or subtracting one to a state variable) must be able to be applied to the following state while maintaining the transition. This gives us two main classes of redundant states we can eliminate:

Let’s consider a simple (x,y) positioning state. If we can transition one unit at a time in any direction - such as from (0,0) to (0,1) - we can also find a parallel or equivalent transition via a permutation - such as from (1,0) to (1,1). Using this, we can choose a representative transition to examine rather than tediously examining every single micropath. A way to visualize this would be as two dirt paths side by side that both start at and lead to the same place. If we considered being in one or the other path as a state variable (an enumeration: path ‘A’ or path ‘B’), then we can conclude that we only need to study one of the paths. An alternative way to think about it is that we coalesce the two states into one.
Imagine we are examining a Souls-esque inventory system where you can switch between weapons that are in a user-defined order. Not only are the equipped weapons a variable within the overall state (a vector of enumerated values over all available weapons in the game), but the order in which they are equipped is also a variable. However, with this symmetry rule, we can effectively abstract away the order. Consider an instance where I have a weapon equipped first and I attack with it, transitioning to a new state in the process. If I examine a separate permutation where the weapon is second in my equipped order and I attack with it, we can see that the following state is identical barring the mutated weapon order. Thus, it falls into the above description of symmetry, and we can simply view one particular permutation as the representative state (or rather, just abstractly think about a set of weapons being equipped without considering order at all).

These two cases eliminate virtually all of our redundant states. However, attempting to wrap one’s mind around this concept when it comes to more complex systems may become complicated, as totaling up many, many differing permutations into a single representative state/transition is very tedious in its own way. Therefore, let’s develop a useful heuristic that makes the process of finding these representatives more intuitive.

Reparameterizing Using Relative Values

The most prominent variable we’ve explored up to now is position, which is an absolute value. Within a world with a bounded space for an entity to reside in (or a plane, in the case of a 2D game), an entity resides at a position that is relative to the entirety of the space/plane that the entity can move through. We consider this absolute in the sense that the value is defined according to the definition of the world itself.

However, it is useful for us to consider an entity’s position relative to another entity. Let’s change our position value to a Euclidean distance value from some external entity. We can notice off the bat that this new state variable instantly deals with symmetry. If the entity we’re examining approaches the external entity from the right versus the left, the distance value is the same even though the absolute positions are different (with each being negated permutations of the other). By doing this, we have instantly created an implementation of state reduction. Note that by making this change, we’re effectively inferring a value. If you wanted to actually reconstruct the scenario within the engine without the original, absolute values, then you would struggle with the ambiguitity of the inferred value. For instance, while it is a useful state reduction to have approaches from both the right or the left instantly coalesced into a single value, it also makes it ambiguous on what side of the external entity our observed entity is actually approaching from. However, for our goal of modeling the system in order to abstractly evaluate it, the ambiguity is not an issue.

This distance value is still within the unit dimensions of the game world, assuming that we’re measuring distance in some discrete unit of measurement and rounding the real-valued result of the Euclidean distance formula. We can further divide this down by considering actionable ranges that effectively create coarse sections from our granular variable. Assume that our entity has two actions it can perform on the external entity. One can be performed anywhere from a distance of zero away to a distance of d1 away, while our other action can be performed from a distance of zero away to a distance of d2 away. Let’s assume d2 is larger. Using these range values, we can derive three actionable ranges for our distance variable: one where the distance is larger than d2 where no actions can be performed on the entity, one where the distance is between d1 and d2 where the second action can be performed, and one where the distance is between zero and d1 where both actions can be performed. In this way, we have shrunk our state model down to an extremely small enumeration of ranges compared to the vast amount of states one would find when using absolute positions. One can think of these like rings around the external entity, each of which represent a different set of actions that our observed entity can take in relation to the external entity.

An astute reader may note that I did not consider movement actions above. If our observed entity can move a unit in any direction at any time, I could consider it an action in all of our determined actionable ranges. However, I would need to represent moving from one range to another as a separate state, with its own specific distance value. For example, if I’m at distance d2+1 and I can move into the d1 to d2 range within one action, then I would need to add an extra range to denote that the entity is at a range where using a particular movement action gives it access to a new action in its next state, compared to a state at d2+100 where moving one unit does not change the list of available actions in the next state. By creating that new range, I would need to make another range at d2+2 indicating that by moving one unit, the entity moves into the d2+1 range, and so forth again and again, eventually plopping me back into just having a distance variable valued within the unit dimensions just like I had at the beginning. This issue can be solved by using “guards” on our transitions as mentioned previously. By maintaining the distance variable while maintaining the original actionable ranges as actual “separate” states, I can add more specific guards to each transition. For example, instead of having a movement action move me into a new state on its own, I can create a logical condition such as “if the entity is in the out-of-range state AND it is at a distance d2+1 from the external entity AND it uses a particular movement action, transition into the d1 to d2 range’s state”. This both maintains the validity of the separate states while using an acceptable abstraction to simplify our representational model.

This reparameterization and use of actionable ranges can be applied to other aspects of the game as well. For example, if we consider a player fighting an enemy that has different states depending on whether it’s attacked from behind or in front, we can infer an “relative direction” variable that tracks whether our player is behind or in front of the enemy. If the player has one attack action, then we can consider two actionable ranges that together form a ring around the enemy, where being in front of the enemy and attacking starts at a separate state than being behind the enemy and attacking.

This same logic also follows for a resource like health as well. As a reminder, we’ve set up specific “win states” and “fail states” for our game, so let’s assume that the health of the enemy reaching zero is a win state. If the enemy has five health points (HP) and we have two attack actions - one that deals 2HP worth of damage and another that deals 1HP worth of damage - we can visualize additional states in the form of ranges on a bar representing the enemy’s health. Within the middle or high range of the bar, using either attack does not reach the win state and presumably loops us back to the same state we were already in before we attacked. Thus, we have two ranges at the bottom of the bar: one at 2HP where the first action will transition to the win state, and one at 1HP where either action will transition to the win state. In both of these cases, we’re able to use this heuristic to model a game in such a way that it’s legible to us while also representing the game accurately at an abstract level.

Metrics For Evaluating Games

With this framework set in place for eliminating redundant states, we can begin establishing some metrics by which we evaluate the mechanics that we create. Firstly, let’s establish that our system can be elaborated in the following ways:

By adding new actions to the pool of available actions. An example of this would be to give a player with a sword attack a bow as well.
By restricting when said actions can be used. Following the above example, we can restrict the player’s action set at any given time by adding an equip action, where the player must sheathe/holster their equipped weapon and equip the other weapon in order to use it.
By increasing the amount of state variables. At the system level, these would primarily take the form of resources that entities in play can expend and collect. In some instances with multiple AI entities on the field at once, there may also be other state variables that control how the AI interacts with one another.
By adding new guards to the state transitions. These are effectively rules that dictate when particular actions change the state, including switching between different ranges or increasing/decreasing an available resource. This further elaborates upon the restrictions given in the second bullet point.
By adding random elements. I have plans for a further blog post that will cover this element in more detail, but for now we will leave it out of our scope.

Next, we establish that our scenarios are elaborated by the collection of entities that are contained within them. These include enemies, terrain, allies, and any sort of other interactable object that is relevant to the system state. Each scenario is also given an “initial state”, “win states” and “fail states”. Finally, each scenario implements the above rules and elements in order to create the state model that we’ll be observing for our metrics. As mentioned previously, these state models are graphs made up of states as nodes and directional edges as state transitions. Using this as our foundation, we’re primarily looking at prime paths through the graph that lead to either a win state or a fail state. These paths take the form of a list of states beginning with the initial state and listed in the order that the states are transitioned through. Note that, per the link, loops are not considered in prime paths, and neither are incomplete paths that never reach a finishing point.

With this pretext, we can begin looking at some different metrics for evaluating a game.

Complexity

Per Celia’s earlier definition, let us define the cardinality of the set of non-redundant states that we’ve determined for our state model to be our measure of complexity. Note that we have included “irrelevant” states; as mentioned earlier, determining these requires previous evaluation of the mechanics, which defeats the purpose of our evaluation step that we’re undertaking here. As this implies, highly complex games feature large amounts of non-redundant states, which are likely driven either by many resources, available actions, and/or high numbers of entities in a given scenario.

Alternatively, we may decide that we wish to avoid denoting systems with many states but rigid paths as “complex.” Therefore, we can use the total degree summed across all state nodes in our graph as a replacement metric. We may also wish to denote this as the interaction complexity given that it measures the states by the amount of ways they can interact with each other versus how many states there are total.

However, we may consider that a game with few states yet with no restrictions on which state one can transition to to be not suitably complex enough in the sense that the player may have easy paths to a win state no matter where they are within the state model. Therefore, we can further derive flexibility to be the ratio of interaction complexity to complexity, or the average degree across all state nodes. A game that is highly flexible may be suitable for games that wish to deemphasize win states and emphasize player freedom, but it also might hinder a game that seeks to provide more structured challenge.

A series that excels at meeting all of these various criteria is Monster Hunter. Monster Hunter games feature complex interactions between positioning, monster hitbox, moveset, and damage type, creating an extremely large number of states in the process. This is especially notable given that it generally features one on one combat (pesky adds in the older games aside). Using our previous actionable range method, we can find that Monster Hunter games retain a large amount of states even after reduction thanks to the many different hurtbox ranges for each movset for a given weapon as well as many different hurtboxes on each monster. Players are generally restricted in terms of movement when unsheathed, slowing their speed and restricting their evasion options (especially in older titles). However, interaction complexity is maintained through micropositioning, where subtle changes through restricted movement actions can yield substantial differences in terms of the actionable ranges one may reside in after the repositioning occurs. This example provides an illustration of what these terms may mean in practice, actual exhaustive analysis of the game aside.

Viability

For a given action, how often does it appear in the list of prime paths that reach a win state? What ratio is this to the total number of prime paths? Actions that are highly viable may appear very frequently in successful paths to the win state, while actions that are not highly viable may only appear in winning paths incidental to highly viable actions.

In a game such as a puzzle, there may be only one viable path to a winning state, and this may be seen as acceptable or even preferred. This includes genres such as racing games, where for particularly difficult tracks with heavy time restrictions there may be few true paths to success (although these will often change given how changing a car varies the dynamics of the system, represented by our transition guards). However, for many action and multiplayer games we seek to create multiple viable paths for the player, preventing an exhaustive search for a viable path on behalf of the player as well as giving them options to match their preferencse in playstyle and strengths/weaknesses from a mental and physical standpoint.

There are two separate considerations here.

We may seek to have many significantly different options for the player to choose from in terms of a “macro-preference.” Having a wide variety of different weapons, loadouts, or build types would be examples of these. If all or many of these choices are viable, we can denote this as breadth, where options for a given enumerated state variable (such as the list of weapons) that are each equally or near-equally viable in terms of their ability to reach the win state are considered “broad”.
We may also seek to have many subtly different options for the player to choose from in terms of a “micro-preference.” Having the ability to subtly alter a viable path and have it remain viable gives the player more room to express themselves in terms of movement, decision-making, and moment-to-moment game understanding. Therefore, we can denote this as expression, where an option or action that can be subtly altered in terms of timing, positioning, or some other state variable without violating its viability is considered “expressive.”

Resident Evil 4 (2005) provides examples of both of these parameters. In terms of its breadth, the game gives a vast array of disjoint weapons, where not all weapons can be held at once and thus must be explicitly selected by the player. Not only is each weapon class viable in terms of winning scenarios (though some like the knife and mine thrower are less viable in terms of total paths to victory, and would require a significantly skilled player to suss out a viable path in a challenge run containing only them), but there are also separate weapons within each class that offer further breadth for the player to experiment with. As for expression, the game not only provides a restricted-but-pliable control scheme, but also allows the player to shift their opponents into several different stagger states with subtly different effects depending on where they are shot, as I explain in my recent No Merchant review of the game. Different staggers can momentarily incapacitate the enemy, cause them to fall to the ground, or set them up for a melee strike depending on what the player’s preference in a given scenario is. Through this, a player can select an approach to each encounter based on their preferences not only on the macro-level of weapon selection but also in terms of individual moment-to-moment decision-making regarding how to stagger each opponent.

Determining the fine line between a macro-preference and a micro-preference is elusive and highly reliant on the designer or evaluator for proper discernment. Based on the above, I would suggest that timescale serves an important tool in order to differentiate the two, where a state variable that affects breadth would only be able to change on the scale of once or a few times per scenario, whereas a state variable that affects expression would be able to change incredibly frequently by comparison. However, this highly depends on the mechanics that are being studied as well as the perspective of the evaluator.

Efficiency

While an action or approach may be broadly viable, the matter of whether it can be performed quickly or frequently is a different issue. We can measure efficiency by looking at the average length of the subpaths starting at when an action first occurred and proceeding to a win state across all viable paths for the action. Some major interpretation on the level of the evaluator is required for this. For one, certain coalesced states may need to be broken out again or loops that were ignored by prime path may need to be considered if an action must be performed multiple times before proceeding to the win state. There is also the question of where the subpath should begin on a path where an action is performed multiple times; for the sake of discussion, we will assume it begins on the first occurrence of the action within a given viable path.

There are multiple reasons why an action would be more or less efficient. These tend to deal with how the action affects various state variables, resources in particular. If an action kills an opponent in five hits but requires an excess of a given resource that must be slowly recharged, it will inherently be less efficient than if the same action kills an opponent in five hits, no resource required. Where the sweet spot for efficiency is lies in the mind of the designer. Actions that are highly efficient may become centralizing in terms of their effect on the game if not regulated properly. However, actions that are highly inefficient may seek little use in practical play.

One way to balance this is with a classic gambling proposition. Suppose you have a highly efficient action that is also much less viable than another action that is highly viable but inefficient. The player then has a choice: they can roll the dice and try a highly efficient action in order to speed things up, or they can settle for an inefficient action that is guaranteed to work. On its face this is less interesting than it sounds given that the player always has the ability to guarantee a win with the potentially boring inefficient action. However, Devil May Cry 3 plays with this concept. The Dante Must Die difficulty is notorious for beefy enemies that encourage jump-cancel spam, which is relatively safe thanks to Dante’s jumping invincibility frames yet very slow to conduct to a win state. However, there is also the Devil Trigger Explosion (DTE), which is both highly viable and comparatively efficient but limited by a resource. Thus, the player is incentivized to use the jump-cancel to build meter for the DTE before strategically destroying an entire group of enemies at once with the DTE. This is not always successful in practice for the game, but it does illustrate the essentials of how we might characterize mechanics within a game such as this.

These kinds of tradeoffs end up making more sense in a multiplayer game such as Street Fighter, where the addition of a stochastic opponent that one can model in terms of decision-making capabilities can make previously unviable strategies potentially more viable depending on the foibles and habits of the opponent. These take the form of “reads” where a less viable but more efficient action is undertaken in response to a particular action or string of actions by the opponent. Balancing also deals with efficiency in relation to viability: moves that are highly situational (read: less viable) may have increased efficiency through high damage output to compensate for their lack of use, whereas moves that are very safe such as jabs will often have less efficiency through low damage output. However, a jab into a specific combo creates higher efficiency through a particular viable path that has low expressive capability; referring back to the previous section, note that the expressiveness of a particular path allows the player to make subtle changes to the way they conduct the path without losing viability. With a fixed, strict bread and butter combo, the low expressiveness and the need for exacting inputs from the player executing the combo creates a new tradeoff where higher efficiency is achieved, while the jab can potentially remain highly expressive on its own as a way to bait the opponent using specific timings or stuff out particular actions. Of course, not all fighting games full remove expression from combos: anime fighters and others such as Marvel 3 use highly elaborate and customizable combos as a selling point. However, the timing for these often remains strict; one could argue that a game with a robust combo system with strict timings is actually a case of a game having option breadth but low expressiveness. This is a matter of interpretation, as many would refer to one of these combo systems as a prime example of highly expressive gameplay in common parlance outside of the context of this article.

Future Work

To expand on this work in the future, I would eventually like to make another blog post illustrating a toy game that gradually increases in complexity as we add more to its state space while also illustrating each of the metrics given above. I also skipped over the random elements that incredibly important considerations when it comes to viability and paths towards victory. I will eventually write a blog post that shows how such a stochastic system can be modeled via a Markov Decision Process in order to enhance our understanding of the metrics above with probabilistic interpretations of viability and such. These will be written time permitted. I would also like to reconstruct this post with visual aids, as this is a subject that becomes much more accessible with the addition of images. This will likely require me moving to a personal blog format.

This is a post in the State Space Analysis series.
Other posts in this series:

April 13, 2024 - State Space Analysis from the Player Perspective: Complexity and Exploration
May 29, 2023 - Using State Space Representations of Game Design to Evaluate Mechanical Design

Comments

You can use your Bluesky account to reply to this post.