Human and AI Learning, Part 1

Humans and machines both learn. But the way they do so is very different. Those differences provide interesting insights into quality and thus the idea of testing for risks to quality. I found one way to help conceptualize this is around the context of games. Even if you’re not a gamer, I think this context has a lot to teach. So let’s dig in! AI testing with Elden Ring, part 1

Because there’s a lot to cover here, I broke this up into two parts. In this post, I’d like to consider how a player learning to play a game is very similar to what an AI does when it’s training. Then I want to focus on some terms that I used in my previous post and apply them to this context.

In part two of the post, I’ll give a bare-bones sketch of an AI program I wrote that can play an instrumented version of the game Elden Ring in order to assess combat difficulty of the so-called “boss encounters” and help look at a quality we can call “fairness.”

Fair Models and Explainable AI

It’s widely recognized that a good machine learning model should be fair. Fairness is also one of the goals of Explainable AI (often called XAI). This had a nice correlation with the idea of “being fair” to players in a game but also explaining why you think you achieved fairness.

I’ll note here that Explainable AI is a set of methods and tools that can be adopted to make machine learning models understandable to humans in terms of providing explanations on the results that are provided by those models.

The existence of an algorithm does not, by itself, guarantee full explainability and full transparency of a given system in this context. Or, rather, we often don’t directly have the algorithmic steps that have been followed. AI systems learn the rules from data during the learning/training phase and construct the algorithms.

A growing industry focus on Responsible AI is that we want to make our systems more transparent and interpretable to build trust and confidence in their adoption. This is a very wide topic but I’m hoping I can gently introduce the basis of the ideas in these posts.

Games Facilitate Learning

As many of my readers likely know, I’m a big fan of game examples in testing. As I said at one point, testing games is hard. I’ve also indicated that a certain idea of the breadth of testing can be seen best in a game context.

In my experience, game testing is one of the best ways to hone a series of instincts and intuitions around the testing craft, even if you’re not a gamer. I’ve seen this in my career. I’ve seen it being a contract game tester for various game studios. In fact, I took it so much to heart that I once wrote a game called Test Quest to use as a fun way to conduct interviews.

Here I want to look at how the thinking around AI can be helped by considering a particular gaming context. Now, technically, you could choose any game you wanted here. I’m going to choose to focus on the so-called “Soulsborne” or “Souls-like” titles. These are games that have their mechanics rooted in Dark Souls, which had its mechanics rooted in the earlier Demon’s Souls.

To make this as current as possible with a game that a lot of people played relatively recently, I’ll focus on Elden Ring. I have history with this game anyway, as you can read in my ludonarrative testing posts.

I realize not everyone may be a gamer. But I think games are intrinsically understandable, even if you don’t play them. My goal here is to help people, testers in particular, conceptualize AI around an activity that they might be familiar with already or at least one that they can relate to.

If nothing else, you can imagine this like any other scenario where you’ve had to step in and work on an application you’ve never seen before and you have to learn about it. In fact, consider that an AI would be in no better position than you in a situation where you’re entirely unfamiliar with the game but, like you, has to learn about it.

The Game Context

First, let’s get some of the basics of Elden Ring down just so everyone has the same level of understanding.

In Elden Ring, players of the game build a character and that character will have certain attributes, like strength, dexterity, endurance, intelligence, faith and so on. Attributes for a player in Elden Ring

The gameplay loop is essentially that players have the ability to level up their character’s attributes to become stronger and take on challenging enemies, including so-called “boss characters.” Basically very powerful enemies! The game features a leveling system where players earn “runes” by defeating all those enemies. Runes serve as a form of experience points within the game. Elden Ring runes allow a player to level up attributes.

By allocating runes to those attributes I just mentioned, players can customize their character’s build and playstyle. The leveling system allows players to increase their survivability, their damage output, or can give access to various supporting elements like sorcery-based spells and faith-based incantations. As players progress through the game, they can choose how to distribute their runes strategically based on their preferred playstyle and the challenges they encounter. Example of leveling up attributes.

In the above image, based on the numbers shown in blue, I’m leveling up from level 9 to 13 and that’s being done by increasing my strength from 12 to 15 and my endurance from 13 to 14.

Given the combat nature of the game, the game features a variety of weapons. Some of those weapons are said to scale with the attributes. So, for example, a greatsword may scale with the strength attribute. Meaning the weapon will do more damage the higher my strength attribute is. A katana, on the other hand, will scale with dexterity. Those weapons can also be strengthened, thus the “greatsword” can become a “greatsword + 10”, which means it was strengthened ten times. Heavy greatsword strengthened to 24.

I have a Heavy Greatsword that’s been strengthened twenty-four times and you can see (from the “Attribute Scaling” section) that it scales with my strength (“Str”) in what’s called an “A” scaling.

You can also coat your weapon with certain things like fire or poison which allows your weapon to inflict that kind of damage on an enemy. Fire grease to put on a weapon.

Certain enemies might be particularly susceptible to certain types of damage while others may be entirely immune to it. You also have various things that you can craft, like sleep arrows or fire pots, all of which can be shot at or thrown at an enemy.

One more crucial element to understand about the game is that the enemies all have specific move sets and attack patterns. Essentially, they have a particular style of fighting that they will use against the player. A large part of the game is learning these move sets and attack patterns and figuring out how to counter them.

Everything I just said here that you would have to learn is also what an AI would have to learn if it were going to engage with the game.

Be The Player

Whether you’re an AI that has to learn or a human that has to learn, consider that you’re about to encounter some boss and you’re encountering it for the first time. A character encountering a boss character in the game.

So what are you going to do? What’s your plan? Given that you know next to nothing about this particular boss, what’s your strategy? How do you learn? How do you optimize what you learn? I can guarantee this is probably what you’re going to see a lot: The famous 'You Died' message from the game.

I barely made a dent in this guy (notice how little of his red bar is gone) and he mopped the floor with me. In fact, these Crucible Knight characters are so hard to me that they’re actually a large part of why I wrote the AI tester that I did, which we’ll get into in part two of this post. Essentially, though, I wanted to test the “fairness” of these characters.

And if you’re thinking: “But, Jeff, fairness is a subjective thing!” — well, you’re right! It is. Quality has objective and subjective components. And that’s an interesting thing to consider when you apply AI to any context where people will make determinations about their perception of quality.

How did that AI tool I wrote learn? Well, very similarly to how a human had to learn.

Applying Learning To Git Gud

For those who don’t know “Git Gud” (read: “get good”) is the dismissive reply you often get from experienced “Soulsborne” players when you describe your woes to them.

In my previous post I mentioned the idea of iterative optimization algorithms and stochastic gradient descent. Let’s apply those terms broadly here.

As a player, you wander into a boss encounter. If you’ve never encountered this boss before, you likely have very little idea of what’s going to happen. The boss is going to have a particular weapon (sometimes more than one), a particular move set, particular abilities, and so on.

So you’ll probably go in and just try various things. You’ll try to see how much damage you can do with your weapon. You’ll see how fast the boss is and how they attack. You’ll probably die. Many times.

But each time you encounter the boss, you learn something. You see how the boss moves. You see how it uses its weapons. You learn what the boss might be vulnerable to. You see which of your attacks work and which don’t. You see how much you have to dodge roll or you might see how ineffective such rolling is, given the speed of the boss.

This is iterative optimization in action. Now think of the boss encounter as a hill you are climbing down. Getting to the bottom means you have no trouble with the boss. But getting to the bottom of the hill means taking steps of a certain size. Consider this: A steep gradient descent

So you’re facing a powerful boss in the game and your goal is to defeat the boss by finding the best strategy. The arrows in that visualization represent the guidance you receive to help you improve your approach.

The length of the arrows represents how much you should adjust your tactics, while the direction shows where you should focus your efforts. If the arrows are short and mostly pointing vertically, it means that you need to make small adjustments in your vertical movements, such as timing your jumps or dodges, or finding the right distance to attack.

The steeper the arrow, the more critical that adjustment is for your success. It indicates that small changes in your vertical movements can greatly impact your chances of winning the encounter.

Going with this “hill” metaphor, the objective of gradient descent is to reach the bottom of the hill because that corresponds to the minimum value of what’s called a “cost function.” The bottom of the hill represents the optimal solution or the point where the cost function is minimized.

In the plot you see above, the curve represents the shape of the hill, and the arrows represent the path you would take to descend. Notice that the arrows are short and mostly pointing vertical. This indicates that the steepest change or gradient is in the vertical direction. In other words, the objective function or cost function is changing rapidly along the vertical axis.

But what does that mean? When we say that the objective function or cost function is changing rapidly along the vertical axis, it means that small changes in your vertical movements or actions in the boss encounter can result in significant changes in the difficulty or outcome of the battle.

But what does “vertical movements or actions” mean?

The vertical axis can represent various aspects of the encounter, such as timing your jumps, adjusting your positioning, or choosing the right moment to attack or defend. These vertical movements or actions directly affect your success in defeating the boss.

Okay, so if our goal is to descend the slope and if the arrows are what we use to guide our steps, what does it mean when the arrows are all pointing up?

If all the arrows in the visualization are pointing up, it indicates that the current steps or adjustments you’re making are moving you away from the optimal solution or descending direction. In the context of descending the slope in the gradient descent metaphor, this situation implies that you are moving in the wrong direction and need to adjust your strategy.

Put another way, when the arrows are pointing up, it means that the gradient, which represents the direction of steepest ascent, is leading you away from the desired goal. It suggests that the current set of parameter values or actions you’re taking is actually increasing the objective function or cost function instead of decreasing it.

In the context of a boss encounter in Elden Ring, this would imply that your current tactics or actions are not effective in defeating the boss. It suggests that you need to reassess your approach, make different decisions, or adjust your timing and positioning to find a better strategy.

I repeated some elements of my explanation there to bring the points home but what you initially end up with is something like this: A steep gradient learning.

In this example, the x-axis represents the encounter number, starting from 1, and the y-axis represents the progress level. The progress level indicates the level of knowledge and skill the player has acquired in the boss fight, with 0 representing the least knowledge and 1 representing complete mastery.

Eventually, you level up more, you strengthen your weapons, you learn the patterns and, effectively, you get better. (Or “git gud,” if you prefer.) Your gradient descent gets less steep. A less steep gradient descent.

Here the gradient descent visualization shows multiple hills and arrows of varying lengths and orientations. And what this indicates is the presence of multiple local optima or suboptimal solutions within the optimization landscape.

Here “local optima” refers to points in the optimization landscape where the objective function reaches a relatively low value compared to its immediate neighbors but may not be the absolute lowest point in the entire landscape. The optimization landscape is a visual representation of how the objective function behaves across different parameter values, showing the peaks, valleys, and flat regions that impact the optimization process. In the context of Elden Ring, this landscape represents the effectiveness of different strategies or tactics in defeating the boss, with different points corresponding to different combat approaches and the resulting outcomes.

When the arrows are longer and tilting horizontally, as you see in the above plot, it suggests that the steepest change or gradient is occurring more prominently along the horizontal axis. This implies that the optimization process involves adjusting the parameter values or actions primarily in the horizontal direction to find the optimal solution.

In the context of a boss encounter in Elden Ring, this scenario can be interpreted as the presence of multiple viable strategies or approaches to defeat the boss. Each hill represents a different strategy or set of tactics that may yield relatively good results, but they may not be the absolute best solution.

The longer arrows here suggest that making horizontal adjustments or decisions can significantly impact the outcome or difficulty of the encounter. It indicates that finding the right timing, positioning, or movement in the horizontal aspect of the battle is crucial for success.

So, wait. The terminology may be odd here. In the case of Elden Ring, when I talk about the “vertical aspect” and the “horizontal aspect”, how are those distinguished? Do I mean literal vertical and horizontal movement in the game? No, I don’t. Or, at least, not directly.

  • The vertical aspect refers to actions or movements that involve changes in vertical positioning or timing. It relates to vertical movements such as jumping, rolling, or dodging, as well as vertical attacks or defenses. Timing your movements accurately, adjusting your vertical positioning to avoid enemy attacks, and landing successful vertical strikes are examples of actions associated with the vertical aspect.
  • The horizontal aspect, on the other hand, pertains to actions or movements that involve changes in horizontal positioning or tactics. It includes actions such as strafing, circling around enemies, adjusting the distance between you and the boss, or choosing the appropriate angle of attack. The horizontal aspect focuses on positioning yourself effectively, maneuvering to exploit enemy vulnerabilities, and executing horizontal attacks or defenses.

Distinguishing between the vertical and horizontal aspects helps to categorize the different types of actions or movements you can perform in the game. By recognizing these aspects, you can analyze the role of each aspect in the boss encounter and identify specific strategies or adjustments to optimize your performance.

Eventually you end up with this: A gradient that shows more learning.

The plot shows how the progress level has significantly increased compared to the previous example. The player has become more adept at recognizing and countering the boss’s attacks, resulting in a faster progression towards mastery. The marker points (represented by red circles) are connected with lines to visualize the learning progress over the encounters.

This plot demonstrates the improvement the player has made in understanding the boss’s mechanics, learning effective strategies, and executing precise timing and maneuvers. It reflects the player’s growing expertise and ability to handle the challenges posed by the boss.

So what I just described here from the player perspective is pretty much identical to how it would have been described from an AI perspective.

Leveling is Like Training

When we talk about AI systems that train, what we’re really saying is something fundamentally simple: the more you do something, and the more you get useful and accurate feedback, and assuming you have an outcome, then you can progressively get batter at whatever it is you’re trying to do.

Maybe that’s diagnosing diseases. Or driving a car. Or recognizing text. Or discerning images. Or, maybe, just playing a game. AI learning to play Elden Ring with a failure and success state.

Drawing a comparison between leveling up in Elden Ring and the training process of an AI system is an interesting analogy. While they operate in different domains, there are some similarities worth exploring.

One is obviously incremental improvement. In Elden Ring, leveling up attributes gradually enhances the player’s abilities and makes them better equipped to face challenges like those boss characters. Similarly, AI systems go through an iterative training process where they learn from data and gradually improve their performance. By making adjustments, refining algorithms, and training on larger datasets, AI models become more accurate and proficient in their tasks over time.

Another area is adaptation and customization. Elden Ring allows players to tailor their character’s attributes to their preferred playstyle. Maybe you want to be a pure caster, for example. Or maybe you prefer a melee-focused build. Similarly, AI systems can be fine-tuned and customized for specific tasks or domains. By adjusting hyperparameters, training data, or model architecture, an AI system can be optimized to perform better on particular tasks, adapting to the specific requirements of the problem they aim to solve.

Yet another area is learning from experience. In Elden Ring, players gain experience by facing challenging enemies and bosses. They learn from their mistakes, study enemy patterns, and refine their strategies. Similarly, AI systems learn from training data and use it to improve their performance. By analyzing patterns and examples in the training data, AI models can extract valuable insights that enable them to make better predictions or generate more accurate outputs.

How about overcoming obstacles? Elden Ring is known for its difficult boss battles that require perseverance and skill. Players often encounter roadblocks and failures before finally triumphing over tough challenges. But how does that apply to an AI? Well, AI systems encounter obstacles during their training process. They may initially struggle to produce accurate outputs but gradually overcome these hurdles as they learn from mistakes and receive feedback from training data, becoming more adept at providing the desired predictions or outputs.

Feedback and evaluation is clearly critical to all of the above. In Elden Ring, players receive immediate feedback on their actions. They can gauge their progress through successful battles where specific strategies on the boss provide positive reinforcement of the approach. Similarly, AI systems require feedback and evaluation to improve. By providing labeled data or reinforcement signals, humans can guide the training process and help AI models correct errors, enhancing their performance over time.

Obviously the analogy here isn’t a perfect match, but I do think it highlights some shared concepts between leveling up in Elden Ring and the training of AI systems. Both involve incremental improvement, customization, learning from experience, overcoming challenges, and — perhaps most importantly — leveraging feedback to enhance performance.

Utilizing AI Terminology

The context I’ve provided you here let’s us consider some terms from the AI domain and this will allow me to further reinforce some ideas we talked about in the previous posts.

For example, let’s look at fitness function and loss function.

In Elden Ring, the player’s progress is determined by their ability to defeat challenging enemies and bosses. This can be seen as a fitness function, where the player’s performance is evaluated based on their success in combat. Similarly, in iterative optimization algorithms, there’s a defined fitness function that quantifies the quality or performance of a particular solution or set of parameters. The algorithm aims to improve this fitness value over iterations.

The player’s performance can also be seen as a measure of loss or error. Similarly, in stochastic gradient descent, there’s a defined loss function that quantifies the discrepancy between predicted and actual values. The algorithm aims to minimize this loss by iteratively adjusting the model’s parameters.

Let’s consider parameters and specifically updating or adjusting parameters.

In Elden Ring, players have the ability to allocate runes to different attributes, adjusting their character’s parameters to suit their playstyle. For example, I know that if I want to be a caster, I really have to focus on my intelligence attribute. But if I want to carry around colossal weapons, I need to focus on my strength attribute.

Similarly, an iterative optimization algorithm adjusts the parameters or variables of a model or system in each iteration to optimize the fitness function we just talked about. This involves exploring different configurations or values and selecting those that improve the performance or output.

Likewise, in stochastic gradient descent, the model’s parameters are updated iteratively based on the gradient of the loss function that we just talked about with respect to those parameters. The updates aim to minimize the loss by moving in the direction that reduces the error.

How about exploration and exploitation? This is a favorite topic of mine!

In Elden Ring, players often need to encounter various enemies, experiment with different strategies, and learn from their experiences to progress. Similarly, in an iterative optimization algorithm, there’s a balance between exploration and exploitation. The algorithm explores different configurations or solutions, seeking potentially better options, while also exploiting promising solutions found thus far to maximize the chances of optimization.

Similarly, stochastic gradient descent involves a balance between exploration and exploitation. The algorithm explores different samples from the dataset, updating the parameters based on the gradients, and exploits the information gained to improve the model’s performance.

And, of course, there’s learning from the feedback that we get. Elden Ring requires players to learn from their failures and adjust their approach based on the feedback they receive. Similarly, an iterative optimization algorithm utilizes feedback from the evaluation of different solutions to refine and improve subsequent iterations. The algorithm learns from the fitness values or feedback provided to guide the optimization process.

In similar fashion, stochastic gradient descent utilizes feedback from the loss function to update the model’s parameters. By examining the gradients, the algorithm learns from the discrepancies between predicted and actual values, guiding the parameter updates.

Finally let’s talk about convergence and optimization.

In Elden Ring, players aim to improve their character’s attributes and skills to overcome challenges more effectively and progress through the game. Similarly, an iterative optimization algorithm strives to converge towards an optimal solution by iteratively improving the fitness value. The algorithm continues to refine and adjust the parameters until it reaches a satisfactory or optimal state based on the defined criteria.

Stochastic gradient descent, in a very similar way, attempts to optimize the model’s performance by iteratively updating the parameters. The algorithm continues to adjust the parameters until convergence is achieved, minimizing the loss function and improving predictions or outputs.

What I hope you can see here is yet another contextual example using the terms I provided in the previous post.

Representation Space

In that previous post I also mentioned something called a “continuous representation space.” Let’s look at that in our context here. Stylized representation space.

What would be the continuous representative space of a boss’s attacks in Elden Ring for training an AI system? Well, one possible approach is to create a multidimensional feature space capturing various aspects of the attacks. Here’s a conceptual example: Dimensions of a boss encounter in an abstract space.

Here the big block at the far left is all the input data, which, in this context, would be a full set of data about a combat scenario. That data is then further refined and broken down until, at the far right, we get some sort of output. That output would be the decision to take based on all that input data. But what is that data specifically? Here are some examples:

  • Timing: This dimension represents the timing of the attacks, including windups, durations, and recovery periods. It could be represented as a continuous value indicating the time intervals for different attack phases.
  • Animation: This dimension captures the animations associated with different attacks, including attack types (slashes, thrusts, etc.), attack trajectories, and attack speeds. It could be represented as a vector or a combination of relevant animation parameters.
  • Hitboxes: This dimension represents the hitboxes associated with each attack, indicating the areas in space where the boss’s attacks can hit the player or the player can hit the boss. It could be represented as a set of spatial coordinates defining the hitbox boundaries.
  • Damage and Effects: This dimension captures the damage potential and additional effects (e.g., bleeding, poison, fire) associated with each attack. It could be represented as a continuous value indicating the damage output and a categorical variable denoting the presence of specific effects.
  • Behavior Patterns: This dimension represents the boss’s behavior patterns or attack sequences, indicating any recurring patterns or combinations of attacks. It could be represented as a sequence of discrete actions or a sequence of corresponding features.

These dimensions form a continuous representation space that encompasses various aspects of the boss’s attacks. AI systems, just like humans, can then train on this representation to learn patterns, make predictions, or generate strategies.

Training methods such as reinforcement learning, supervised learning, or unsupervised learning can be applied depending on the specific objectives and available data. In the context of games, reinforcement learning is probably one of the more common techniques used.

So … How About Some Testing?

Whew, that’s a lot of theory! In part two of this post, I’ll show you how I put some of that theory into practice with an AI testing tool I wrote that was executed against Elden Ring for the purposes of discovering particular qualities about the boss encounters. We’ll see what the AI tester was good at finding but also what it was not so good at finding.

Share

This article was written by Jeff Nyman

Anything I put here is an approximation of the truth. You're getting a particular view of myself ... and it's the view I'm choosing to present to you. If you've never met me before in person, please realize I'm not the same in person as I am in writing. That's because I can only put part of myself down into words. If you have met me before in person then I'd ask you to consider that the view you've formed that way and the view you come to by reading what I say here may, in fact, both be true. I'd advise that you not automatically discard either viewpoint when they conflict or accept either as truth when they agree.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.