Focused vs. Open Playtesting

How to know when playtesting should be more directed or more hands off

Jul 18, 2024

title card for Doom Eternal with text reading "this is... fine?"

This month on the pod we turned our attention toward something that may seem to to be ancillary game design but is actually fundamental: playtesting. We talk about how, ideally, a kind of feedback loop develops between devs and players. The devs create something, hand it off to the player, who plays what they’ve been given and, in the process, reveals something about the design that the devs themselves either didn’t immediately see or only suspected might be the case. This then informs the next iteration of the design, which the devs again hand off to the players, who reveal something, and so forth until the game reaches a state where the devs are comfortable shipping it. Everyone gets a little something out of the process. The player gets to see something new and fresh as well as make a meaningful contribution to a game they want to love. The devs, likewise, develop a stronger rapport with their playerbase and get the kind of crucial feedback they need to make their game a success.

But I should be clear that when I say playtesting, I don’t really mean quality assurance (QA), which is generally done by professional testers who intentionally try to break various aspects of the game, so that bugs can be addressed before launch. That’s not to say playtesters never identify bugs or break things, as alpha and beta builds often have a feedback system integrated into the version of the game the players see, but that is not the fundamental purpose of playtesting. When you have someone playtest a game, you want, first and foremost, to see what their ordinary play experience looks like and the many forms that can take. You can track which parts of the game they subjectively assess as more exciting, and generate an interest curve to see the overall sweep, the highs and lows, of your game.

Though as tempting as it may be to wrench subjective assessments into a rigorously data driven analysis, sometimes it’s more appropriate to just sit back and see what a player does. In an open world game, where you hand your player an environment and a set of tools/mechanics, it might be to your detriment to direct the play experience in order to receive more concrete, specific feedback. It’s a forest for the trees problem. You might find through directed feedback that players enjoy all the individual pieces of a game: the mechanics, the gameplay encounters, etc. Then you ship the game and suddenly discover all the reviews compliment those mechanics while taking you to task for creating something that never really coheres and players quickly lose interest. In this instance, it would have been better if you also just let the player do their thing and get their general impressions afterwards.

Much of this is the purview of UX research, which maybe in a future post I’ll get into in greater detail. For the time being, we’re going to focus on the relative benefits of directed and open playtesting, so you can know when to do which. Ideally, you should always be doing both. When you get directed feedback from a player as well as their overall vibe, you can actually integrate the two into a more holistic picture of what’s going on in your game. Subjective and objective measures don’t have to be at odds with one another.

Opening Up to Player Feedback

I want to begin this section with a bit of a plea. I wish this weren’t the case, but modern, mass market game companies are not always open to what players have to say, especially when that feedback is negative. Your players are telling you the excessive microtransactions make the game feel like it’s pay to win, but the hedge fund that owns your company is saying you need to pump x millions from this game, or they’re going to shutter the studio. Your incentives are perverse, even in a situation where you might agree with what players are saying. Players, on the other hand, don’t always find the most generous way to communicate their feelings, and oftentimes they might know that they don’t like something but can’t really articulate why. The more the two parties fail to communicate—meaning the devs and the players… fuck the hedge fundamentalists—the worse the rapport between them and the less likely you as a designer will get the kind of feedback you can actually do something with.

With this in mind, you have to understand that playtesting is as much a relationship as it is a means to evaluate gameplay… or, again, ideally it should be. This means long before you ever set a prototype or a beta build in front of your player, you need to be certain that you have the kind of rapport with them to produce the mutually beneficial outcomes that should drive playtesting. This is why, in the pod, we suggested you should always start with a trusted, inner circle of knowledgeable testers. Prototypes can be in a rough state, and you need to know that the people who will be playing it care as much about you and your success as they do about their personal satisfaction in gameplay. As the game moves toward a more polished state, this circle can broaden, and you can start to garner a more diverse array of opinions.

This is also, then, a plea to look beyond players who are too similar to you or who might be considered the “typical” playerbase for the kind of game your making. I’m not saying that if you’re making a collectible trading card game you should ignore, say, Magic: The Gathering players, but part of the success of Hearthstone came directly from engaging a broader player base who were less tolerant of the abstruseness and walls of texts you sometimes see in competitive card games. And growing your playerbase is essentially a win-win scenario, so I really don’t understand why you occasionally see studios pigeonholing themselves solely for the purpose of appealing to a particular demographic.

When it comes to the nuts and bolts of open playtesting, it’s pretty straightforward, at least initially. You hand the player a build, let them play, and mostly leave them alone. Mostly. First, as they play, you’ll want to closely observe what players are spending more or less time doing. Are they completely ignoring a mechanic or encounter? Why? Be sure to take copious notes as you go, so that later you can decipher why players were spending more or less time on particular aspects of gameplay. Is it because they seemed to be really into it, or was it because that aspect was confusing, and they had to spend an inordinate amount of time figuring it out? Resist the temptation to intervene, even where players appear to be getting frustrated. That frustration is an important indicator of areas where your game might need more attention.

Second, you’ll want to survey players after-the-fact about their experience. Here you’ll want to resist the temptation to be too specific or ask questions that lead the player to a particular conclusion. Try to make things as open-ended as possible. A question like, “were you really excited when the boss came crashing down from the sky?” implies they should have felt excited about it and might hide their true feelings so as not to buck expectations. You can ask a completely open-ended question like “what did you think about that moment in the game? How did it make you feel?”, or you could go with something a little more directed like, “what aspects of the boss entrance did you enjoy? What do you think could improve things?” Either way, you need to signal to players in your questions that whatever they think is appropriate. And, more importantly, you actually have to believe even negative criticism is good and have your behavior reflect that.

While open playtesting of this kind has the potential to indicate things you, as a dev, hadn’t previously considered, its drawback lies in how feedback can be contradictory or even vague. Vibes are rarely precise. Players might be able to articulate to you that something isn’t working, but they might not be able to say why they feel that way, which leaves you in the position of having to throw darts at a board in order to find a solution. Moreover, even when feedback doesn’t suffer from vaguery, you can end up with a set of responses that explicitly contradict one another. Then you’ll need to do some further work to try and figure out what accounts for the difference. This is where focused playtesting comes in.

Directing Gameplay for Great Success!

What distinguishes open from focused or directed playtesting is more or less when you ask the question. Does the player know what you’re looking for beforehand, or will they only find out afterwards? More specifically, though, you’re asking questions before the player even sees a build, because you want them to focus on a particular aspect of the game and be aware of what they’re supposed to be looking for so that they can provide the precise feedback you need. In this, open and focused playtesting are not opposed to one another, but rather complimentary. Open testing might reveal something you need to look at, while focused testing can provide you an inroad into understanding why the thing that doesn’t work isn’t clicking with people.

Let’s say we’re making a boomer shooter meets bullet hell game called Martian Demon Hunter Eternal, and in our initial playtesting we discover our players are unanimously having problems getting through this one particular level. We could just tune everything down by spawning fewer demons or having them shoot less often, but at the same time it’s a bullet hell game. You’re supposed to feel at least a little pressed, so we want to see if there’s something specific in the level that’s hampering a player’s progression.

So, we tell them going in that this level is hard. We want you to complete it, but there’s a high likelihood you want. As you play, pay attention to what moments in the level you find especially difficult and make a mental note to report back later. Here’s a map of the level. When you get to one of these especially difficult moments, pause the game and mark it on the map. Or maybe we want to focus on the quantity of shots. Then we’d direct the player to note when the volume of incoming fire is just too overwhelming. Whatever we want to be looking at, we direct the player to focus on that thing, potentially to the detriment of other things going on in the level.

From this, we can generate a quasi-objective data set. The map is a good example of this. We’re asking for the player to respond to subjective feelings of difficulty but in such a way that when we combine all the players’ map marking, we can get a kind of heat map of where the difficulties arise. This heat map might draw our attention to one location in particular where nearly every player indicated that the volume of fire was just too much. Then we can tweak the number of shots just a little, say, to 75% of our original value, in only that location, hand the build back to the players, and now we find level completion rates much closer to what we would want or expect, even given our assumptions about the game’s overall intended level of difficulty.

Where focused playtesting shines is in precisely these situations where what we need is a tweak here and there, not to completely overhaul a level or mechanic. If there is a more fundamental concern in this level of Martian Demon Hunter Eternal, directed testing isn’t likely to find it. In fact, by telling players to focus on something, we may actually distract them from the symptoms of an underlying problem that needs to be rooted out entirely. Perhaps—and I’m just spitballing here—making a boomer shooter bullet hell wasn’t a great idea to begin with.

But much like the relationship between players and devs, in an ideal world with infinite time and money, you would do both open and focused playtesting in such a way that they inform and build upon one another. In our examples above, I laid out a unidirectionary flow, where open testing gives you something like an overview of big picture concerns, so that down the line you can focus in on specific things that have come to light within that broad framing. However, if we go back to our Martian Demon Hunter Eternal test example, we might find through a series of testing and tweaking that, no matter what we change or how, we just can’t find the sweet spot between ridiculously difficult and completely trivial, because the issue isn’t a matter of simple balance. Going back to open testing can help us see the forest again.

Generally you want to be hands off with open testing, but one thing you might want to keep in mind is that you can still limit the bounds of what the player has available to them. A single level, like the one above, can not only make for a good vertical slice for all the game’s systems, it can also provide you the opportunity to double dip a bit, if you find that this level in particular just isn’t working for people. Through a more open, after-the-fact interview and survey process, you might discover that what players object to is a kind of spikey suddenness in difficulty. If instead of enemies shooting a bunch of crap at you out of nowhere, the level could ramp up over time, like Tetris, letting the player get into a groove or flow that warms them up for the level’s most difficult moments. This is not something you’ll typically see from focused testing, where you’re looking at specific components or mechanics rather than the overall structure.

Conclusion

So, to sum up, all of this has been a longwinded way of saying the following three things:

Open playtesting is defined by waiting to assess player opinions until they’ve completed gameplay and is analytically oriented toward a more subjective, vibey, big picture view of whatever slice of the game you happen to give them.
Focused playtesting is defined by directing the player before gameplay to pay attention to specific, typically more minute aspects of gameplay and is therefore oriented toward precise, objective-ish feedback related to fixes and tweaks.
Open and focused playtesting are not opposed to one another, but rather present a complimentary whole, where those who do playtesting can go back and forth between the two, depending on what each process reveals that might be relevant to the other.

At the end of the day, though, we have to acknowledge that testing cannot be infinite but should be understood as a “good enough” process. No game has ever shipped in a perfect state, Nintendo seal or no, and it’s even possible to get stuck in a kind of endless testing and tweaking mentality, where you stop getting enough benefit no matter how much effort you put in. Initial testing will often identify high value targets for reassessment, later testing less and less so, as the game gets more refined. Make sure you’re doing a periodic gut check just to be certain more testing is really needed.

And… that’s it for this month! We’ll see you back here next time!

Game Design Discourse - The Furidashi Weblog

Discussion about this post