9 Types of bug fix that every game tester should know

When a bug is fixed, it’s returned to the test team to verify the fix and perform any additional testing that might be warranted given the details of the fix (from now onwards I’m going to refer to both activities simply as ‘retesting’, even though strictly only the former is retesting). The aim of testing beyond the bug repro steps is to identify knock-on bugs and regressions in other game areas. The scope, type and risk of code bug fixes varies wildly, from one-liner fixes through to complete code system restructures. There's also a wide range of content fixes which bring their own unique risks. Given this huge deviation, how do testers assess whether each bug requires additional testing and decide on the focus and depth of that testing? 

With thousands of bugs being fixed in a game project, it's critical that each and every tester is able to assess the scope and risk of bug fixes and take the appropriate action.

The volume of bug fixes mean that senior QA members aren’t able to personally vet every bug that gets fixed, leaving the responsibility to each tester. Because of this distributed responsibility, training and documentation on effective bug retesting is a strict requirement to avoid missing severe knock-on bugs during retesting.

In this article I’ll start by discussing some bug lifecycle best practices that are a prerequisite to effective retesting and then list the various types of fixes and their attributes that dictate the correct action for each fix type. Armed with this information, testers will be better equipped to identify different types of fixes and ask more targeted questions when requesting additional information.

Bug life cycle prerequisites

To make the best decisions on how to deal with bug retests, testers need detailed information about how the bug was fixed, given to them using terminology that they can understand. Some teams create a specific field for this information when resolving bugs, others leave bug fixers to add updates through comments on the bug. Whatever method your project chooses, bug fixers should be encouraged to add any contextual and relevant information to the bug. This information should describe how the bug was fixed and give guidance on how risky the fix was. The most helpful comments may even suggest game systems which are at highest risk of regressing and should be included in retesting. Some developers I know will even list some explicit scenarios that require deeper investigation. Here’s a quick template on the type of information which is helpful during retesting.

  • Builds + Branches: This information describes where the fix can be found

  • Affected Areas: This information describes those game systems and areas which were modified during the fix or at risk of regressing because of the fix.

  • Notes: This provides any additional context to the fix and may include specific scenarios, technical details, a dev risk assessment, instructions for how to reproduce the bug or special instructions for what to do with the bug after retesting.

  • Bug fix method: This information describes what was changed to fix the bug and will tell the tester if they're expecting a new client build for the bug fix or if it will be delivered another way. Possible types include (but aren't limited to): client code fix, server code fix, server architecture updates, over-the-air asset deployment, over-the-air config deployment, text updates, UI layout changes, balancing data updates, art and asset updates, build machine config changes and 3rd party package updates. 

If you've worked on any game project ever, you'll know this level of detail can be a tall order and the reality of project life is that testers will receive a mixed bag of helpful and unhelpful fix comments. There needs to be at least a baseline of knowledge transfer here and assuming the QA leadership on your project is promoting this best practice, the testers should still be able to tailor their retesting to each bug.

With the prerequisites out of the way, let's dive into different types of fixes and their attributes. 

Trivial code logic fixes

This bucket of fixes most commonly come from a client developer, as opposed to any other discipline (design, tech art, art, sound, etc.) and arrives in updated client builds. The number of changes (files and lines) required to fix the issue are minimal and the nature of the fix is clear; a component which was broken due to a single error where amending that error has fixed the bug. The attributes of the bug are often that the error was a single, obvious and undisputed mistake. The fix itself is isolated and doesn't have links or dependencies with other game systems. These types of fixes are also well-understood and predictable, meaning that they present limited unknowns and therefore also present limited risk. Finally, the 'expected result' after the fix is clear, what was previously failing should now not be.

In short, this type of bug fix is what we hope every fix to be. Low risk and limited in scope. An 'open and shut case'. Regarding retesting, these bug fixes can be safely verified without further testing. Fixes arrive as commits to new client builds and tests can move freely between fix and non-fix builds.

Code restructure and larger code logic bugs

This category of fix is applied when there isn't a simple solution to the bug fix, often required because the root cause of the bug isn't simple either. These fixes may also include changes to other non-code files, like assets renamed, regenerated or reorganised to work alongside the new code changes. They are larger in scope, both in files and lines modified, but also in the number of game systems that are touched during the fix. Additionally, there is a large range of fix sizes within this category, the largest of which will be classified as new implementation and converted from a Bug type to a User Story type within the project database. Changes may have been made to functionality shared across multiple features, or the developer found references from other game systems to the changed code. The far-reaching and complex nature of these fixes means that there are more unknowns about how the fix will impact other dependent game systems, which increases risk and lowers predictability. This is an important point, because it means even the developer may not be able to tell you exactly which systems and scenarios should be included in retesting, or how they might be impacted. Testers should also know that tracing these references through the code base and understanding their impact is often a difficult and time-consuming task for developers.

For these more complex fixes, the player-facing result may be that the buggy behaviour stopped occurring, but it’s also equally likely that the code restructure changed the player-facing behaviour more holistically, preventing the buggy scenario on a higher level. As the new behaviour diverges further from the original ‘expected result’ of the bug, more work is required to communicate and appropriately test that new design. Developers and other bug fixers should understand that testers will be retesting against the original ‘expected result’ of the bug and may even mark the fix as a fail if the new design isn’t documented in the bug fix comments. These types of fixes require more detail in the bug fix comments and present a stronger case for regression testing of connected game components as well as deeper retesting of the new game behaviour. 

Fixes will arrive in new client builds, but where the developer has identified the fix to be sufficiently risky or large, it will also be made in its own code branch, isolated from the main working copy of the game. This scenario requires coordination between dev and test disciplines to retest the bug and test the branch, then report back to the developer with either knock-on bugs or the positive test results. If the fix has created more problems than it was trying to solve, it may be abandoned and the branch left unmerged. 

New implementation fixes

Some bugs can’t be directly fixed because it’s technically infeasible, would require too much time or isn’t aligned with the holistic goals of the project. Bugs are often left unfixed in many cases because the buggy game system or dependent technology is planned to be replaced or updated in the future. For example, it wouldn’t make sense to fix individual audio bugs if the audio package the game engine uses is planned to be replaced before the release date, or spending time fixing some types of 2D asset bugs if texture atlas packing algorithm is planned to be updated. In these cases, the new system or package is implemented into the project and all affected bugs will be resolved as fixed or ‘requires retesting’. With new implementation, it's likely that existing bugs will no longer trigger but new bugs will be seen. Additionally, developers may not be able to easily predict the effect of the new implementation on each bug, so they can’t give much guidance on whether each bug will actually be fixed or not. 

The recommended way to handle new implementations is to have a User Story to track the work and link all affected bugs to it. This guarantees the communication and test effort will be handled by QA analysts and the bug retesting grouped into test tasks for a singular test effort, not left to be assessed separately by each tester. However, for database hygiene and traceability, bug fixers should still leave comments on those bugs that are (hoped) to be fixed through new implementation. Testers will need to more carefully assess the new game behaviour in the buggy area and decide to close or reopen the bug, and whether new bugs should be logged. Bugs that remain open should be updated with new logs and the details re-checked to make sure the reproduction steps and facts are still accurate. Without care, these types of bugs often turn into a mess of outdated and incomplete information from before and after the new implementation. 

This is unfortunately an area where I've seen the most issues. It's too easy for feature authors to assume an update will fix the bugs and simply write "this is now fixed" in the bug comment, without further context that this bug didn't have a specific fix. Likewise, I've seen many cases where a similar, but fundamentally different bug is still seen and the tester has updated the repro steps drastically and marked the bug as a fix fail, when it's actually a new bug with the new implementation.

Like complex fixes, new implementations will arrive in new client builds and possibly also isolated into their own branch.

Defensive coding fixes

This category of fixes are created as a response to exceptions, errors and crashes for which the team doesn't know the root cause (referred to herein as ‘errors’ and ‘error reports’). Throughout development, errors will be triggered by testers, developers and other project team members. Some of these errors might be identified and investigated at the time they are triggered by each team member, while others are identified through crash reporting tools retrospectively. For these error reports, the team is not always able to identify the persons who triggered the bug, and so don’t know what they were doing at the time the error was triggered and can’t retrieve a full game log or other normal bug artefacts. The limited information provides the developer with a call stack showing the area and line of the code that failed, the device details, the build details and ‘breadcrumb’ logs which show some details of what happened leading up to the error. Importantly, the player state and exact steps leading up to the bug are not provided. Additionally, what errors are reported and the information included within error reports needs to have been implemented by the development team ahead of time, meaning that the usefulness of this information varies from project to project.

When tackling these errors, experienced developers can often take an educated guess at the root cause steps which led to the failure and then attempt to reproduce it in their own development environment. In these cases, a way to reliably reproduce the error has been found and the fix will focus on preventing the root cause issue failure which later led to the error. The repro steps can also be added to the bug report for testers and other team members to follow during retesting. But deducing the root cause of all errors isn’t so easy. When the root cause can’t be found, the developer may instead choose to add additional logical checks around the line of code where the error fired. This prevents the error from triggering but doesn’t solve the root cause failure that led to the error. This is called defensive coding. Adding in checks to guard against states and scenarios which shouldn’t be possible, but have somehow been triggered elsewhere in the code. It’s an important distinction between fixing the root cause failure and it’s how a developer can fix an issue without even knowing the reproduction steps for it.

These fixes are made by developers within the game client and will arrive into retesting as new client builds. The good news is the change is often only a few lines of code and low risk, the bad news is that repro steps are still unknown, leaving no possibility to confirm the fix through a retest. I’ve had a lot of testers ask me “how do I retest this bug?” for these issues, and the answer is, you don’t. If we weren’t able to reliably trigger the failure in the first place, any retesting to confirm the fix would be futile. At most, the tester will need to decide whether regression testing around the change is required, depending on the developer comments. This requires some rough guidance on the game area where the change was made. 

To actually get confidence in the fix, the development team will monitor occurrences of this error report within crash reporting tools. If the fix has been successful, occurrences of the error should decrease.

Logging and breadcrumb fixes

This bucket of fixes also deals with errors, exceptions and crashes where no reliable reproduction steps can be found. The changes are actually not attempts to fix the error at all, but add additional diagnostic information into the game so that when the error is triggered again, the code team have more information to deduce the root cause failure. While this isn’t a fix for the bug, many teams will still link the code commit to the bug and resolve it alongside the code change so that it's no longer in the 'to do'/active state. When the error is triggered again, the newly logged information is collected in the bug and it is reactivated and re-triaged.

This flow can confuse testers because they can see a change was made and the bug resolved, so naturally assume a fix was attempted (and jump into retesting). Additionally, the extra diagnostics are required because a solid repro doesn't exist, leaving the tester at a loss for what to do with the bug. In reality, no retest action is required and a better state for the bug would be 'needs information' rather than resolved. A slightly better bug flow and some good communication in bug comments can avoid wasted time and tester confusion with these fix types.

Performance improvements

Performance bugs and their fixes cover many different performance metrics, including but not limited to: framerate, hardware utilisation %, game loading time, scene loading time, memory usage, storage usage, app size, network performance and server performance. Regardless of the metric involved, performance bugs have a unique attribute in that their state is not binary* (fixed or not-fixed) but instead are instead improved through doing work. Sometimes that improvement is enough to meet an agreed quality bar like '30 frames-per-second' which the tester can use to verify the fix concretely. However, the reality is rarely so clear cut. Often, performance improvements are difficult to implement, improving but not fully fixing the negative behaviour. If the fix expectations are misaligned between dev and QA, testers can wrongly reactivate these bugs and cause bug ping-pong, wasting time for everyone. Further improvements with the same fix method are often not technically possible, limiting the extent of improvement of each fix. This means that if the bug is reactivated, the team has to accept the partial fix or go back to the drawing board and try a different approach.

Another unique attribute of performance fixes is that they're often a trade-off between two metrics. Improving one metric may make another worse. For example, moving assets to be downloaded on demand may improve game loading time and reduce app size, but make framerate and network utilisation worse as assets are downloaded asynchronously within the scene. 

These fixes require clear communication from the fixer on how much the bug is expected to be improved by the fix, as well as contextual information about the fix itself which should indicate if a tradeoff has been made. Testers should recognise performance as the unique example that they are and take care when retesting. If the new result is in doubt, a direct message to the bug fixer may save time over reactivating the bug.

*Ok fine, a small number of performance bugs can be quite binary, but these are an exception to the rule. This can happen if a single piece of logic or feature is behaving badly and causing a massive degradation in performance for a specific scenario. In these cases, fixing the bad behaviour often shows a complete resolution to the affected performance metric.

Content and art bug fixes

This category of fixes comes from non-code disciplines and includes a wide variety of content fixes, including but not limited to: 2D assets, 3D meshes, particle effects, animations, character rigs, object placements, map layouts, game environment  changes, UI layouts, camera behaviours, sound effects, dialogue and music. These fixes are interesting to the retesting process because they present different scope, predictability and risk to code fixes. A basic bug example is that of missing assets (of any type) where the fix is to simply include the asset in the project or if the file exists already, fix the reference to it. Missing assets require very little diagnosis before applying the fix and are mostly very predictable once the fix has been made. These attributes make the retesting process straightforward, verifying that the missing asset is now present in the game. The only added work here is that the newly added asset may have further bugs within it that need to be logged. Generally speaking, content fixes are far more predictable and isolated than code fixes and require no regression testing around the fix.

Many content fixes exhibit similar characteristics to performance bugs in that they're improved instead of fixed. When a content bug is fixed, the asset or area is edited by the author to stop showing the negative behaviour or visual. This work is far more freeform, arty and subjective than logical code fixes. A common example are 3D in-game objects that clip (intersect) one another in a visually undesirable way. The fix is usually to move or remove the objects. This will eliminate or reduce the clipping. The subjectivity and variance of the 'expected result' can cause problems during retesting, most often resulting in testers reactivating bugs because the bug isn't entirely fixed. This bug ping-pong creates wasted time and conflict between the test team and the designer or artist fixing the bug. This wasted time is amplified by the huge numbers of minor content bugs which are logged on every project. These types of fixes don't require lots of context from the fixer, a brief comment on the asset changes is enough to help the tester conduct effect retesting.

Lastly, minor content and art bugs arrive in the thousands for big projects and can include as little information as a screenshot of the bug and the world coordinates where it can be seen, making it quicker to log each bug, driving the total number higher. To fix and retest this scale of bugs requires a unique approach. Teams will choose not to retest bugs individually, but to re-run a test pass on a subset of content and log new bugs for issues seen. Resolved bugs are closed for the area as a batch without individual retests. In some cases, low severity environment bugs may go without a retest at all, because it's simply not worth the time to check such minor issues on such a large scale.

Testers should be aware of the subjectivity of content fixes and of how much time they're spending on the bug. No regression testing is required, but speed of verification is important.

Temporary vs. permanent fixes

As we've discovered so far, the difficulty and scope of bug fixes can differ greatly. In some cases the team knows the most correct way to fix a bug (or a group of bugs) but agree they don't have time to fix it that way right now. If the bug effect is severe enough, the team may decide to apply a quick fix and release it, then replace it later with a more comprehensive fix.

I've seen many examples of bugs like this which only occur for a subset of devices, features, players or AB test configurations. The quick fix is to simply disable the particular feature configuration through a remote data killswitch, then follow up with a code fix in a later client release so the config can be re-enabled. When this occurs, not only will the temporary fix be applied without a new build, but the real fix retest requires a two-step process of re-enabling the data killswitch (undoing the temporary fix) and testing with an updated client build. If not intimately understood, these complexities can trip up testers conducting retesting.

Even if the temporary fix is a client change to code or assets, the double-fix confuses the regular retesting process because we don't want to close the bug after the temporary fix is applied, we want to keep it to track the later permanent fix. x. In this scenario, special instructions should be communicated to the tester to not close the bug if the retest shows a positive result, or create two separate bugs to track the temporary and permanent fixes separately, linking them and adding comments to avoid future confusion. 

Regardless of the method of the temporary fix (client, data, etc.), the technical context of both temporary and permanent fixes is paramount. Without this information, testers are likely to confuse the two bug fixes, reactivate the wrong bug in the future, mix up the details between the two bugs, incorrectly assess the risk of each fix, not understand the difference between the fixes or trigger a myriad of other negative outcomes.

Updated designs and intended behaviour

These 'fixes' are made outside of the game code and so present very low risk during retesting. The bugs arise as a consequence of bugging a difference between the running game and some point of truth, usually a design document, art reference or technical spec. For some of these bugs, the team will decide that the current game behaviour is the true intended design and the reference document is out of date or less desirable, and the 'fix' is to either update the reference document or mark it as out of date. This might occur because the design details were not feasible to technically implement in the first place. It could also be that the design was simply iterated on and the original document now out of date. 

It's normal and healthy that the running game will become the 'point of truth' over time and so these bugs serve as a record of the agreed and intended game behaviour, even if nothing was actually 'fixed'. Indeed, many teams will simply mark these as 'by design' as a separate bug resolution to a 'fix' without telling the tester that the design was updated on the fly. There is an important distinction here between bugs that were always intended design and bugs that have become the new intended design as a consequence of finding and triaging the bug. 

For this bucket of bugs, retesting involves reviewing the updated document to check that it is now aligned with the running game. Other times the bug can be closed directly.

Conclusions

Bug fixes are not all the same and can't be treated equally during retesting. Furthermore, many testers don't have the visibility or the experience to understand what work goes into a fix and the different game disciplines who commit bug fixes. Educating testers and passing contextual information through bug comments is a key part of an effective retesting process. 


Enjoying my content?


Consider buying my book to keep reading or support further articles with a small donation.

Buy me a coffee
Previous
Previous

QA vs. Testing: The role of QA in games and how getting it wrong can create a terrible team culture

Next
Next

Chasing Ghosts: When a bug isn't a bug