This page summarizes discussion after my Quality Week 97 talk on 'The Test Manager at the Project Status Meeting'. It covers two topics:
Some pre-talk email discussion
1) Hire the best testers you can find (passion for testing, strong technical skills, energy, etc.) and work hard to keep them. Building a great team can feed on itself. Hiring strong testers makes it easier to hire strong testers.
2) Be an active participant in (or at least attend) any project related activities that have anything remotely to do with test. For example, design reviews. In other words, have a presence whenever possible and make it valuable for test to attend by being an active contributor.
3) As your talk suggests, provide valuable/useful information about the status of the project. For example, if the ship criteria says that the product won't ship if any priority 1, severity 1 bug still exists, but the test results reflect all problems encountered, then the results might not be meaningful. In this case, presenting test results that indicate failures relative to the ship criteria would probably be more useful. On the other hand, it is important to make people aware that the test results don't reflect all failures. Doing this item well will encourage development to become dependent on the test team.
4) When it comes to beta testing, internal dogfood testing, etc., be the driving group behind making these happen. Set the internal deployment milestones. Drive the early adopters program. Create the tools that help debug the product. If it has to do with testing or building a quality product, be the one doing it. When other groups do this, it tends to water down the value of the test team.
5) Be as much an expert in the technology as the developers are. If there are internal or external aliases/news groups, be active answering questions about the technology you test.
You hint at the central issue in your last comment about bug tracking.
The bug tracking system is the vehicle for tracking and communicating the exposed quality (or lack thereof) of the product under test. As you probably (but may not) know, there is one defect tracking system used within all of MS. This defect tracking system has a bit of inflexibility built in by design. For example, the resolutions for a bug are fixed values of: By Design, Duplicate, External (to the product), Fixed, Not Reproducible, Postponed, and Won't Fix. This ensures that all defect resolution metrics have value (due to consistency). A similar technique exists for defect severity's, as they're defined within the tool as: Severity 1: Bug is crashing, causes data loss, breaks major functionality, or other severe problem.
Severity 2: Bug is annoying, contributes to overall instability in this area, crashes in obscure cases, and breaks minor functionality.
Severity 3: Bug is minor, doesn't impair functionality, may affect "fit & finish". Severity 4: Bug is trivial - a good case for postponement.
These definitions may be slightly more open to interpretation, but are again, intended for consistency. Additionally, these items require entry (by the defect tracking tool) for each defect.
The consistent information associated with every bug provides for clear evaluation of the productivity/effectiveness of the test organization (i.e. an opportunity to differentiate your test organization). Some of the following metrics are examples:
# of Bugs Reported
% Closed as Not Repro or Not a Bug
% of Duplicates
Open/Close Rate by Severity & Priority
* # of Days as Resolved
% of Bugs Reported by Customers (after release)
* Significant Bugs Found in last two weeks of Project
Here are some things i've seen work:
- Track and graph bug find/fix rates. This provides an easy rejoinder to the development manager who says all the open bugs will be fixed by the end of the week.
- Have a clearly documented set of bug severity criteria. Update the criteria for your product. Have it actually reflect your priorities. If there are differences of opinion amongst the team, they can be resolved at this level.
- Track the number of build cycles or test/rebuild cycles you needed on previous releases. Use these as background when you project how many build cycles you expect to need for the next release.
- Track the effects of late checkins of minor bugfixes. Are they really worth it? TurboTax had to reship software because of one of these.
Some statistics from a QA survey:
What measures of product quality does your organization use?
66% Passing of regression tests
62% Number of known bugs of a given severity in product 31% Number of hours tested
26% Bug find rate per hour of testing
Some discussion from the talk
Thanks to Johanna Rothman for being Recorder. I'm not exactly transcribing what she wrote. Resulting errors and misinterpretations and overgeneralizations are entirely my fault.
Comments on using historical data as a predictor of the future: It is dangerous to give numbers about bugs expected to be shipped with the product to the project manager. He or she will claim that historical data should have helped the test manager improve the process to prevent those bugs. You should be able to find more bugs in follow-on versions because you have more intelligence about the product and process.
It is useful to track how bugs get demoted in priority. If, as the project gets closer to shipment, more bugs get downgraded in priority, that may tell you something. It's useful to have data about the degree to which the project is fooling itself.
How do you find hot spots? Ask the developers. They'll tell you about their code. They'll also tell you about other developers.
John Musa recommends tracking failure intensity, rather than number of bugs:
Were both failure intensity and number of defects equally easy to capture with equal accuracy, I think there would be no question but that failure intensity with respect to an operational profile would be the hands-down winner. Where John and I differ is our assessment of where effort should be directed given that defect data is relatively easier to capture in typical projects, and on how usefully accurate the failure intensity data is for products where the operational profile is difficult to predict.
I worry that John, as a leading and early proponent of a great new idea, falls prey to the "if all you have is a hammer, every problem looks like a nail" syndrome. On the other hand, I worry that I, as someone practiced in an earlier approach, fall prey to the "what do I need a hammer for? This rock's always worked for me" syndrome. In the future, I hope to see both failure- and defect-driven approaches applied on the same projects, each in due proportion, each providing some useful data. We can both be right.)
Hot spots should be presented by the development manager, not the test manager. (Audience reaction: that's not going to happen.) Brian responded: "Either way is OK by me. The important thing is that it get presented. Note that the testing team still will be the ultimate provider of this information - the development team won't know (well enough) where the hot spots are."
Someone working in a highly regulated environment, moving to higher levels of the CMM, says:
Involve the user as part of the test team. Get testers, developers, and users talking about what should be tested and how.
Technical writers have a good feel for what should be tested. (If it's hard to explain, it's hard to get right.) Customer support, as domain specialists, have a good feel for where testing effort should be directed.
Defect classification: it's important to agree on classifications and priorities, else understanding is weakened.
Human side of software process improvement is hard. Some recommended books:
Steve Maguire, Debugging the Development Process
Kaner, Falk, and Nguyen, Testing Computer Software
Doyle and Straus, How to Make Meetings Work
Gerald Weinberg, Quality Software Management