RBCS COVID-19 response: All of our public training courses through May will be run virtually (view details).
How the mighty have fallen. Starting in the 1980s, Japanese companies became legendary for quality, and none more legendary than Toyota. Today, Toyota leads the news—due to quality problems. The situation is so severe that Toyota CEO Akio Toyoda personally appeared at a Congressional hearing. In that hearing, Toyoda said, “We know that the problem is not software, because we tested it.”
Is this a realistic way to think about software quality assurance? In fact, increasing indications (including reliable information from confidential sources in Japan) are that some of the problems are software-related. Let’s look at the quality and testing lessons we can draw from Toyota’s debacle. Let’s start with that quote from Toyoda, because it’s so categorical—and so wrong.
Size can deceive. Consider bridges. The Sydney Harbour Bridge, the Golden Gate Bridge, and the Tsing Ma Bridge are enormous structures. However, they are built of well-understood engineering materials such as concrete, steel, stone, and asphalt, which have well-defined engineering, physical, and chemical properties. Being physical objects, they obey the laws of physics and chemistry, as do the materials that interact with them—air, water, rubber, pollution, salt, and so forth. Further, we’ve been building bridges for thousands of years. We know how bridges behave, and how they fail. Ironically enough, given some of the lessons in this post, our ability to use computers to design and simulate bridges has increased their reliability even further.
Size notwithstanding, a bridge is a simpler thing to test than a Toyota Prius. In the complex system of systems that controls the Prius, there are too many states, too many lines of code, too many data flows, too many use cases, too many sequences of events, too many transient failures to recover from. Consider this example: Engineers at Sun Microsystems told an associate of mine that the number of possible internal states in a single Solaris server is 10,000 times greater than the number of molecules in the universe.
I have been involved in testing and quality almost my whole 25-plus year career. I know how important testing is to quality. In the two lost Shuttle missions, software failure was not the cause, thanks to the legions of software testers who worked on the mission control systems. However, there’s less software involved in a shuttle mission than in driving your Prius to the grocery store. Software for late 1970s and early 1980s hardware is orders of magnitude simpler and smaller than software for 2010-era computers. You could not run an iPhone, not to mention a Prius, on the computers that run the shuttle. And even when computers were smaller and simpler, you could not exhaustively test the systems. Glenford Myers, in the first book on software testing, written in 1979, recognized this fact. Whether testing cars or data centers, software testing is a necessary but insufficient means to quality.
This brings us to the next lesson from Toyota, though it is by no means company or culturally specific. We have clients around the world. It is common, across borders, across companies, across cultures, for people to forget that complex systems can exhibit unpredictable, in some cases catastrophic, failures. It is also common for people to forget that failures are not proportional to the size of the defect.
To see examples, consult the Internet for the answers to four questions. Why did a SCUD missile evade the Patriot missiles and hit a troop barracks in the first Gulf War? Why did the first Arianne 5 rocket explode? Why did not one but two NASA Mars missions fail? Why did the Therac kill cancer patients? In each instance, the answer is discouragingly simple: an infinitesimally small percentage of the code proved defective.
Again, size deceives. If you knock a rivet out of a bridge, does the bridge fall? No. If you nick a wire in a single suspension cable, does the bridge fall? No. If you carve your name in a facing stone on a pillar, does the bridge fall? No. Yet some software fails for similarly small defects involving just a few lines of code.
So, what can we do? Well, first, remember that software testing cannot save us from this problem. However, there are many different software testing techniques. Each types of testing can expose a different set of defects. Testers must use different test techniques, test data, and test environments for different bugs during different levels of testing. Each technique, each set of test data, each environment, and each test level filters out a different set of bugs. There is no “one true way” to test software.
Now, I’m not saying that Toyota believes in a “one true way” to quality. Toyota learned quality management from J.M. Juran and W.E. Deming, heroes in the pantheon of quality. Juran and Deming knew much better than to believe in a single magic bullet for quality. However, as we saw from Toyoda’s comments, he did believe too much in testing. In addition, I suspect that Toyota as a company believed too little in integration testing, and perhaps too much in vendors.
Here’s the problem: When complex systems are built from many subsystems, and some of the subsystems are produced by vendors, risks can go up and accountability can go down. It’s not that vendors don’t care; it’s that they can’t always foresee how their subsystems will be used. It’s not the people won’t take responsibility—though that happens—it’s that, when multiple subsystems are at fault, neither vendor wants to take all the blame. So, understand, measure, and manage the quality assurance process for such systems from end-to-end, including vendor subsystems. After all, the end user drives just one car, not a dozen, and there is only one brand on the grill—and this is true for the systems you test, too, isn’t it?
We have spent the last couple years in an economic downturn, and no one seems to know how much longer it will last. For the foreseeable future, management will exhort testers and test teams to do more with less. A tedious refrain, indeed, but you can improve your chances of weathering this economic storm if you take steps now to address this efficiency fixation. In this blog, I’ll give you four ideas you can implement to improve test efficiency. All can show results quickly, within the next six months. Better yet, none require sizeable investments which you could never talk your managers into making in this current economic situation. By achieving quick, measurable improvements, you will position yourself as a stalwart supporter of the larger organizational cost-cutting goals, always smart in a down economy.
Know Your Efficiency
The first idea—and the foundation for the others—is that you should know your efficiency to know what to improve. All too often, test teams have unclear goals. Without clear goals, how can you measure your efficiency? Efficiency at what? Cost per what? Here are three common goals for test teams:
You should work with your stakeholders—not just the people on the project, but others in the organization who rely on testing—to determine the right goals for your team. With the goals established, ask yourself, can you measure your efficiency in each area? What is the average cost of detecting and repairing a bug found by your test team, and how does that compare with the cost of a bug found in production? (I describe this method of measuring test efficiency in detail in my article, “Testing ROI: What IT Managers Should Know.”) What risks do you cover in your testing, and how much does it cost on average to cover each risk? What requirements, use cases, user stories, or other specification elements do you cover in your testing, and how much does it cost on average to cover each element? Only by knowing your team’s efficiency can you hope to improve it.
Institute Risk-Based Testing
I mentioned risk reduction as a key testing goal. Many people agree, but few people can speak objectively about how they serve this goal. However, those people who have instituted analytical risk-based testing strategies can. Let me be clear on what I mean by analytical risk-based testing. Risk is the possibility of a negative or undesirable outcome, so a quality risk is a possible way that something about your organization’s products or services could negatively affect customer, user, or stakeholder satisfaction. Through testing, we can reduce the overall level of quality risk. Analytical risk-based testing uses an analysis of quality risks to prioritize tests and allocate testing effort. We involve key technical and business stakeholders in this process. Risk-based testing provides a number of efficiency benefits:
You can learn more about how to implement risk-based testing in Chapter 3 of my book, Advanced Software Testing: Volume II. You can also read the article I co-wrote with an RBCS client, CA, on our experiences with piloting risk-based testing at one of their locations.
Tighten Up Your Test Set
With many of our clients, RBCS assessments reveal that they are dragging around heavy, unnecessarily-large regression test sets. Once a test is written, it goes into the regression test set, never to be removed. However, in the absence of complete test automation, this leads to inefficient, prolonged test execution periods. The scope of the regression test work will increase with each new feature, each bug fix, each patch, eventually overwhelming the team. Once you have instituted risk-based testing, you can establish traceability between risks and test cases, identifying those risks which you are over-testing. You can then remove or consolidate certain tests. You can also apply fundamental test design principles to do identify redundant tests. We had one client that, after taking our Test Engineering Foundation course, applied the ideas in that course to reduce the regression test set from 800 test cases to 300 test cases. Since regression testing made up most of the test execution effort for this team, you can imagine the kind of immediate efficiency gain that occurred.
Introduce Lightweight Test Automation
I mentioned complete test automation above. That’s sometimes seen as an easy way to improve test efficiency. However, for many of our clients, that approach proves chimerical. The return on the test automation investment some of our clients see is low, zero, or even negative. Even when the return is strongly positive, for many traditional forms of GUI-based test automation, the payback period is too far in the future and the initial investment is too high. However, there are cheap, lightweight approaches to test automation. We helped one of our clients, Arrowhead Electronic Healthcare, create a test automation tool called a dumb monkey. It was designed and implemented using open source tools, so the tool budget was zero. It required a total of 120 person-hours to create. Within four months, it had already saved almost three times that much in testing effort. For more information, see the article I co-wrote with our client.
In this blog, I’ve shown you four ideas you can implement quickly to improve your efficiency. Start by clearly defining your team’s goals, then derive efficiency metrics for those goals and measure your team now. With that baseline measurement, move on to put risk-based testing in place, ensuring the right focus for your effort. Next, apply risk-based testing and other test fundamentals to reduce the overall size of your test set while not increasing the level of regression risk on release. Finally, use dumb monkeys and other lightweight test automation tools to tackle manual, repetitive test tasks, saving your people for other more creative tasks. With these changes in place, measure your efficiency again six months or a year from now. If you are like most of our clients, you’ll have some sizeable improvements to show off for your managers.
Software security is an important concern, and it’s not just for operating system and network vendors. If you’re working at the application layer, your code is a target. In fact, the trend in software security exploits is away from massive, blunt-force attacks on the Internet or IT infrastructure and towards carefully crafted, criminal attacks on specific applications to achieve specific damage, often economic.
How can you respond effectively? While the threat is large and potentially intimidating, it turns out that there is a straightforward seven-step process that you can apply to reduce your software’s exposure to these attacks.
Carefully following this process will allow your organization to improve your software security in a way which is risk-based, thoroughly tested, data-driven, prudent, and continually re-aligned with real-world results. You can read more about this topic in my article, Seven Steps to Reduce Software Security Risk.
Businesses spend millions of dollars annually on software test automation. A few years back, while doing some work in Israel (birthplace of the Mercury toolset), someone told me that Mercury Interactive had a billion dollars in a bank in Tel Aviv. Probably an urban legend, but who knows? Mercury certainly made a lot of money selling tools over the years, which is why HP bought them.
That's nice for Mercury and Hewlett Packard, but so what, right? I don't know about your company, but none of RBCS' clients buy software testing tools so that they can help tool vendors make money. Our clients buy software testing tools because they expect those tools will help them make money.
Unfortunately, it's often the case that there's a real lack of clarity in terms of the business case for software test automation at some organizations. Without a clear business case, there's no clear return on investment. This leads to a lack of clear success (or failure) of the automation effort. Efforts that should be cancelled continue too long, and efforts that should continue are cancelled.
So, one of the pre-requisites of software test automation success a clear business case, leading to clear measures of success. Here are the top three business cases for software test automation that we've observed with our clients:
This list is not exhaustive, and, in some cases, two or more reasons may apply. One of the particularly nice aspects of each of these three business cases is that the return on investment is clearly quantifiable. That makes achieving success in one or more of these areas easy to measure and to demonstrate. It also makes it easy to determine which tests should be automated and which should not.
We often want--and need--testing to go more quickly, don't we? So, here's a list of organizational behaviors and attributes that tend to accelerate the test process. Encourage these activities and values among your peers, and jump at the opportunities to perform them yourself where appropriate.
Testing throughout the project. I use the phrase testing throughout the project in a three-dimensional sense. The first dimension involves time: in order to be properly prepared, and to help contain bugs as early as possible, the test team must become involved when the project starts, not at the end. The second dimension is organizational: the more a company promotes open communication between the test organization and the other teams throughout the company, the better the test group can align its efforts with the company’s needs. The third dimension is cultural: in a mature company, testing as an entity, a way of mitigating risk, and a business-management philosophy permeates the development projects. I also call this type of testing pervasive testing.
Smart use of cheaper resources. One way to do this is to use test technicians. You can get qualified test technicians from the computer-science and engineering schools of local universities and colleges as well as from technical institutes. Try to use these employees to perform any tasks that do not specifically require a test engineer’s level of expertise. Another way to do this is to use distributed and outsourced testing.
Appropriate test automation. The more automated the test system, the less time it takes to run the tests. Automation also allows unattended test execution overnight and over weekends, which maximizes utilization of the system under test and other resources, leaving more time for engineers and technicians to analyze and report test failures. You should apply a careful balance, however. Generating a good automated test suite can take many more hours than writing a good manual test suite. Developing a completely automated test management system is a large endeavor. If you don’t have the running room to thoroughly automate everything you’d like before test execution begins, you should focus on automating a few simple tools that will make manual testing go more quickly. In the long run, automation of test execution is typically an important part of dealing with regression risk during maintenance.
Good test system architecture. Spending time in advance understanding how the test system should work, selecting the right tools, ensuring the compatibility and logical structure of all the components, and designing for subsequent maintainability really pay off once test execution starts. The more intuitive the test system, the more easily testers can use it.
Clearly defined test-to-development handoff processes. Let's illustrate this with an example. Two closely related activities, bug isolation and debugging, occur on opposite sides of the fence between test and development. On the one hand, test managers must ensure that test engineers and technicians thoroughly isolate every bug they find and write up those isolation steps in the bug report. Development managers, on the other hand, must ensure that their staff does not try to involve test engineers and technicians, who have other responsibilities, in debugging activities.
Clearly defined development-to-test handoff processes. The project team must manage the release of new hardware and software revisions to the test group. As part of this process, the following conditions should be met:
Automated smoke tests run against test releases, whether in the development, build (or release engineering), or testing environments (or all three), are also a good idea to ensure that broken test releases don’t block test activities for hours or even days at the beginning of a test cycle.
Another handoff occurs when exit and entry criteria for phases result in the test team commencing or ending their testing work on a given project. The more clearly defined and mutually accepted these criteria are, the more smoothly and efficiently the testing will proceed.
A clearly defined system under test. If the test team receives clear requirements and design specifications when developing tests and clear documentation while running tests, it can perform both tasks more effectively and efficiently. When the project management team commits to and documents how the product is expected to behave, you and your intrepid team of testers don’t have to waste time trying to guess—or dealing with the consequences of guessing incorrectly. In a later post, I'll give you some tips on operating without clear requirements, design specifications, and documentation when the project context calls for it.
Continuous test execution. Related to, and enabled by, test automation, this type of execution involves setting up test execution so that the system under test runs as nearly continuously as possible. This arrangement can entail some odd hours for the test staff, especially test technicians, so everyone on the test team should have access to all appropriate areas of the test lab.
Continuous test execution also implies not getting blocked. If you’re working on a 1-week test cycle, being blocked for 1 just day means that 20 percent of the planned tests for this release will not happen, or will have to happen through extra staff, overtime, weekend work, and other undesirable methods. Good release engineering and management practices, including smoke-testing builds before installing them in the test environment, can be a big part of this. Another part is having an adequate test environment so that testers don’t have to queue to run tests that require some particular configuration or to report test results.
Adding test engineers. Fred Brooks once observed that “adding more people to a late software project makes it later,” a statement that has become known as Brooks’s Law. Depending on the ramp-up time required for test engineers in your projects, this law might not hold true as strongly in testing as it does in other areas of software and hardware engineering. Brooks reasoned that as you add people to a project, you increase the communication overhead, burden the current development engineers with training the new engineers, and don’t usually get the new engineers up to speed soon enough to do much good. In contrast, a well-designed behavioral test system reflects the (ideally) simpler external interfaces of the system under test, not its internal complexities. In some cases, this can allow a new engineer to contribute within a couple of weeks of joining the team.
My usual rule of thumb is that, if a schedule crisis looms six weeks or more in my future, I might be able to bring in a new test engineer in time to help. However, I have also added test engineers on the day system test execution started, and I once joined a laptop development project as the test manager about two weeks before the start of system test execution. In both cases, the results were good. (Note, though, that I am not contradicting myself. Testing does proceed most smoothly when the appropriate levels of test staffing become involved early, but don’t let having missed the opportunity to do that preclude adding more staff.) Talk to your test engineers to ascertain the amount of time that’ll be required, if any, to ramp up new people, and then plan accordingly.
While these software test process accelerators are not universally applicable--or even universally effective--consider them when your managers tell you that you need to make the testing go faster.
Smart professionals learn continuously. They learn not only from their own experience, but also from the experience of other smart professionals. Learning from other smart people is the essence and origin of best practices. A best practice is an approach to achieving important objectives or completing important tasks that generally gives good results, when applied appropriately and thoughtfully.
I have identified a number of software testing best practices over the years. Some I learned in my own work as a test manager. I have learned many more in my work as a consultant, since I get a chance to work with so many other smart test professionals in that role. Here are five of my favorite software testing best practices:
You can listen to me explain these five software testing best practices in a recent webinar. I've also included links above for webinars that deal with some of these software testing best practices specifically.
If you have adopted risk based testing and are using it on projects, how do you know if you are doing it properly? Measure the effectiveness, of course.
I've discussed good software testing metrics previously. Good metrics for a process derive from the objectives that process serves. So, let's look at the four typical objectives of risk based testing and how we might measure effectiveness.
After each project, you can use these metrics to assess the effective implementation of risk based testing.
Next week, I'll be in Germany for the Testing and Finance conference, giving a keynote speech on how testing professionals and teams can satisfy their stakeholders. One of the key themes of that presentation is the following:
There are a wide variety of groups with an interest in testing and quality on each project; these are testing stakeholders. Each stakeholder group has objectives and expectations for the testing work that will occur.
When we do test assessments for our clients, we often find that test teams are not satisfying their stakeholders.
Why? Well, many times, what the testers think the stakeholders need and expect from testing differs from what the stakeholders actually need and expect. In order to understand the stakeholder's true objectives and expectations, testers need to talk to each stakeholder group. Since in many cases the stakeholders have not thought about this issue before, these talks often take the form of an iterative, brainstorming discussion between testers and stakeholders to articulate and define these objectives and expectations.
To truly satisfy these stakeholders, the test team needs to achieve these objectives effectively, efficiently, and elegantly.
The next step of defining these objectives, and what it means to achieve them effectively, efficiently, and elegantly, is often to define a set of metrics, along with goals for those metrics. These metrics and their goals allow the test team to demonstrate the value they are delivering. With goals achieved, testers and stakeholders can be confident that testing is delivering satisfying services to the organization.
Are you satisfying your stakeholders? Catch me in Bad Homborg, Germany, on June 8, to discuss the topic with me directly. Or, your can e-mail firstname.lastname@example.org to find out when we will post the recorded webinar on the RBCS Digital Library.
When I talk to senior project and product stakeholders outside of test teams, confidence in the system—especially, confidence that it will have a sufficient level of quality—is one benefit they want from a test team involved in system and system integration testing. Another key benefit such stakeholders commonly mention is providing timely, credible information about quality, including our level of confidence in system quality.
Reporting their level of confidence in system quality often proves difficult to many testers. Some testers resort to reporting confidence in terms of their gut feel. Next to major functional areas, they draw smiley faces and frowny faces on a whiteboard, and say things like, “I’ve got a bad feeling about function XYZ.” When management decides to release the product anyway, the hapless testers either suffer the Curse of Cassandra if function XYZ fails in production, or watch their credibility evaporate if there are no problems with function XYZ in production.
If you’ve been through those unpleasant experiences a few times, you’re probably looking for a better option. In the next 500 words, you’ll find that better option. That option is using multi-dimensional coverage metrics as a way to establish and measure confidence. While not every coverage dimension applies to all systems, you should consider the following:
Notice that I talked about “passing tests” in my metrics above. If the associated tests fail, then you have confidence that you know of—and can meaningfully describe, in terms non-test stakeholders will understand—problems in dimensions of the system. Instead of talking about “bad feelings” or drawing frowny faces on whiteboards, you can talk specifically about how tests have revealed unmitigated risks, unmet requirements, failing designs, inoperable environments, and unfulfilled use cases.
What about code coverage? Code coverage measures the extent to which tests exercise statements, branches, and loops in the software. Where untested statements, branches, and loops exist, that should reduce our confidence that we have learned everything we need to learn about the quality of the software. Any code that is uncovered is also unmeasured from a quality perspective.
If you manage a system test or system integration test team, it’s a useful exercise to measure the code coverage of your team’s tests. This can identify important holes in the tests. I and many other test professionals have used code coverage this way for over 20 years. However, in terms of designing tests specifically to achieve a particular level of code coverage, I believe that responsibility resides with the programmers during unit testing. At the system test and system integration test levels, code coverage is a useful tactic for finding testing gaps, but not a useful strategy for building confidence.
The other dimensions of coverage measurement do offer useful strategies for building confidence in the quality of the system and the meaningfulness of the test results. As professional test engineers and test analysts, we should design and execute tests along the applicable coverage dimensions. As professional test managers, our test results reports should describe how thoroughly we’ve addressed each applicable coverage dimension. Test teams that do so can deliver confidence, both in terms of the credibility and meaningfulness of their test results, and, ultimately, in the quality of the system.
In a later post, I'll talk about what software testing is, and what it can do. However, in this post, I'd like to talk about what software testing isn't and what it can't do.
In some organizations, when I talk to people outside of the testing team, they say they want testing to demonstrate that the software has no bugs, or to find all the bugs in it. Either is an impossible mission, for four main reasons:
It's important to understand and explain these limits on what software testing can do. Recently, the CEO of Toyota said that software problems couldn't be behind the problems with their cars, because "we tested the software." As long as non-testers think that testers can test exhaustively, those of us who are professional testers will not measure up to expectations.