For those readers outside the US or too young to remember this ad slogan, there is a glue-trap-based insect control device called the Roach Motel. If a cockroach (or other bug) walked onto the floor of the rectangular box (open at the ends, of course), it would become trapped on the glue. The slogan was, naturally enough, "Bugs check in but they don't check out."
While it might seem like I'm setting up a story about a software development team that never fixes bugs, I actually have a different story to tell.
A number of years ago, I broke one of my rules--be a conservative adopter of new technologies--and was an early adopter of LinkedIn. However, within a short period of time, I read an article (I believe in Computer World or Information Week) that said LinkedIn was planning on "monetizing" its business model by selling information to recruiters and other interested parties. So, I stopped using it and started declining invites.
Being somewhat lazy--and not having many contacts at risk--I didn't get around to trying to delete the account for some time. About a year or so ago, I finally got tired of having to decline two or three LinkedIn invites every week, so I went ahead and deleted the account.
It wasn't easy to figure out how to do it. Usability of that feature didn't seem to be high on the list. Finally, I did manage to get to the right page and did get a confirmation that the account was deleted. I turns out, I should have saved a screen shot of that confirmation message.
Because, to my surprise, the LinkedIn invites keep on coming. I guess they didn't test whether this delete feature works, or at least didn't test it very well. Of course, to some extent, this confirms my decision not to get caught up in the LinkedIn mania. I'm glad I didn't get a lot of contacts in there before I read that article.
So, if you are a colleague of mine, and you send me a LinkedIn invite, you're likely to receive an e-mail much like this one:
I hope you are dong well? It's been a while since we've touched base. How are things going with you?
I've stopped using LinkedIn because I'm not comfortable with the way they manage personal data and contact data.
In the meantime, I will probably spend some time in the next few weeks trying to figure out how to check out of the Roach Motel of social networking. I'll post an update if I can find any way to unstick my little paws from the glue-trap floor.
Here's a question about equivalence partitioning.
My name is Nhat Khai Le and I'm taking ISTQB Advanced Technical Test Analyst online course (provided by RBCS).
I faced this sample question:
You are testing an accounting software package. You have a field which asks for the month of a transaction.
You know that months are treated differently in this software depending on the month of the quarter (first month of the quarter is treated differently than the second which is treated differently than the third.)
How many test cases, minimum, would you need to test this field to get equivalence coverage?
I chose answer B, i.e 3, but the answer is C, i.e 5.
The reason the answer is C is because there are three months in each quarter, each of which is different, plus the two months immediately preceding and following the quarter. It's important to remember that equivalence partitioning includes partitions outside the "normal" range when using it for testing.
To help visualize the answer, see the illustration below:
Equivalence Partitioning for the Months in the Quarter:
Reader Gianni Pucciani has a good question about a question in the Advanced Software Testing: Volume 2 book:
I have another doubt for a question in Advanced Software Testing Vol. 2. It is about the first question in Chapter 7, Incident Management. The book says that the correct answer is C "Insufficient Isolation". What does it mean? I had chosen B "Inadequate classification information", because all the rest was not making sense to me. For B, I could justify it saying that more information could be added to the incident report, e.g the error message displayed by the application.
Here is the question from the book:
Assume you are a test manager working on a project to create a programmable thermostat for home use to control central heating, ventilation, and air conditioning (HVAC) systems. In addition to the normal HVAC control functions, the thermostat also has the ability to download data to a browser-based application that runs on PCs for further analysis.
During quality risk analysis, you identify compatibility problems between the browser-based application and the different PC configurations that can host that application as a quality risk item with a high level of likelihood.
Your test team is currently executing compatibility tests. Consider the following excerpt from the failure description of a compatibility bug report:
1. Connect the thermostat to a Windows Vista PC.
2. Start the thermostat analysis application on the PC. Application starts normally and recognizes connected thermostat.
3. Attempt to download the data from the thermostat.
4. Data does not download.
5. Attempt to download the data three times. Data will not download.
Based on this information alone, which of the following is a problem that exists with this bug report?
A. Lack of structured testing
B. Inadequate classification information
C. Insufficient isolation
D. Poorly documented steps to reproduce
The reason that the answer is "C" is because we don't see any evidence of the tester trying some different scenarios to see if the data downloads properly. The testing is clearly well-structured and carefully thought out, and the steps to reproduce are well-described. The classifications are not given, so we have no way of saying, based on this information alone, whether those classifications are correct.
I received an interesting e-mail from long-time reader, John Singleton:
I'm so proud of my 6-year old, Josh. This weekend, we were playing Wii. Those of you likewise hooked on Wii will know that pushing the Home button on the controller pauses the current game and allows the user to access some configurations and such. It also displays the battery level for all connected Wii controllers. This time, the battery icon had only one bar, with the red color to get your attention. Josh said, "Hey Dad, have you ever noticed that when you push Home, it shows the battery is full for just a second before it shows that it's empty?"
I had him show me, and sure enough. There on the Home screen, it shows the battery meter as blue and full for just an instant before it refreshes with the red, almost-empty icon. I think I spouted off something geeky about how it probably shows a default value for just a moment while it queries for the actual battery value.
I don't know if I mentioned, but my son also has some special needs, including Sensory Processing Disorder, which tends to make one much more acutely aware of any kind of sensory stimulus. I wonder if this episode is similar to the kinds of dynamics that have led people to hire individuals with Aspergers' or Autism Spectrum Disorder for software testing. Or maybe he's just quirky, like his dad...
Regardless, my heart swells to about ten times its normal size when I hear my six-year-old finding obscure software defects in robust commercial products!
An interesting set of questions raised here, John. Certainly, my children are very adept at digital devices, but they don't seem to show that level of awareness for problems. Is testing an innate skill? Do certain traits which are thought of as "disorders" in some contexts actually provide skills in other contexts?
Reader Gianni Pucciani has another good question in his review of the Advanced Software Testing: Volume 2 book. He asks:
My doubt is on question 4 of chapter 10 (People Skills and Team Composition).
The correct answer is B, and I had chosen B by excluding all the others which were for sure wrong.
However, my question is: how do you know that your team found 90% of defects by the time you need to give bonuses?
You know for sure the number of defects found prior to release, but how do you know the total number of defects if not after an agreed period (1 year?) of production use?
How would you implement this approach in a real life situation?
Here's the question from the book:
You are a test manager in charge of system testing on a project to update a cruise-control module for a new model of a car. The goal of the cruise-control software update is to make the car more fuel efficient. Assume that management has granted you the time, people, and resources required for your test effort, based on your estimate. Which of the following is an example of a motivational technique for testers that will work properly and is based on the concept of adequate rewards as discussed in the Advanced syllabus?
A. Bonuses for the test team based on improving fuel efficiency by 20% or more
B. Bonuses for the test team based on detecting 90% of defects prior to release
C. Bonuses for individual testers based on finding the largest number of defects
D. Criticism of individual testers at team meetings when someone makes a mistake
Gianni is of course right, the answer is B. He is also right that there is some lag time after release required to calculate the defect detection effectiveness. Defect detection effectiveness is calculated as
DDE = (defects detected)/(defects present).
In the case of the final stage of testing, you can calculate this as
DDE = (defects detected in testing)/(defects detected in testing + defects detected in production).
The bottom side of that equation (the denominator) is a reasonably good approximation for "defects present" is you wait long enough.
So, how long is "long enough"? Most of our clients find that they can determine the typical period of time in which 90% of the defects will be reported on a given release, usually through analysis of the field failure information. In some organizations, this is as short as 30 days, though 90 days seems a more typical number.
In the last decade, outsourcing became a powerful force in the software industry. Motivations behind outsourcing vary, but the reason our clients mention most is that of cost savings. Unfortunately, all too often our clients also mention that previous attempts at outsourcing failed to deliver the desired efficiencies, or perhaps failed to deliver anything at all.
So, is outsourcing some siren on rocky project shores, luring to doom the captains of IT who dare to listen to the siren’s song? Not at all, but outsourcing is not without its risks. Over the last twenty years, I’ve worked on both sides of the outsourced IT relationship, and have seen it work. Let’s examine what successful outsourced efforts have in common.
Successful outsourcing involves planning and handling the unique logistical details of outsourcing. For example, e-mail and intranet communication, synchronized software lifecycles, procedures for file transfer, effective configuration management, support for development, test, and staging environments, sufficient test data, common tool usage, and compliance to applicable standards are necessary for success on many software projects. Project teams must understand the tactical details of how the work will get done, day-by-day, person-by-person, and resolve any logistical obstacles that could occur in advance. Good project logistics are like air and water: You don’t notice them until they’re bad or, worse yet, completely missing. However, outsourcing logistics are complex and often span organizational areas of responsibility (or even falls into gaps in areas of responsibility), problems happen often, and cause many outsourcing difficulties and failures.
Successful outsourcing also involves good working relationships with mutual trust and open communication. Studies show that simply locating people on separate floors in the same building can dramatically reduce communication and relationship building. Having people located thousands of miles and half a dozen or more time zones away is even harder on relationship building and maintenance. However, successful outsourcing requires that people actively nurture good working relationships across the organizational and geographical boundaries. If relationships are weak, trust is missing, and communication is infrequent, every project challenge becomes harder to deal with. In the long term, relationships sour and morale suffers. In addition to creating an emotionally-unpleasant working situation for everyone, quality and efficiency both go decrease.
Successful outsourcing requires understanding what CMMI does—and doesn’t—tell you about an outsource vendor’s capabilities. Properly applied, CMMI will lead to more orderly, consistent practices, which can increase quality and efficiency. We have clients who use CMMI to improve their processes, reduce costs, and deliver better software. That said, the jury is still out on whether there is a statistically valid and reliable correlation between CMMI levels and: 1) the cost per delivered KLOC (or Function Point); or, 2) the reliability or defect density of the delivered software. If that seems to contradict what I said about some of our clients, my point is that it is a logical fallacy to say that, since some companies have success with CMMI, therefore every company that achieves a high level of CMMI maturity will produce better, cheaper software than another company with a lower level of maturity. In addition, even Bill Curtis of the Software Engineering Institute, one of the fathers of CMMI, admitted (at the ASM/SM 2002 conference) that, when used purely as a marketing device, CMMI does not significantly improve quality or efficiency. So, if an organization says they are CMMI accredited (at whatever level), dig further to see exactly what that means in terms of their daily practices, and look at solid metrics for efficiency and quality.
This brings us to the final factor for successful outsourcing: selecting the right outsource service provider. As I mentioned above, just looking at CMMI levels won’t suffice, but even if you satisfy yourself that a vendor is mature, efficient, and delivering quality, remember, as an investment prospectus would say, past results are not necessarily an indicator of future performance. In other words, just because a vendor has had good results on past projects doesn’t mean they will succeed on your projects. Here are some other questions to consider:
If all this seems difficult and complex, keep two things in mind. First, even relatively small projects can have significant costs, especially opportunities costs, if they fail, so outsourcing is always a decision to be made with care. Second, in most cases the real efficiencies of outsourcing will only kick in after a few projects, so organizing for outsourcing success is worth doing well, because, if you do it right, you only have to do it once. Once you have established a successful working relationship with an outsourcing vendor, you will find yourself reaping the benefits, project after project.
Many of us got into the computer business because we were fascinated by the prospect of using computers to build better ways to get work done. (That and the almost magical way we could command a complex machine to do something simply through the force of words coming off our fingers, into a keyboard, and onto a screen.) Ultimately, those of us who consider ourselves software engineers, like all engineers, are in the business of building useful things.
Of course, engineers need tools. Civil engineers have dump trucks, trenching machines, and graders. Mechanical engineers have CAD/CAM software. And we have integrated development environments (IDEs), configuration management tools, automated unit testing and functional regression testing tools, and more. May great software testing tools are available, and some of them are even free. But just because you can get a tool, doesn’t mean that you need the tool.
When you get beyond the geek-factor on some tool, you come to the practical questions: What is the business case for using a tool? There are so many options, but how to I pick one? How should I introduce and deploy the tool? How can I measure the return on investment for the tool? This article will help you uncover answers to these questions as you contemplate tools.
Let’s start with the business case. Remember: without a business case, it’s not a tool, it’s a toy. Often, the business case comes down to one or more of the following:
There can be other business cases, but one or more of these will frequently apply. Sometimes the business case masquerades as something else, such as improving consistency of tasks or reducing repetitive work, but notice that these two are actually the first and last bullet items above, respectively, if you consider them carefully.
Once you’ve established a business case, you can select a tool. With the internet, it is easy to find candidate tools. Before you start that, consider the fact that you are going to live with the tool you select for a long time—if it works—and potentially spend a lot of money on it. So, I recommend that you consider tool selection as a special project, and manage it that way. Form a team to carry out a tool selection. Identify requirements, constraints, and limitations. At this point, start searching the Internet to prepare an inventory of suitable tools. If you can’t find any, then perhaps you can find some open source or freeware constituent pieces that could be used to build the tool you need? Assuming you do find some candidate tools, you should perform an evaluation and, ideally, have a proof-of-concept with your actual business problem. (Remember, the vendor’s demo will always work, but you don’t learn much from a demo about how the tool will solve your problems.) With that information in hand, you’re ready to choose a tool.
Once you’ve chosen the tool, it’s time to pilot the tool and then deploy it. In the pilot, select a project that can absorb the risk associated with the piloting of a tool. Your goals for the pilot should include the following:
Based on what you learned from the pilot, you’ll want to make some adjustments. Once those adjustments are in place, you’ll want to proceed to deployment of the tool. Here are some important ideas to remember for deployment:
Finally, let’s address this question of return on investment (ROI). For process improvements (including introduction of tools), we can define ROI as follows:
ROI = (net benefit of improvement)/(cost of improvement)
This question of net benefit returns us to where we started: business objectives. Any meaningful measure of return on investment has a strong relationship with the objectives initially established for the tool. Let’s look at an example. Suppose you have developers who currently use manual approaches for code integration and unit testing. This consumes 5,000 person-hours per year. With the tool, one developer will spend 50% of their time as integration/test toolsmith, using Hudson and other associated tools to automate the process. By doing so, developer effort for this process will shrink to 500 person-hours (plus the 50% of the person-year for the toolsmith). So, ROI is:
ROI = (net benefit from investment)/(cost of investment) = ((5000-(500+1000)))/1000 = 350%
Notice that, in this case, since the tools are free, I did the calculation entirely using person hours. Sometimes, with commercial tools, you have to perform this whole calculation in dollars or whatever your local currency is.
As software engineers, we want to build useful things, and tools can make us more effective and efficient in doing so. Before we start to use a tool, we should understand the business objectives the tool will promote. Understanding the business case will allow us to properly select a tool. With the tool selected we can then go through one or more pilot projects with the tool, followed by a wider deployment of the tool. As we deploy—and after we deploy—we should plan to measure the return on investment, based on the business case. By following this simple process, you can not only achieve success with tools—you can prove it, using solid ROI numbers.
A quick follow-up related to my earlier post on evidence. As some readers may know, avionics software that controls flight on airplanes (e.g., cockpit software) is subject to a test coverage standard, FAA DO-178B. That standard applies lower standards of test coverage to software that is not safety critical.
So far, so good.
Here's an example of why such standards are useful. During my flight from the US to China today, I managed to crash the entertainment software running at my seat not once by three times. I did this by pausing, rewinding, and resuming play when the flight attendants were taking my dinner orders (i.e., not by unusual actions). I was ultimately able to get it working again, thanks to a series of hard reboots by a flight attendant. One of my fellow passengers wasn't so lucky, as his system never recovered.
Okay, that's just entertainment, and anyone who travels regularly knows they should bring a book or plan to winnow down their sleep deprivation balance on long flights.
However, what if the flight control software were as easy to crash? Who would want to hear a cockpit announcement along the lines of the following: "Our entire flight control system just crashed. This enormous airliner is now essentially an unpowered and uncontrolled glider. We'll reboot the system until we get it working again, or until we have an uncontrolled encounter with terrain"?
Personally, I want people testing the more safety-critical aspects of avionics software to adhere to higher standards of coverage, and to be able to provide evidence of the same.
I received another interesting e-mail from a colleague a few weeks ago. Sorry about the delay in response, Simon, but here are my thoughts. First, Simon's e-mail:
I have been reading the Advanced Test Manager book & have been discussing the possibility of adopting an informal risk based approach in my test team, but I am encountering some resistance, which has also got me thinking. You have covered (in several places) the topic of gaps in risk analysis from a breadth point of view, but how about the issue of disparity in 'depth' for identified risk items? For example in your ‘Basic-Sumatra’ Spreadsheet there is a huge variation in depth
between, for example the risk item ‘Can’t cancel incomplete actions using cancel or back.' (A functional item that has a risk score) and 'Regression of existing Speedy Writer features.' (This is also a functional item, but may constitute several hundred test cases).
In my case an experienced tester is against the idea of informal risk analysis due to the effort involved. The scenario is one where a regression 'plan' (set of test cases) is already in place for an enterprise scale solution with 10 main components deployable in both a
Web & Windows client manner. So the usual regression test execution 'plan' requires executing a complex test procedure 10x2 times. In total there is several hundred test cases to execute (some components have approx 100 test cases).
When I suggested an informal (PRAM) style risk identification to each new project the response was:-
The effort of establishing such a 'test plan' seems to be enormous considering that the whole thing has to be performed per application component for each Win and Web client (i.e. 10 x 2 times). I estimate that the number of items requiring risk scoring will be approx 100 for each of the bigger components let alone the whole of the application.
In response to this I pointed out that we could have a 'coarse grained' risk item identification & score - perhaps 20 lines on the risk assessment spreadsheet- 1 for each component\deployment combination.
The response to that was:-
If each of these 20 lines has got an RPN and all the test cases assigned to it just inherited this RPN, this would mean that we would perform an 8 hour test on ‘Securities Win client’ before even beginning with the test of another component, which has got a lower
RPN. Further, this could mean that low-priority components might not be tested at all in a tight time schedule. This cannot be the desired test procedure. It must be ensured that each component is at least tested basically on Win and Web … which would again lead us to scoring risk items at the test case level within each component for Windows and Web & that has the problem of the effort involved.
Do you have any suggestions for handling this depth of risk identification issue?
This is an important question, Simon, that brings up three important points.
First, the amount of effort invested must be considered. We usually find that the risk analysis can be completed within a week. The time involved depends on the approach used. If you use the group brainstorm approach, then each participant must invest an entire day, with the leader of the risk analysis typically investing a couple days in addition on preparation, creating the analysis, doing follow-up, etc. If you use the sequential interview approach, then each participant invests about three hours, with 90 minutes in the initial interview and 90 minutes in the review/approval process for the document, with the leader of the risk analysis again investing about three days of effort.
Second, the question of granularity of the risk analysis is also important. The granularity must be fine-grained enough to allow unambiguous assignment of likelihood and impact scores. However, if you get too fine-grained then the effort goes up to an unacceptable level. A proper balance must be struck.
Third, the question of whether we might not test certain important areas at all because they are seen as low risk is indeed a problem. What we typically suggest is what's called a "breadth-first" approach, which means that to some extent the risk-order execution of tests is modified to ensure that all major areas of the software are tested. These areas are tested in a risk-based fashion, but every area gets at least some amount of testing.
Many of these topics are addressed in the sequence of videos on risk based testing that you can find on our digital library. I'd encourage interested readers to take a look at those brief videos for more ideas on these topics.
I recently received an interesting e-mail from a colleague:
To Whom It May Concern-
Do you have any articles on the value of collecting/capturing detailed test evidence (e.g., screenshots attached to test cases)?
In my opinion, for mature systems with experienced, veteran testers, the need for an abundance of test evidence in the form of screenshots attached to test runs in QC is overkill and unecessary that adds more time to release cycles. The justification for this is awlays "For Audit" as opposed to "Improves Quality". I looked in several articles on this fantastic site, and couldn't find anything pertaining to test evidence. Do you have any articles that provide evidence that an abundance of test evidence improves quality (even if it's just a correlation and not necessarily causation)?
We have clients that do need to retain such detailed software testing evidence; e.g., clients working in safety critical systems (such as medical systems) who must satisfy outside regulators that all necessary tests have been run and have passed. For them, retaining such evidence is a best practice, as not doing so can result in otherwise-worthy systems being barred from the market due to the lack of adequate paperwork.
As someone who relies on such systems to work--indeed, as we all do--I appreciate these regulations and would not want to see software held to a lesser standard. However, Erik makes a very valid point in terms of the trade-off. As time is spent on these audit-trail activities, that is time not spent doing other tasks that would perhaps result in a higher level of quality. Of course, these audit-trail activities are designed to ensure that all critical quality risks are addressed. So, the key question is how should organizations balance the risk of failing to test certain critical quality attributes against the reduction in breadth of quality risk coverage?
I'd be interested in hearing from other readers of this blog on their thoughts. Erik, if you have further comments on this matter, I'm sure the readers of this blog would benefit from those ideas, as this is clearly an important area to consider. I certainly agree it's an interesting topic for an article, and this blog discussion may well inspire me to collaborate with you and other respondents to write one.