RBCS Covid-19 response: Until further notice, all public training classes will be run virtually. Remote proctored certification exams are available (view details).


Managing a Key Risk in Risk Based Software Testing

By Rex Black

As regular readers of my posts, books, and/or articles know, I like risk based testing.  That said, it's not without its own risks.  One key project risk in risk based testing is missing some key quality risks.  If you don't identify the risk, you can't assess the level of risk, and, of course, you won't cover the risk with tests--even if you really should. 

How to mitigate this risk? Well, one part is getting the right stakeholders involved, and I have thoughts on doing that in a previous blog post.  Another part is to use the right approach to the analysis, as discussed in this blog post

However, another key part of getting as thorough-as-possible a list of risks is to use a framework or checklist to structure and suggest quality risks.  I've seen four common approaches to this, two of which work and two of which don't work.

  1. A generic list of quality risk categories (such as the one available in the RBCS Basic Library here).  These are easy to learn and use, which is important, because all the participants in the risk analysis need to understand the framework.  It is very informal, and needs tailoring for each organization.
  2. ISO 9126 quality characteristics (for example of ISO 9126, see here).  This is very structured and designed to ensure that software teams are aware of all aspects of the system that are important for quality.  It is harder to learn, which can create problems with some participants.  It also doesn't inherently address hardware-related risks, which is a problem for testing hardware/software systems.   
  3. Major functional areas (e.g., formatting, file operations, etc. in a word processor).  I do not recommend this for higher-level testing such as system test, system integration test, or integration test, unless the list of major functional areas is integrated into a larger generic quality risk categories list that includes non-functional categories.  By themselves, lists of major functional areas focus testing on fine-grained functionality only, omitting important use cases and non-functional attributes such as performance or reliability.
  4. Major subsystems (e.g., edit engine, user interface, file subsystem, etc. in a word processor).  This approach does work for hardware, and in fact is described in some books on formal risk analysis techniques like failure mode and effect analysis such as Stamatis's classic.  However, as with the functional areas, risk lists generated from subsystems tend to miss emergent behaviors in software systems, such as--once again--end-to-end use cases, performance, reliability, and so forth.

Here's my recommendation for most clients getting started with risk based testing.  Start with the general list of quality risk categories I mentioned above.  Customize the risk categories for your product, if needed, but beware of dropping any risk categories unless everyone agrees nothing bad could happen in that category.  If you find you need a more structured framework after a couple projects, move to ISO 9126.

— Published

A Cautionary Tale in System Reliability

By Rex Black

I want to depart a bit from the usual theme to share a cautionary tale about reliability that has lessons for system design, system testing, cloud computing, and public communication.  Regular readers will have noticed that we had only one post last week, down from the usual two posts.  The reason is that Friday's post was pre-empted by a thunderstorm that knocked out the high-speed internet to our offices.  We get our internet from a company called GVTC.  In fact, the storm appears to have affected hundreds of customers, because we are only now (a full three days after the failure) finding out that we won't have a GVTC service person here until Thursday.

Shame on RBCS for not having backup, you might think.  But we did have backup.  In addition to a GVTC's fiber-based wired connection, we had a failover router (a Junxion Box) with an AT&T 3G wireless card in it.  However, when the fiber-to-ethernet adapter failed, it created a surge in the ethernet connection (which ran through the router).  That surge completely destroyed the router. So, no backup internet.  Worse yet, because the router was also acting as the DHCP server, the entire local area network was now inaccessible. 

Chaos ensued, as you might imagine, and we're still recovering from it.  I'll spare you the details of what we have done and are still doing, and jump to the lessons.

  • Testing lesson: Yes, we had tested the failover to the 3G router.  We did it by disconnecting the ethernet connection to the GVTC fiber-to-ethernet hardware. That tested the "what happens if the connection goes dark" condition.  It didn't test the "what happens if the hardware is damaged and starts sending dangerous signals" condition.  The lesson here for testers is, when doing risk analysis for reliability testing, make sure to consider all possible risks. Murphy's Law says that the one risk you forget is the one that'll get you.
  • Design lesson: When you're designing for reliability, don't assume that single-points-of-failure can be eliminated simply by adding a failover resource.  If the failover resource is connected in some way to the primary resource, there may well be a path for failure of the primary to propagate to the failover.  Our particular problem is the kind of design flaw that the iterative application of hazard analysis could have revealed.  I'll be more careful with choice of contract support personnel as I rebuild this network.
  • Cloud lesson:  Cloud computing and software as a service (SaaS) are the latest thing, and gaining popularity by leaps and bounds.  RBCS doesn't rely much on the cloud, other than having our e-learning systems remotely hosted.  That hosting of e-learning was a good decision, it turns out, because the loss of connectivity to our offices did not affect our e-learning customers.  However, had we relied on our e-learning system for internal training over the weekend, we'd have been out of luck.  A key takeaway here--especially if you run a small business like I do--is that, if you rely on the cloud or SaaS, those applications are no more reliable than your high speed internet access.
  • Public communication lesson: For those who communicate to the public, GVTC's handling of this problem is a textbook example of how not to communicate.  They did not issue any e-mail or phone information about what to expect.  My business partner spent over five hours on the phone with them in the last 72 hours, and it wasn't until today that we got even the remotest promise of resolution.  She was told conflicting stories on each call.  The IVR system at one point instruted her to "dial 9 for technical support," and, when she did, it replied, "9 is not a supported option." Clear communication to affected customers when a service fails will have a big impact on the customers' experience of quality.  Conversely, failing to communicate sends a clear message, too: "We don't care about you."

Enough ruminations on the lessons learned.  Later this week, we'll be back to our regularly-scheduled programming.  In the meantime, give a thought to reliability--before circumstances force you to do so.

— Published

Five Tips for Quality Risk Analysis

By Rex Black

Let's suppose you've succeeded in convincing your project team to adopt risk based testing (e.g., using the pitch outlined in this previous blog post). Quality risk analysis is the initial step in risk based testing, where we identify the quality risks that exists for the software or system we want to test, and assess the level of risk associated with each risk item (see previous blog post here for more on risk factors). Obviously, it's important to get this step right, since everything else will follow from the risks your identify and your assessment of them.  Here are five tips to help you do the best possible job of it. 

  1. Use a cross-functional brainstorming team to identify and assess the risks, ensuring good representation of both business and technical stakeholder groups.  This is absolutely the most critical of these five tips.  The better the quality risk analysis team represents the various perspectives that exist in the project, the more complete the list of quality risk items and the more accurate the assessment of risk associated with each item.
  2. Identify the risk items first, then assign the level of risk.  This two-pass approach ensures that people consider risks in relationship to each other before trying to decide likelihood and impact, which helps reduce the risk rating inflation or deflation that can occur when each risk is considered in isolation.
  3. Only separate risk items when necessary to distinguish between different levels of risk.  In other words, you typically want to keep the risk items as "coarse grained" as possible, to keep the list shorter and more manageable.  Remember, this is test analysis, not test design.  You're trying to decide what to test, not how to test it.  You'll identify specific test cases for each risk item once you get into test design.
  4. Consider risk from both a technical and a business perspective.  Risk items can arise from technical attributes of the system and from the nature of the business problem the system solves.  Technical considerations determine the likelihood of a potential problem and the impact of that problem on system should it occur. Business considerations determine the likelihood of usage of a given feature and the impact of potential problems in that feature on users.
  5. Follow up and re-align your risk analysis with testing activities and project realities at key project milestones.  No matter how well you do the risk analysis during the initial step, you won't get it exactly right the first time.  Fine-tuning and course-correcting are required.

If you apply these five tips to your quality risk analysis activities, you'll be well on your way to doing good risk based testing.  You might consider some of the other suggestions I have in the video on risk based testing available on our Digital Library here.

— Published

Movitating Stakeholder Participation in Risk Based Testing

By Rex Black

As I've mentioned before in this blog (and elsewhere), we work with many clients to help them implement risk based software and system testing.  Two key steps in this process are the identification and assessment of risks to the quality of the system.  In my 15 years of experience doing risk based software testing, I've found that the only reliable way to do this is by including the appropriate business and technical stakeholders. 

When I explain this to people getting started, I sometimes hear the objection, "Oh, our project team members are all very busy, so I can't imagine they'll want to spend all the time required to do this."  Fortunately, this is an easily resolved concern.

First of all, to answer the "why would they want to participate?" implication of that objection, I'd refer you to my previous blog post on this topic here.  Next, let's look at the "how much time are we talking about" implication.

For most stakeholders, risk based testing involves only a little bit of their time to collect their thoughts.  Risk identification via one-on-one or small team interviews requires about 90-120 minutes each, with risk assessment interviews either being separate follow up discussions of about the same length or even sometimes included in the risk identification interview.  There's typically a subsequent meeting to review, finalize, and approve the risk assessment.

It's true that, by using project team brainstorming sessions, the workload on the stakeholders is higher than just 2-3 hours total.  Risk identification and assessment via brainstorming sessions requires a single, typically one-day meeting.  Most of our clients choose the "sequence of interviews" approach, because of the difficulty of scheduling all-day meetings.

Either way, the interview or session participants need to think about three questions during these steps in the process:

  • What are the potential quality problems with the system (i.e., what are the quality risks)?
  • How likely is each potential problem (i.e., how often do we find such problems during testing or in production)?
  • How bad is each potential problem (i.e., what is the business and customer pain associated with such problems)?

By including the right selection of technical and business stakeholders, and thinking realistically (i.e., with neither excessive pessimism or optimism) about these three questions, the stakeholder team can produce a realistic and practical quality risk assessment. 

If you're interested in more information on risk based testing, you might want to take a look at the videos and other resources available here.

— Published

Ten Steps to Better Bug Reports

By Rex Black

Good bug reports are important.  For many test teams, they are the primary deliverable, the most frequent and common touchpoint between the test team and the rest of the project.  So, we'd better do them well. 

Here are ten steps I've used to help my testers learn how to write better bug reports:

  1. Structure: test carefully, whether following scripts, software attacks, or exploratory testing.
  2. Reproduce: test the failure again, to determine whether the problem is intermittent or reproducible.
  3. Isolate: test the failure differently, to see what variables affect the failure.
  4. Generalize: test the same feature elsewhere in the product, or in different configurations.
  5. Compare: review similar test results, to see if the failure has occurred in the past.
  6. Summarize: relate the test and its failure to customers, users, and their needs.
  7. Condense: trim unnecessary information from the report
  8. Disambiguate: use clear, unambiguous words and phrase, and avoid words like disambiguate.
  9. Neutralize: express the failure impartially, so as not to offend people.
  10. Review: have someone look over the failure report, to be sure, before you submit it.

If you apply these steps to your daily work as a tester, you'll find that bugs get fixed quicker, more bugs get fixed, and programmers will appreciate your attention to detail.

— Published

Simple Factors for Risk Based Software Testing

By Rex Black

We work with  a number of clients to help them implement risk based testing.  It's important to keep the process simple enough for broad-based participation by all stakeholders.  A major part of doing so is simplifying the process of assessing the level of risk associated with each risk item.

To do so, we recommend that stakeholders assess two factors for each risk item:

  • Likelihood.  Upon delivery for testing, how likely is the system to contain one or more bugs related to the risk item?
  • Impact.  If such bugs were not detected in testing and were delivered into production, how bad would the impact be?

Likelihood arises primarily from technical considerations, such as the programming languages used, the bandwidth of connections, and so forth.

Impact arises from business considerations, such as the financial loss the business will suffer, the number of users or customers affected, and so forth.

We have found that project stakeholders can use these two simple factors to reach consensus on the level of risk for each risk item.  These two factors are also sufficient to achieve the benefits of risk based testing that I discussed in an earlier post.  Using more than these two factors tends to make this process overly complicated and often results in the failure of attempts by project teams to implement risk based testing.

— Published

Software Testing Strategies

By Rex Black

It's important for project teams to select the right mix of test strategies for their projects.  In our training, outsourcing and consulting work, we typically see one or more of the following strategies applied:

  • Analytical strategies, such as risk-based testing and requirements-based testing;
  • Model-based strategies, such as performance testing based on statistical usage profiles;
  • Methodical strategies, such as checklists of important functional areas or typical bugs;
  • Process- or standard-compliant strategies, such as IEEE 829 documentation and agile testing approaches;
  • Dynamic or heuristic strategies, such as the use of software attacks or exploratory testing;
  • Consultative strategies, such as asking key project stakeholders about the critical quality characteristics;
  • Regression testing strategies, such as automated testing at the unit or graphical user interface levels.

It's important to remember that strategies may be combined; test managers should employ all of the strategies that they can effectively and efficiently employ on a given project.  Test managers should carefully select and tailor the strategies they use for a project.

— Published

What Is Software Quality?

By Rex Black

Since software testing is an assessment of quality, this question is not a theoretical one.  I suggest using a definition from J. M. Juran, one of the quality gurus who helped Japan achieve its astounding progress in the last 60 years.  Juran wrote that "quality is fitness for use. Features [that] are decisive as to product performance and as to 'product satisfaction'... The word 'quality' also refers to freedom from deficiencies…[that] result in complaints, claims, returns, rework and other damage. Those collectively are forms of 'product dissatisfaction.'"  Since satisfaction revolves around the question of who is (or isn't) satisfied, it's clear that Juran is referring to the satisfaction of key stakeholders.

Okay, that's pretty straightforward.  So, how do we think about quality?  I suggest there are three ways to approach this question:

  • Outcomes: What outcomes do we enjoy if we deliver a quality product or service?  Clearly, we would have customer satisfaction, conformance to requirements, etc.
  • Attributes: What attributes must the product or service have to deliver quality? These would include functionality, performance, security, etc.
  • Means: What must we do to build those attributes into the product?  These activities would include good requirements, good design, good testing, etc.

So, if we think about quality properly, and think about the approach to thinking about quality properly, then we can approach the assessment of quality properly.

— Published

Four Simple Rules for Good Software Testing Metrics

By Rex Black

Here are four simple rules for good software testing metrics:

  1. Define a useful, pertinent, and concise set of quality and test metrics for a project.  Ask yourself, about each metric, "So what? Why should I care about this metric?"  If you can't answer that question,  it's not a useful metric.
  2. Avoid too large a set of metrics. For one thing, large collections of charts and tables create a lot of ongoing work for the test team or manager, even with automated support.  For another thing, such situations lead to a "more data, less information" situation, as the volume of (sometimes apparently inconsistent) metrics becomes confusing to participants.
  3. Ensure uniform, agreed interpretations of these metrics.  Before you start using the metrics, educate everyone who will see them about how to evaluate them, in order to minimize disputes and divergent opinions about measures of outcomes, analyses, and trends.
  4. Define metrics in terms of objectives and goals for a process or task, for components or systems, and for individuals or teams.  Instead of starting with the metric and looking for a use for it, start with clearly defined objectives, define effectiveness, efficiency, and elegance metrics for those objectives, and set goals for those metrics based on reasonable expectations.

While simple to state, these rules are important.  My associates and I find that, almost every time there is a test results reporting problem in an organization, one or more of these rules is being violated.  Follow these rules for better software test metrics, and thus more effective software test management.

— Published

In Search of the Elusive Software Test End Date

By Rex Black

One of the great frustrations for testers, test managers, and other project team managers is the elusive test execution completion date.  It always seems to take longer than you'd think to get done with test execution, especially on larger projects. 

So, how do you predict when will you be done executing the tests? Part of the answer is when you’ll have run all the planned tests once.  This involves knowing three things:

  1. Total estimated test time (the sum of the estimated effort for all planned tests).
  2. Total person-hours of tester time available per week.
  3. The percentage of time per day spent executing tests by each tester (as opposed to being involved in other activities like meetings, updating test cases, etc.).

This figure sets a minimum time required to finish test execution.

However, test execution can last longer than this, because the other part of the answer is when you’ll have found the important bugs and confirmed those bugs to be fixed.  This involves using historical data (or extremely good guesses) to determine four things:

  1. The total number of bugs you'll find.
  2. The bug find rate at various stages of test execution.
  3. The bug fix rate at various stages of test execution.
  4. The average bug closure period (i.e., the time from initial discovery to final resolution).

Obviously, solid historical data from similar past projects really helps with this kind of estimation.  For our clients who have such data, and formal defect removal models, they can get quite accurate, often predicting the end date of test execution to within plus or minus 10% even on test execution efforts that last over six months.

For a simple spreadsheet that can serve as a starting point for predicting bug find-fix duration, you can take a look at this one, from the RBCS Advanced Library.

— Published

Copyright ® 2020 Rex Black Consulting Services.
All Rights Reserved.

PMI is a registered mark of the Project Management Institute, Inc.

View Rex Black Consulting Services Inc. profile on Ariba Discovery