Skip to content

Misbehaviours

The TA-MISBEHAVIOURS assertion from the TSF is a key focus of RAFIA, and many of its activities revolve around the concept of Misbehaviours.

This term is used instead of faults or failure modes because it encompasses unintended or unspecified behaviours of the software (or the system that it is part of) as well as behaviour that is in violation of its specified Expectations.

Identified and confirmed Misbehaviours correspond to Faults: deviations from specified Expectations. However, we also wish to identify Misbehaviours that have not yet been considered, or are not yet adequately specified. Hazard analysis techniques such as STPA specifically encourage us to consider classes of Misbehaviour that may result when a system is behaving exactly as specified, but not as intended.

The following diagram illustrates how RAFIA processes are used to identify, document and make use of Misbehaviours:

Role of Misbehaviours in RAFIA

The processes indicated are all described in this section, and most of them correspond to activities described in the Risk Analysis and Automation sections; the exception to this is Fault and Defect analysis, which is independent of RAFIA, but it may nevertheless inform and be informed by these processes.

The results of analysis, including new or refined Expectations and Assertions, test specifications and descriptions of identified Misbehaviours, are captured as part of a TSF Specification in the form of Statements and Artifacts.

Test results, other collected test data and Faults or Defects are managed outside this specification, but may be referenced by Statements (e.g. to automate assessment of test results using validators).

Identifying and categorising Misbehaviours

As shown in the preceding diagram, all of these activities revolve around the identification Misbehaviours.

There will always be a set of Misbehaviours that have not been identified, and our Expectations, Assertions, tests and Fault inductions may not cover or address all of the Misbehaviours that we have identified. However, we can identify new Misbehaviours by analysing the results of our existing tests, by developing new tests with the specific aim of exposing unidentified Misbehaviours, and by using the knowledge gained in this way to refine the results of our Risk analysis.

In addition to identifying new Misbehaviours, Risk analysis and Test analysis can help to identify and monitor Advance Warning Indicators, which can then be used to proactively respond to known conditions that may lead to a deviation from expected Behaviour, instead of simply reacting after a Misbehaviour has occurred. Furthermore, the data that is gathered by monitoring these indicators can itself enable us to identify new Misbehaviours.

The following diagram illustrates how a clear understanding of Misbehaviours can inform the processes of testing, analysis and specification:

Testing quadrants

Risk Analysis

Risk analysis is a key source of Misbehaviours, and should be used where possible to characterise all documented Misbehaviours.

This means that Misbehaviours are described in terms of the analytical model(s) used to perform Hazard Analysis. This provides a way to align Misbehaviours observed in testing (or in deployed software) or identified by test analysis to be aligned with those identified through Risk Analysis.

This is important because it enables us to identify limitations or gaps in the Risk Analysis results, or in the models that are used to perform it.

Fault Induction

Misbehaviours are used to create Fault Induction tests, and new Misbehaviours can be identified using the same techniques.

Testing

The Automated testing section describes how Pre-merge tests are used to verify that the software satisfies its specified Expectations and Assertions. These tests can use Fault Induction techniques to verify that the tests are fit for purpose or that mitigations correctly handle exceptions.

However, others types of test are used to identify new Misbehaviours, by subjecting the software (and the system that it is part of) to environmental factors, inputs or simulated Faults that are designed to exercise behaviours that may not yet be covered by Pre-merge tests or Behaviour specifications.

  • Soak tests can help to identify new Misbehaviours by simply executing the software repeatedly or over a longer timeframe, thereby triggering behaviour that may not occur in shorter or more atomic Pre-merge test.

  • Stress tests can identify Misbehaviours by simulating environmental factors that may impact the software's behaviour, trigger exception-handling routines or mitigations, or cause it to break in as-yet-unanticipated ways.

  • Performance tests are primarily intended to help calibrate the software or a system, to understand and document any limitations that should be placed on its configuration, or restrictions on how it should be used. However, it can also help to identify Misbehaviours by pushing the performance of the software or hardware to its limits.

Fault and Defect analysis

Faults identified in components, whether found through testing or reported by the originators of the component, can be a valuable source of new Misbehaviours. If a Fault cannot be mapped to a documented Misbehaviour, or if it cannot be characterised by a Hazard Analysis model, then this may suggest that it represents a new category of Misbehaviour.

Defects identified in other artifacts associated with the software or system, such as incomplete, incorrect or misleading specifications, can also suggest Misbehaviours that may not have been considered.

As noted above, a useful first step in Fault or Defect analysis can be attempting to describe the observed problem(s) using one of the models used for Risk Analysis (e.g. a software architecture diagram or specification), to make it easier to determine whether it corresponds to a previously identified Misbehaviour.

System and Testing Faults

When reporting and analysing Faults detected during testing, it is important to distinguish between System Faults and Testing Faults:

  • System Faults are those cases where testing has positively determined that the software or system itself has exhibited a Misbehaviour; this affects the TA-MISBEHAVIOURS assertion of the software's TSF graph
  • Testing Faults are those cases where the automated testing apparatus failed to measure the software or system, so we cannot draw conclusions about the presence or absence of Misbehaviours; this concerns the validity of the tests, so affects the TA-ANALYSIS assertion of the software's TSF graph, but also the TA-MISBEHAVIOURS assertion of the automated test framework, if that is additionally subject to analysis against the TSF

Test data analysis

As described in Automation, test results, test metrics, historical statistics and other data collected during tests or accumulated over time, can also help us to identify new (or confirm predicted) Misbehaviours. In some cases this will be very obvious: a test that was passing is now failing since we made a certain change. In other cases we may only observe anomalous patterns that need further investigation.

Examples include:

  • Intermittent errors in regular tests, which disappear if the test is re-run, only to re-appear in subsequent runs.
  • Anomalous patterns observed in collected test data, which do not cause tests to fail, but also do not correlate with what we would expect, or what we have observed in historical tests.
  • New warnings in system logs, which may not indicate an actual failure, but are indicative that something has changed.
  • Changes in observed behaviour, or in measured performance characteristics, that were not an expected result of a change.

These events or patterns do not always indicate the nature of the Misbehaviour, but can make us aware that some aspect of our system is not behaving as we had expected. The problem might be a poorly implemented test, or an inadequately controlled test environment, but it could equally be a badly specified test, or an unnoticed design flaw.

Existing Misbehaviours and Risk Analysis can help to narrow down what is behind anomalous patterns, and confirm whether this is expected behaviour, a new category of Misbehaviour, or an example of an existing category. Where a link to Misbehaviours is established, it may also be valuable to consider whether the observed anomaly or pattern might be used as an Advance Warning Indicator, or as the basis of a new test.