Risk Analysis
Risk Analysis describes a set of processes that are used to document and verify both what software is expected to do, and what it must not do, by focussing on outcomes that we specifically wish to avoid or prevent.
There are a number of established methodologies and approaches to this, many of which focus on the potential faults or failure modes of a system (including software systems), and how the system or a given component either prevents these, or mitigates their effects.
The RAFIA approach was developed using System Theoretic Process Analysis (STPA), methodology developed by MIT, and the approach is strongly informed by this technique.
Hazard analysis
The first objective of Risk Analysis is hazard analysis: identifying and characterising Misbehaviours, and classifying them by reference to their negative outcomes.
Detailed procedures and guidance for accomplishing this using STPA are provided in the STPA section, but its objectives can be summarised as follows:
- Describe a system or subsystem that incorporates the software, which might be a physical or a software system, or a discrete part of a larger system
- Identify losses (outcomes that are unacceptable for the system's stakeholders) and hazards (system-level conditions that can lead to these losses)
- Specify a hierarchical control structure, which describes the functionality of the system in terms of its elements (notably controllers and controlled processes) and interactions between them (notably control actions and feedback)
- Analyse this structure to identify unsafe control actions (UCA) (interactions between a controller and a controlled process that may result in a hazard)
- Identify causal scenarios (factors that can lead to unsafe control actions, or directly to hazards)
- Devise and specify constraints (Statements about the software or a system that must be true in order to avoid a given hazard, UCA or causal scenario)
Beyond the basics, other STPA variants expand the analysis with domain-specific knowledge, and supporting analyses (such as FTA) are often needed to incorporate derived test results. This serves as a starting point from which the analysis can be further refined. Whatever processes are followed, appropriate review guidelines must be established to meet project-specific needs.
Note
If you wish to use a methodology other than STPA to apply RAFIA, you should first familiarise yourself with how STPA approaches hazard analysis, to ensure that your selected methodology fulfils these objectives.
Traceability
To document how the results of hazard analysis inform the design, implementation and verification of our software, we use the model described by the TSF.
With STPA, for example, the losses, hazards and UCAs documented in the analysis represent a set of prohibited Misbehaviours, which may be documented as Expectations. The derived set of constraints are documented as Assertions, which specify how the risks associated with these misbehaviours are managed for the software, or in a given system. Other risk analysis techniques can provide inputs in a similar way.
This set of Assertions, together with Assertions developed through analysis of other software or system expectations, may then inform (or be mapped to existing) test specifications and related fault inductions, which provide Evidence that the identified risks are indeed managed as asserted.
The Misbehaviours identified by hazard analysis are also a valuable source of scenarios that need to be tested, and fault inductions that can be used to verify the tests, or the implemented mitigations.
If all top-level objectives are tracked as Statements (Expectations, Assertions, Evidence) with forward and backward traceability, RAFIA establishes a verification-driven workflow with analysis-led traceability:
- Expectations must be supported by sufficient Evidence, with progress tracked through reviewed confidence measurements.
- The workflow requires analysis-led traceability linking analysis to Statements about system objectives, architecture, design, and verification and validation outcomes.
Consequently, changes to any Statement require re-evaluating the associated analysis links that connect them.
Ensuring that knowledge remains in sync across design and evaluation cycles is addressed in Automation of information gathering and presentation.
Risk Evaluation
The STPA methodology does not address another important aspect of Risk Analysis: evaluating the relative importance or criticality of the Hazards and/or Misbehaviours that have been identified.
This can be partially addressed using STPA's concept of Losses, which allow us to categorise negative outcomes that are unacceptable to stakeholders, and prioritise the necessary remedies or mitigations on this basis. However, this only covers one aspect of risk evaluation: determining the severity of a risk.
Risk evaluation of an identified hazard needs to consider at least two things:
- The severity, impact or consequences of the hazard, in terms of its potential adverse effects
- The likelihood or frequency of the hazard, in terms of the probability that it will occur in a given timeframe
Other factors to be considered include:
- controllabilty: If the hazard does occur, how effectively can its adverse effects be mitigated?
- exposure or demand: To qualify likelihood, how often is the entity impacted by the hazard likely to be exposed to it in their use of the system?
Together, these factors are used to categorise and determine the relative importance of Misbehaviours. This is valuable when considering the cost of eliminating or adding mitigations to address hazards against the net effect of this on overall risk. This is particularly important when a mitigation or remedy may have a significant impact on the overall design, since the changes involved may themselves introduce more risk.
Evaluation of risk is considered in the Eclipse Trustable Software Framework as part of TA-CONFIDENCE.