Table of Contents
About
Eclipse Trustable Software Framework (TSF)
The Eclipse Trustable Software Framework (TSF) approach is designed for consideration of software where factors such as safety, security, performance, availability and reliability are considered critical. Broadly we assert that any consideration of trust must be based on evidence.
See the accompanying documentation for an explanation of:
- the context and rationale for this approach
- its underlying model and methodology
- guidance on its implementation.
Info
You can also view this site as a single page.
This overview uses a generic "XYZ" software project/product as an example. For XYZ specifically, we need to collect and evaluate evidence for all of the following:
- Provenance: where it comes from, who produced it, and what claims they make about it
- Construction: how to construct, install and run it. Also, how to be sure that we constructed, installed and ran it correctly
- Change: how to update it, and be confident that it will not break or regress
- Expectations: what it is expected to do, and what it must not do
- Results: what it actually does, compared to our expectations
- Confidence: our confidence in the software, based on all of the above
Software and systems complexity has increased to the extent that in most cases we cannot aspire to bug-free software, deterministic behaviours, or complete test coverage.
In practice XYZ must be released into production regularly, to introduce planned updates and improvements, and irregularly to address severe problems and security vulnerabilities. Every release can and should be considered to involve risk. The Eclipse Trustable Software Framework aims to provide guidance for measuring, managing and minimising the risk associated with a software product/project, both in general and for each release.
Trustable Tenets and Assertions
Our working model for a given release of XYZ is as follows. You can click on the boxes in the diagram or browse the specification to see the detailed definitions:
Goal
Our goal is to provide a structured argument for trustability of a given release of XYZ, by gathering and presenting evidence across a range of factors (Trustable Tenets and Trustable Assertions) based on a TSF specification of the software project.
We aim to calculate (or at least estimate) a confidence score for each Tenet and each Assertion based on the available evidence, and then distil these scores into an overall Trustable Score for the release.
Note this approach is intended to be used alongside, or after, the de-facto development process for XYZ.
TSF Rationale
Context
Typical arguments for critical systems treat software differently from hardware. Whereas hardware is expected to suffer both random and systematic failures, only systematic failures are considered for software.
In practice this has led safety engineers to focus on processes and practices which have not kept pace with industry norms, and are arguably no longer fit for purpose in our rapidly evolving environment. Whereas most software organisations claim to be "Agile" and rely heavily on open source, critical systems practitioners are still advocating "Waterfall" and relying on techniques (and in some cases even tools and software) that fell out of mainstream use decades ago.
The world has changed significantly, but the standards (and indeed the techniques used to devise and maintain the standards) are not keeping up.
Where are we now?
Any promises we hope to make about software must be made with the knowledge that:
- most modern target hardware is complex and non-deterministic, and comes bundled with a huge amount of firmware (which is just hidden, non-certified, binary software).
- complex software contains bugs, is non-deterministic and evolves rapidly
- the external environment for software is also evolving
- it is not feasible to specify fully the behaviour of complex systems
- developing new software at scale is obviously risky - usually much more risky than reusing existing code that is widely used and actively maintained
- modern systems are connected to networks, and thus subject to evolving security threats which must be mitigated throughout their product lifetime
As a result, we must consider that:
- we cannot hope to achieve 100% confidence in most software, particularly complex software running on multicore processors
- software cannot be considered to be 100% "safe" or "secure" or "reliable" or "bug free"
So how do we do better?
In our view the best we can (and should) realistically aim for is to
- analyse the specific Behaviours we require from software in a specific context i.e. running on specific hardware, for a specific set of use-cases
- analyse what could go wrong in that context
- devise fixes and/or appropriate monitoring and mitigations
- demonstrate that the software provides the Expected Behaviours
- demonstrate that things typically don't go wrong
- demonstrate that mitigations work as expected when things do go wrong
- measure our confidence in the above
- be ready to repeat the above every time we need to change the software, or hardware or both
We consider that delivery of software for critical systems must involve identification and management of the risks associated with the development, integration, release and maintenance of the software.
Further we consider that delivery is not complete without appropriate documentation and systems in place to review and mitigate those risks.
The Eclipse Trustable Software Framework provides a basis to help us, and our customers, to manage these risks as we understand them. Broadly the approach is to consider supply chain and tooling risks as well as the risks inherent in pre-existing or newly developed software, and to apply statistical methods to measure confidence of the whole solution. We believe that this is most usefully applied at the integration level, which is where the problems will usually be noticed.
Our approach is to:
- make specific promises, in a specific context
- devise methods to show that the promises are usually met
- verify that these methods are reporting truthfully
- analyse the ways that our promises may be broken, and either fix or mitigate
- measure how often the promises are broken, in engineering and in production
- calculate confidence values based on the measurements, for each software release
- provide all of the evidence and tooling for the above, along with source code to our customers, so they can incorporate our work into their overall solution and make their own promises
So for critical software to be considered 'Trustable' we suggest it must be provided with the following constraints:
- risks/hazards associated with the planned use of the software are analysed
- Expected Behaviours are explicitly documented
- prohibited Misbehaviours are explicitly documented
- Expected Behaviours are shown to be provided, by tests
- test procedures and results are verified
- prohibited Misbehaviours are shown to be absent, mitigated or fixed
- process artifacts and test results are captured as evidence
- evidence is analysed, distilled and presented with confidence values for each release
Key insights
We can and should:
- accept that complex software cannot in practice be 100% risk-free
- expect and intend to provide timely updates to mitigate problems as they arise
- expect complex software to exhibit random/stochastic behaviour
- apply statistical methods to establish confidence in software
- use soak testing to explore software behaviour over extended time periods
- use stress testing to identify and analyse rare events
- use CICD to lock down the target code and the whole supply chain
Compliance
Compliance with Trustable means making a commitment to:
- Only make claims that you provide evidence for.
-
Use evidence to measure the extent to which those claims are met.
The Trustable Methodology provides a flexible approach to building chains of Claims connecting Expectations to Evidence.
The Trustable Score provides a measurement of the degree to which your Expectations (and Assertions) are met, based on the quality of your Evidence and your argument.
Using the -trustable suffix
A release that meets conditions (1) and (2) above is considered to comply with Trustable.
This should be indicated by appending the -trustable suffix to the release tag.
Note
Compliance with Trustable (or equivalently a release marked with
-trustable) does not mean that the software can be trusted implicitly
(that is, without question or reservation). Rather, it means that
sufficient information about the software and its properties is provided
to the user, to enable them to make an informed decision as to whether
they can trust the software for their application.
Reference implementation of Trustable
We believe that our implementation of Trustable (Following the Trustable Methodology and using the tooling we provide to produce a Trustable Report) is a systematic method for achieving Trustable Compliance. Therefore, consumers of this reference implementation of Trustable may interpret (1) and (2) as equivalent to the following:
- The Trustable Methodology is applied to all claims made for their software
trudagis used to store and track confidence in these claims- A Trustable Report is included with the release:
- The report is not meant to be checked in but should be provided with each release as a downstream consumable artefact.
- Optionally, data can be shared with each release to allow reproduction of the generated report.
A release meeting these conditions may also use the -trustable tag.
Releases
TSF Release Process
Release versions
Releases and release candidate versions are tracked with git tags using a Semantic Versioning 2.0 scheme as follows:
- Release tags of the form:
<MAJOR>.<MINOR>.<PATCH> - Release candidate tags of the form:
<MAJOR>.<MINOR>.<PATCH>-rc<N>
where:
<MAJOR>incompatible API changes<MINOR>backward compatible features<PATCH>backward compatible bug fixes<N>denote release candidates numbers
All are numerical representations (e.g. 0.7.2-rc1).
The project aims to have a release each month.
Releases prior to 0.1 use the CalVer versioning scheme and
are considered breaking.
Version 1.0.0 will be the version in which the Python based trudag will be
deprecated in favour of the Rust based tool, and
a stable API will be defined.
Release candidates are a means for testing a release before it is finalised. Project users can use the release candidate to test the tooling on their project, reporting back any bugs or breaking changes that occur. If there are a number of bugs, there will be a second release candidate and so forth. If the release candidate is considered acceptable, a full release will follow ~2 days later.
Tooling Interface Stability
The current trudag tool is still in the development stage and therefore cannot be considered as fully stable.
However, in favouring stability over constant change, we loosely follow stability rules in practice.
The interface surfaces and what should be accounted for in the context of stability are enumerated below:
Dependencies
Any changes to dependencies should only be considered in major releases.
Command Line
Commands and their respective options will not be removed or changed without a major release. New commands and flags should be added in minor release updates.
Artifact
The data model should only change in minor release if additional keys are added in a backwards compatible manner, otherwise changes should only be considered for major releases.
Library
Changes to the public library interface are discouraged in minor releases, however, as the project is in the development stage, may still occur.
Release preparation
- Add an issue with the
releasetemplate to track the release in GitLab. - For a stable release, Create a merge request and update the Release Notes
- Update
docs/NEWS.mdto summarise changes, issues resolved, breaking changes and new issues identified - Update the package version specified in
pyproject.tomlto the planned release version - Create a tag from the MR branch, with a corresponding release candidate version specifier, and check that the release job runs successfully
- Verify the generated release package by:
- Downloading it from the package registry
- Following the instructions for installing and updating the TSF tools
- Update
- Create the release
- Create a release tag on
main(see format above) after the release MR has been approved and merged - Check that the release has been created and verify the release package
- Announce the release to the tsf-dev mailing list
- Update and close the issue
- Create a release tag on
Release Notes
Current version
The current software package for TSF (also available as a .tar.gz) is:
- trustable-0.2.0-py3-none-any.whl
Refer to the provided instructions for installing and updating.
v0.2.0
-
Trustable Changes:
- Reformat assertions to add section header files for
TAs.
- Reformat assertions to add section header files for
-
Performance improvements:
- Added concurrency to references.
- Added option to resolve validators concurrently during scoring and publishing operations (with the
--concurrent-validationflag) - Improved performance of
trudag publishthrough caching and simplifying of report figures. - Added caching of artifact references for faster
trudagoperations. - Added cached successors and predecessors for
BaseGraphfor faster graph read-only operations. - Changes scoring algorithm to switch from an Adjacency matrix to an Adjacency List to improve performance.
-
Tooling improvements:
- Changed the default exporting of artifacts to stop when encountering failed references (this can be changed through
--allow-failureflag). - Sets
--fail-on-errorflag toTrueby default. This flag is generally applied to linting and diffingtrudagoperations. - Added item file order and SME scores to exported artifacts.
- Various improvements to
SourceSpanReferencereference type. - Added version hash postfix for
trudag --versionandtrudag shell. - Links can now be be made if parent/child item is unreviewed.
- Changed the default exporting of artifacts to stop when encountering failed references (this can be changed through
-
New Features:
- Added
trudag manage move-itemcommand to move items between.dotstop.dotand.needs.dotfiles. - Added publishing from artifacts through
trudag publish --artifact artifact_path. trudag publish --sensitivityis made functional.
- Added
-
Documentation updates:
- Fixed README typos.
- Added explanation of system and testing faults in RAFIA documentation.
- Improved Remote Graph documentation.
- Improved Scoring implementation documentation.
v0.1.0
-
Project Updates:
- The Trustable Software Framework is now hosted at: https://gitlab.eclipse.org/eclipse/tsf/tsf in co-ordination with the Eclipse Foundation as the
Eclipse Trustable Software Framework(project home at: https://projects.eclipse.org/projects/technology.tsf). - TSF releases now follow SemVer versioning
- Updated release process with inclusion of release candidates.
- Updated project chat service URL from
#trustablechannel on https://www.libera.chat to https://matrix.to/#/#technology.tsf:matrix.eclipse.org. - Added
SECURITY.mdfor guidance on vulnerability reporting. - Added
NOTICE.mdfor notices on Copyright and licensing information. - Updated
CODE_OF_CONDUCT.mdto be in line with the Eclipse Foundation. - Updated
CONTRIBUTING.mdto be in line with the Eclipse Foundation. - Updated Copyright headers.
- Added
Eclipsebranding to the project. - TSF artifacts/Remote graphs now available on mainline pipelines for users.
- The Trustable Software Framework is now hosted at: https://gitlab.eclipse.org/eclipse/tsf/tsf in co-ordination with the Eclipse Foundation as the
-
New Features:
trudag shellcan be used to create a single session repl through which users can run varioustrudagcommands and make changes in-session. For more information, please look at the documentation on using the shell- Adds artifact versioning. (Currently it's optional and not enforced but will be in the future, see: issue #491).
- Adds a delete file option (default being enabled) for
trudag manage rename-item/trudag manage remove-item.
-
Fixes:
- Spelling fixes for documentation and TSF tenets and assertions.
- Removed key duplication in scoring and collecting validators. Fixes bug where validation results were being duplicated.
- Links in published index file corrected.
- Disables validation result inclusion from artifact to avoid errors with exporting of Trustable Remote Graphs/Artifacts (will be correctly included in the next release).
trudag manage formatnow supports SemVer and CalVer versioning.- Corrects sensitivity admonition for reports.
- Corrects navigation index in published reports when using Artifact References.
- Clearer
--reviewedoption fortrudag set-item. - Option to enable/disable item text in
trudag plot.
-
Tool Improvements:
- Faster resolution of Remote Graph/Artifact references in consumer/downstream projects.
- Faster validation of
TrustableGraphduring scoring. - Pooled requests and
retriesoption for Gitlab References.
-
Documentation:
- Added Trustable in
Mediapage to the project documentation. - Improved installation instructions.
- TSF Objectives section added to explain the TSF objectives and the principles behind them.
- Adds a dynamic layout footer section (useful when browsing the documentation with mobile devices).
- Added Trustable in
v2026.01.28-01
- Fixes:
- Removed
typing_extensionsas a runtime dependency.
- Removed
v2026.01.28
-
Breaking Changes:
trudagbuild dependency updated:python3.10 -> python3.11.- Turns non-normative trustable assertions into file references in the trustable repository.
-
Trudag Tool Enhancements:
- Improved output/logging for:
trudag manage linttrudaghelp messages.trudag manageerrors.
- Improved performance for:
trudag manage show-linktrudag manage lint- Loading of Trustable Items during initialisation of the
trudagtool.
trudag manage show-item --pathcommand added to show path of an Item.trudag manage add-itemcommand added to add an existing markdown item file to the Trustable project database.- Fixed regression (introduced in v2025.12.18) causing validators to be run twice in
trudag. - Linking is permitted only between reviewed items.
- Preserve links between items when importing a Trustable Artifact.
- Add optional validation for the Trustable artifact export process in
trudag. - Preserve metadata in Trustable item references when importing Trustable artifacts.
- Failure in resolution of item references are logged instead of blowing up as exceptions.
- Improved output/logging for:
-
Report Improvements:
- Item sensitivities sorted through scientific notation.
- SME Name, SME Score, and SME reasoning are shown as tables in the compliance report.
- Adds rendering options for reference type of
FileReference.
-
Documentation:
- Broken links are fixed.
- Improve dark mode colours.
- Remove documentation for un-implemented feature (configuration for
trudag).
v2025.12.18
-
Breaking Changes:
- Score origins included in Trustable artifact schema, prior artifacts to this release are unusable.
- Unified
BaseIteminterface to concrete typeItem.- Serialisation and deserialiation formats have changed.
-
Trudag Tool Enhancements:
- Ability to reference subgraphs in artifacts.
- Referenced artifact reports are generated as sub-reports.
- Reference renders as a link to subgraph root items.
- TSF artifact generated in Trustable repository CI.
- All TSF tenets and argumentations are considered "needs".
- Support for a general configuration file.
- Support for
metadatain Reference definition.
- Ability to reference subgraphs in artifacts.
-
Report Improvements:
- Origin of score shown in compliance report.
- Additional validator information displayed.
-
Documentation:
- Restructuring of documentation.
- Roadmap for new Rust tools.
- Documenting proposed use cases for remote graphs.
- TSF expectations graph expands on mouse hover.
v2025.11.27
-
Breaking Changes
- Change default Trustable Report and Graph plot output path from
docs/doorstoptodocs/trustable(as thedoorstopbackend is no longer supported). - Change importing directory option for
trudag importfrom--needs-dirto--import-dir. - Remove resolved graph's root node from being exported/imported through Trustable artifacts.
- Change default Trustable Report and Graph plot output path from
-
Trudag Tool Enhancements
- GitlabFileReference now fetches files from private GitLab repositories more securely.
- More expressive log messages and warning for:
- References
- Needs/AOU items during exporting/importing of Trustable artifacts.
- Add
pyyamlpython package as runtime dependency (fixes issues with fresh installations). - Importing of Items from Trustable artifacts required providing a namespace, now the namespace is checked to be unique during import.
- Needs/AOU items can now contain Trustable references.
- Trustable projects containing only needs/AOU items and without scored/resolved graph items can now be exported/imported through Trustable artifacts. (Trudag's inability to consume parts of the resolved/scored graph in the artifact is a known issue and will be worked on in the future).
-
Report Improvements
- Add Sensitivity Summarisation for Trustable Statements (while this feature has known performance limitations).
- Bugfix: Remove Fallacies from showing up in reports as they are not fully implemented.
-
Documentation
- Bugfix: Make Trustable Graph in the landing page to be clickable.
- Add expected architectural design of TSF informing future tooling and data store implementations directions.
- Add instructions for
trudagtool'spublishandplotcommands. - Update Architectural diagram illustrating Remote Graphs.
v2025.10.22
-
Breaking Changes
- Removal of doorstop as a supported backend and dependency
- Removal of
migratecommand
- Removal of
initis now a command, not an option- Unreviewed items and links are now scored as 0, rather than being removed
- This means lint warnings consistently degrade the score.
- Clarifications to all TSF Trustable Tenets and Assertions
- Removal of doorstop as a supported backend and dependency
-
Trudag Tool Enhancements
- Re-add concurrency and parallelism to graph linting
- Allow use of project name or id number for Gitlab file references
- Fix for artifact bug when root node score was 0
- Rounding of scores to 5 decimal places
- Introduce
trudag manage rename-item- Creates a copy of markdown item with new name
- Eager evaluation of
--help
-
Report Improvements
- Limiting rows in historical data table to 20 to prevent them growing too large
- Better presentation of information for item sections
- Sensitivity analysis inclusion in report
- Defaults to off because sensitivity analysis can take an extraordinarily long time for large graphs.
- Fixes table colour for darkmode
- Display commit datetime in a more readable format
- Replace Emojis with Unicode characters
-
Documentation
- Add documentation for RAFIA validation
- Improve documentation on artifacts and remote graph
- Small fixes to API docs
v2025.09.16
Remote scoring now supports creating serialized artifacts from resolved items and needs for downstream projects in the local environment, which can be imported into remote graphs. This is an early, unstable iteration, and both the user experience and documentation will be refined in future updates.
This release also includes bug fixes and introduces a new interface for diff commands.
-
Breaking Changes
- Introduce
trudag manage diffcommands- And remove
trudag manage lint --diffoptions
- And remove
- Introduce
-
Trudag Tool Enhancements
- Add feature to generate artifacts for remote graphs
- Improve error message for unexpected items without scores
- Improve error message for normative scored items marked as informative
- Add feature to pass boolean values to validators
- Add feature to set-item unreviewed from cli
- Fix missing
frontends/clideclaration inpyproject.toml - Fix not providing a default graph name during
trudag init - Fix
--helpevaluating subcommands from cli command groups - Refactor internal graph object access functions
-
Documentation
- Fix single page view
- Clarify SME score replacement options
-
Miscellaneous
- Improve Flox setup
- Run poetry install on activate
- Add graphviz dependency for plotting
- Improve Flox setup
v2025.08.05
This version includes performance improvements across several areas of the
Trudag tool. In this release, we added multiprocessing support for linting
tasks, which leads to faster execution. We also introduced caching for items
and the reference builder. These changes improve the overall experience,
especially when working with larger graphs.
- Trudag Tool Enhancements
- Linting tasks are now processed in parallel
- Caching added for the reference builder and items
set-linknow supports multiple links in one callset-linknow logs errors instead of raising them- Fixed broken links in the documentation
- Small changes to data store values:
- "Commit SHA" now stores whole sha rather than abbreviated version.
- "Commit tag" appends the abbreviated commit sha.
- Extended documentation and glossary for STPA.
v2025.07.23
The new Remote Scoring feature allows the integration of multiple trustable
graphs (hosted in external Git repositories) into the scoring process of the
local trustable graph. This update lays the groundwork for upcoming features
related to the consumption of external graphs and introduces improvements to user experience.
Change Summary
- Trudag Tool Enhancements
- Added new reference type:
source code reference - Redesigned the graph interface to simplify the addition of new backends
- Introduced support for an ignore list file to exclude specific files from Trudag operations
- Reference classes are now loaded from the entry point file location rather work directory
- Added monitoring for validator execution time
- Improved over exception handling and linting function
- Updated documentation for usage with Doorstop
- More explanatory user-facing messages
- Added new reference type:
v2025.06.25
A new feature has landed: Data Store. This feature enables users to visualize
the progression of each expectation as a graph. Additionally, users can now
slice the graph directly from the CLI for more targeted insights.
This release also includes several user experience improvements to the
trudag tool.
Change Summary
- Trudag Tool Enhancements
- Nodes and edges are now organized by name and source name
- Introduced the new
Data Storefeature:- Graphical view of expectations
- CLI-based graph slicing
- Added more tests:
- Including coverage for the
publishcommand
- Including coverage for the
- Publish command improvements:
- Now runs faster
- Outputs only SVG images
- Matrix calculations are now more efficient
- Removed mutable object from constructor interface
-
Improved tracebacks for validators and reference plugins
-
Documentation
- Clarified behavior of "inputs"
- Fixed pinned links in the NEWS section
-
CI Pipeline Improvements
- Fixed issues in linting stage
- Added more tests
- Added Markdown linting
-
Misc
- Fixed a known issue related to a hidden Python 3.13 dependency
Known Issues
- All
trudagcommands require a GitLab auth token if apublic: falsereference oftype: gitlabis present in the tree.
v2025.05.29
Refinements to the user experience for Trudag.
Change Summary
- Improvements to
trudagCLI tool- Proper logging of
StopIterationerror raised on encountering missing item files - Proper logging of
schema.SchemaErrorraised on encountering invalid item frontmatter - Proper logging of
Exceptionerrors raised when building references - Fixed bug where inconsistent dotfiles (edge including unspecified node) caused an unhandled Exception
- Fixed bug where warnings in Validator plugins were treated as Exceptions
- Fixed bug where mkdocs (specifically its used of jinja) interpreted raw escape blocks sequences in referenced content as template specifiers
- Fixed bug where markdown items were not ordered according to their level
- Fixed bug where duplicate markdown item files caused unnannounced nondeterministic behaviour
- Fixed bug where
trudag manage create-itemcaused an unhandled Exception - Improved import of custom Validators, each Validator is now only imported once
publicoption added to references of typegitlab, allowing public repositories to be referenced without an access token.- Upgrade to mkdocs-puml 2.3.0
- Documents are now ordered alphabetically
- Remove HEAD request from references of type
gitlabto improve performance on large graphs.
- Proper logging of
- Documentation and Methodology
- Improved Reference and Validator plugin documentation to include worked examples
- Introduced new code of conduct for contributors
- Trudag API reference documentation automated and updated across the codebase
Known issues
- Exceptions in custom Validator and Reference plugins are not handled reliably
- Additional undocumented dependencies required when using python 3.13 or newer
- All trudag commands require a gitlab auth tokens, if a
public: falsereference oftype: gitlabis present in the tree.
v2025.04.30
First release including trudag, a Statement management and analysis tool designed specifically for trustable.
Change Summary
- Improvements to
trudag/trustable-complianceCLI tool- Rename
trustable-complianceastrudag - New subcommand
managefor manipulating, linting and migrating to dot - Support for completeness and correctness model in scoring backend
- Validator plugin API and optional use through
--validateflag - Reference plugin API
- Improved logging, including
--verboseflag - Score dumping functionality, including
--dumpflag
- Rename
- TSF Statements
- TSF Statements now use dotstop backend, not doorstop
- Documentation and Methodology
- Improved explanation of Misbehaviours and their role in RAFIA
- Improved STPA schema
- Improved methodology documentation
- Safety requirement process refinement
- Project workflow
- New base container stored in container registry to mitigate DockerHub pull rate limit impact
- End to end tests for command availability, score dumping and dot management
- Test coverage tracking
- Lychee link linting
Breaking Changes
trustable-complianceCLI tool and library renamedtrudagtrudagdefaults to dotstop, not doorstop backend- Leaf nodes are now called Premises. When linked to Artifacts they are Evidence, otherwise they are Assumptions
Known issues
- Exceptions in custom Validator and Reference plugins are not handled
- Additional undocumented dependencies required when using python 3.13 or newer
v2025.03.14
Documentation improvements, ongoing implementation of dotstop (TSF-specific replacement for Doorstop) and documentation on the RAFIA process.
Change Summary
- Added API references for trustable_compliance, graphalyzer and dotstop
- Added Lint and format checks using ruff
- Progress on implementation of dotstop
- Read and write dotstop graphs and items natively from files
- Reference management and reference plug-in support, including unit tests
- YAML front matter schema
- Improved guidance for TA-MISBEHAVIOURS and associated glossary definitions
- RAFIA process documentation, illustrating some specific applications of TSF
- Renamed and expanded the TAs (see Breaking Changes)
Breaking Changes
The identifiers for the Trustable assertions have been changed in this release, to replace the numbers with more meaningful names. A new TA (TA-CONSTRAINTS) has also been added. The following table summarises the changes.
| Old | New |
|---|---|
| TA_A-01 | TA-SUPPLY_CHAIN |
| TA_A-02 | TA-INPUTS |
| TA_A-03 | TA-RELEASES |
| TA_A-04 | TA-TESTS |
| TA_A-05 | TA-ITERATIONS |
| TA_A-06 | TA-FIXES |
| TA_A-07 | TA-UPDATES |
| TA_A-08 | TA-BEHAVIOURS |
| TA_A-09 | TA-MISBEHAVIOURS |
| TA_A-10 | TA-INDICATORS |
| TA_A-11 | TA-VALIDATION |
| TA_A-12 | TA-DATA |
| TA_A-13 | TA-ANALYSIS |
| TA_A-14 | TA-METHODOLOGIES |
| TA_A-15 | TA-CONFIDENCE |
| (new) | TA-CONSTRAINTS |
Known issues
- Needs guidance on how to:
- Import existing requirements into a TSF graph
- Integrate TSF graphs from other projects
- Support for multi-repository usage of TSF is experimental only and has a number of known issues
v2025.02.18
Improved user instructions
Changes
- Improved tool and getting started documentation
- Guidance on organisation of items and graphs
- Clarifications and improved guidance for TA items
- Criteria for SME assessments
- Clarified documentation about review and the review status field
- Small TSF tool improvements, including graduated colour maps for scores
Known issues
- See list for previous release (unchanged)
v2025.01.30
Release of project in preparation for public launch.
Changes
- Added licenses for documentation and tools
- Removed deprecated validator code
- Improved release job to publish package in project PyPi via twine
- Improved readability for text on coloured background in compliance report
Known issues
- Mermaid diagram rendering is broken in the Compliance documentation
- Needs guidance on how to:
- Import existing requirements into a TSF graph
- Integrate TSF graphs from other projects
- Missing API reference pages for
trustable-compliance - Support for multi-repository usage of TSF is experimental only and has a number of known issues
v2025.01.29
Interim release of project in its new context on gitlab.com
Changes
- CI and tooling updated to build for the new context
- Theme and document generation configuration updated to a Trustable theme
Resolved issues
- Guidance for TA-18 and TA-20 needs adding
Known issues
- Mermaid diagram rendering is broken in the Compliance documentation
- Needs guidance on how to:
- Import existing requirements into a TSF graph
- Integrate TSF graphs from other projects
- Missing API reference pages for
trustable-compliance - Support for multi-repository usage of TSF is experimental only and has a number of known issues
v2025.01.28
Improvements to documentation, tooling to support future migration to a new graph model design and renumbering of the Trustable Assertions
Changes
- Tooling extended and improved to support:
- New 'dotstop' interface layer to support new graph model design in future
- Restructured Trustable report
- Documentation extended and improved to:
- Design concept description for 'dotstop'
- Refined requirements management process
- Expanded and improved methodology
- Trustable Assertions and Tenets updated to:
- Separate informative (guidance) from normative (Statements) text
- Change TA UIDs to define a new 'series', with a series identifier (A) and a new numerical sequence for the Items
Known issues
- Needs guidance on how to:
- Import existing requirements into a TSF graph
- Integrate TSF graphs from other projects
- Missing API reference pages for
trustable-compliance - Guidance for TA-18 and TA-20 needs adding
- Support for multi-repository usage of TSF is experimental only and has a number of known issues
v2024.12.19
Further refinement of the methodology, tooling to support confidence scoring and a refactoring of the Trustable Assertions to align with the TSF graph model.
Changes
- Tooling extended and improved to support:
- Autogenerate the Trustable Tenets and Assertions diagram from Doorstop Items
- Add support for aggregating confidence scores from Doorstop Items
- Inclusion of inlined Artifacts in the Trustable Report
- Documentation extended and improved to:
- Clarify and refine the methodology, adding diagrams to illustrate the TSF graph model
- Provide detailed instructions for creating, storing, and managing a TSF graph
- Refactor the Trustable Assertions to address identified problems and align their relationships with the TSF graph logic
Resolved issues
- Guidance added to documentation for how to:
Known issues
- Needs guidance on how to:
- Import existing requirements into a TSF graph
- Integrate TSF graphs from other projects
- Missing API reference pages for
trustable-compliance - Guidance for TA-18 and TA-20 needs adding
- Support for multi-repository usage of TSF is experimental only and has a number of known issues
v2024.11.29
Updates to support confidence scoring, reporting and provide guidance to users on the use of the TSF tools and methodology.
Changes
- Tooling extended and improved to support:
- Recording and management of confidence scores
- Plotting graphical summaries of databases
- Improved compliance reporting
- Documentation extended and improved to:
- Describe what TSF compliance involves
- Explain how to start applying TSF to a project
- Add instructions for installing and updating TSF tools
- Add guidance for projects applying TSF on the use of a
-trustablesuffix on release versions
Resolved issues
- Tooling to support confidence scoring is now provided
- Spurious 'fatal error' reported by
trustable-compliancefixed
Known issues
- Needs guidance on how to:
- Integrate the Trustable TSF graph into a project
- Import existing requirements into a TSF graph
- Update a TSF graph when evidence changes
- Integrate TSF graphs from other projects
- Missing API reference pages for
trustable-compliance
v2024.10.24
Initial release of Eclipse Trustable Software Framework
Changes
- Overview and rationale for Trustable
- Preliminary guidance on implementation
- Methodology for authoring and organising TSF metadata
- Instructions for managing metadata using Doorstop
- Examples of automated verification
- Report generation tooling based on custom version of Doorstop
Known issues
- Integration of the TSF Doorstop tree into a project is a manual process
- Tooling to support confidence scoring is not yet provided
- Needs clearer guidance on:
- How to import existing requirements into a TSF Doorstop tree
- How to update Doorstop when evidence changes
Media
All media related to TSF is enumerated here in reverse chronological order. Additions via merge request are highly encouraged.
Videos
Open Source Summit - August, 2025
ELISA - May, 2025
Eclipse Trustable Software Framework: Introduction - April, 2025
Paul Sherwood on Trustable Software - April, 2025
FOSDEM - February, 2025
Objectives
TSF Objectives
TSF objectives express the desired outcomes needed to evaluate Trustability for any software project. They are inspired by difficulties in establishing trust in software identified in the Trustable whitepaper.
These objectives are high-level, forming a foundation which individual projects expand argumentation and provide evidence.
TSF objectives should adhere to the following principles:
- accommodate existing software standards and best practices
- apply to any software system, regardless of domain or scale
- express Trustability concerns (What and Why) without prescribing implementation mechanisms
- focus exclusively on critical aspects required to establish Trustablity
- are extendable
Trustable Tenets and Assertions
Our working model for a given release of XYZ is as follows. You can click on the boxes in the diagram or browse the specification to see the detailed definitions:
Projects should expand upon these objectives with a suitable model and methodology, such as that proposed by TSF, and develop the argumentation with processes recommended here.
Model & Methodology
Trustable Model and Methodology
Trustable provides a theoretical model for reasoning about critical requirements and a methodology for applying it to software projects in practice. The model and methodology are structured to permit an approximation of organizational confidence in XYZ automatically, in CI. Better still, by reasoning about multiple systems with the same model we can manage their interactions in a scalable manner.
Motivation and Approach
Poorly-written and poorly-organised requirements are a frequent problem for software projects. Where these projects contribute to critical systems, creating numerous requirements can worsen the very problems they are intended to solve.
Poorly-written requirements are characterised by imprecise or ambiguous language, unspecified contexts and implicit criteria. When exhibited in high-level requirements these problems can be propagated downwards: from architecture into design and ultimately implementation.
On the other hand, flaws can also propagate upwards. When teams begin work without requirements, or use prior art, the objectives, motivation and reasoning for that work are often lost. Integrating this work into the wider design or architecture represents an organizational challenge.
Traditional approaches in safety mitigate these risks by defining a complex multistage lifecycle through which XYZ and its requirements are created, maintained and refined throughout the life of the product. Such a lifecycle cannot be applied to FOSS as it is inherently pre-existing, rapidly evolving and not subject to a well-defined development lifecycle.
Trustable endeavours to provide an alternative solution for managing the critical requirements of complex systems that addresses these commonplace problems in a way that can be applied to FOSS. We start from a minimal theoretical model for requirements, from which all of our terminology, methodology and tools are derived. This provides an unambiguous approach to requirements that is flexible and extensible but always self-consistent. By following simple rules in our implementation, we aim to build a complex model of XYZ, whatever its state of development, that is meaningful, robust and rigorous.
Model
Trustable Graphs
The basic building blocks of our model for requirements are Statements: Definitive expressions with meaningful interpretations when considered to be True and False.
A good Statement
The Trustable project provides tools that are implemented in Python.
This is a good statement because:
- If we suppose it is True, we know what to conclude: Trustable offers some tooling which is written in Python.
- If we suppose it is False, we know what to conclude: Trustable may or may not offer some tooling, but none of the tools are implemented in Python.
A bad Statement
Trustable should be written in Python.
This is a bad Statement because it is difficult to infer what it means for this to be True:
- Who or what thinks Trustable should use Python?
- What should be written in Python? The tooling? What about documentation?
Similarly, it is difficult to understand what this means if the Statement is False.
Because Statements must be expressed in natural language, there is an unavoidable element of subjectivity in how they should be written (but not what they mean!). Recognising that there will be necessary and/or sensible exceptions, we recommend that Statements:
- Use the indicative mood
- Use the third person perspective
- Use the present tense
- Are affirmative
- Are single sentences
Statements are connected by Links. A Link from Statement A to Statement B means that Statement A logically implies Statement B. It can be helpful to remember this is equivalent to "B is a necessary but not sufficient condition for A". By convention, we refer to Statement A as the parent and B as the child.
The set of Statements about a project and their Links forms a directed graph. To remove the possibility of circular arguments, we insist that this graph is acyclic. We call the resulting set of directed acyclic graphs (DAGs), Trustable Graphs.
Tip
Statements and Links are the basis for everything that follows. Make sure you have a good understanding of these concepts before reading further. If you are struggling, the #trustable channel on TSF Matrix chat room is a good place to ask for help!
Classifying Statements
The Statements comprising a Trustable Graph can be naturally classified into two overlapping categories:
- A Request is a Statement that has one or more children.
- A Claim is a Statement that has one or more parents.
These simple definitions allow us to define three further and disjoint categories of Statement:
- An Expectation is a Statement that is a Request, but not a Claim.
- An Assertion is a Statement that is both a Request and a Claim.
- A Premise is a Statement that is a Claim, but not a Request.
Warning
Historically, we have used the term Evidence to mean a Premise. However, this wording becomes confusing when dealing with "unsupported" Statements. Evidence has a more specific meaning that will be discussed later.
Architecture
The Eclipse Trustable Software Framework is designed to be tool-agnostic, with one exception: the data defining a Trustable Graph is expected to be stored under version control in a repository managed by git.
The CLI tool (trudag) and data store backend (dotstop) that are provided as part of this project are the default implementation of TSF. This page documents the expected architectural design of TSF, which is intended to inform these and any other tool and data store implementations. Note that some of the features documented here may not yet be supported by trudag.
This architecture view looks only at the conceptual elements of the TSF graph (the types of data processed by TSF and the types of processes that consume that data) and how they interact. It does not address:
- Data formats, grouping and segmentation
- Processing and dataflow orchestration and configuration
- User, library and process interface definitions
- Any examples for any discussed element to implement an example argument
Overview
In defining this architectural view, we consider which types of information matter when applying TSF. First, we identify which parts of this information are managed by the user. Then, we look at the elements that process this data and create intermediate artifacts. Finally, we discuss these intermediate artifacts.

The TSF architecture distinguishes between three distinct categories of element that correspond to these topics, as illustrated above:
- Managed data defines a Trustable graph. It is always stored as files under version control in a single git repository, but may include references to data stored or managed in other contexts, including content in the graph itself, and files in its local git repository, or another repository. This type of data is expected to be persistent, i.e. it should be consistent for any instantiation of the graph unless explicitly changed. Note that this expectation may not apply to referenced data managed in another context (see Output data below).
- Providers are software components for a given TSF implementation, providing functions relating to a specific type of Managed data, which obtain or process Output data. These functions may be 'built-in' behaviours of the tooling associated with the implementation, or they may be implemented as extensions (e.g. plug-in libraries), enabling users to extend or customise the tooling's behaviour to support different types and sources of data.
- Output data is generated by Providers using information from an associated type of Managed data to obtain and process data from a referenced context. This type of data is expected to be transient, i.e. it is generated for each instantiation of the graph, and its content may vary. Examples include the results of executing a test, or of querying an external data source. Also included is referenced data from an external context, such as a file managed in another git repository, which may be subject to uncontrolled changes (e.g. if the reference is to a branch in another git repository).
The combination of the Managed data for a given iteration of the graph, and the Output data obtained or generated for it by a given set of Providers is called a Resolved graph.
Managed Data
This is the data that defines a TSF graph, which is created and managed by users, and stored in a git repository.
Statement
A Statement is Managed data that represents the most fundamental element of a TSF Graph, as described by the Model. A Statement must include a textual component, which is expected to consist of a single sentence.
Statements may be connected to other Managed data elements, as characterised by the following relationships:
- Other Statements that it supports, or which support it
- References that qualify it, or provide additional context
- Evidence that is used to confirm its validity
- Scores that record evaluations of its validity by a human
- Graphs that include it as a member (at least one)
- Namespaces that identify a group of related Statements
Reference
A Reference is Managed data that specifies data from this or another context, which is obtained via a Resolver and used to qualify, provide context or validate a Statement. A Reference may be characterised by attributes:
- The type attribute is always required; it determines the Resolver that is used to obtain Content Data associated with the Reference. For example, the
filetype of Reference in trudag specifies a file managed in the same git repository as the graph. - A class attribute may be specified to characterise the relationship between the element that owns the Reference and the Content Data that it specifies.
As with the associated Content data, there are two categories of Reference:
-
For Persistent References, the resolved Content data is expected to be consistently reproducible for a given iteration of the associated Reference. Examples include data obtained from a file managed in the same git repository as the Trustable graph, or a file managed in another git repository that is referenced using a specific tag or SHA-1.
-
For Changeable References, the associated Content data may be expected to vary each time it is resolved. Examples include a file from the main branch of another git repository, the log of the most recent execution of a CI job, or the result of executing a query on a database of accumulated result data.
In both categories, the Reference implementation may include measures to determine whether the resolved Content data has changed since the reference was last updated. For example, a hash value for the content generated by a cryptographic algorithm might be stored as part of the Reference, and used by the Resolver to determine whether the data has changed.
References may also be connected to other Managed data elements, as characterised by the following relationships:
- A Namespace may define a class of References (e.g. the set of documented Misbehaviours)
- A Reference may specify a subgraph of a specified Graph (e.g. all Statements satisfying a given set of criteria)
- (Note): A subgraph in this context may consist of a single Statement,
- A Reference may provide a documented description (e.g. context for a group of Statements) for a Namespace
- A Reference may provide documented qualification or context for a Statement
- A Reference may provide an artifact that is used as Evidence
In a Resolved graph, References are also associated with the corresponding Content data provided by their related Resolver.
Evidence
Evidence is Managed data that is used to validate a Statement. It defines the artifacts that are to be evaluated and may define criteria used in their evaluation. It may be characterised by the following additional relationships:
- References define input artifacts for use in an evaluation.
- Validators provide functions that are used to perform automated evaluations.
In a Resolved graph, Evidence is also associated with the Result data that is produced by its related Validator(s).
Where Validators use discrete inputs from an external source of data (e.g. a file managed in an external git repository, or the id of the latest CI pipeline), these should be specified in the Evidence element using References. This serves to document the specific source and context of inputs provided to the Validator, and can be used to capture and store the associated Content data as part of the Resolved graph. This can also be used to verify the reproducibility of the associated Result data in the Resolved graph.
Where a Validator interacts directly with an external source of data (e.g. executing a series of queries that depend upon each other), however, the data source and the input criteria may instead be specified as attributes of the Evidence element.
Score
A Score is Managed data recording the outcome of an evaluation of a Statement by a human, based on its defined Evidence and any Statement(s) that it supports.
Where a Statement has a Score, it must also have Evidence, which identifies the set of inputs that are to be evaluated, and any Validators that may be involved in the evaluation.
A Score may have three components:
- evaluator: Who or what determined this Score?
- verdict: Based on the specified Evidence, is the related Statement true, false, or undetermined?
- confidence: A value expressing the evaluator's degree of confidence in the validity of the verdict
The evaluator component identifies the individual(s) or Validator who provide the Score. A Statement may be assigned more than one Score, so this property is used to distinguish between them.
Multiple Scores provided by humans may be recorded for a single Statement, reflecting reviews by individuals with different areas of expertise, or responsibility.
Graph
A Graph is Managed data consisting of a set of Statements and the relationships between them.
The scope of a Graph may encompass all of the Statements in a Trustable graph, or a subset of these. It is characterised by the following relationships:
- A Graph defines the set of Statements that are its members.
- A Graph may be identified by a Namespace.
- A Reference may specify a subgraph of an associated Graph.
The Graph also documents the Links between Statements, which may have associated attributes.
A link should be annotated with a cryptographic hash, which records the state of the parent and child statements. This enables a tooling implementation to report when either of these Statements has changed, which should prompt a re-evaluation of the link to determine whether it is still valid.
A link may also be annotated with a weight, which expresses the relative significance or importance of a linked child Statement with respect to other children of the same parent.
Namespace
A Namespace is Managed data that identifies a specific Graph.
Namespaces may be connected to other Managed data elements, as characterised by the following relationships:
- A Namespace may be associated with a class of References (e.g. the set of documented Misbehaviours).
- A Namespace may specify a definition Reference, which defines an associated subgraph of a specified Graph (e.g. all Statements satisfying a given set of criteria)
- A Namespace may also specify a description Reference (e.g. context for a group of Statements).
- A group of Statements may be associated with a Namespace (e.g. because their UIDs share the same prefix).
Providers
These are discrete elements of a TSF tooling implementation that are associated with particular types of Managed and Output data.
How Providers are implemented, scheduled and configured, and how their inputs and outputs are managed by an orchestrating process or processes, is outside the scope of this document.
How outcome documents (such as a Trustable Compliance Report) are constructed and laid out by Publishers is also outside the scope of this document.
Resolver
A Resolver is a Provider that obtains and verifies Content data for a given type of Reference. This data may be obtained from an external context (e.g. a file managed in a different git repository), or from the local context of the graph (e.g. a file in the same git repository), or from an element of the graph itself (e.g. a Statement).
The Resolver may implement a mechanism to determine whether the resolved Content data has changed since the associated Reference was last updated. See Reference for a further discussion of this.
Validator
A Validator is a Provider that performs automated evaluation using input data and criteria specified by an Evidence element. Validators may use referenced Content data and generate Result data as part of the validation process.
Every validator checks whether its input criteria are satisfied and its result data is as expected, reporting any violations with suitable error codes and messages as part of its Result data.
Publisher
A Publisher is a Provider that generates documents from a given Resolved graph. The Trustable report generated by trudag is an example of such a document.
Publishers are provided with a Resolved Graph as an input, which includes all of the necessary Content and Result data, so that this can be included or rendered in the output document.
If the required data referenced by Statements in the provided input graph has not been obtained by the associated Resolvers or produced by the associated Validators, then the Publisher must omit it, or replace it with an explanation.
Output Data
This is data that is produced by Providers acting on Managed data. First Content data is retrieved by Resolvers, to better contextualise parts of the Managed data. Then, Result data is gathered or generated against that context by Validators. Finally, all of this is collected together with the Managed data to create a Resolved graph.
Content data
Content data is Output data that is obtained when a Reference is processed by a Resolver. Two categories of Content data may be referenced:
- Persistent: the same result should be obtained, unless some attribute of the Reference is changed.
- Changeable: the result obtained may vary each time the Content data is obtained.
For both categories, the associated Reference type and Resolver may include a mechanism to detect when the resolved data has changed. See Reference for a further discussion of this.
Result data
Result data is Output data that is generated and used by a Validator as part of its evaluation of Evidence. This type of data is transient, but may be exported as part of a Resolved Graph.
Result data should always include a verdict component, where true means that the evaluated set of artifacts and results fulfil the criteria specified by the Evidence, and false means that they do not, or that an error prevented the Validator from completing its evaluation.
Result data may optionally include other types of data, which may be used to quantify or qualify the verdict. The meaning and significance of such data is Validator-specific. For example, a Validator may perform a calculation, and return a verdict of true if it falls within defined parameters, but also output the calculated result. For some Validators, this may be used to provide a confidence component of the associated Score.
Result data is always related to the corresponding Evidence element, and is typically associated with a corresponding Score element of the related Statement.
Resolved graph
When all of Output data for a Trustable graph has been obtained or produced by the associated Resolvers and Validators, the result is called a Resolved graph.
This represents the state of the Trustable graph for a specific iteration of the Managed data at a given moment in time and for a given evaluation context.
An implementation of TSF should include methods for storing a Resolved graph as a persistent artifact, and for instantiating a Resolved graph from such an artifact.
Methodology
This section explains how this abstract model can be applied to XYZ in practice. Applying the Trustable Methodology is an iterative process. It has distinct stages, but these can be tackled in any order. They are:
- Setting Expectations
- Providing Evidence
- Documenting Assumptions
- Recording Reasoning
- Assessing Confidence
Whichever order(s) you perform these stages in, moving from one stage to another will always require Modification, which we discuss separately.
Setting Expectations
Stakeholders of XYZ must agree on the most important functional or non-functional requirements for XYZ within the context of the project. These may arise from or be informed by requirements outside of XYZ. This is anticipated and will be addressed directly later. Stakeholders should express these critical requirements as a set of Statements.
Thinking now about the model, we notice these are Requests that are not used to make any further Claims about XYZ. Therefore they must be Expectations. We conclude that Expectations originate from Stakeholders, who may be:
- Consumers (e.g. a customer or product owner providing requirements)
- Contributors (e.g. an architect identifying a key design goal)
- Others (e.g. the authors of a regulation or safety standard)
Tip
If you find you have high-level requirements that are important both within and outside of your project, don't worry! You can manage this using Access Specifiers, which are introduced later.
Providing Evidence
Contributors are often better placed than other Stakeholders to understand the properties of XYZ. It is recommended that, at least initially, Contributors own the process of gathering and proving Evidence.
In order to support the Claims we make about XYZ, we must express these properties as Statements and then measure or otherwise determine them. This will always require Artifacts: Identifiable components, products or byproducts of XYZ's execution and development. These Artifacts can be persistent (e.g. source code, process documentation) or transient (e.g. test results, incident response times).
Properties of transient Artifacts are best measured algorithmically. For instance, we can procedurally inspect and average the results of performance tests and record the desired performance threshold as a Statement. Our algorithm will now automatically tell us to what extent this Statement is True. We call this relationship between a Statement and an Artifact a Validation.
Properties of persistent Artifacts such as documentation or source code can be measured algorithmically, but often we are interested in more complex or "soft" properties when dealing with them. In such cases, we make a Statement about the Artifact and then use SME review to determine the extent to which that Statement is True. We call this relationship between a Statement and an Artifact a Reference.
Tip
When starting a project, it is often easier to start by referencing all or most of the Artifacts you are interested in, before thinking about validation; this is encouraged.
We appeal to Artifacts when we cannot support a Statement with other Statements. We call such Statements Evidence: Premises that are supported by reference to or validation of Artifacts.
Documenting Assumptions
Some requirements for XYZ can never be satisfied within the context of the project. For instance, nearly all software projects will require a specific operating system(s) or hardware; it is not sensible or reasonable to expect the author of a small library to also provide an OS!
These requirements should still be recorded as Statements, but we leave them "dangling". That is, they are Premises for which we have no justification at all. We refer to such Statements as Assumptions.
Recording Assumptions is just as important as recording Evidence, if not more so.
Tip
During the evolution of XYZ, you will often find Assumptions that actually reflect work that needs to be done within XYZ by its Contributors, not the Consumer. This is encouraged as it provides transparency for developers and users; they can clearly see what they need to fix or provide to achieve the functionality XYZ promises.
Recording Reasoning
The measurable properties of XYZ that we capture as Evidence, and the Assumptions we make about XYZ's properties cannot always be directly Linked to our Expectations for XYZ. For instance, there may be a significant logical argument or chain of implication between Expectations and the properties of XYZ. This is particularly true for complex projects.
Instead of making large and undocumented leaps in logic, we make smaller logical steps using intermediate Statements. These Statements will necessarily have parents and children; they are therefore Assertions.
In a strictly logical sense the set of Statements alone should be enough to capture the argument, but they will not necessarily be sufficient to explain it to Contributors and Consumers. This is the role of Informative items. These are passages associated with a set of Statements and Links, which explain the reasoning for and thinking behind the argument represented by the subgraph.
The intermediate reasoning we make can be complex and detailed. It is not reasonable to assume it can all be broken down into simple sentences. For this reason we allow Assertions to be Qualified by Artifacts. That is, the Assertion can state that some complex property expressed in an Artifact is True, rather than reproducing the definition in the Statement itself. This relationship between Artifacts and Assertions is called Qualification.
Assessing Confidence
At any stage in the project's lifecycle and ideally every time it is changed, the Contributors to and Consumers of XYZ need to understand to what extent XYZ has achieved its stated goals. This is the role of a Confidence Assessment.
Evidence is scored, both using SME review of Evidence and executing validation algorithms. Scores for every Statement in the Trustable Graph are then calculated recursively. These scores should be used to inform where effort should be focused next.
Modification
While it can be helpful to consider a Trustable Graph as "requirements-as-code" in many contexts, there is a crucial difference we must bear in mind: unlike code, we cannot lint, test or otherwise automatically verify the underlying logic of the argument. Ultimately humans, not machines, have to decide whether the Links between Statements are valid.
To help us maintain the quality of the underlying logic, we introduce a new adjective to describe Statements and Links in our tooling: Suspect.
The first time a Statement is written, we confirm that it is a valid Statement. If it has references, then, when confidence is assessed, it will receive an SME review and a score. If the Statement later changes, how do we know that the Statement and its recorded score are still valid? We enforce the convention that any change to a Statement makes it Suspect, until it is reviewed by a human. By convention, we refer to items that are not Suspect as reviewed. Therefore, unlike ordinary Statements, Suspect Statements and their scores are not treated as valid.
Similarly, the first time we record a Link, we confirm it represents a logical implication. What happens if the parent or child Statement changes? The two Statements are still related, but we can't be sure that the relationship is a logical implication. The Link is now Suspect, until reviewed by a human. By convention, we refer to links that are not Suspect as clear or cleared.
Tip
An example process for managing modifications is provided here.
Summary
In this section we will quickly recap the new objects and relationships we introduced in the methodology:
- Artifacts are components, products or byproducts of XYZ.
- Evidence is a Premise that is supported with an Artifact
- Assumptions are Premises that are unsupported
- Artifacts can Qualify Assertions
- Artifacts can Validate Premises, making them Evidence
- Premises can Reference Artifacts to create Evidence
Composing Projects
A key aim of Trustable is composability. Where XYZ and its dependencies both use Trustable, their respective Graphs can be easily combined.
When composing XYZ and a dependency, we simply decide which Assumptions in XYZ can be satisfied by Requests in the dependency and add a Link. What was an Assumption in the parent and a Request in the child becomes an Assertion in the context of the wider project.
Warning
Currently composing projects is done by manually vendoring items between repositories. This is a complex process that is error-prone and tedious. Tooling solutions are in active development.
Access Specifiers
Allowing consumers to depend on implementation-specific Assumptions or Assertions in your argument is a recipe for breaking changes. To this end, Statements can be classified accordingly:
- Public Statements can be freely used by consumers. Any change to Public Statements is a breaking change. A good project will ensure Consumers using Public Statements can easily move between versions of the software and argument.
- Private Statements are not intended to be used by consumers. Upstream projects may change, add or remove Private Statements as needed. Sensible Consumers will not use Private Statements from upstream sources.
Danger
This feature is not yet implemented in tooling. Instead, you should maintain a dialogue with your up- and downstream projects and communicate this information by other means. Clearly, this is a potential bottleneck for projects using TSF and great care should be taken when managing this, at least until tools are available.
Applying TSF
The Eclipse Trustable Software Framework is a special kind of project that is purely formed of requirements. It is intended to be composed with large software projects like XYZ, enabling them to audit the quality, completeness and correctness of their own arguments.
The TSF and XYZ should be managed separately. We recommend that you first perform one iteration of the Methodology directly to XYZ, before applying TSF.
This means deciding on your Expectations and building an argument for them out of Statements. At this stage, it's fine to have many broad Assumptions. The image on the right shows an example of how this may look: We have two Expectations, X1 and X2, supported by several Assertions which in turn are linked to Statements left as Assumptions, Zi.
Now we apply the TSF. Each TA in the TSF is in fact an Assumption that must be satisfied by XYZ. When we compose TSF with XYZ, we turn these Assumptions into Assertions by linking them into both new and existing argumentation.
To turn each TA from an Assumption into an Assertion, consider the Statements and Artifacts from XYZ that can be used to support the TAs. Note that this may require treating some Statements as Artifact. For instance, in TA-BEHAVIOURS you will need to reference XYZ's Expectations. Similarly, TA-CONSTRAINTS requires you to reference XYZ's Assumptions. You may need to make new Artifacts to support TAs you have not considered before. For example, TA-ITERATIONS requires you to assemble and provide all source code with each constructed iteration of XYZ.
Note
The example below is incomplete and does not represent a sufficient argument for any TA.
The image below illustrates what this may look like for a subset of the TSF. Intermediate Statements Ui are used to tie XYZ's Statements and Artifacts into XYZ.
- U1 makes a Statement about the source code XYZ provides, supporting TA-ITERATIONS
- U2 makes a Statement about a property of XYZ's Expectations, supporting TA-BEHAVIOURS
- U3 makes a Statement about a property of XYZ's Assumptions, supporting TA-CONSTRAINTS
The Expectation in the TSF, TRUSTABLE-SOFTWARE therefore provides a
transparent and arms-length (though not truly independent) assessment of the trust
we can place in XYZ's Expectations and their score. This structure, although
optional, allows upstream and downstream projects to reuse the argumentation
body independently of the TSF and to separately reevaluate their trustability.
Scoring
The Trustable Score
The Trustable Score for each element of the set of Statements made to support Expectations is defined recursively from the Evidence scores. Evidence scores are based on either:
- Automated assessment of Artifacts using a well-defined metric.
- Calibrated Subject Matter Expert (SME) assessment of the Evidence Statement using the referenced Artifacts.
Warning
The implementation of Trustable Score calculation and its definition in the tools is work-in-progress. This section describes the current behaviour of the tools. For the complete and correct theoretical definition of the score, see the Scoring Roadmap.
Calibrated SME Assessment
Statements in the Trustable Methodology must be verifiable propositions. That is, the notion that they are true or false must be meaningful and measurable.
There are many examples of Statements that are verifiable propositions but are too complex or high-level to easily measure directly (e.g. It will rain tomorrow). The field of Decision Analysis provides a path forward in such cases: Calibrated Probability Assessment. This means using Subject Matter Expert's assessments of their confidence in a given Statement as a probability measure. While early work confirmed this approach had merit in some cases (e.g. Weather forecasting), it also identified assessor overconfidence as having a serious impact on assessor accuracy1. More recent work has identified credible methods for calibrating assessors, such as the use of structured feedback2.
Key Concepts
- Confidence: A measure of the probability a Statement is true, given by a Subject Matter Expert.
- Accuracy: Typical error between a Subject Matter Expert's confidence in a Statement and its actual probability.
- Calibration: Adjustment of Subject Matter Expert measurements to achieve a known standard of accuracy.
Assessing Evidence
In the Trustable Methodology, we require assessors to have two key qualities:
- Expertise. Assessors with good knowledge of the subject area are needed in order to reduce epistemic uncertainty. While in theory anyone can provide accurate estimates of their own confidence, these are likely to reflect their uncertainty in the subject (i.e. their scores will lie close to 0.5) and provide little additional value. On the other hand, Subject Matter experts have the context and understanding to offer assessments with reduced epistemic uncertainty (though not necessarily greater accuracy).
- Calibration. The Assessor should be calibrated to compensate for their innate overconfidence. This provides improved accuracy, though not necessarily reduced uncertainty.
In practice, this means SMEs must only assess topics within their expertise and must undergo calibration exercises. Furthermore, since confidence assessments are probabilities, their correlation with reality grows with the number of assessments. Therefore, assessment should be performed frequently, by a significant number of individuals and used to infer long-run trends, rather than as an exact reflection of the current reality.
When assessing a Statement, SMEs should consider the following:
- Is my assessment based on only the referenced Artifacts, or do I need to reference other documents before providing a score?
- Is my assessment of the whole Statement, or do I need to break the Statement down further before providing a score?
- Am I sufficiently calibrated?
- Would an automated validator reach a similar conclusion?
Note
Unscored Evidence is always assumed to have a score of zero.
Calibration
In addition to general calibration training, Statements that can be verified by testing can also be used to help calibrate SMEs. Where testing cannot be used, the following strategies can be used to improve calibration:
- Self-Validation of historic estimates
- Cross-Validation using other estimates
- Statistical Anomaly Detection
SME Scoring Guidance
The SME assessment is an assignment of probability to the likelihood that a statement is true. The purpose of calibration is to ensure that the SME's assessments match the probability of the statement's truth (strictly speaking, if the SME states they believe the statement is correct 90% of the time, it actually is true 90% of the time). Therefore:
- A score of 0 means the SME is certain the Statement is false
- A score of 1 means the SME is certain it is true.
- A score of 0.5 means that the SME has no information or intuition to indicate whether it is more likely to be true or false.
Defining scores for all items
A Trustable Graph comprises a set of Statements \(S\) and a set of directed edges \(L\), such that the graph is defined by the ordered pair \((S,L)\). The existence of an edge \((s,s')\in L\) means that Statement \(s\) is supported, in whole or in part, by the claim made by Statement \(s'\).
The Trustable Score function \(T: S \rightarrow [0,1]\) is defined as,
That is, the Trustable Score of a Statement \(s\) is the mean of the scores of its supporting Statements.
Therefore, if scores are defined for all Evidence Statements (recall this is the set of Statements that have no incident edges \(S_E = \{s\in S : (s,s') \not\in L,\; \forall s' \in S\}\)), the definition of \(T(s)\) is sufficient to recursively define the scores of all items in the tree.
Calculating scores for all items
We briefly discuss here how the Trustable Score can be calculated using powers of the adjacency matrix. Given a graph of \(n\) nodes, label the nodes with indices \(1\leq i\leq n\), such that the set of statements \(S\) is equivalent to \(\{s_i:i= 1,...,n\}\). We may then write the entries of adjacency matrix \(\mathbf{W}\) as
That is, the \((i,j)^\text{th}\) entry of the adjacency matrix is zero where there is no edge from \(s_i\) to \(s_j\) and the inverse of the number of children of \(s_i\) otherwise.
Then, given the vector of Evidence scores \(\mathbf{t}_E\) whose entries are given by
we claim that the trustable score for all nodes \(\mathbf{t}\), \(t_i = T(s_i)\), is given by the sum of the products of adjacency matrix exponents with Evidence scores,
since \(\mathbf{W}^m \mathbf{t}_E\) contains the contributions to the score of all nodes from paths of length \(m>0\) and an acyclic digraph of \(n\) nodes cannot contain paths of length greater than \(n\).
Note this coincides exactly with the result presented in the Roadmap, under the assumption that all leaf scores are considered to be correctness scores, \(\mathbf{t}_E={\mathbf{c}_r}_E\) and that the argument is complete, such that \(\mathbf{C}_p=\mathbf{I}\).
Equivalent Adjacency List Implementation
Our reference implementation of the Trustable Score calculation uses the
graphalyzer backend. graphalyzer represents Trustable graphs as directed
acyclic graphs using adjacency lists and evaluates the score by dynamic
programming over the graph. For acyclic graphs this permits score computation in
time proportional to the number of nodes and edges.
This recursion is mathematically equivalent to the matrix power-series formulation above, but evaluates it directly on the graph structure.
Trustable Score
The Trustable Scores can be computed by the following recurrence:
where:
- \(v, u\) are nodes in the Trustable graph,
- \(s(v)\) is the Trustable score of node \(v\),
- \(c(v)\) is the completeness factor associated with node \(v\),
- \(r(v)\) is the correctness value of node \(v\),
- \(succ(v)\) is the set of immediate successor nodes of \(v\)
- \(w_{vu}\) is the weight of the directed edge \(v \rightarrow u\).
For leaf (Evidence) nodes, the successor set is empty and the definition reduces to \((s(v)=c(v)r(v))\).
The code matching the formulation is as follows:
n = graph.size
score = np.zeros(n)
for v in reversed(graph.topological_order):
score[v] = completeness[v] * (
correctness[v]
+ sum(weight_vu * score[u] for u, weight_vu in graph.successors[v])
)
return score
The implementation iterates through the nodes in reverse topological order, ensuring that all successor nodes are scored before their parents. Each node’s score is computed as a single expression matching the recurrence: the correctness value plus weighted successor contributions, scaled by completeness. Leaf (Evidence) nodes have no successors, so their score reduces to completeness[v] * correctness[v].
Node Sensitivity
For node sensitivity, the accumulated influence from node v onto node t can be calculated with the following recurrence:
where:
- \(t\) is the target node,
- \(v\) and \(u\) are nodes in the graph,
- \(b(v)\) is the accumulated influence of node \(v\) on the target \(t\),
- \(c(v)\) is the completeness factor of node \(v\),
- \(w_{vu}\) is the weight of the directed edge \(v \rightarrow u\).
Because the graph is acyclic, this recurrence can be evaluated in a single topological traversal of the graph.
The code matching the forumulation is as follows:
sensitivity = np.zeros(graph.size)
sensitivity[t] = 1.0
pos_t = graph.topological_order.index(t)
for u in graph.topological_order[pos_t + 1 :]:
sensitivity[u] += sum(
weight_vu * completeness[v] * sensitivity[v]
for v, weight_vu in graph.predecessors[u]
)
return sensitivity
The sensitivity of the target node is initialised to 1. The graph is then traversed in topological order. For each edge v -> u, influence is propagated from v to u, scaled by:
- the edge weight w_vu
- the completeness factor of the parent node v
This procedure accumulates the total influence of every node on the target. Because the graph is acyclic, each propagation step is performed exactly once, yielding linear time complexity in the size of the graph.
Edge Sensitivity
For Edge Sensitivity, we compute \(\frac{\partial s(t)}{\partial w_{vu}}\) for all nodes \(t\) by propagating the influence of \(v\) to its ancestors in reverse topological order:
where:
- \(v\) is the parent node of the edge,
- \(u\) is the child node of the edge,
- \(t\) is a target node,
- \(i\) is a successor node of \(t\),
- \(w_{vu}\) is the weight of the edge \(v \rightarrow u\),
- \(s(u)\) is the Trustable score of node \(u\),
- \(c(v)\) is the completeness factor of node \(v\),
- \(b(t) = \frac{\partial s(t)}{\partial s(v)}\) is the sensitivity of node \(t\)'s score to node \(v\)'s score
The code matching the formulation is as follows:
n = graph.size
v, u = edge
score = _vector_score(graph, completeness, correctness)
dscore_dv = np.zeros(n)
dscore_dv[v] = 1.0
pos_v = graph.topological_order.index(v)
for t in reversed(graph.topological_order[:pos_v]):
dscore_dv[t] = completeness[t] * sum(
weight_ti * dscore_dv[i] for i, weight_ti in graph.successors[t]
)
return completeness[v] * score[u] * dscore_dv
The implementation first computes the global score vector. It then computes \(f(t) = \partial s(t) / \partial s(v)\) for all \(t\) in a single reverse topological traversal, starting from \(f(v) = 1\) and accumulating weighted successor contributions for each ancestor. The final edge sensitivity is the product of the completeness of \(v\), the score of \(u\), and \(f(t)\).
This is equivalent to the chain rule applied to the recursive score equation, but avoids computing per-node sensitivities separately, yielding linear time complexity in the size of the graph.
-
Lichtenstein S, Fischhoff B, Phillips LD. 1982. Calibration of probabilities: The state of the art to 1980. In Judgment under Uncertainty: Heuristics and Biases. pp306-334. Cambridge University Press. ↩
-
Moore A, Swift S et al. 2019. Confidence Calibration in a Multiyear Geopolitical Forecasting Competition. Management Science. 61(11) pp3552-3565. https://doi.org/10.1287/mnsc.2016.2525. ↩
Mathematical Roadmap for the Trustable Score
This section describes the theoretical definition of the Trustable Score and intended method of calculation. The current implementation evaluates these definitions using graph algorithms that are mathematically equivalent to the formulations presented here, though not all features (such as user-defined weights and completeness scores) are fully exposed yet. A description of the score as computed by the tooling is documented here.
Scoring is performed in two stages. First, evidence scores are prepared for
each leaf node (trudag/score.py). Then, these scores are propagated through
the graph using the algorithms described below (trudag/graphalyzer/analysis.py).
Currently, edge weights are included in the scoring calculations but are not
user-configurable. All weights are initialised to 1 and normalised by the
number of children.
Completeness scores are supported by the graph algorithm but are not yet
exposed to users as a separate input. All completeness values passed to the
graph algorithm default to 1. However, for Evidence items with both an SME
assessment and a Validator score, the SME score acts as completeness and the
Validator score acts as correctness. Their product (sme_score * validator_score)
is used as the evidence score, effectively incorporating completeness into the
leaf score before it reaches the graph algorithm. For Evidence with only an SME
score or only a Validator score, the missing value is assumed to be 1, and the
single score is treated as correctness.
Notation and naming
Many of the concepts here will be familiar if you have previously encountered graph theory. Much of the following is basic graph theory framed in the language of Trustable. This is done to permit the tools and methodology to be used with an intuitive, rather than mathematical understanding of the concepts at play.
Modelling arguments
Consider some set of Statements related to XYZ, \(S\). We call the power set of \(S\), \(\mathcal{P}(S)\), the set of Arguments. We call the set of ordered pairs \(C=\mathcal{P}(S)\times S\) the set of Conjectures. A Conjecture \((A,c)\) is composed of an Argument, \(A\in\mathcal{P}(S)\) and a Conclusion \(c\in S\). A Conjecture is called a Proof if and only if
Naturally, the set of Proofs \(Pr \subseteq C\).
Info
This model is not intended as a rigorous mathematical model of a logical proof. Such a treatment of the Trustable Graph is, for the moment, not considered. Rather, we are trying to describe networks of requirements we already have in language that is unambiguous and as concepts that we can reason about mathematically; the model is rigorous, the arguments it describes are not.
Defining the score
A Trustable Graph is the ordered pair \((S,L)\), composed of a set of Statements \(S\) and a set of directed edges or links between Statements, \(L\).
A link between two Statements, \((s,s')\in L\), exists if and only if \(\exists (A,s) \in C, s'\in A\).
We aim to define a "Trustable Score" for each Statement in the Trustable Graph, representing our collective confidence in the truth of that Statement. To do so, we must first define some simpler concepts.
SME Assessment Scores
Mathematically, an SME Assessment score is some function \(c: S \rightarrow [0,1]\), which maps Statements to a level of confidence in the truth of that Statement as a probability.
Completeness of a proof
We previously introduced the idea of an Argument: a set of Statements whose conjunction is claimed to imply the Conclusion. In reality there will be a level of uncertainty that a given Conjecture in the graph is in fact a Proof.
The aggregated SME assessment of this probability is called the completeness score of the Statement, \(c_p: S \rightarrow [0,1]\),
This probability is a measure of both aleatory and epistemic uncertainty.
For Statements without an Argument (or equivalently Evidence) \(c_p(s)\) is defined differently:
- If the Statement has a Validator, \(c_p\) is the SME confidence assessment in the Validator as an accurate measure of the truth of the Statement.
- If the Statement has no Validator, \(c_p\) is the SME confidence in the Statement on the basis of the linked Artifacts.
Weighting of the Argument
Different groups of Stakeholders may wish to emphasise particular aspects of an Argument. Each group of Stakeholders who are interested in the output of the project assigns a weight to each link in the graph. In doing so, they define a link-weighting function \(\mathcal{W}:L\rightarrow[0,1]\) which accounts for their organizational priorities and risk management strategy.
This function must satisfy the additional constraint,
For convenience, we define the weighted-adjacency function \(W:S\times S \rightarrow [0,1]\),
Info
While well-defined, weights do not have a formal interpretation.
Correctness of an Argument
The degree to which the conjunction of an Argument for a Statement is True (regardless of whether that Argument and Statement form a Proof or Conjecture) is measured by the correctness function, \(c_r: S\rightarrow [0,1]\). This is a poorly defined extension of boolean logic where a score of \(0\) means the Argument is False, \(1\) means True and values in between indicate a degree of uncertainty (though expressly not in any rigorous sense).
For Requests, we define the correctness function \(c_r: S \rightarrow [0,1]\) recursively from its Argument,
For Evidence, this value is determined by the Validator, or set to \(1\) if the Statement has no Validator (i.e. it is purely SME assessed).
Info
Again correctness does not have a formal interpretation, but is well-defined.
The Trustable Score
Finally, we'd like to consider our confidence that a given Statement is True, based on its Argument. A reasonable approximation for this is the product of the completeness and correctness of its Argument:
We call this the Trustable Score for a Statement. If completeness and correctness scores are defined for all Evidence, the definition of \(T(s)\) above is sufficient to recursively define the scores of all items in the graph.
The Trustable Score is therefore an approximate measure of an organisation's collective confidence that a given Statement is True, solely on the basis of Evidence.
It is important to be clear that the Trustable Score is only an approximate measure, in the sense that:
- For a given set of weights, equivalent Trustable Scores for two Statements does not mean equal likelihood those Statements are True.
- For a given set of weights, an increase or decrease in Trustable Score does not necessarily imply an increase or decrease in the likelihood a Statement is True.
However it is still useful, in the sense that:
- A Trustable Score of \(1\) does correspond to absolute organizational confidence in a Statement.
- Long-run trends in Trustable Scores should correlate strongly with changes in organizational confidence.
To improve the reliability of the Trustable Score as a measure of organizational confidence, the following avenues of research should be considered:
- Statistical analysis of the relationship between Trustable Scores and the perceived success or failure of projects.
- More rigorous application of fuzzy logic and probability theory in the definition of the Trustable Score to help define its formal meaning.
Contributions and critique
Contributions to and criticism of the Trustable Score are welcomed. In the first instance, the preferred method of communication for this is via GitLab issues.
Calculating the score
As suggested above, we must begin by determining the correctness of Evidence. Recall that correctness for Evidence is equal to one or the value returned by a Validator, if one exists.
Next, we define the completeness score for all Statements, using the mean of all recorded SME assessments for a given Statement.
For a graph of \(n\) nodes, label the nodes with indices \(1\leq i\leq n\), such that the set of statements \(S\) is equivalent to \(\{s_i:i= 1,...,n\}\) and we can define the diagonal completeness matrix \(\mathbf{C}_p\), whose entries are given by,
Similarly, the entries of the weighted adjacency matrix \(\mathbf{W}\) are defined as,
Then, given the vector of correctness scores for Evidence items, \({\mathbf{c}_r}_E\) whose entries are given by
we claim that the correctness score for all nodes \(\mathbf{c}_r\), \({c_r}_i = C_r(s_i)\), is given by the following sum
and by extension the trustable score \(\mathbf{t}\), \({t}_i = T(s_i)\),
Because Trustable graphs are acyclic, the series above can be evaluated by a finite dynamic program over the graph. In practice this corresponds to computing scores by traversing the graph in topological order rather than by explicit matrix powers, while remaining mathematically equivalent to the expression above.
Sensitivity Analysis
The derivatives of the Trustable score can be used to study the effect of varying weights and scores on the trustable scores of Statements in the Graph.
In implementation, these derivatives are evaluated by propagating influence along the DAG using recurrences equivalent to the matrix expressions above.
Edge-weight sensitivity
The partial derivative of trustable scores with respect to edge weights is given by
where \(\mathbf{E}_{rs}\) is the matrix with entries
Node-score sensitivity
The partial derivative of trustable scores with respect to other scores is more complex. Beginning with the definition of the trustable score,
expanding gives us
It is then clear that we can write the trustable score of all nodes in terms of the trustable score of the leaf nodes,
We interpret this as meaning the trustable score is the product of powers of the modified adjacency matrix \(\mathbf{C}_p\mathbf{W}\) and the vector of leaf scores. These powers represent the number of walks from one node in the graph to another. Determining the partial derivative is therefore as simple as choosing the correct entry of the modified adjacency matrix,
where \((d_r)_s = \delta_{rs}\).
Tools
Installing trudag
System Dependencies
trudag requires:
python3(version>=3.11, <4.0) andgitas runtime dependenciesgraphvizas an optional dependency; required if usingplotcommand
Debian/Ubuntu
For alternative package managers, consult the respective documentation.
User
You can install the latest release of trudag from the project's package repository:
pip install trustable --index-url https://gitlab.eclipse.org/api/v4/projects/12202/packages/pypi/simple
pipx can also be used and is recommended instead of a global install:
pipx install trustable --index-url https://gitlab.eclipse.org/api/v4/projects/12202/packages/pypi/simple
General python package installation documentation can be found here.
Remember, if you require additional dependencies when using trudag, (such as
validator or reference plugins), you can inject these into the environment of
your pipx install:
Developer
If you are planning to modify the source code, you should use poetry instead. Installation instructions can be found here.
Clone the repository and enter it, then install trudag's dependencies.
Finally activate poetry's built-in virtual environment (Some IDEs may do this by default).
git clone https://gitlab.eclipse.org/eclipse/tsf/tsf.git
cd trustable
poetry install
eval $(poetry env activate)
Confirm everything is working as expected by running pytest.
Alternatively, you can also activate the local dev shell using Flox.
This shell allows us to share the same pinned dependencies and installs all the required tools for trudag development.
All subsequent changes you make to trudag's libraries or CLI tool will be
automatically reflected in your current shell.
Pre-commit
It is recommended to setup pre-commit, which would allow for local linting and checks, which can catch CI failures locally.
Either install pre-commit using your system package manager, or use poetry as pre-commit is listed as a dependency.
Run the following command to install the required git hooks:
Dotstop
Dotstop is an in-development text-based requirements management package. It is similar to doorstop, but uses a decoupled representation of the edges and nodes of the requirements graph both in memory and on disk. This has significant advantages for modularity and scalability, particularly for large software projects distributed across multiple repositories or even multiple platforms.
Concept1
In the simplest sense, dotstop is just a set of uniquely named markdown files
that are linked by edges described in a
.dot or "DAG of tomorrow" file.
.dot is a near-universal and open-source language for describing directed
acyclic graphs. This has many benefits for data analysis of Trustable, since
.dot can be directly parsed by many powerful open-source network analysis and
visualization tools, such as:
dotstop uses two small (and syntax-compliant) extensions to the .dot language.
First, dot's custom node and edge attributes are used to store the hashes of
items and their links:
"ITEM-001" [sha="<hash of file content>"]
"ITEM-001" --> "ITEM-002" [sha = "<hash of concatenated file contents>"]
This allows the Suspect status of items and links to be straightforwardly and robustly tracked.
Secondly, .dot supports C++ style comments (/*,*/ and //) and ignores
all lines beginning with #. Crucially though, such lines are not considered
comments. Since links (and their hashes) are stored separately from the content of
items, handling external dependencies can be reduced to a simple namespacing
problem:
# import https://github.com/project/project.git@sha as project
"project.ITEM-001" [sha="<hash of file content>"]
"project.ITEM-001" --> "ITEM-002" [sha = "<hash of concatenated file contents>"]
Implementation
Warning
Dotstop has now been developed to the point where we have fully deprecated doorstop as a viable backend.
The first important type introduced by dotstop is the Item class. These
objects possess all the data and functionality associated with a set of
Statements:
- Statement names (hashed)
- Statement text (hashed)
- Artifact references (hashed)
- Status (normative/non-normative) (hashed)
- Expression as a header
- Parent document
- Score
- Ordering, level-based and/or alphabetical
The second crucial type is the Graph class. Using a pydot.Graph
representation of the .dot file, this object provides all of the data and
functionality associated with the relationship between the set of Statements
and the wider project:
- Network adjacency matrix
- Representation as
.dotsource - Review status of
Items, based on item hashes - Link status, based on item hashes
Roadmap
Dotstop is under heavy development, and the design sketch below is for context only:
flowchart TB;
subgraph trustablemodule["`**trudag**`"]
direction BT
cli["`**trudag.cli**`"]
score["`**trudag.score**`"]
manage["`**trudag.manage**`"]
plot["`**trudag.plot**`"]
subgraph publishmodule["`**trudag.publish**`"]
direction BT
Report["`_trudag.publish.Report_`"]
end
plot
manage --"called by"-->cli
plot --"called by"--> cli
publishmodule --"called by"--> cli
score --"called by"--> cli
end
graphalyzer --"called by"--> score;
score --"called by"--> Report
subgraph graphalyzer["`**graphalyzer**`"]
DirectedAcyclicGraph["`_graphalyzer.DirectedAcyclicGraph_`"]
end
DotstopGraph --"built from"--> DirectedAcyclicGraph
DotstopGraph --"member of"--> Report
dotstopmodule --"called by"--> plot
dotstopmodule --"called by"--> manage
subgraph dotstopmodule["`**dotstop**`"];
DotstopGraph["`_dotstop.Graph_`"]
PydotGraph["`_pydot.Graph_`"]
MarkdownItem["`_dotstop.MarkdownItem_`"]
MarkdownItem --"member of"--> DotstopGraph;
PydotGraph --"member of"-->DotstopGraph;
end
dot --"read/write"--> PydotGraph;
item --"read/write"--> MarkdownItem;
subgraph Disk;
dot[/graph.dot/]
item[/ITEM-*.md/]
end
-
This section is intended to serve as motivation for the goals of the project and the technical decisions being made. These features are not yet implemented. ↩
Using the trudag CLI tool
trudag is the Trustable project's tooling solution for applying the Trustable Methodology.
You don't need to use trudag to apply Trustable, but its likely to be the easiest route for greenfield projects.
This documentation is intended to explain how to use the tool to apply the Trustable Methodology.
To this end, this page is structured to mirror the methodology documentation.
If you haven't read that yet, please read it first.
For specific technical help, check trudag --help, the API reference pages or ask in our TSF Matrix chat room.
Model
Trudag stores statements as markdown files that are uniquely named within a project's git repository.
The name of the file serves as the Statement's identifier (without the .md extension).
Each statement file contains a small amount of frontmatter and the Statement itself.
An illustrative example, SUN-BRIGHTNESS.md, is shown below:
---
# Is the item Normative or Informative.
normative: True
# Used to specify a Validator and its arguments.
evidence:
type: "name"
configuration: {}
# Used to specify Reference artefacts.
references:
- type: "file"
path: "path/to/ref"
# Collection of SME scores.
score:
Neil: 0.8
Rachel: 0.6
---
The Sun is bright.
Statement files are intended for human editing only.
They are read by trudag, but trudag will never write to existing Statement files.
Warning
Statement identifiers may not contain the " character (This is a limitation of the pydot library currently used for parsing .dot).
The links between items are stored in a .dot file, .dotstop.dot, which is placed in the top-level of a project's git repository.
This file is intended to be human-readable (in particular when reading diffs in a version control system) but should not be edited by hand.
An example .dotstop.dot file is shown below:
# This file is automatically generated by dotstop and should not be edited manually.
# Generated using trustable 2025.3.14.
digraph G {
"SUN-BRIGHTNESS" [sha="0d200951"];
"SUN-TEMPERATURE" [sha="7dh394h"];
"SUN-BRIGHTNESS -> "SUN-TEMPERATURE" [sha="h3d730s1"]
}
trudag uses a strict subset of the DOT language.
That means trudag output is guaranteed to be valid DOT and, by extension, compatible with popular libraries like networkx,
tools like gephi and,
of course, graphviz itself.
Methodology
This section explains how trudag can be used to apply the Trustable Methodology.
Recall that the methodology is defined in terms of several distinct and unordered stages:
- Setting Expectations
- Providing Evidence
- Documenting Assumptions
- Recording Reasoning
- Assessing Confidence
When moving between any two stages,
the Modification guidance must also be considered.
The next section discusses how trudag is used in each of these stages.
Setting Expectations
To begin a trudag project, run
This will create an empty dotstop database. (You can optionally pass a name for the database by trudag init -n NAME)
Applying the TSF
If you're planning to use the Trustable Methodology to apply TSF, you can import the TSF artifact into your project (available at https://gitlab.eclipse.org/eclipse/tsf/tsf/-/releases).
To populate this database, run
to create your first item. This will create the item SUN-COLOR, stored in the markdown file /path/to/desired/location/SUN-COLOR.md.
The first argument is the "document" prefix.
Items with the same prefix are grouped together for presentation purposes.
This has no other effect.
The second argument is the item name.
It may not contain the - character, which is reserved for document prefixes.
Open the file and write your Expectation, following the example in the model section.
You can also add an existing item inside the project directory by:
This will add the SUN item file to your database and allow you to perform further operations.
Providing Evidence
Recall the methodology introduces two classes of Evidence:
- Evidence argued on the basis of a Reference
- Evidence argued on the basis of a Validator
These two classes are addressed in turn in the following subsections.
References
To provide a Reference, add a references field to the frontmatter of an Item.
You may provide multiple references as an array. For each reference you should specify a type.
You will also need to provide additional fields that are dependent on the type of reference you are using.
Check the documentation for the specific reference type you are interested in.
Tip
You can define custom references outside of the builtin types.
The following sections describe the builtin types and their uses:
Type: LocalFileReference
References to Artifacts that are regular local files, in the git sense.
This means any regular file that is present in the git tree for the current commit of the local repository.
For example, a reference to a locally stored text file:
Type: GitlabFileReference
References to Artifacts that are regular files in a remote GitLab repository.
For example, a reference to the README.md for the official GitLab repository:
---
references:
- type: "gitlab"
url: "gitlab.com"
id: gitlab-org/gitlab
path: README.md
ref: main
---
This Statement has a Gitlab reference.
Validators
To indicate a Statement should be automatically Validated, add an evidence field to the frontmatter of an Item.
You may only provide a single Validator against a Statement.
You should specify a type (a string) and a configuration (a dictionary of any schema).
The fields of configuration are passed directly to the Validator function.
Out of the box, trudag does not contain any Validators.
Check out how to write your own here.
The following is an example of using a validator named my_custom_validator:
---
evidence:
type: my_custom_validator
configuration:
arg1: 123
some_path: docs/validators.md/process/personnel.md
---
Item body
This results in my_custom_validator being used to score the item with the stated configuration.
Documenting Assumptions
From the perspective of our data model,
Assumptions are just Items without Children or Artifacts.
The same is true in trudag.
Any Item without Children or Artifacts will be automatically identified as an Assumption and scored appropriately.
Recording Reasoning
To construct an argument, use trudag to link different Statements together.
For instance, to create a link from the new Expectation SUN-COLOR to the existing Evidence SUN-TEMPERATURE, run
This command will automatically update the dotstop database to include the link.
Remember, you can Qualify Assertions using the references field exactly as for Evidence.
Assessing Confidence
In the methodology, the assessment of confidence in XYZ was broken into two stages. First, Evidence is subjected to SME review. Secondly, all validators are executed and the scores for all requests are calculated recursively from the Evidence.
SME review
To provide an SME review for an Evidence Statement on the basis of its References,
add a score field and populate it with key-value pairs of assessor names and scores.
Info
The names of SMEs are not currently used by trudag.
To aid with traceability we suggest that:
- Names are unique to an individual and each individual uses only one name.
- Names coincide with usernames in the context they are used, e.g. GitHub or GitLab users.
Validation and Calculation
To recursively calculate scores from Evidence and execute all validators, run
This command will print a summary of scores to the console.
This is not a very nice way to consume the data as a human being.
To produce a markdown report that can be rendered using mkdocs, run
You may also disable validators by passing --no-validate for commands that calculate scores,
or use --concurrent-validation to enable validations to run concurrently, which is particularly
beneficial if your validators are primarily I/O bound.
To monitor your trustable scores over time consider setting up a data store.
Leverage your historical data to include figures in the report that show the change in scores over time.
To do this use trudag publish --figures.
Tip
The environment created by running poetry install --with dev; eval $(poetry env activate); is guaranteed to have all the mkdocs plugins and extensions needed to correctly render the generated report.
Configuring report
For publishing, item files can be configured through the publish field.
All of the options are shown below:
---
publish:
group: "group_a" # Group, used for ordering items
order: 100 # Priority value, used for ordering items
---
By default, the items in the report are ordered by the item name.
You can order the items using the group and order fields.
Using group would group the items with the same group name. Group themselves are ordered by the group name.
Using order would choose the priority of an item within individual groups. It is sorted from low to high values, where the low value would be shown at the top of the group.
If group is unspecified, the item will be in an unnamed group, and this group is shown at the top of the document prefix in the report.
If order is unspecified, those items would be pushed to the top of the group.
If the two items has the same group and order, it will fall back to be sorted by name.
Usage
This section describes briefly how to use trudag. For additional information or options on any
of the commands used, append --help directly after the command.
Tip
Remember, an example process for managing modifications is provided here.
trudag provides all of the functionality necessary to follow this process.
Using the Shell
trudag shell can also be used to initialise a session in which all of the commands below can be run
whilst keeping the graph in context. This can increase the speed of workflows, particularly for large graphs,
and enables both command and item completion. To use the shell to operate on the needs graph, run trudag --needs shell.
Commands in the shell should be inputted in the same way, except the leading trudag is removed.
Every command in the shell updates the dotfile, rather than at the end of the shell session.
Command history persists between sessions on a per-project basis,
history files are stored in a directory trudag/ in $XDG_DATA_HOME if defined, otherwise $HOME/.local/share/.
Internal REPL commands are prefixed by :, use :help for information on them.
External terminal commands can also be run inside the shell by prefixing them with ! .e.g. !ls.
Use ctrl+D or :q to exit the shell.
Adding new items
To create new items use the following:
Inspecting items
To inspect specific items or links, you can use the
and
subcommands respectively.
For a big-picture view, use plot to inspect the entire graph (defaults to svg format):
Reviewing items
Review used to track whether the item is included in our Trustable argument
and up-to-date. It is conducted by a human using the command:
or to review multiple items at once
Note: To mark both the item (or all items provided) and all related links use
the flag --link:
Clearing suspect items
When a linked Item is modified, trudag will report a Suspect link. This
indicates fingerprint link between two items is no longer valid, necessitating a
reevaluation to determine whether its linkage is still relevant.
Linting a Trustable graph
To view the Suspect status of all items and links in a graph, run the manage lint subcommand as shown below (with an example output).
>> trudag manage lint
WARNING: Suspect Link: SUN-BRIGHTNESS -> SUN-TEMPERATURE
WARNING: Unreviewed Item: SUN-BRIGHTNESS
Generating a report
To generate a Trustable report, run the publish command:
By default, report files will be generated in a directory named docs/trustable.
Then you can either view the markdown files directly, or use mkdocs serve to serve the page locally if you have
mkdocs installed.
Data Store
Long-term maintenance entails trend analysis, which depends on data about the project's state.
To perform trend analysis on your TSF project, data about the graph's state must be stored persistently.
In aid of this trudag is able to interface with an external data store.
Please see the data store for guidance.
Needs
A "needs" graph captures unresolved assumptions that a project should satisfy.
It is stored alongside the main dotstop graph in a .needs.dot file.
To initialise a needs graph, run:
Items in the needs graph are created and managed using the --needs flag:
Moving items between graphs
To move an existing item from the dotstop graph into the needs graph:
To move an item from the needs graph back to the dotstop graph:
Warning
An item can only be moved if it has no links (parents or children) in its source graph. Remove any links before moving.
Migration from doorstop
Though doorstop was previously a supported backend, as trudag has evolved, it can no longer be supported. If you still use a doorstop database, use release v2025.05.29, the last release that supported the migration command.
Ignoring Specific Files or Folders
Sometimes you may want to exclude certain files or directories from processing.
To do this, we provide a .trustableignore file (located either at the git
repository root or in the same directory as your .dotstop.dot file).
This file allows you to explicitly tell trudag which files or folders to
ignore.
Note: Globbing (wildcard patterns like
*or?) is not supported.
In some cases, you might have files with the same name across different
repositories.
If a Markdown file happens to share a name with an item in your trustable graph,
this can cause conflicts.
To prevent that, simply add the file path (relative to the git root) to your
.trustableignore.
Example: Ignoring a Specific File
If you have a file named TRUSTABLE-SOFTWARE.md inside a folder that’s
not related to TSF, add the following line to your .trustableignore file:
To Ignore an entire folder
Plugins
Using references
trudag can be used to Reference any data that can be consistently hashed.
trudag natively supports References to two sources of data: files in the git tree and files in a remote repository hosted in gitlab.
The included sources can be easily extended using the BaseReference interface.
We illustrate this with an example: Referencing content at a specific url, using the frontmatter syntax below.
Deriving a custom reference type from BaseReference
We begin by defining a subclass of BaseReference,
WebReference that has type() webpage.
from trudag.dotstop.core.reference.references import BaseReference
class WebReference(BaseReference)
@classmethod
def type(cls):
return "webpage"
Next, lets add a constructor that matches the keywords used in the frontmatter. Also, lets use that url to fetch the relevant content (note this is an example; naively hashing all of the html at some url is not a great plan).
import requests
from trudag.dotstop.core.reference.references import BaseReference
class WebReference(BaseReference)
def __init__(self, url: str) -> None:
self.url = url
@classmethod
def type(cls) -> str:
return "webpage"
@property
def content(self) -> bytes:
response = requests.get("https://" + self.url)
return response.text.encode()
Finally, lets add a summary as valid markdown. For simplicity, we will just use the url itself.
import requests
from trudag.dotstop.core.reference.references import BaseReference
class WebReference(BaseReference)
def __init__(self, url: str) -> None:
self._url = url
@classmethod
def type(cls) -> str:
return "webpage"
@property
def content(self) -> bytes:
response = requests.get("https://" + self._url)
return response.text.encode()
def as_markdown(self, filepath: None | str = None) -> str:
return f"`{self._url}`"
Now we have defined our new reference type, we need to make it available to trudag.
There are two ways do this: using the local extensions system, or a packaged plugin.
Using packaged plugins is recommended for production.
The local extensions system is intended for prototyping and small projects only.
A note on performance
All required BaseReferences are constructed when the graph is built.
However, the content property is only evaluated on request.
Therefore, loading data or performing expensive checks in the constructor will lead to poor performance.
Instead, check the validity of arguments at construction, but load data only in the content property.
Adding as a local extension
To add WebReference as a local extension, create a new file.dotstop_extensions/references.py in the current working directory.
All concrete implementations of the BaseReference class available in this namespace will now be usable within trudag, accessible using their type() method.
Packaging as a plugin
To mark a package as containing plugins, add the following entry point to its pyproject.toml:
All concrete BaseReference subclasses present in the entry point namespace will now be accessible, again by using their type() method.
Inbuilt reference types
trudag also exposes further reference classes, that can be used as is in
references, or extended to create your own. These are:
gitlab(GitlabFileReference) : References a file in a remote Gitlab repository.file(LocalFileReference) : References a local file.source span(SourceSpanReference): References a span of source code.artifact(ArtifactReference): References a subgraph of a resolved graph within an artifact.
Further documentation on their use can be found in the classes' docstrings.
Using validators
trudag can be used to validate any data that can be reduced to a floating point metric.
However, trudag does not provide any validation functionality out of the box: we expect users to have their own use cases that are highly specific to their Artifacts.
This page describes how to write and integrate your validators.
A Validator is any type-hinted function with the signature
where yaml is an alias defined as
Writing a validator
As an example, suppose we would like to compare the response time of an http request to some target. First, write the function signature. Remember, the function name will be used to identify the validator in item files, so choose something sensible.
Next, decide what the entry in a Markdown file will look like for your validator. For our use case, the following yaml fields seem sensible:
evidence:
type: https_response_time
configuration:
target_microseconds: # Target to be met, or beaten.
url: # url to query
The contents of the configuration field will be passed directly to the validator as a dict.
We can now write the body of our validator.
A good validator will always return a score; remember it is always safe to be conservative and return 0.0.
def https_response_time(configuration: dict[str, yaml]) -> tuple[float, list[Exception | Warning]]:
target = configuration.get("target_microseconds", None)
url = configuration.get("url", None)
if not url:
return (0.0, [ValueError("No url specified for https_response_time validator")])
if not target_microseconds:
return (0.0, [ValueError("No target time specified for https_response_time validator")])
response = requests.get("https://" + url)
score = min(target/response.elapsed.microseconds, 1.0)
return (score, [])
Tip
Validators are used by default when scoring items, you can pass --no-validate to skip using validators.
Check the usage for more information.
Adding as a local extension
To add https_response_time as a local extension, create a new file.dotstop_extensions/validators.py in the current working directory.
All functions with the signature <Signature (configuration: dict[str, yaml]) -> tuple[float, list[Exception | Warning]]> available in this namespace will now be usable within trudag, accessible using their method name.
Packaging as a plugin
To mark a package as containing validator plugins, add the following entry point to its pyproject.toml:
All functions with the signature <Signature (configuration: dict[str, yaml]) -> tuple[float, list[Exception | Warning]]> available in this namespace will now be usable within trudag, again accessible using their method name.
Setting Up a Trustable Data Store
Long-term maintenance entails trend analysis, which depends on data about the project's state. To perform trend analysis on your TSF project, data about the graph's state must be stored persistently.
Git is designed as a store for your source code, hence it is suitable for storing the structural component of the graph state. Storing the artifacts of this structure, such as scores, would quickly become unmaintainable in git. Furthermore, a historic snapshot of your output state is not always practical to reproduce from the project's source code and graph structure alone. An external data store provides a maintainable solution for recording the graph's output state.
The following documentation shows how the reference tool, trudag, integrates with external storage.
The interface between trudag and external platforms is user-defined, allowing integration with systems like SQL databases or OpenSearch-like data stores.
For starters, to build a simple proof of concept, trudag can dump the score, which can then be transformed and pushed to an external platform using a custom script (as shown below).
Throughout the rest of this document, more robust and long-term data model solutions provided by trudag are explained.
Data Models
For long-term maintainability, it's good practice to define a data model to ensure consistency, simplify data store design, and reduce the risk of corruption.
Accordingly, trudag relies on a specific internal data model, so care must be taken to ensure it aligns with the external data store to avoid conflicts.
In order to prevent clashes, the data_model.py file provides a schema that ensures all fields are present, and in a usable format.
This schema ignores extra keys, designed to support extended data models.
The data model provided here needs to be a superset of the one expected by trudag.
Transformations enable you to define backwards compatibility between changes to your data model.
These transformations should be included in your implementation of the data store connector, such that the output of data_store_pull (see below) fits the superset criteria mentioned above.
To aid with backward and forward compatibility trudag provides a Schema version field to track any changes to the data model.
Data Spec
Below is a schema defining the shape of the core data model expected by trudag.
This is expected in a .json format.
RootLevel
| Field | Type | Notes |
|---|---|---|
| scores | list[ScoreDict] | See ScoreDict below. |
| info | Info | See Info below. |
ScoreDict
| Field | Type | Notes |
|---|---|---|
| id | str | The name of the statement e.g. SUN-BRIGHT. |
| score | float | The trustable score. |
Info
| Field | Type | Notes |
|---|---|---|
| Repository root | str | The top level directory of the git repository. |
| Commit SHA | str | The current git commit SHA. |
| Commit tag | str | The latest git tag if exists. |
| Commit date/time | int | The UNIX timestamp in seconds when the commit occurred. |
| CI job id | str | The id of the current CI job or "run_locally" if run outside of CI. |
| Schema version | str | The version of this schema used. |
Configuring a Data Store Connector
A data store connector defines an interface for trudag to connect with your data store without an external script.
The connector is implemented by you as a dotstop extension in the file .dotstop_extensions/data_store.py.
The extension must include at least two Python functions - data_store_push and data_store_pull - which trudag uses to push and pull data, respectively, in the following format:
Trudag Data Schema
[
{
"scores": [
{"id": "SUN-BRIGHT", "score": 1.0, ...}
...
],
"info": {
"Commit date/time": 1747754766,
"Schema version": "1",
...
}
}
...
]
Please note that the date format is also important here.
The data format is identical to the one created when dumping to a json file with trudag.
For more information on data models please see the section above.
A simple data_store.py file might look something like this.
def data_store_pull() -> list[dict]:
data = get_my_data()
return data
def data_store_push(data: list[dict]):
push_my_data(data)
def get_my_data() -> list[dict]:
# Insert data store interfacing logic here
def push_my_data(data: list[dict]):
# Insert data store interfacing logic here
Using a Data Store Connector
To push data to the data store run any command that supports --dump with the argument data_store.
e.g.
or
Collecting data with each change to your projects mainline branch ensures accountability for that changes impact to the graph state.
Remote Graph
TSF graphs are easier to manage when broken down into smaller, verifiable projects that can be integrated into larger graphs. A Remote Graph is one of these composable building blocks. It is a finished artifact or snapshot of a TSF graph that you can depend on but not change.
Currently, remote graphs are implemented as artifacts: files that package all the data required from one graph for reference by a different graph that builds upon it.
What is a Remote Graph?
Think of a remote graph like a published library:
- It is immutable: once published, it never changes.
- You must use it as-is: you cannot edit its internals.
- It defines clear needs: it tells you what you must resolve in order for you to use it safely.
- It acts as an interface: connecting what the remote graph provides with what your local graph must supply.
The most common use-case of remote graphs is the integration of upstream project's graph into downstream project's graph. Upstream can provide argumentation and declare needs which downstream can reference and substantiate respectively. This enables traceability between the two projects, without either requiring access to the other's working graph.
Parts of a Remote Graph
A remote graph usually contains two complementary pieces: the Resolved Graph and the Needs Graph.
Resolved Graph
The resolved graph is a frozen snapshot of a graph at a specific point in time, everything that has already been computed (Validators) or processed (References).
Key points:
- Includes pre-computed scores and references
- Read-only: you cannot attach new local items directly into it.
- Exports content and references used to build it.
- Including needs from another remote graph.
- Does not expose transitive dependencies (you can’t automatically see what it depends on).
- Including needs from another remote graph.
Needs Graph
The needs graph describes what is not yet resolved. This is what your local graph must provide in order to make use of the remote graph. Think of it as the “contract clauses” of the artifact.
Key points:
- Includes unresolved assumptions such as AoUs (Assumptions of Use).
- Imported into your local graph under a namespace for clarity.
- Must be scored locally, even if you choose to ignore parts of it.
- Important: ignored assumptions must still be documented.
- Ensures your local graph complies with the requirements of the remote graph.
Usage
Consumed artifacts should be associated with the same revision of the project that is being used.
If you are using v0.3.0 of a project, the artifact consumed should also have been generated
at release v0.3.0. It is therefore recommended that producers of artifacts export on release
as part of a CI pipeline, such that the artifact is versioned, and is the source of truth.
Needs
Needs document dependencies (or 'assumptions of use') that are defined by a graph that need to be satisfied in the context of a downstream consumer of that graph. These typically request evidence and/or assertions that cannot be defined in the originating graph, because they are context-specific. They may also document known limitations or constraints on the usage of the associated software, or known gaps in the reasoning or evidence provided.
Initialising and Populating the Needs Graph
To create and operate on the "needs" graph, use the top-level --needs option:
trudag --needs initwill create the.needs.dotfile.trudag --needs manage create-item ...will create an item in the needs graph.trudag --needs manage move-item --to-dotstop MY-ITEMwill moveMY-ITEMto dotstop, from needs.
Any trudag command, except export, can be run with this option; the needs graph
is managed in the same way the main graph is managed.
Moving existing items into the needs graph
If an item already exists in the dotstop graph and should be treated as a "need" it can be moved using move-item:
The item must have no links (parents or children) in the source graph before it can be moved.
The underlying markdown file is not moved or deleted, only the graph entry is transferred between
.dotstop.dot and .needs.dot.
To move an item back from needs to dotstop:
Guidance on Writing Needs
Needs are written in the same manner as assertions, and still provide argumentation about the given project, but differ in that it is up to the consumer of the artifact to provide evidence that the needs are fulfilled. Needs can contain references, but SME scores and validators are not permitted. For example:
---
references:
- type: file
path: path_to_reference.txt
---
Project XYZ is built reproducibly with using tooling ABC.
Producing an Artifact
Run trudag export --artifact <path> --project-name <name> to write the artifact to a file specified by --artifact.
This will both resolve references and run validators for the graph, so any required plugins must be
available when performing an export.
This command will fail if any references fail, as issues can propagate to artifact consumers. However, the option
--allow-failure is available if you wish to ignore this.
Warning
Though it is not required, it is highly recommended that all items and links are reviewed before exporting, particularly if the artifact is for a release.
Consuming an Artifact
Currently, when working with artifacts, we expect the artifact to be present as a file in a reachable directory.
Importing Needs
For a consumer project, once the artifact has been obtained, run trudag import
to import the needs items into the local trustable graph. This command uses the following parameters:
--artifact: (required) The path from which to read the artifact--import-dir/-d: (required) The path to the directory where imported items should be extracted.--namespace/-n: (required) Prefix for any extracted items. This helps resolve any name conflicts.
During the import process, the needs items will be created in the specified directory, prefixed with the namespace, and imported into the consumer project's graph.
The consumer project can now use these needs items as any other item in their local project (including attaching evidences for them).
Referencing the Resolved Graph
The ArtifactReference allows an item in the local graph to reference a subgraph of an artifact's resolved graph.
For example, an item with the following frontmatter will reference the descendant subgraph with roots ITEM-1 and ITEM-2:
Consequently:
- Any changes to the referenced items will prompt a review for the referencing item.
- This may be the case if a newer artifact is pulled.
- The published report will also publish a report of the referenced artifacts. Referenced items can be clicked through to in the rendered reference.
Planned Features
- Version migration support: smoother upgrades between remote graph versions.
- Backward compatibility:
- Load newer versions with warnings if they break rules.
- Safely use older versions without issues.
- Explicit exports: if something should be shared, it must be marked
public(similar topublicvs.privatein OOP). - Implementation for Read Only enforcement of Resolved Graph.
Remote subgraph use cases
This page describes the use cases for the planned future development of the remote graph feature.

Trustable projects can export validated instances of their graph as TSF project artifacts. These can define subgraphs, such as a needs graph, which may be referenced or imported by other projects to expand upon them in their own context. The resolved evidence artifacts provided by the exporting projects can be referenced by a project consuming the artifact, but not imported.

Consuming projects use Junctions to specify the graphs and subgraphs they import. These specify the project artifact for the graph and namespaces for imported subgraphs. Statements in the local project can then include References to Statements in the remote graph, or link to Statements in an imported subgraph using the namespaces specified in the Junction.
A junction can also reference a Junction specified in another project, to consume the same version of a 'parent' graph.
- e.g. In the illustration above, Z is consuming a subgraph from the same version of X as specified by Y.
This junction-of-a-junction approach is included to enable the explicit selection and inclusion of a specific version of an upstream graph as specified by a given project.
- This is preferred over implicit inclusion because a project may have more than one remote graph, which may in turn have a shared remote
- An obvious example of a shared remote is the Trustable project itself
Types of subgraph

Imported subgraphs should include any artifacts referenced by the included Statements (ref. 1 in the diagram).
- e.g. A document referenced by a Statement to qualify / clarify its meaning
- For this reason, any such artifacts should be stored in coordination with the Statements that reference them
However, imported subgraphs should not include evidence artifacts that are specific to the originating project.
- Distinguishing these automatically is not possible with the current trudag / dotstop implementation.
- Future versions of the tooling and data format should make an explicit distinction between artifacts referenced as evidence and referenced artifacts that qualify a Statement
- See the TSF Architecture for the use of References and Evidence to facilitate these distinctions.
Given the Architecture as specified, subgraphs may include (or omit) evidence in a variety of ways:
- For ref. 2, only a Premise is included in the subgraph, so the importing project must attach context-specific Evidence to this.
- For ref. 3, an Evidence element is included, and this identifies the Validator that will be used, but this may need to be overridden in the consuming context to describe the evidence artifacts that the Validator will process.
- For ref. 4, an Evidence element with a Reference is included. The Reference identifies a file that is local to the git repository, which identifies the input evidence artifacts needed by the Validator. The consuming project may therefore provide a file in the same file path, which provides the Validator with the required information.
Rust Roadmap
Overview
In the near future we plan to introduce new tooling for TSF to replace trudag. The tooling will be written in Rust and is currently in the planning stage. A name for the new tools has not yet been decided.
Motivation
The reason for developing new tools arises from multiple pain points associated with python and the current architecture. With the new tooling, we expect the following improvements:
- Increased Performance: Rust offers better performance and reduced resource usage.
- Ease of Use and Distribution: Single binary rather than convoluted Python environments.
- Extensible Plugins: Plugins that are not tied directly to python objects.
- Improved Remote Graph: Remote graph implementation will be heavily considered in initial design.
Roadmap
The migration from trudag to the new Rust based tools will proceed through the following phases:
-
Documenting Design:
- Document new architecture, schemas and interfaces of the new tool.
- Define an interface for plugins.
- Establish expectations.
-
Acceptance Tests:
- Define an acceptance test framework.
- Setup CI capable of building and testing the new tool.
- Run the acceptance tests against both the old and new tools simultaneously.
-
Migration:
- Release tools enabling existing users to port persistent data to new tools.
- Implement shims to enable compatibility between python trudag plugins and the new tools.
-
Transition:
- Cease development of Python trudag.
Plugin Interface
Currently, trudag plugins are implementations of a Python abstract base class in external modules, which have proven quite restrictive in the follow ways:
- Plugins must be written in Python.
- Plugins cannot be shared, often there is duplicated operations.
- Tight coupling between plugin and trudag code.
- Difficult to version plugins.
In the new tools, we want plugins to satisfy a strict but minimal interface, prioritising extensibility and reliability.
Plugin Protocol
The goal for the new tools is to move computation done in plugins to a structured communication mechanism between the tool and external sources via a remote procedure call. This model decouples plugins from the tool such that they can be developed, tested and deployed completely separately.
Initially, a compatibility shim will be provided to incorporate existing python plugins into the new protocol, which should ease the transition to this new model.
Remote Graphs & Junctions
The integration of multiple projects in TSF is a core part of the design of the new tools.
The new tooling should provide the following functionality:
- Reference and reuse of argumentation from projects that they use in a software context to construct a valid argument about why the integration is Trustable.
- Import templated arguments, for example the TSF tenets and assertions, that can be expanded upon.
- Changes to upstream projects are reflected in the importing project.
The plan is to introduce the "junction" element, separate from statements, that serve as an interface between projects' argumentation.
With junctions, a project implementing TSF can define:
- Argumentation that can utilised by other projects, with accompanying constraints.
- Which argumentation is being imported from other projects and how it integrates into their own argument structure.
This approach enables projects to build upon established arguments rather than reproducing them, whilst maintaining clear separation and tracking of provenance.
Scoring
Current scoring is achieved through representing the graph as an adjacency matrix on which we perform matrix operations to percolate scores up the graph.
With the new tooling, we aim to make scoring more extensible, where different operations of score propagation can occur.
Implementation Architecture
From a technical perspective, the new tooling will be structured as follows:
-
core: Contains primitives - statement, item and graph. Defines core API for interacting with graph as a library, including serialisation to and derserialisation from an artifact.. -
remote: Implementation of junctions. -
resolvers: Defines plugin interface. -
score: Scoring algorithms. -
publish: Report generation. -
cli: Command line frontend.
API Reference
trudag
trudag
Tools for summarising the compliance of trustable software projects, in written, graphical and numerical forms.
config
trudag.config
trudag.config.config
GeneralConfig
Bases: TrustableConfig
Given a custom configuration list, validates it. Given no configuration, uses the default configuration in trudag.
get
Provides a way to search for values in the sub-sections of the configuration by recursively searching for the value of a given keys list. For example: get(["reports", "TSF", "enable_figures"]) will return the value for enable figures under the TSF report (.i.e. False).
override_config
Overrides the configuration content with the given custom_config content.
safe_get
Provides a safe way to search for values in the sub-sections of the configuration by recursively searching for the value of a given keys list. If given keys don't exist in the configuration, we fallback on the default general config. For example: get(["reports", "TSF", "enable_figures"]) will return the value for enable figures under the TSF report (.i.e. False).
TrustableConfig
Bases: Protocol
Protocol to support different kinds of configuration to be used in the trustable project. It has a basic requirement to support schema accessor, version of configuration, and configuration content.
schema
property
schema: Schema
Getter method to be implemented for returning the schema used in the configuration.
Returns:
trudag.config.schema.Schema of the configuration that validates the configuration content.
trudag.config.schemas
dotstop
trudag.dotstop
A Library for managing Trustable Graphs stored as dot files.
core
trudag.dotstop.core
trudag.dotstop.core.artifact
Artifact
Artifact(artifact_file: Path | None = None)
Provides a class to interact with artifact files. This could used to export and import artifacts. This class also supports the context manager protocol to be able to rollback any changes during importing from artifact files in the case of failure.
export_to
export_to(
graph: TrustableGraph,
project_name: str,
file: Path,
validate: bool,
concurrent_validation: bool,
workers: int | None = None,
allow_failure: bool = False,
)
Exports an artifact including the graph and needs to the given file named with project_name.
Args:
graph(TrustableGraph): The graph to export through an artifact.
project_name(str): The name stored for the project in the artifact metadata.
file(Path): Output path for the artifact file.
import_from
import_from(
file: Path,
graph: TrustableGraph,
local_dot: Path,
namespace: str,
import_dir: Path,
)
Imports needs items and a resolved remote graph, to be prefixed with a namespace, from an artifact file into a import_dir directory.
Args:
file(Path): Input path for the artifact file.
graph(TrustableGraph): The local graph to be updated.
local_dot(Path): The path of the local dotstop file to be updated.
namespace(str): The namespace name to be prepended for the imported items.
import_dir(Path): The path of the output directory for the imported files to be stored.
trudag.dotstop.core.constants
A module providing constants used throughout dotstop.
DataObject
module-attribute
DataObject: TypeAlias = (
str
| int
| float
| bool
| None
| list["DataObject"]
| dict[str, "DataObject"]
)
Type alias for JSON/Yaml data possibly stored as a nested dict/list data structure.
IGNORE_FILE
module-attribute
Write this message to all files automatically generated by dotstop.
LAST_FORMAT_VERSION
module-attribute
LAST_FORMAT_VERSION = parse('0000.0.0')
The last trustable version that changed dotfile formatting.
data_store
trudag.dotstop.core.data_store
trudag.dotstop.core.data_store.data_model
trudag.dotstop.core.data_store.data_store_client
DataStoreClient
Class to handle interactions with the data store defined by the user in .dotstop_extensions/data_store.py
DataStoreClientProtocol
DataStoreClientSingleton
Simple class to ensure only one instance of data store client exists
trudag.dotstop.core.data_store.version
trudag.dotstop.core.exception
Categories of Error that can be encountered when using dotstop.
ArtifactError
Bases: ValueError
An error cause by artifact operation problems.
ArtifactModelError
Bases: Exception
An Error caused by a mismatch between the artifact schema used by the tool and the actual artifact schema.
ConfigError
Bases: ValueError
An error caused by configuration operation failing
DataModelError
Bases: Exception
An Error caused by a mismatch between the data schema used by the tool and the data schema used by a store.
DotstopError
Bases: Exception
Generic Exception for all Errors encountered by the dotstop library.
GitError
Bases: DotstopError
An error caused by git operation failing
GraphActionError
Bases: DotstopError
An Error caused by trying to perform a invalid action (e.g. a query or operation) on a valid graph.
GraphStructureError
Bases: DotstopError
An Error caused by trying create (or perform a valid action on) an invalid graph.
InvalidArgumentError
Bases: ValueError
An error caused by an illegal argument value
ItemError
Bases: DotstopError
An Error caused by discovery of an invalid item, either when creating, using or modifying it.
PluginError
Bases: DotstopError
An Error encountered when using a plugin.
ReferenceError
Bases: DotstopError
An Error encountered when checking references in dotstop item
VersionError
Bases: DotstopError
An Error caused by a version mismatch between current trudag and the version used to generate the dotfile.
graph
trudag.dotstop.core.graph
A module containing the implementation of the TrustableGraph and BaseGraph's.
trudag.dotstop.core.graph.base_graph
BaseGraph
Bases: ABC
A base abstract class that describes abstract methods that must be implemented for a class to serve as the backend for TrustableGraph.
add_edge
abstractmethod
Add the edge parent_id -> child_id to the graph.
edges
abstractmethod
empty
abstractmethod
classmethod
empty() -> BaseGraph
Create an instance of BaseGraph with no nodes or edges.
from_file
abstractmethod
classmethod
Construct an instance of BaseGraph from a file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
Path
|
Path to file to build graph from |
required |
from_string
abstractmethod
classmethod
Construct an instance of BaseGraph from a string source.
get_edge_attr
abstractmethod
Get the attribute associated with key from edge parent_id -> child_id.
Returns:
| Type | Description |
|---|---|
Attribute
|
String if attribute exists, otherwise None. |
get_edge_attrs
abstractmethod
Get all attributes of edge parent_id -> child_id.
get_graph_attr
abstractmethod
Get the attribute associated with key from graph.
Returns:
| Type | Description |
|---|---|
Attribute
|
String if value exists, otherwise None |
get_graph_attrs
abstractmethod
Get all attributes for graph.
get_node_attr
abstractmethod
Get the attribute associated with key from node with node_id.
Returns:
| Type | Description |
|---|---|
Attribute
|
String if value exists, otherwise None |
get_node_attrs
abstractmethod
Get all attributes for node with node_id.
has_edge
abstractmethod
True if the edge parent_id -> child_id exists in the graph.
predecessors
Returns nodes which have an outgoing edge to the specified node.
remove_edge
abstractmethod
Remove the edge parent_id -> child_id from the graph.
remove_node
abstractmethod
remove_node(node_id: str) -> None
Remove a node with node_id from the graph.
set_edge_attrs
abstractmethod
Set the attributes for each key-value pair in attrs for edge parent_id -> child_id.
Attributes should operate like a dictionary, existing attributes are updated, new attributes are appended.
set_graph_attrs
abstractmethod
set_graph_attrs(**attrs: Attribute) -> None
Set the attributes for each key-value pair in attrs for graph.
set_node_attrs
abstractmethod
Set the attributes for each key-value pair in attrs for node with node_id.
successors
Returns nodes which have an incoming edge from the specified node.
trudag.dotstop.core.graph.graph_factory
build_trustable_graph
build_trustable_graph(
graph_source: Path | str | None = None,
items_source: Path | list[Item] | None = None,
trustable_ignore: Path | None = None,
) -> TrustableGraph
Builds a TrustableGraph from given sources.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph_source
|
Path | str | None
|
DOT file path or string, or None for an empty graph. |
None
|
items_source
|
Path | list[Item] | None
|
Markdown directory path, raw items list, or None for an empty list. |
None
|
trustable_ignore
|
Path | None)
|
trustable ignore file like gitignore without globs feature. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
TrustableGraph |
TrustableGraph
|
Complete TrustableGraph object. |
trudag.dotstop.core.graph.pydot_graph
trudag.dotstop.core.graph.trustable_graph
LinkStatus
Bases: Enum
Possible statuses of relationships between a pair of Items in a TrustableGraph.
LINKED
class-attribute
instance-attribute
The two items are linked and both items have not been changed since the link was last reviewed.
SUSPECT
class-attribute
instance-attribute
The two items are linked, but one or more of the items has changed since the link was last reviewed.
TrustableGraph
Construct an instance of TrustableGraph with a BaseGraph and a list of Items.
The elements in both the nodes of the graph and the names of the items are expected to be equivalent.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
BaseGraph
|
An object describing relationships between items that implements the BaseGraph abstract class. |
required |
items
|
list[Items]
|
A list of Items. |
required |
add_items
add_items(
new_items: list[Item],
parent: str | None = None,
reviews: list[bool] | None = None,
) -> None
Add new items as nodes (and edges, if parent item exists).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
new_items
|
list[Item]
|
a list of |
required |
parent
|
str | None
|
the name of an optional parent/source. |
None
|
reviews
|
list[bool] | None
|
a list of booleans of equal length to new_items representing the review status of the items. |
None
|
Raises a GraphActionError if item already exists.
add_namespace
Modify graph during runtime by adding a namespace to all nodes and edges.
check
Raise a GraphStructureError if the Graph object contains illegal items or links.
- The graph does not:
- contain links to or from non-normative items
- contain duplicate nodes or edges
- The list of items has:
- unique names
- separator in name
- names that correspond exactly with the set of node ids in the graph
documents_to_items_map
A dictionary of document names and their constituent Items.
Items are returned in ascending order. Documents are sorted alphabetically by name.
find_shortest_path
Find the shortest path between parent and child node. Returns a list of nodes from parent to child, or None if no path exists.
get_expectations
get_item
Return the Item with name.
Raises a GraphActionError if no such item is in the graph.
get_item_children
Return the list of children of the Item corresponding to name.
Raises a GraphActionError if no item named name is in the graph.
get_item_parents
Return the list of parents of the Item corresponding to name.
Raises a GraphActionError if no item named name is in the graph.
get_item_sha
Get the stored sha256 checksum of the Item with name name.
get_link_sha
Get the stored sha256 checksum of the link between items parent and child.
get_link_status
get_link_status(parent: str, child: str) -> LinkStatus
Return the LinkStatus of the edge from parent_name to child_name.
Raises a GraphActionError if either item does not exist.
get_orphaned_items
Returns a list of items with no parent and child nodes
get_review_status
Return True if the Item with name name is reviewed, False otherwise.
Raises a GraphActionError if no item named name is in the graph.
namespaces
Returns the list of all namespaces on nodes and edges in a trustable graph.
remove_item
Remove the item with name name.
Raises a GraphActionError if the item does not exist.
resolve_all
resolve_all(workers: int | None = None) -> None
Resolves references and sha's of all items concurrently.
set_link_status
set_link_status(
parent: str,
child: str,
status: LinkStatus,
force: bool = False,
) -> None
Set the status of a link to status.
Raises a GraphActionError if:
- Either item does not exist
- The link is set to its current status.
- Attempt to set status from unlinked to suspect.
set_review_status
Set the reviewed status of item name to status.
Raises a GraphActionError if no item named name is in the graph.
sha_link
A sha256 checksum of the link between self and child.
Computed from the shas of self and of child.
stamp_needs
Stamps items in a needs graph by adding a node attribute, that is the hash of the item's text.
trudag.dotstop.core.item
Classes for managing the items in a TrustableGraph.
Item
Item(
name: str,
text: str = "",
normative: bool = True,
unresolved_references: list[BaseReference]
| None = None,
resolved_references: list[ResolvedReference]
| None = None,
validator_config: dict | None = None,
sme_scores: dict[str, float | dict[str, float | str]]
| float
| None = None,
fallacies: dict[str, Fallacy] | None = None,
order: ItemOrder | None = None,
)
A representation for a falsifiable statement and its associated properties.
name
property
name: str
The name of the item constructed from two parts, separated by a hyphen: DOCUMENT-ID
DOCUMENTis the document prefix and may contain any alphanumeric or special character expect.and". Items with the same document prefix are grouped together in aTrustableGraph.IDis used to uniquely identify the item within a document. ID may contain any alphanumeric or special character except.and-.
Example
Item with a name TSF-ITEM_1 is within the TSF document, uniquely identified in this document as ITEM_1.
text
property
text: str
The statement associated with a normative item. If the item is not normative this can be any text.
add_namespace
add_namespace(namespace: str) -> None
Namespaces the item by prepending its name with the namespace name, followed by a ..
header
A short text summary of the item, containing more information than the name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
include_name
|
bool
|
Prepend the item's name to the header. |
True
|
references
references() -> list[ResolvedReference]
Returns references in resolved form.
Unresolved references are resolved lazily, This is an in-place operation, the unresolved state is lost.
ItemOrder
dataclass
Storage for item ordering properties.
create_default_item_markdown
Create a default item, writes it in markdown in dir.
item_from_markdown
item_from_markdown(
item_name: str,
md_content: str,
reference_builder: ReferenceBuilder = ReferenceBuilder(),
) -> Item
Creates an item from markdown content.
migrate_to_versioned_schema
This is a conversion function for artifact schemas before the addition of artifact version to be compatible with the artifact schemas after the addition of artifact version This should be removed in the next release, see : https://gitlab.eclipse.org/eclipse/tsf/tsf/-/issues/492
reference
trudag.dotstop.core.reference
trudag.dotstop.core.reference.builder
Tools for managing the construction of BaseReference, whether they are built-in,
locally-defined plugins or packaged plugins.
ReferenceBuilder
ReferenceBuilder(dir_path: Path | None = None)
Functionality for building References to Artifacts whose types are not known until runtime.
Create a map of Reference types to their implementation as subclasses of BaseReference.
All derived subclasses of BaseReference are available when using this builder,
including those provided by local and packaged plugins.
Local reference plugins are loaded from ./.dotstop_extensions/references.py.
Packaged plugins are identified by the presence of a trustable.reference.plugins entry point in their package metadata
and imported as objects into the current module namespace.
build
Build a BaseReference subclass, inferring types from dictionary field type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reference_dict
|
dict[str, bool | int | float | str]
|
Dictionary to be passed to BaseReference.from_dict. |
required |
trudag.dotstop.core.reference.references
Extensible classes for storing References to Artifacts
ArtifactReference
Bases: BaseReference
An ArtifactReference references a subgraph of another Trustable project
through the artifact.
When publishing, an additional report will be published from this artifact,
which the rendered reference will link to.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
path of artifact file |
required |
roots
|
list[str]
|
roots of the subgraph |
required |
BaseReference
Bases: ABC
Abstract base class for defining References to Artifacts.
Concrete subclasses of BaseReference are
identified by their type property and:
- Are expressible as a string of valid markdown using
as_markdown() - Have a persistent sha256 checksum
sha - Express a relationship to a specific sequence of
bytes,content
content
abstractmethod
cached
property
content: bytes
The content of the Referenced Artifact as bytes.
sha
property
sha: str
The content of the Referenced Artifact as a sha256 checksum, formatted as a hexadecimal str.
Raises a ReferenceError if the content cannot accessed.
as_markdown
abstractmethod
The content of the Referenced Artifact as a string of valid markdown.
The optional argument filepath is provided to allow subclasses of BaseReference to express links to other documentation or data.
This is required as markdown links are always relative.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filepath
|
Path | None
|
Path to markdown file being written to. Defaults to None. |
None
|
from_dict
classmethod
Construct a BaseReference from a dictionary with the appropriate type value.
Raises a ReferenceError if the object cannot be built.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reference_dict
|
dict[str, bool | int | float | str]
|
Dictionary used to construct |
required |
type
abstractmethod
classmethod
type() -> str
A unique human-readable identifier for the Reference.
This identifier is used by ReferenceBuilder to select the appropriate reference type.
This type is usually specified in non-python contexts, such as markdown frontmatter.
FileReference
Bases: BaseReference, ABC
Abstract base class for References to Artifacts that are regular files.
Provides a concrete implementation of as_markdown().
FILE_EXTENSION_ALIASES
class-attribute
instance-attribute
Aliases for unusual file formats.
Each key-value pair expresses a mapping from a non-standard file extension (the key) to an extension supported by pygments (the value). This allows unusual files to be formatted correctly in a fenced markdown block.
as_markdown
The content of the Referenced Artifact as a string of valid markdown.
Supports all file formats that are supported by pygments.
Note
.csv format files are always assumed to use a comma delimiter, as
the delimiter of a csv file cannot be reliably determined.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filepath
|
Path | None
|
Path to markdown file being written to. Defaults to None. |
None
|
GitlabFileReference
GitlabFileReference(
url: str,
id: int | str,
path: str,
ref: str,
token: str = "GITLAB_CI_TOKEN",
public: bool = False,
retries: int = 3,
**kwargs,
)
Bases: FileReference
References to Artifacts that are regular files in a remote GitLab repository.
A valid gitlab access token with sufficient read permissions must be available in the current environment. Several attempts are made to get a token, with the following precedence:
- User-specified
tokenargument $GITLAB_CI_TOKEN$CI_JOB_TOKEN
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
url
|
str
|
Url of gitlab instance (e.g. |
required |
id
|
int
|
Repository/project id (this is an integer value, unique to repository) |
required |
path
|
str
|
Path to the Artifact, relative to the root of the repository |
required |
ref
|
str
|
Tag, branch or sha |
required |
token
|
str
|
Environmental variable containing a suitable access token. Defaults to "GITLAB_CI_TOKEN". |
'GITLAB_CI_TOKEN'
|
retries
|
int
|
Specify max amount of retries for request. |
3
|
LocalFileReference
Bases: FileReference
References to Artifacts that are regular local files, in the git sense.
This means any regular file that is present in the git tree for the current commit of the local repository.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
Path to the Artifact, relative to the root of the repository. |
required |
rendering
|
str
|
optional 'inline-{expanded,collapsible}' or codeblock-{expanded,collapsible} rendering default: 'inline-collapsible' |
None
|
ResolvedReference
dataclass
ResolvedReference(
type_: str,
sha: str | None,
origin: str,
text_content: str,
metadata: dict,
logs: list[str] = list(),
)
An immutable data structure for a reference that has been resolved.
SourceSpanReference
Bases: BaseReference, ABC
References to a span of source code in the repository.
Should be used as the base class for your own reference types that find the span of the references source code at reference-time - say, by looking up a function name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
Path to the source file, relative to the root of the repository. |
required |
span
|
list[tuple[int, int]]
|
list of two 2-tuples. The tuples are the start and end [line, character] positions spanning the code we wish to reference. |
required |
language
abstractmethod
classmethod
language() -> str
The markdown name of the language of the source code you are referencing.
csv_to_markdown_table
trudag.dotstop.core.validator
yaml
module-attribute
Type alias for loaded yaml data stored as a nested dict/list data structure.
Validator
Find, store and provide access to validator functions that are not known until runtime.
On construction, the Validator object will collect all functions with
the signature:
that are available in the directory .dotstop_extensions/validators.py,
or are available in module entry points belonging to the group
trustable.validator.plugins.
Build a Validator instance.
is_validator_function
staticmethod
True if the provided object is a validator function.
trudag.error
graphalyzer
trudag.graphalyzer
Support for simple applications of graph theory in python using adjacency-matrix-backed representations of graphs.
ABS_TOL
module-attribute
Absolute error tolerance for all inexact floating point operations.
For the double precision floating point values used by this module, setting this
value smaller than its default 1.0e-15 will result in well-defined but
unhelpful behaviour.
trudag.graphalyzer.analysis
Functions acting on the structure of a DirectedAcyclicGraph.
score
score(
graph: DirectedAcyclicGraph,
correctness_dict: dict[str, float],
completeness_dict: dict[str, float] | None = None,
) -> dict[str, float]
Given a dictionary of correctness scores for leaf (or equivalently Premise) items, compute the Trustable Scores of every node in the graph.
Note
- Non-leaf item scores included in
correctness_dictare disregarded. - Unscored leaves are assumed to have score zero.
- If no completeness scores are provided, the argument is assumed to be complete.
- If partial completeness scores are provided, omitted scores are set to zero.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
DirectedAcyclicGraph
|
Graph of the project being scored. |
required |
correctness_dict
|
dict[str, float]
|
Dictionary of (leaf item name, leaf item correctness) pairs. |
required |
completeness_dict
|
dict[str, float] | None
|
Dictionary of (item name, item completeness). Defaults to None. |
None
|
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
Dictionary of (item name, item trustable score) pairs. |
score_edge_sensitivity
score_edge_sensitivity(
graph: DirectedAcyclicGraph,
edge: tuple[str, str],
correctness_dict: dict[str, float],
completeness_dict: dict[str, float] | None = None,
) -> dict[str, float]
Dictionary of node scores differentiated with respect to the chosen edge weight.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
DirectedAcyclicGraph
|
Graph of the project being analyzed. |
required |
edge
|
tuple[str, str]
|
|
required |
correctness_dict
|
dict[str, float]
|
Dictionary of (leaf item name, leaf item correctness) pairs. |
required |
completeness_dict
|
dict[str, float] | None
|
Dictionary of (item name, item completeness). Defaults to None. |
None
|
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
(node, node-derivative) key-value pairs for all nodes. |
score_node_sensitivity
score_node_sensitivity(
graph: DirectedAcyclicGraph,
node_label: str,
completeness_dict: dict[str, float] | None = None,
) -> dict[str, float]
The partial derivatives of node_label's score with respect to other node scores.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
DirectedAcyclicGraph
|
Graph of project under analysis. |
required |
node_label
|
str
|
Label of the node to return partial derivatives of. |
required |
completeness_dict
|
dict[str, float] | None
|
Dictionary of (item name, item completeness). Defaults to None. |
None
|
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
(node, node-derivative) key-value pairs for all nodes. |
trudag.graphalyzer.graph
Adjacency-list-backed representations of graphs, with methods for performing graph analysis using dynamic programming.
DirectedAcyclicGraph
An Adjacency List representation of a Directed Acyclical Graph An adjacency list contains a list of nodes, and for each node, a reference to other nodes. These references represent edges between the nodes. A directed acylic graph (DAG) is a graph that contains directed edges only, and contains zero cycles. - https://en.wikipedia.org/wiki/Adjacency_list - https://en.wikipedia.org/wiki/Directed_acyclic_graph
Construct a DAG from a list of node labels and weighted edges.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
nodes
|
list[str]
|
Unique string labels for each node. |
required |
edges
|
list[tuple[int, int, float]]
|
Tuples of (from_index, to_index, weight) defining directed edges. |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If node labels are not unique, edges reference invalid nodes, weights are negative, duplicate edges exist, or the graph contains a cycle. |
is_connected
is_connected() -> bool
Return True if the graph is weakly connected.
A DAG is weakly connected if every node is reachable from every other node when edge directions are ignored. Returns False for empty graphs.
is_normalised
is_normalised() -> bool
Return True if outgoing edge weights from each node sum to 1 (or 0 for leaves).
is_unweighted
is_unweighted() -> bool
True if the graph is unweighted, false otherwise.
A graph is unweighted if all stored edge weights are either 0 or 1.
label_to_index
Return the index of the node with the given label.
Raises:
| Type | Description |
|---|---|
ValueError
|
If no node with the given label exists in the graph. |
leaves
Return an iterator of (label, index) pairs for nodes with no outgoing edges.
nodes
Return an iterator of (label, index) pairs for all nodes in the graph.
normalised
normalised() -> DirectedAcyclicGraph
Return a new DAG with outgoing edge weights normalised to sum to 1 per node.
Leaf nodes (with zero total outgoing weight) are left unchanged.
trudag.manage
Additional functionality required by but not specific to the CLI.
add_item
add_item(
current_graph: TrustableGraph,
dotstop_path: Path,
filepath: Path,
parent: str,
)
Takes the markdown file at :filepath: and adds it to the graph with :prefix: and :id:. If :parent: is provided the item is linked appropriately.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
current_graph
|
TrustableGraph
|
the current trustable graph |
required |
dotstop_path
|
Path
|
the path to the projects dotstop file |
required |
filepath
|
Path
|
the path to the new items .md file |
required |
parent
|
str
|
the parent item of the new item. Can be set to "" to create an orphaned node. |
required |
Raises:
| Type | Description |
|---|---|
GraphActionError
|
If the item already exists in the current graph. |
FileNotFoundError
|
If the file path does not contain a markdown file. |
ItemError
|
If the markdown at :filepath: is malformatted |
create_new_item
create_new_item(
prefix: str,
path: Path,
parent: str,
uid: str,
output_file: Path,
dot_graph: TrustableGraph,
existing_file: bool = False,
) -> None
Create a new Item in the filesystem and add it to dot_graph.
Updates dot_graph and writes it as dot to output_file.
Creates a default item file at f"{path}/{prefix}-{name}.md".
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prefix
|
str
|
Document prefix for the new item |
required |
path
|
Path
|
Directory where new item will be written. |
required |
parent
|
str
|
Name of the parent item, if any. |
required |
uid
|
str
|
Unique (within a document) name for the item. |
required |
output_file
|
Path
|
File to write the updated graph to. |
required |
dot_graph
|
TrustableGraph
|
Graph to update and write to file. |
required |
existing_file
|
bool
|
When True use a pre-existing markdown file. |
False
|
Raises:
| Type | Description |
|---|---|
GraphActionError
|
If the item to create contains a reserved character. |
FileExistsError
|
If the file path for the item to create is not a valid directory. |
describe_item
describe_item(
graph: TrustableGraph, item_name: str, statement: bool
) -> str
Describes item and formats item details as a human-readable output.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
TrustableGraph
|
Graph to use for describing the item |
required |
item_name
|
str
|
Item name to be described |
required |
statement
|
bool
|
If statement of the item should be included in the output |
required |
diff_graph_git
diff_graph_git(
current_graph: TrustableGraph, revision: str
) -> ExitCodes
Compare the current graph with the version of the graph in a given Git branch.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
current_graph
|
TrustableGraph
|
The current trustable graph. |
required |
revision
|
str
|
Git revision to compare against. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
ExitCodes |
ExitCodes
|
SUCCESS if no changed SME items are detected, else SME_ITEM_CHANGE_DETECTED. |
Raises:
| Type | Description |
|---|---|
GitError
|
If the specified branch cannot be found or compared. |
Notes
- Logs removed and added items.
- Logs content differences for changed items.
- Logs SME scores for changed items if available.
diff_lint_report
Compare two lint reports and log differences.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lint_report_old_path
|
Path
|
Path to the old lint report. |
required |
lint_report_new_path
|
Path
|
Path to the new lint report. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
ExitCodes |
ExitCodes
|
SUCCESS if reports match; otherwise, a LINT_FAILURE. |
Raises:
| Type | Description |
|---|---|
TypeError
|
If reports cannot be loaded or parsed. |
lint
lint(graph: TrustableGraph, dump: Path | None) -> ExitCodes
Perform linting on the graph and log warnings for any issues found.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
TrustableGraph
|
The graph to lint. |
required |
dump
|
Path | None
|
Dumps the result into a specified path. Extension determines the format of the dump. Allowed extensions: .json |
required |
Returns:
| Name | Type | Description |
|---|---|---|
ExitCodes |
ExitCodes
|
LINT_FAILURE if issues are found, SUCCESS otherwise. |
lint_diff
lint_diff(
current_graph: TrustableGraph,
compare_branch: str,
workers: int | None,
) -> ExitCodes
Compare the current graph against another git branch's graph and log differences.
Logs
- New or removed unreviewed items
- New or removed suspect links
- Items with changed SHAs
- Changed items with associated SME scores
Warning
This function checks out the target git branch temporarily, and restores HEAD afterward.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
current_graph
|
TrustableGraph
|
The current working graph. |
required |
compare_branch
|
str
|
The git branch to compare against. |
required |
workers
|
int | None
|
The upperbound for worker threads to spawn. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
ExitCodes |
ExitCodes
|
LINT_FAILURE if any differences are found, SUCCESS otherwise. |
move_item
move_item(
source_graph: TrustableGraph,
source_dotstop: Path,
dest_graph: TrustableGraph,
dest_dotstop: Path,
items: tuple[str, ...],
)
Move item(s) from one graph to another (e.g. needs to dotstop or vice versa).
The item must have no links (parents or children) in the source graph. The markdown file is not moved or deleted.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source_graph
|
TrustableGraph
|
the graph to move items from |
required |
source_dotstop
|
Path
|
the path to the source dotstop file |
required |
dest_graph
|
TrustableGraph
|
the graph to move items to |
required |
dest_dotstop
|
Path
|
the path to the destination dotstop file |
required |
items
|
tuple[str, ...]
|
names of items to move |
required |
Raises:
| Type | Description |
|---|---|
GraphActionError
|
If an item doesn't exist, already exists in destination, or has links in the source graph. |
remove_item
remove_item(
current_graph: TrustableGraph,
dotstop_path: Path,
items: tuple[str, ...],
delete_file: bool = True,
)
Remove item(s) from the graph and optionally delete their markdown files.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
current_graph
|
TrustableGraph
|
the current trustable graph |
required |
dotstop_path
|
Path
|
the path to the projects dotstop file |
required |
work_dir
|
Path
|
the working directory |
required |
items
|
tuple[str, ...]
|
names of items to remove |
required |
delete_file
|
bool
|
if true, delete item markdown files |
True
|
Raises:
| Type | Description |
|---|---|
GraphActionError
|
If an item doesn't exist in the current graph. |
rename_item
rename_item(
current_graph: TrustableGraph,
dotstop_path: Path,
name_from: str,
prefix: str | None = None,
identifier: str | None = None,
delete_file: bool = True,
)
Takes the item :name_from: and renames it with a new :prefix: or :id: this involves copying the .md file to a new name, and updating the dotstop with new links and item and removing the old links and item. The old .md file is not removed.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
current_graph
|
TrustableGraph
|
the current trustable graph |
required |
dotstop_path
|
Path
|
the path to the projects dotstop file |
required |
name_from
|
str
|
the name of the item to move |
required |
prefix
|
str
|
the new prefix to move the item to |
None
|
identifier
|
str
|
the new id for the item (optional) if no value is provided it will use the id from name_from |
None
|
delete_file
|
bool
|
delete file for item with old name |
True
|
Raises:
| Type | Description |
|---|---|
GraphActionError
|
If the item to rename doesn't exist in the current graph. |
FileExistsError
|
If the file path for the item to rename is not a valid directory. |
set_all_link_status
set_all_link_status(
current_graph: TrustableGraph,
item: str,
status: LinkStatus,
) -> None
This sets the link status of all links from the item as status.
This will not change the link status of links to items that aren't reviewed.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
current_graph
|
TrustableGraph
|
The current working graph. |
required |
item
|
str
|
The item to set all the links from/to. |
required |
status
|
LinkStatus
|
The status to set the links to |
required |
trudag.plot
A library for creating graphical summaries of trustable software projects.
Functions for creating .dot files, and interfacing between graphviz and
dotstop.
break_line_at
Take a one-line string and add line breaks at the first whitespace after
every char_limit characters.
This will allow words to overrun where necessary.
Raises a ValueError if the input string is multiline.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
line
|
str
|
Line to break. |
required |
char_limit
|
int
|
Soft limit on line length. |
required |
Returns:
| Type | Description |
|---|---|
str
|
|
build_subgraph
build_subgraph(
graph: TrustableGraph,
pick: list[tuple[str, int | None, int | None]],
include_orphan_nodes: bool,
) -> int
Creates a slice of the original graph through arguments.
!!! THIS FUNCTION IS DESTRUCTIVE !!! The slice of the graph is directly operated on the graph passed through the argument. If the original graph is needed, make sure to make a copy.
The intention behind this function is to create a reduced graph for projects that generates dense graphs that is hard to view and navigate, and to allow for focusing on a slice of interest.
This function will do nothing if none of the filtering arguments are effective.
To allow combining the filter operations together, instead of removing items from the graph on each operation being processed, it will initially track which items must be retained per filter. This item retaining tracking gets used at the end of this operation to remove all items from the graph except for those that are marked to be retained.
pick and orphan_nodes filters are added as a basic set of operations to allow focusing on
points of interest in the graph.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
TrustableGraph
|
Dotstop Graph to filter. |
required |
pick
|
list[tuple[str, int | None, int | None]]
|
(list[tuple[str, int, int]]): Picks nodes to be retained in the graph, specified by item name and levels of parent / child. |
required |
include_orphan_nodes
|
bool
|
Retain nodes that isn't linked with anything else. |
required |
Returns:
| Type | Description |
|---|---|
int
|
Number of items removed. |
format_source_from_graph
format_source_from_graph(
graph: TrustableGraph,
line_length: int,
same_rank: list[list[str]],
invis_deps: list[tuple[str, str]],
base_url: str = "",
body: bool = True,
) -> str
Return a string of dot source code including formatting metadata.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
TrustableGraph
|
Dotstop Graph to generate source from |
required |
line_length
|
int
|
Soft limit for characters-per-line in node labels. |
required |
same_rank
|
list[list[str]]
|
List of lists of item uids that should appear on the same rank in the plotted graph. |
required |
invis_deps
|
list[tuple[str, str]]
|
(list[tuple[str, str]]): List of tuples of item uids that should be invisibly linked. |
required |
base_url
|
str
|
Base url for tooltips. If "", no tooltips are added. |
''
|
plot
plot(
graph: TrustableGraph,
line_length: int,
same_rank: list[list[str]],
invis_deps: list[tuple[str, str]],
pick: list[tuple[str, int | None, int | None]],
orphan_nodes: bool,
output_file_path: Path = Path("./graph.svg"),
base_url: str = "",
body: bool = True,
) -> None
Given a dotstop graph in cwd, plot the tree using Graphviz.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
TrustableGraph
|
dotstop.Graph to work on |
required |
line_length
|
int
|
Soft limit for characters-per-line in node labels. |
required |
same_rank
|
list[list[str]]
|
List of lists of item uids that should appear on the same rank in the plotted graph. |
required |
invis_deps
|
list[tuple[str, str]]
|
(list[tuple[str, str]]): List of tuples of item uids that should be invisibly linked. |
required |
pick
|
list[tuple[str, int | None, int | None]]
|
(list[tuple[str, int | None, int | None]]): Picks nodes to be retained in the graph, specified by item name and levels of parent / child to be picked. |
required |
orphan_nodes
|
bool
|
Retain nodes that isn't linked with anything else. |
required |
output_file_path
|
Path
|
Output path for the plot file. |
Path('./graph.svg')
|
base_url
|
str
|
Base url for tooltips. If "", no tooltips are added. |
''
|
trudag.publish
Library for creating written summaries of trustable software projects.
Report
Report(
graph: TrustableGraph,
project_name: str,
output_path: Path,
scores: dict[str, ItemScore],
non_normative_body: bool = False,
name_references_only=False,
figures: bool = False,
score_tables: bool = False,
data_store_client: DataStoreClientProtocol
| None = None,
sensitivity: dict[str, dict[str, float]] | None = None,
artifacts_published: dict[str, str] | None = None,
root_report: bool = True,
)
The set of information defining a Trustable Report, writeable as markdown.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
TrustableGraph
|
Graph for the project |
required |
project_name
|
str
|
Name of the project |
required |
output_path
|
Path
|
Directory to write all generated files within. |
required |
scores
|
dict[str, ItemScore]
|
Dictionary of (item_str, confidence score) pairs |
required |
non_normative_body
|
bool
|
Include the body text of non-normative items. Defaults to False. |
False
|
name_references_only
|
bool
|
Do not include the contents of external references. Defaults to False. |
False
|
figures
|
bool
|
Include time series plots (requires data store). |
False
|
score_tables
|
bool
|
Include historic store tables (requires data store) |
False
|
data_store_client
|
DataStoreClient
|
Injectable DataStoreClient for testing. |
None
|
sensitivity
|
dict[str, dict[str, float]]
|
Dictionary of item and their graph sensitivity values. |
None
|
artifacts_published
|
dict[str, str]
|
Dictionary of artifact reference paths with their project names to be published. |
None
|
root_report
|
bool
|
Flag to check if the report to be published is the report on the top most level. |
True
|
write_navigation_index
The Trustable Report's navigation as formatted markdown.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
report_name
|
str
|
Trustable Report's file stem. |
required |
create_markdown_table
create_markdown_table(
headers: list[str],
data: list[list[str | float]],
widths: list[int] | None = None,
line_break: bool = True,
indent: int = 0,
) -> str
A helper function for creating markdown tables. widths[i] corresponds to the % width of column i.
publish
publish(
graph: TrustableGraph,
project_name: str,
all_bodies: bool,
output_path: Path,
scores: dict[str, ItemScore],
figures: bool = False,
score_tables: bool = False,
data_store_client: DataStoreClientProtocol
| None = None,
sensitivity: dict[str, dict[str, float]] | None = None,
artifacts_published: dict[str, str] | None = None,
root_report: bool = True,
) -> None
Given a trustable graph in cwd, summarise the graph (and its trustable score)
in the markdown file "trustable_report_for_" + project_name + ".md".
Also produce an item summary and nav file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
Graph
|
Graph to summarise |
required |
project_name
|
str
|
Name of the trustable software project. |
required |
all_bodies
|
bool
|
Include the body text of all (i.e. including non-normative) items in the generated markdown. |
required |
output_path
|
Path
|
Directory to write all generated files within. |
required |
scores
|
dict[str, float]
|
Dictionary of (item_str, confidence score) pairs |
required |
figures
|
bool
|
Include time series plots (requires data store) |
False
|
score_tables
|
bool
|
Include historic store tables (requires data store) |
False
|
data_store_client
|
DataStoreClient
|
Injectable DataStoreClient for testing |
None
|
trudag.score
Library for Trustable score calculations.
In practice, this means functions for interfacing between dotstop and graphalyzer.
item_sensitivity
item_sensitivity(
graph: TrustableGraph,
items_labels: list[str] | None = None,
)
Do the sensitivity analysis for a given TrustableGraph or to a certain set of nodes using items_labels filter
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
TrustableGraph
|
Graph to be analysed |
required |
items_labels
|
list[str]
|
Optional items filter to be evaluated |
None
|
Returns: (dict[str, dict[str,float]]): Dictionary of items and with a map of how that item is important to the others items
score
score(
graph: TrustableGraph,
validator: Validator | None = None,
concurrent: bool = False,
workers: int | None = None,
dump: Path | None = None,
) -> dict[str, ItemScore]
Compute the trustable score of graph and all its reviewed Items.
The score is calculated recursively from leaf Items, with each
Item being assigned a score equal to the weighted sum of its child
Items. The score for each leaf Item should be recorded in an
attribute named score,
else its score will be assumed to be zero.
Unreviewed Items will not be scored. Contributions from child
Items associated by a suspect link are ignored.
Warning
Scoring is still under heavy development. Currently:
- Scores for non-leaf nodes are ignored.
- All weights are assumed to be one and are normalised accordingly.
This behaviour is likely to change.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
TrustableGraph
|
Graph to score. |
required |
validator
|
Validator | None
|
Validator class object to run validations with. |
None
|
dump
|
Path
|
Output file path for the Trustable Scores file |
None
|
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
Dictionary of (UID, confidence score) pairs for all reviewed |
score_origin
score_origin(
item: Item,
premises: list[Item],
review_statuses: dict[str, bool],
validations: dict[str, dict],
score: float,
) -> str
Given an item extracts its score origin.
scores_from_graph
scores_from_graph(
premises: list[Item],
review_statuses: dict[str, bool],
validations: dict[str, dict],
) -> dict[str, float]
Given premises with their review statuses and validations, extract the Evidence scores.
Where items have neither an SME or validation score, assign them a score of zero, logging a warning. Where items have an SME score but are missing a validation score AND references, assign them a score of zero, logging a warning. Where items have both an SME and validation score, take their product.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
premises
|
Graph
|
List of premises to extract Evidence scores from. |
required |
review_statuses
|
dict[str, bool]
|
review statuses of all items. |
required |
validations
|
dict[str, dict]
|
validation results of all items. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
Dictionary of (item_str, confidence score) pairs for Evidence items |
trudag.utils
OUTPUT_QUOTE
module-attribute
Filename in the root directory assumed to contain a dot graph generated by trudag.
dump_scores
Outputs the provided Trustable scores to a provided dump path as one of the supported files.
get_commit_timestamp
get_commit_timestamp() -> int | None
The UNIX timestamp in seconds for commit at HEAD.
get_last_tag
get_last_tag() -> str | None
The most recent git tag (either annotated or lightweight) for HEAD, if it exists.
Returned string has format
<latest tag>-<# commits since last tag>-<short SHA> if the current
commit is not tagged. If the commit is tagged, return <latest tag>
get_root_dir
get_root_dir() -> str
Returns the top level directory of the git repository. If no git repository is found returns the current working directory.
get_workdir
get_workdir() -> Path
Get the cli working directory.
If the current working directory is a git repo, return the top level directory of the repo. If not, return the current working directory.
frontends
cli
frontends.cli
The trudag CLI frontend.
main
main(
ctx: Context,
verbose: bool,
workers: int | None,
needs: bool,
config_file: Path | None = None,
) -> None
Manipulate, analyze and present the contents of a Trustable Acyclic Graph.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ctx
|
Context
|
Context to append and pass to subcommands |
required |
verbose
|
bool
|
Set logging level to |
required |
workers
|
int
|
Set the upperbound for spawning worker threads |
required |
needs
|
bool
|
Whether to operate on needs graph instead of main graph. |
required |
frontends.cli.utils
abort_click_on
Returns a decorator that catches all exceptions of type kinds, logs them, then raises a click.Exception.
Decorator factory function that returns an exception handling decorator for
use with click CLI programs. When an error of type kinds is encountered
it is logged at CRITICAL level. The error is also logged with a full stack
trace at DEBUG level for later inspection. After logging, a click.Exception
exception is raised, which click will handle to gracefully exit the
program.
Example
Consider the simple function square_root.
def square_root(arg: float):
if arg < 0.0
raise ValueError("Cannot compute square root of negative value.")
return arg**0.5
If this is called with a negative value by a click program, we'll get a crash with a stack trace.
Using the abort_click_on decorator
@abort_click_on(ValueError)
def square_root(arg: float):
if arg < 0.0
raise ValueError("Cannot compute square root of negative value.")
return arg**0.5
We instead get a clean exit:
describe_item
describe_item(
graph: TrustableGraph, item_name: str, statement: bool
) -> str
Describes item and formats item details as a human-readable output.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
TrustableGraph
|
Graph to use for describing the item |
required |
item_name
|
str
|
Item name to be described |
required |
statement
|
bool
|
If statement of the item should be included in the output |
required |
filter_unreviewed_items
filter_unreviewed_items(
graph: TrustableGraph, items: list[str]
) -> list
Returns a list of unreviewed items from the supplied list of items. Error for any items that are not in the graph.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
TrustableGraph
|
Graph used to check review status. |
required |
items
|
list[str]
|
a list of strings to check |
required |
Return: list of strings, from the items argument, that are in the graph and unreviewed.
parse_ranks
Parse the 'list as string' of a rank tuple into a Python list.
validate_trudag_version
validate_trudag_version(graph_source: Path) -> None
Compare the trudag version with the version used to generate the dotfile.
Raises:
| Type | Description |
|---|---|
VersionError
|
If the trudag version and the version used to generate the dotfile have different dotfile formatting. |
Reports
Trustable Report
Trustable Compliance Report
Item status guide
Each item in a Trustable Graph is scored with a number between 0 and 1. The score represents aggregated organizational confidence in a given Statement, with larger numbers corresponding to higher confidence. Scores in the report are indicated by both a numerical score and the colormap below:
The status of an item and its links also affect the score.
Unreviewed items are indicated by a cross in the status column. The score of unreviewed items is always set to zero.
Suspect links are indicated by a cross in the status column. The contribution to the score of a parent item by a suspiciously linked child is always zero, regardless of the child's own score.
Compliance for TA
| Item | Summary | Score | Score Origin | Status |
|---|---|---|---|---|
| TA-SUPPLY_CHAIN | All sources for XYZ and tools are mirrored in our controlled environment | 0.00 | Missing | ✔ Item Reviewed ✔ All Children Linked |
| TA-INPUTS | All inputs to XYZ are assessed, to identify potential risks and issues | 0.00 | Missing | ✔ Item Reviewed ✔ All Children Linked |
| TA-TESTS | All tests for XYZ, and its build and test environments, are constructed from controlled/mirrored sources and are reproducible, with any exceptions documented | 0.00 | Missing | ✔ Item Reviewed ✔ All Children Linked |
| TA-RELEASES | Construction of XYZ releases is fully repeatable and the results are fully reproducible, with any exceptions documented and justified. | 0.00 | Missing | ✔ Item Reviewed ✔ All Children Linked |
| TA-ITERATIONS | All constructed iterations of XYZ include source code, build and usage instructions, tests, results, and attestations. | 0.00 | Missing | ✔ Item Reviewed ✔ All Children Linked |
| TA-FIXES | Known bugs or misbehaviours are analysed and triaged, and critical fixes or mitigations are implemented or applied. | 0.00 | Missing | ✔ Item Reviewed ✔ All Children Linked |
| TA-UPDATES | XYZ components, configurations and tools are updated under specified change and configuration management controls. | 0.00 | Missing | ✔ Item Reviewed ✔ All Children Linked |
| TA-BEHAVIOURS | Expected or required behaviours for XYZ are identified, specified, verified and validated based on analysis. | 0.00 | Missing | ✔ Item Reviewed ✔ All Children Linked |
| TA-MISBEHAVIOURS | Prohibited misbehaviours for XYZ are identified, and mitigations are specified, verified and validated based on analysis. | 0.00 | Missing | ✔ Item Reviewed ✔ All Children Linked |
| TA-CONSTRAINTS | Constraints on adaptation and deployment of XYZ are specified. | 0.00 | Missing | ✔ Item Reviewed ✔ All Children Linked |
| TA-INDICATORS | Advance warning indicators for misbehaviours are identified, and monitoring mechanisms are specified, verified and validated based on analysis. | 0.00 | Missing | ✔ Item Reviewed ✔ All Children Linked |
| TA-ANALYSIS | Collected test and monitoring data for XYZ is analysed using verified methods to validate expected behaviours and identify new misbehaviours. | 0.00 | Missing | ✔ Item Reviewed ✔ All Children Linked |
| TA-DATA | Test and monitoring data from development and production are appropriately collected and retained. | 0.00 | Missing | ✔ Item Reviewed ✔ All Children Linked |
| TA-VALIDATION | Tests exercise both stressed and representative conditions, validating behaviour through systematic, scheduled repetition. | 0.00 | Missing | ✔ Item Reviewed ✔ All Children Linked |
| TA-METHODOLOGIES | Manual methodologies applied for XYZ by contributors, and their results, are managed according to specified objectives. | 0.00 | Missing | ✔ Item Reviewed ✔ All Children Linked |
| TA-CONFIDENCE | Confidence in XYZ is measured based on results of analysis | 0.00 | Missing | ✔ Item Reviewed ✔ All Children Linked |
Compliance for TRUSTABLE
| Item | Summary | Score | Score Origin | Status |
|---|---|---|---|---|
| TRUSTABLE-SOFTWARE | This release of XYZ is Trustable. | 0.00 | Missing | ✔ Item Reviewed ✔ All Children Linked |
Compliance for TT
| Item | Summary | Score | Score Origin | Status |
|---|---|---|---|---|
| TT-PROVENANCE | All inputs (and attestations for claims) for XYZ are provided with known provenance. | 0.00 | Missing | ✔ Item Reviewed ✔ All Children Linked |
| TT-CONSTRUCTION | Tools are provided to build XYZ from trusted sources (also provided) with full reproducibility. | 0.00 | Missing | ✔ Item Reviewed ✔ All Children Linked |
| TT-CHANGES | XYZ is actively maintained, with regular updates to dependencies, and changes are verified to prevent regressions. | 0.00 | Missing | ✔ Item Reviewed ✔ All Children Linked |
| TT-EXPECTATIONS | Documentation is provided, specifying what XYZ is expected to do, and what it must not do, and how this is verified. | 0.00 | Missing | ✔ Item Reviewed ✔ All Children Linked |
| TT-RESULTS | Evidence is provided to demonstrate that XYZ does what it is supposed to do, and does not do what it must not do. | 0.00 | Missing | ✔ Item Reviewed ✔ All Children Linked |
| TT-CONFIDENCE | Confidence in XYZ is achieved by measuring and analysing behaviour and evidence over time. | 0.00 | Missing | ✔ Item Reviewed ✔ All Children Linked |
Generated for: TSF
- Repository root: /builds/eclipse/tsf/tsf
- Commit SHA: 9e93651f6a8bfe22ba8c741080d134e5cf06ba25
- Commit date/time: 2026-04-09 15:30:25+00:00 UTC
- Commit tag: 0.2.0-5-g9e93651
Dashboard
Evidence Score Distribution
The distribution of scores for evidence nodes across the graph.
click to view figure as table
| bin | count |
|---|---|
| 0.0-0.1 | 16 |
| 0.1-0.2 | 0 |
| 0.2-0.3 | 0 |
| 0.3-0.4 | 0 |
| 0.4-0.5 | 0 |
| 0.5-0.6 | 0 |
| 0.6-0.7 | 0 |
| 0.7-0.8 | 0 |
| 0.8-0.9 | 0 |
| 0.9-1.0 | 0 |
Expectations Score Distribution
The distribution of scores for expectations nodes across the graph.
click to view figure as table
| bin | count |
|---|---|
| 0.0-0.1 | 1 |
| 0.1-0.2 | 0 |
| 0.2-0.3 | 0 |
| 0.3-0.4 | 0 |
| 0.4-0.5 | 0 |
| 0.5-0.6 | 0 |
| 0.6-0.7 | 0 |
| 0.7-0.8 | 0 |
| 0.8-0.9 | 0 |
| 0.9-1.0 | 0 |
All Score Distribution
The distribution of scores for all nodes across the graph.
click to view figure as table
| bin | count |
|---|---|
| 0.0-0.1 | 23 |
| 0.1-0.2 | 0 |
| 0.2-0.3 | 0 |
| 0.3-0.4 | 0 |
| 0.4-0.5 | 0 |
| 0.5-0.6 | 0 |
| 0.6-0.7 | 0 |
| 0.7-0.8 | 0 |
| 0.8-0.9 | 0 |
| 0.9-1.0 | 0 |
Summary
| Category | Count |
|---|---|
| statements | 30 |
| reviewed statements | 30 |
| unreviewed statements | 0 |
| orphaned statements | 0 |
| statements with evidence | 6 |
| evidence | 16 |
| expectations | 1 |
TA
This document lists the Trustable Assertions (TA) for XYZ, grouped by the Trustable Tenets (TT) they support.
Provenance
TA-SUPPLY_CHAIN | Reviewed: ✔ | Score: 0.0
Supported Requests:
| Item | Summary | Score | Status |
|---|---|---|---|
| TT-PROVENANCE | All inputs (and attestations for claims) for XYZ are provided with known provenance. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
Supporting Items:
None
References:
-
trustable/assertions/TA-SUPPLY_CHAIN_CONTEXT.mdGuidance
This assertion is satisfied to the extent that we have traced and captured source code for XYZ and all of its dependencies (including transitive dependencies, all the way down), and for all of the tools used to construct XYZ from source, and have mirrored versions of these inputs under our control. Any associated data and documentation dependencies must also be considered.
'Mirrored' in this context means that we have a version of the upstream project that we keep up-to-date with additions and changes to the upstream project, but which is protected from changes that would delete the project, or remove parts of its history.
Clearly this is not possible for components or tools (or data) that are provided only in binary form, or accessed via online services - in these circumstances we can only assess confidence based on attestations made by the suppliers, and on our experience with the suppliers' people and processes.
Keep in mind that even if repositories with source code for a particular component or tool are available, not all of it may be stored in Git as plaintext. A deeper analysis is required in TA-INPUTS to assess the impact of any binaries present within the repositories of the components and tools used.
Evidence
- list of all XYZ components including
- URL of mirrored projects in controlled environment
- URL of upstream projects
- successful build of XYZ from source
- without access to external source projects
- without access to cached data
- update logs for mirrored projects
- mirrors reject history rewrites
- mirroring is configured via infrastructure under direct control
Confidence scoring
Confidence scoring for TA-SUPPLY_CHAIN is based on confidence that all inputs and dependencies are identified and mirrored, and that mirrored projects cannot be compromised.
Checklist
- Could there be other components, missed from the list?
- Does the list include all toolchain components?
- Does the toolchain include a bootstrap?
- Could the content of a mirrored project be compromised by an upstream change?
- Are mirrored projects up to date with the upstream project?
- Are mirrored projects based on the correct upstream?
TA-INPUTS | Reviewed: ✔ | Score: 0.0
Supported Requests:
| Item | Summary | Score | Status |
|---|---|---|---|
| TT-PROVENANCE | All inputs (and attestations for claims) for XYZ are provided with known provenance. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
Supporting Items:
None
References:
-
trustable/assertions/TA-INPUTS_CONTEXT.mdGuidance
Anything that can influence the output of the XYZ project is considered an input. This includes:
- Software components used to implement specified features and meet defined Expectations
- Software tools, and their outputs, used for design, construction and verification
- Infrastructure that supports development and release processes
All inputs (components, tools, data) and their dependencies (recursively) used to build and verify XYZ releases must be identified and assessed, since they are untrusted by default.
Each input should be evaluated on verifiable merits, regardless of any claims it makes (including adherence to standards or guidance). Evaluation must include the project's defined Expectations to ensure that inputs meet requirements, and that risks are recorded and addressed appropriately.
For components, we need to consider how their misbehaviour might impact achieving project XYZ's Expectations. Sources (e.g. bug databases, advisories) for known risks should be identified, their update frequency recorded, and tests defined for detecting them. These form the inputs to TA-FIXES.
For the tools used to construct and verify XYZ, we need to consider how their misbehaviour could:
- Introduce unintended changes
- Fail to detect Misbehaviours during testing
- Produce misleading data used to design or verify the next iteration
Where any input impacts are identified, consider:
- How serious their impact might be, and whether Expectations or analysis outcomes are affected (severity)
- Whether they are detected by another tool, test, or manual check (detectability)
Confidence in assessing severity and detectability can be supported by analysing development history and practices of each input to evaluate upstream sources (both third-party and first-party) for maintainability and sustainability (including, for example, testability, modularity and configurability) to reduce failure impact and support safe change.
These qualities can be estimated through evidence of software engineering best practice, applied through:
- Processes defining and following design, documentation and review guidelines, carried out manually (advocating simple design, reuse, structured coding constructs, and competent release management)
- Appropriate use of programming languages and their features, supported by tools such as static analysis, with regular improvement of their configurations
For impacts with high severity or low detectability (or both), additional analysis should assess whether existing tests effectively detect Misbehaviours and their impacts.
As a result, for example, any binary inputs without reproducible build steps or clear development history and maintenance processes should be treated as risks and mitigated appropriately.
Evidence
- List of components used to build XYZ, including:
- Whether content is provided as source or binary
- Record of component assessments:
- Originating project and version
- Date of assessments and identity of assessors
- Role of component in XYZ
- Sources of bug and risk data
- Potential misbehaviours and risks identified and assessed
- List of tools used to build and verify XYZ
- Record of tool assessments:
- Originating project and tool version
- Date of assessments and identity of assessors
- Role of the tool in XYZ releases
- Potential misbehaviours and impacts
- Detectability and severity of impacts
- Tests or measures to address identified impacts
Confidence scoring
Confidence scoring for TA-INPUTS is based on the set of components and tools identified, how many of (and how often) these have been assessed for their risk and impact for XYZ, and the sources of risk and issue data identified.
Checklist
- Are there components that are not on the list?
- Are there assessments for all components?
- Has an assessment been done for the current version of the component?
- Have sources of bug and/or vulnerability data been identified?
- Have additional tests and/or Expectations been documented and linked to component assessment?
- Are component tests run when integrating new versions of components?
- Are there tools that are not on the list?
- Are there impact assessments for all tools?
- Have tools with high impact been qualified?
- Were assessments or reviews done for the current tool versions?
- Have additional tests and/or Expectations been documented and linked to tool assessments?
- Are tool tests run when integrating new versions of tools?
- Are tool and component tests included in release preparation?
- Can patches be applied, and then upstreamed for long-term maintenance?
- Do all dependencies comply with acceptable licensing terms?
Construction
TA-TESTS | Reviewed: ✔ | Score: 0.0
Supported Requests:
| Item | Summary | Score | Status |
|---|---|---|---|
| TT-CONSTRUCTION | Tools are provided to build XYZ from trusted sources (also provided) with full reproducibility. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
Supporting Items:
None
References:
-
trustable/assertions/TA-TESTS_CONTEXT.mdGuidance
This assertion is satisfied if all build and test environments and tools used to support Expectations are shown to be reproducible, all build and test steps are repeatable, and all required inputs are controlled. TA-TESTS does not include reproducibility of XYZ itself, this is instead included in TA-RELEASES.
All tools and test environments should be constructed from change-managed sources (see TA-UPDATES) and mirrored sources (see TA-SUPPLY_CHAIN). Additional evidence needs to demonstrate that construction of tools and environments produces the same binary fileset used for testing and that builds can be repeated on any suitably configured server (similar to how the XYZ is evaluated for TA-RELEASES).
Test environment repeatability should be ensured to enable effective Misbehaviour investigations, and enable additional data generations (including those by third parties). To achieve repeatability, all infrastructure, hardware, and configurations must be identified and documented for all test environments. Storage of this information is evaluated in TA-DATA, and its availability is considered in TA-ITERATIONS.
Evidence
- Test build environment reproducibility
- Test build configuration
- Test build reproducibility
- Test environment configuration
Confidence scoring
Confidence scoring for TA-TESTS is based on confidence that the construction and deployment of test environments, tooling and their build environments are repeatable and reproducible.
CHECKLIST
- How confident are we that our test tooling and environment setups used for tests, fault inductions, and analyses are reproducible?
- Are any exceptions identified, documented and justified?
- How confident are we that all test components are taken from within our controlled environment?
- How confident are we that all of the test environments we are using are also under our control?
- Do we record all test environment components, including hardware and infrastructure used for exercising tests and processing input/output data?
- How confident are we that all tests scenarios are repeatable?
TA-RELEASES | Reviewed: ✔ | Score: 0.0
Supported Requests:
| Item | Summary | Score | Status |
|---|---|---|---|
| TT-CONSTRUCTION | Tools are provided to build XYZ from trusted sources (also provided) with full reproducibility. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
Supporting Items:
None
References:
-
trustable/assertions/TA-RELEASES_CONTEXT.mdGuidance
This assertion is satisfied if each iteration of XYZ is repeatable, with all required inputs controlled, and reproducible (covering both XYZ and the construction toolchain/environment, as described in TA-TESTS).
This assertion can be most effectively satisfied in a Continuous Integration environment with mirrored projects (see TA-SUPPLY_CHAIN) and build servers without internet access. The aim is to show that all build tools, XYZ components, and dependencies are built from controlled inputs, that rebuilding produces the same binary fileset, and that builds can be repeated on any suitably configured server, with server differences shown not to affect reproducibility.
For releases in particular, builds from source must be shown to produce identical outputs both with and without cache access.
Again this will not be achievable for components/tools provided in binary form, or accessed via an external service - we must consider our confidence in attestations made by/for the supply chain.
All non-reproducible elements, such as timestamps or embedded random values from build metadata, are clearly identified and considered when evaluating reproducibility.
As a result, we gain increased confidence that the toolchain behaves correctly during version upgrades: unintended changes to the project are avoided, intended fixes produce the expected effects, and the constructed output of XYZ shows the correct behavioural changes, verified and validated with test results according to TT-RESULTS analysis.
Evidence
- list of reproducible SHAs
- list of non-reproducible elements with:
- explanation and justification
- details of what is not reproducible
- evidence of configuration management for build instructions and infrastructure
- evidence of repeatable builds
Confidence scoring
Calculate:
R = number of reproducible components (including sources which have no build stage) N = number of non-reproducible B = number of binaries M = number of mirrored X = number of things not mirrored
Confidence scoring for TA-RELEASES could possibly be calculated as R / (R + N + B + M / (M + X))
Checklist
- How confident are we that all components are taken from within our controlled environment?
- How confident are we that all of the tools we are using are also under our control?
- Are our builds repeatable on a different server, or in a different context?
- How sure are we that our builds don't access the internet?
- How many of our components are non-reproducible?
- How confident are we that our reproducibility check is correct?
TA-ITERATIONS | Reviewed: ✔ | Score: 0.0
Supported Requests:
| Item | Summary | Score | Status |
|---|---|---|---|
| TT-CONSTRUCTION | Tools are provided to build XYZ from trusted sources (also provided) with full reproducibility. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
Supporting Items:
None
References:
-
trustable/assertions/TA-ITERATIONS_CONTEXT.mdGuidance
This assertion is best satisfied by checking generated documentation to confirm that:
- every iteration is a working product with evidence-backed, falsifiable Statements, together with documentation of confidence in those Statements and all required Trustable Statements.
- every iteration includes instructions for building and using the product
- all components, dependencies, tools, and data are identified in a manifest
- the manifest provides links to source code
- where source code is unavailable, the supplier is identified
An iteration consists of each batch of changes accepted into the canonical version of the product. How the canonical version is managed must be documented (for TT-CHANGES) alongside the product's Expectations.
Every iteration must be usable as a standalone product, with verification and validation completed so that a hotfix could be released at any point. Documentation generated alongside the product must include build and usage guidance together with the project's documented Expectations and supporting Statements, enabling any maintainer or user to reverify the state of the product and associated Statements.
For each iteration, any changes must be accompanied by attestations and reasoning, explaining the tests performed and the review steps taken, together with their outcomes (e.g., results of source code inspections). Any attestations and impact assessments must be traceable to the specific changes, authors, reviewers, and the review process documentation used.
Collating and making available all appropriate data and documentation for every iteration must be automatable, so that the product's build can be reproduced and its analysis repeated end-to-end independently (best achieved using generated documentation and configuration as code). All relevant data, including approval statuses and dates, must be stored long-term and analysed as part of TA-DATA. For complex systems, the resulting information must be presented in a user-friendly, searchable, and accessible form.
Given such transparent documentation and attestations for every iteration, it becomes possible to analyse product and development trends over time. For releases, additional documentation should summarise all changes across the iterations since the previous release.
Evidence
- list of components with source
- source code
- build instructions
- test code
- test results summary
- attestations
- list of components where source code is not available
- risk analysis
- attestations
Confidence scoring
Confidence scoring for TA-ITERATIONS based on
- number and importance of source components
- number and importance of non-source components
- assessment of attestations
Checklist
- How much of the software is provided as binary only, expressed as a fraction of the BoM list?
- How much is binary, expressed as a fraction of the total storage footprint?
- For binaries, what claims are being made and how confident are we in the people/organisations making the claims?
- For third-party source code, what claims are we making, and how confident are we about these claims?
- For software developed by us, what claims are we making, and how confident are we about these claims?
Changes
TA-FIXES | Reviewed: ✔ | Score: 0.0
Supported Requests:
| Item | Summary | Score | Status |
|---|---|---|---|
| TT-CHANGES | XYZ is actively maintained, with regular updates to dependencies, and changes are verified to prevent regressions. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
Supporting Items:
None
References:
-
trustable/assertions/TA-FIXES_CONTEXT.mdGuidance
This assertion is satisfied to the extent that we have identified, triaged, and applied fixes or mitigations to faults in XYZ, as well as to bugs and publicly disclosed vulnerabilities identified in upstream dependencies.
Confidence can be improved by assessing known faults, bugs, and vulnerabilities to establish their relevance and impact for XYZ. An important aspect is documenting how issues are discovered and tracked, including identifying additional Misbehaviours (TA-MISBEHAVIOURS) that may require immediate mitigation measures (including recalls), and how such issues are communicated to users.
In principle, this analysis should include not only the code in XYZ but also its dependencies (all the way down) and the tools and data used to construct the release. In practice, however, the cost/benefit of this work must be weighed against:
- the volume and quality of available bug and vulnerability reports
- the likelihood that our build, configuration, or use case is actually affected
The triage process must be documented, reviewed, and evidenced as sufficient and consistently followed. Documentation must make clear how prioritisation, assignment, and rejection (e.g., for duplicates) are handled, and how mitigations are tracked to completion in a timely manner appropriate to the project's claims and the issues discovered.
Field incidents are a key source of high-priority Misbehaviours. These require additional rigour to ensure appropriate and timely responses. For every iteration and associated change, related issue resolutions must be documented with their impact (e.g., whether new Misbehaviours were found or parts of the analysis had to be redone) and linked to the specific change, ensuring visible traceability. This information must remain available to support decision traceability throughout the project's lifetime (as considered in TA-DATA).
As part of ongoing monitoring, the rate of incoming, resolved, and rejected issues across the project and its dependencies should be tracked for trends and anomalies, to identify shifts and to detect if a source of information is lost.
Evidence
- List of known bugs fixed since last release
- List of outstanding bugs still not fixed, with triage/prioritisation based on severity/relevance/impact
- List of known vulnerabilities fixed since last release
- List of outstanding known vulnerabilities still not fixed, with triage/prioritisation based on severity/relevance/impact
- List of XYZ component versions, showing where a newer version exists upstream
- List of component version updates since last release
- List of fixes applied to developed code since last release
- List of fixes for developed code that are outstanding, not applied yet
- List of XYZ faults outstanding (O)
- List of XYZ faults fixed since last release (F)
- List of XYZ faults mitigated since last release (M)
Confidence scoring
Confidence scoring for TA-FIXES can be based on
- some function of [O, F, M] for XYZ
- number of outstanding relevant bugs from components
- bug triage results, accounting for undiscovered bugs
- number of outstanding known vulnerabilities
- triage results of publicly disclosed vulnerabilities, accounting for undiscovered bugs and vulnerabilities
- confidence that known fixes have been applied
- confidence that known mitigations have been applied
- previous confidence score for TA-FIXES
Each iteration, we should improve the algorithm based on measurements
Checklist
- How many faults have we identified in XYZ?
- How many unknown faults remain to be found, based on the number that have been processed so far?
- Is there any possibility that people could be motivated to manipulate the lists (e.g. bug bonus or pressure to close).
- How many faults may be unrecorded (or incorrectly closed, or downplayed)?
- How do we collect lists of bugs and known vulnerabilities from components?
- How (and how often) do we check these lists for relevant bugs and known vulnerabilities?
- How confident can we be that the lists are honestly maintained?
- Could some participants have incentives to manipulate information?
- How confident are we that the lists are comprehensive?
- Could there be whole categories of bugs/vulnerabilities still undiscovered?
- How effective is our triage/prioritisation?
- How many components have never been updated?
- How confident are we that we could update them?
- How confident are we that outstanding fixes do not impact our Expectations?
- How confident are we that outstanding fixes do not address Misbehaviours?
TA-UPDATES | Reviewed: ✔ | Score: 0.0
Supported Requests:
| Item | Summary | Score | Status |
|---|---|---|---|
| TT-CHANGES | XYZ is actively maintained, with regular updates to dependencies, and changes are verified to prevent regressions. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
Supporting Items:
None
References:
-
trustable/assertions/TA-UPDATES_CONTEXT.mdGuidance
This assertion requires control over all changes to XYZ, including configurations, components, tools, data, documentation, and dependency versions used to build, verify, and validate it.
As part of change control, all automated checks must run and pass (e.g., tests, static analysis, lint checks) before accepting proposed changes. These checks must be configured against appropriate claims and coding guidelines. Where a change affects tracked claims, the impact must be identified, reasoned, and verified, with linked analysis performed (e.g., input analysis for new dependencies as per TA-INPUTS). Even changes with no direct impact on project claims must be justified.
Multiple roles (assigned to appropriate parties under suitable guidelines) should be involved in assessing changes. Reviews must focus on the integrity and consistency of claims, the software, and its tests. What each reviewer did or did not examine must be recorded, and this information (together with all checks) made available for every change throughout the project lifecycle (see TA-DATA). Details of manual quality management aspects are addressed in TA-METHODOLOGIES.
As a result, all changes must be regression-free (blocking problematic changes until resolved) and aim to exhibit the following properties:
- simple
- atomic
- modular
- understandable
- testable
- maintainable
- sustainable
Practices that enforce these properties help identify and resolve inconsistent changes early in development.
Change control itself must not be subverted, whether accidentally or maliciously. Process documentation, guidance, and automated checks must also be under change control, approved by appropriate parties, and protected with suitable security controls.
To prevent regressions and reduce the rate of bugs and vulnerabilities, consistent dependency updates must be applied and new issues promptly addressed (TA-FIXES). Evidence for each iteration must demonstrate that change control requirements are applied consistently and evolve as improvements are identified, ensuring the process remains repeatable and reproducible. Timeliness must be monitored across detection, resolution, and deployment, with automation and process improvements introduced where delays are found.
Ultimately, the trustable controlled process is the only path to production for the constructed target software.
Evidence
- change management process and configuration artifacts
Confidence scoring
Confidence scoring for TA-UPDATES is based on confidence that we have control over the changes that we make to XYZ, including its configuration and dependencies.
Checklist
- Where are the change and configuration management controls specified?
- Are these controls enforced for all of components, tools, data, documentation and configurations?
- Are there any ways in which these controls can be subverted, and have we mitigated them?
- Does change control capture all potential regressions?
- Is change control timely enough?
- Are all guidance and checks understandable and consistently followed?
Expectations
TA-BEHAVIOURS | Reviewed: ✔ | Score: 0.0
Supported Requests:
| Item | Summary | Score | Status |
|---|---|---|---|
| TT-EXPECTATIONS | Documentation is provided, specifying what XYZ is expected to do, and what it must not do, and how this is verified. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
Supporting Items:
None
References:
-
trustable/assertions/TA-BEHAVIOURS_CONTEXT.mdAlthough it is practically impossible to specify all of the necessary behaviours and required properties for complex software, we must clearly specify the most important of these (e.g. where harm could result if given criteria are not met), and verify that these are correctly provided by XYZ.
Guidance
This assertion is satisfied to the extent that we have:
- Determined which Behaviours are critical for consumers of XYZ and recorded them as Expectations.
- Verified these Behaviours are achieved.
Expectations could be verified by:
- Functional testing for the system.
- Functional soak testing for the system.
- Specifying architecture and verifying its implementation with pre-merge integration testing for components.
- Specifying components and verifying their implementation using pre-merge unit testing.
The number and combination of the above verification strategies will depend on the scale of the project. For example, unit testing is more suitable for the development of a small library than an OS. Similarly, the verification strategy must align with the chosen development methods and be supported by appropriate verification approaches and tools.
Regardless of the chosen strategy, the reasoning behind it must be recorded in a traceable way, linking breakdown and verification methods to the relevant reasoning, abstraction levels, and design partitioning (including system interfaces with users and hardware, or other system boundaries).
Finally, the resulting system must be validated, with the foundation of validation being a working system that has appropriately considered calibration targets such as capacity, scalability, response time, latency, and throughput, where applicable. Without this, specification and verification efforts cannot be considered sufficient.
Evidence
- List of Expectations
- Argument of sufficiency for break-down of expected behaviour for all Expectations
- Validation and verification of expected behaviour
Confidence scoring
Confidence scoring for TA-BEHAVIOURS is based on our confidence that the list of Expectations is accurate and complete, that Expectations are verified by tests, and that the resulting system and tests are validated by appropriate strategies.
Checklist
- How has the list of Expectations varied over time?
- How confident can we be that this list is comprehensive?
- Could some participants have incentives to manipulate information?
- Could there be whole categories of Expectations still undiscovered?
- Can we identify Expectations that have been understood but not specified?
- Can we identify some new Expectations, right now?
- How confident can we be that this list covers all critical requirements?
- How comprehensive is the list of tests?
- Is every Expectation covered by at least one implemented test?
- Are there any Expectations where we believe more coverage would help?
- How do dependencies affect Expectations, and are their properties verifiable?
- Are input analysis findings from components, tools, and data considered in relation to Expectations?
TA-MISBEHAVIOURS | Reviewed: ✔ | Score: 0.0
Supported Requests:
| Item | Summary | Score | Status |
|---|---|---|---|
| TT-EXPECTATIONS | Documentation is provided, specifying what XYZ is expected to do, and what it must not do, and how this is verified. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
Supporting Items:
None
References:
-
trustable/assertions/TA-MISBEHAVIOURS_CONTEXT.mdThe goal of TA-MISBEHAVIOURS is to force engineers to think critically about their work. This means understanding and mitigating as many of the situations that cause the software to deviate from Expected Behaviours as possible. This is not limited to the contents of the final binary.
Guidance
This assertion is satisfied to the extent that we can:
- Show we have identified all of the ways in which XYZ could deviate from its Expected Behaviours.
- Demonstrate that mitigations have been specified, verified and validated for all Misbehaviours.
Once Expected Behaviours have been identified in TA-BEHAVIOURS, there are at least four classes of Misbehaviour that can be identified:
- Reachable vulnerable system states that cause deviations from Expected Behaviour. These can be identified by stress testing, failures in functional and soak testing in TA-BEHAVIOURS and reporting in TA-FIXES. Long run trends in both test and production data should also be used to identify these states.
- Potentially unreachable vulnerable system states that could lead to deviations from Expected Behaviour. These can be identified using risk/hazard analysis techniques including HAZOP, FMEDA and STPA.
- Vulnerabilities in the development process that could lead to deviations from Expected Behaviour. This includes those that occur as a result of misuse, negligence or malicious intent. These can be identified by incident investigation, random sampling of process artifacts and STPA of processes.
- Configurations in integrating projects (including the computer or embedded system that is the final product) that could lead to deviations from Expected Behaviour.
Identified Misbehaviours must be mitigated. Mitigations include patching, re-designing components, re-designing architectures, removing components, testing, static analysis etc. They explicitly do not include the use of AWIs to return to a known-good state. These are treated specifically and in detail in TA-INDICATORS.
Mitigations could be verified by:
- Specifying and repeatedly executing false negative tests to confirm that functional tests detect known classes of misbehaviour.
- Specifying fault induction tests or stress tests to demonstrate that the system continues providing the Expected Behaviour after entering a vulnerable system state.
- Performing statistical analysis of test data, including using statistical path coverage to demonstrate that the vulnerable system state is never reached.
- Conducting fault injections in development processes to demonstrate that vulnerabilities cannot be exploited (knowingly or otherwise) to affect either output binaries or our analysis of it, whether this is by manipulating the source code, build environment, test cases or any other means.
- Stress testing of assumptions of use. That is, confirming assumptions of use are actually consistent with the system and its Expected Behaviours by intentionally misinterpreting or liberally interpreting them in a test environment. For example, we could consider testing XYZ on different pieces of hardware that satisfy its assumptions of use.
Remember that a Misbehaviour is anything that could lead to a deviation from Expected Behaviour. The specific technologies in and applications of XYZ should always be considered in addition to the guidance above.
At the core, a faulty design is inherently difficult to mitigate. The first priority, therefore, is to ensure a fault-tolerant and fault-avoidant design that minimises fault impact and maximises fault control across all modes and states. All design considerations should be traceable to analyses at the correct abstraction level, with appropriate partitioning and scoping, which address prevalent aspects in complex systems, such as:
- Spatial constraints (e.g., memory corruption)
- Temporal constraints (e.g., timing violations)
- Concurrency constraints (e.g., interference)
- Computational constraints (e.g., precision limits)
- Performance constraints (e.g., latency spikes under load)
- Environmental constraints (e.g., hardware non-determinism)
- Usability constraints (e.g., human interaction errors)
Finally, each new Expectation, whether a required behaviour or a misbehaviour mitigation, introduces the potential for unexpected emergent properties, highlighting the importance of simple, understandable designs that build on established and reusable solutions.
Suggested evidence
- List of identified Misbehaviours
- List of Expectations for mitigations addressing identified Misbehaviours
- Risk analysis
- Test analysis, including:
- False negative tests
- Exception handling tests
- Stress tests
- Soak tests
Confidence scoring
Confidence scoring for TA-MISBEHAVIOURS is based on confidence that identification and coverage of misbehaviours by tests is complete when considered against the list of Expectations.
Checklist
- How has the list of misbehaviours varied over time?
- How confident can we be that this list is comprehensive?
- How well do the misbehaviours map to the expectations?
- Could some participants have incentives to manipulate information?
- Could there be whole categories of misbehaviours still undiscovered?
- Can we identify misbehaviours that have been understood but not specified?
- Can we identify some new misbehaviours, right now?
- Is every misbehaviour represented by at least one fault induction test?
- Are fault inductions used to demonstrate that tests which usually pass can and do fail appropriately?
- Are all the fault induction results actually collected?
- Are the results evaluated?
- Do input analysis findings on verifiable tool or component claims and features identify additional misbehaviours or support existing mitigations?
TA-CONSTRAINTS | Reviewed: ✔ | Score: 0.0
Supported Requests:
| Item | Summary | Score | Status |
|---|---|---|---|
| TT-EXPECTATIONS | Documentation is provided, specifying what XYZ is expected to do, and what it must not do, and how this is verified. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
Supporting Items:
None
References:
-
trustable/assertions/TA-CONSTRAINTS_CONTEXT.mdGuidance
Constraints on reuse, reconfiguration, modification, and deployment are specified to enhance the trustability of outputs. To ensure clarity, boundaries on what the output cannot do - especially where common domain assumptions may not hold - must be explicitly documented. These constraints are distinct from misbehaviour mitigations; instead, they define the context within which the system is designed to operate, including all modes and environmental considerations. This upfront documentation clarifies intended use, highlights known limitations, and prevents misinterpretation.
These constraints, categorised into explicit limitations and assumptions of use, guide both stakeholders and users (integrators, maintainers, operators, and end-users). They define the intended scope and provide a clear interface for how upstream and downstream systems can integrate, modify, install, reuse, or reconfigure to achieve the desired output. The documentation must also specify the contexts in which the integrity of existing Statements is preserved and whether reimplementation is required, considering device maintenance assumptions, including software updates and vulnerability mitigation.
Crucially, these limitations are not unresolved defects from triage decisions but deliberate exclusions based on design choices. Each omission should be supported by a clear rationale (linked to relevant Expectations and analyses with the appropriate architectural and abstraction levels) to ensure transparency for future scope expansion and to guide both upstream and downstream modifications.
To remain effective in practice, constraints must consider user-friendliness in relation to associated Misbehaviours (TA-MISBEHAVIOURS) and AWIs (TA-INDICATORS):
- Include mechanisms to prevent misuse (e.g., protecting runtime parameters from corruption or unauthorized modification during both development and operation), explicitly linking them to relevant Misbehaviours and their analyses (as defined in TA-MISBEHAVIOURS).
- Present constraint-related data with emphasis on availability, clarity, and transparent communication of defined safe states, along with the mechanisms that transition the system into those states, ensuring they are connected to the relevant AWIs (as defined in TA-INDICATORS).
Finally, the documentation must establish and promote a clear process for reporting bugs, issues, and requests.
Suggested evidence
- Installation manuals with worked examples
- Configuration manuals with worked examples
- Specification documentation with a clearly defined scope
- User guides detailing limitations in interfaces designed for expandability or modularity
- Documented strategies used by external users to address constraints and work with existing Statements
Confidence scoring
The reliability of these constraints should be assessed based on the absence of contradictions and obvious pitfalls within the defined Statements.
Checklist
- Are the constraints grounded in realistic expectations, backed by real-world examples?
- Do they effectively guide downstream consumers in expanding upon existing Statements?
- Do they provide clear guidance for upstreams on reusing components with well-defined claims?
- Are any Statements explicitly designated as not reusable or adaptable?
- Are there worked examples from downstream or upstream users demonstrating these constraints in practice?
- Have there been any documented misunderstandings from users, and are these visibly resolved?
- Do external users actively keep up with updates, and are they properly notified of any changes?
TA-INDICATORS | Reviewed: ✔ | Score: 0.0
Supported Requests:
| Item | Summary | Score | Status |
|---|---|---|---|
| TT-EXPECTATIONS | Documentation is provided, specifying what XYZ is expected to do, and what it must not do, and how this is verified. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
Supporting Items:
None
References:
-
trustable/assertions/TA-INDICATORS_CONTEXT.mdNot all deviations from Expected Behaviour can be associated with a specific condition. Therefore, we must have a strategy for managing deviations that arise from unknown system states, process vulnerabilities or configurations.
This is the role of Advanced Warning Indicators (AWI). These are specific metrics which correlate with deviations from Expected Behaviour and can be monitored in real time. The system should return to a defined known-good state when AWIs exceed defined tolerances.
Guidance
This assertion is met to the extent that:
- We have identified indicators that are strongly correlated with observed deviations from Expected Behaviour in testing and/or production.
- The system returns to a defined known-good state when AWIs exceed defined tolerances.
- The mechanism for returning to the known-good state is verified.
- The selection of Advance Warning Indicators is validated against the set of possible deviations from Expected behaviour.
Note, the set of possible deviations from Expected behaviour is not the same as the set of Misbehaviours identified in TA-MISBEHAVIOURS, as it includes deviations due to unknown causes.
Deviations are easily determined by negating recorded Expectations. Potential AWIs could be identified using source code analysis, risk analysis or incident reports. A set of AWIs to be used in production should be identified by monitoring candidate signals in all tests (functional, soak, stress) and measuring correlation with deviations.
Telematics, diagnostics, or manual proof testing are of little value without mitigation. As such, AWI monitoring and mitigation should be automatic, traceable back to analysis, and formally recorded to ensure information from previously unidentified misbehaviours is captured in a structured way.
The known-good state should be chosen with regard to the system's intended consumers and/or context. Canonical examples are mechanisms like reboots, resets, relaunches and restarts. The mechanism for returning to a known-good state can be verified using fault induction tests. Incidences of AWIs triggering a return to the known-good state in either testing or production should be considered as a Misbehaviour in TA-MISBEHAVIOURS. Relying on AWIs alone is not an acceptable mitigation strategy. TA-MISBEHAVIOURS and TA-INDICATORS are treated separately for this reason.
The selection of AWIs can be validated by analysing failure data. For instance, a high number of instances of deviations with all AWIs in tolerance implies the set of AWIs is incorrect, or the tolerance is too lax.
Evidence
- Risk analyses
- List of advance warning indicators
- List of Expectations for monitoring mechanisms
- List of implemented monitoring mechanisms
- List of identified misbehaviours without advance warning indicators
- List of advance warning indicators without implemented monitoring mechanisms
- Advance warning signal data as time series (see TA-DATA)
Confidence scoring
Confidence scoring for TA-INDICATORS is based on confidence that the list of indicators is comprehensive / complete, that the indicators are useful, and that monitoring mechanisms have been implemented to collect the required data.
Checklist
- How appropriate/thorough are the analyses that led to the indicators?
- How confident can we be that the list of indicators is comprehensive?
- Could there be whole categories of warning indicators still missing?
- How has the list of advance warning indicators varied over time?
- How confident are we that the indicators are leading/predictive?
- Are there misbehaviours that have no advance warning indicators?
- Can we collect data for all indicators?
- Are the monitoring mechanisms used included in our Trustable scope?
- Are there gaps or trends in the data?
- If there are gaps or trends, are they analysed and addressed?
- Is the data actually predictive/useful?
- Are indicators from code, component, tool, or data inspections taken into consideration?
Results
TA-ANALYSIS | Reviewed: ✔ | Score: 0.0
Supported Requests:
| Item | Summary | Score | Status |
|---|---|---|---|
| TT-RESULTS | Evidence is provided to demonstrate that XYZ does what it is supposed to do, and does not do what it must not do. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
Supporting Items:
None
References:
-
trustable/assertions/TA-ANALYSIS_CONTEXT.mdGuidance
This assertion is satisfied to the extent that test data, and data collected from monitoring of deployed versions of XYZ, has been analysed, and the results used to inform the refinement of Expectations and risk analysis.
The extent of the analysis is with sufficient precision to confirm that:
- all Expectations (TA-BEHAVIOURS) are met
- all Misbehaviours (TA-MISBEHAVIOURS) are detected or mitigated
- all advance warning indicators (TA-INDICATORS) are monitored
- failure rates (calculated directly or inferred by statistics) are within acceptable tolerance
When tests reveal Misbehaviours missing from our analysis (TA-ANALYSIS), we update our Expectations (TA-BEHAVIOURS, TA-MISBEHAVIOURS). Guided by confidence evaluations (TA-CONFIDENCE), we refine and repeat the analysis as needed. Analysis results also inform confidence evaluations, allowing automatic generation through statistical modelling and defining Key Performance Indicators (KPIs) for consistent use across the TSF.
For increased confidence in the analysis specification and results, they should be evaluated in terms of their reliability, relevance, and understandability.
- Reliability: The analysis methods must be verified against both known good and bad data to ensure sufficient detection of false negatives and false positives. Accuracy degradation across methods should be tracked and aggregated, making outcomes more easily verifiable and providing visibility into how changes to the system under test or to the analysis mechanisms affect the results.
- Relevance: The results must account for hardware and hardware/software interactions. Calibration should address capacity, scalability, response time, latency, and throughput where applicable. To further increase confidence in estimated failure rates, the analysis should also cover testing sufficiency (with statistical methods where appropriate), cascading failures including sequencing and concurrency, bug analysis, and comparison against expected results and variability. The analysis should be automated and exercised repeatedly for timely feedback.
- Understandability: Both methods and results should be mapped to other analyses performed on the system (linked to TT_EXPECTATIONS) to ensure alignment with scope, abstraction levels, and partitioning, thereby guiding prioritisation. Effectiveness also depends on user-friendliness and presentation (involving semi-formal structured forms, supported by diagrams and figures with clear legends).
To gain increased confidence, test results should be shown to be reproducible. Even with non-deterministic software, representative test setups must be ensured to produced reproducible results within a defined threshold as specified by TT-EXPECTATIONS. Reproducible test results also supports verification of toolchain updates (together with other measures in TA-FIXES), by confirming that test results remain unchanged when no changes are intended.
Evidence
- Analysis of test data, including thresholds in relation to appropriate statistical properties.
- Analysis of failures
- Analysis of spikes and trends
- Validation of analysis methods used
Confidence scoring
Confidence scoring for TA-ANALYSIS is based on KPIs that may indicate problems in development, test, or production.
CHECKLIST
- What fraction of Expectations are covered by the test data?
- What fraction of Misbehaviours are covered by the monitored indicator data?
- How confident are we that the indicator data are accurate and timely?
- How reliable is the monitoring process?
- How well does the production data correlate with our test data?
- Are we publishing our data analysis?
- Are we comparing and analysing production data vs test?
- Are our results getting better, or worse?
- Are we addressing spikes/regressions?
- Do we have sensible/appropriate target failure rates?
- Do we need to check the targets?
- Are we achieving the targets?
- Are all underlying assumptions and target conditions for the analysis specified?
- Have the underlying assumptions been verified using known good data?
- Has the Misbehaviour identification process been verified using known bad data?
- Are results shown to be reproducible?
TA-DATA | Reviewed: ✔ | Score: 0.0
Supported Requests:
| Item | Summary | Score | Status |
|---|---|---|---|
| TT-RESULTS | Evidence is provided to demonstrate that XYZ does what it is supposed to do, and does not do what it must not do. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
Supporting Items:
None
References:
-
trustable/assertions/TA-DATA_CONTEXT.mdGuidance
This assertion is satisfied if results from all tests and monitored deployments are captured accurately, ensuring:
- Sufficient precision for meaningful analysis
- Enough contextual information to reproduce the setup (e.g., runner ID, software version SHA), though not necessarily the exact results
Monitored deployments run in both production and development, validating monitoring mechanisms across environments and ensuring comparable results. Collecting and retaining all data that support project claims (together with traceability to reasoning and specifications, and including both established and experimental indicators as well as test data from all environments) preserves evidence for selecting appropriate measures and enables historical analysis.
To avoid misinterpretation, all data storage mechanisms and locations are documented, together with long-term storage strategies, so analyses can be reliably reproduced. How this data is made accessible is assessed as part of TA-ITERATIONS.
Storage strategies should account for foreseeable malicious activities and privacy considerations when handling sensitive data, including how the data is managed during transit and at rest, and whether it can be accessed in plaintext or only through appropriate tools (also considered for TA-INPUTS and TA-TESTS).
Appropriate storage strategies safeguard availability across the product lifecycle, with emphasis on release-related data, and account for decommissioning, infrastructure teardown, and post-project backups.
Evidence
- Time-stamped and traceable result records for each test execution, linked to associated system under test version and specification references.
- List of monitored indicators, linked to associated specification version references.
- Time-stamped and traceable test-derived data for each indicator, linked to associated system under test version and indicator specifications references.
- List of monitored deployments, linked to associated version and configuration references.
- Time-stamped and traceable production data for each indicator, linked to associated deployment metadata and specification references.
Confidence scoring
Confidence scoring for TA-DATA quantifies the completeness of test results (including pass/fail and performance) and the availability of data from all monitored deployments.
Checklist
- Is all test data stored with long-term accessibility?
- Is all monitoring data stored with long-term accessibility?
- Are extensible data models implemented?
- Is sensitive data handled correctly (broadcasted, stored, discarded, or anonymised) with appropriate encryption and redundancy?
- Are proper backup mechanisms in place?
- Are storage and backup limits tested?
- Are all data changes traceable?
- Are concurrent changes correctly managed and resolved?
- Is data accessible only to intended parties?
- Are any subsets of our data being published?
TA-VALIDATION | Reviewed: ✔ | Score: 0.0
Supported Requests:
| Item | Summary | Score | Status |
|---|---|---|---|
| TT-RESULTS | Evidence is provided to demonstrate that XYZ does what it is supposed to do, and does not do what it must not do. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
Supporting Items:
None
References:
-
trustable/assertions/TA-VALIDATION_CONTEXT.mdGuidance
This assertion is satisfied when tests demonstrate that the features specified to meet project Expectations (TT-EXPECTATIONS) are present and function as intended. These tests run repeatedly in a controlled environment (TA-TESTs) on a defined schedule (e.g., daily, per change, or per candidate release of XYZ).
Confidence grows when tests not only verify Expectations but also validate (continuously) that they meet stakeholder and user needs. Robust validation depends on three aspects:
- TA-VALIDATION – a strategy that produces representative and stressing data.
- TA-DATA – appropriate handling of collected data.
- TA-ANALYSIS – analysis methods that remain dependable as the project evolves.
This structure enables iterative convergence toward required behaviours, even when early validation results are unsatisfactory.
A strategy to generate appropriate data addresses quantity, quality, and selection:
- Selection: Testing remains exploratory, combining monitoring with verified and new indicators (supporting TA-INDICATORS). Coverage spans input, design, and output analysis with traceable specifications and results (considering TA-BEHAVIOURS). Tests also support calibration of capacity, scalability, response time, latency, and throughput, executed in targeted conditions and under stress (e.g., equivalence class and boundary-value testing).
- Quantity: Automation scheduling provides sufficient repetition and covers diverse environments (e.g., multiple hardware platforms). Failures block merge requests, with pre- and post-merge tests giving fast feedback. Adequacy of data is assessed through TA-ANALYSIS.
- Quality: Test suites include fault induction (considering TA-MISBEHAVIOURS) and checks that good data yields good results while bad data yields bad results.
Evidence
- Test results from per-change tests
- Test results from scheduled tests as time series
Confidence scoring
Confidence scoring for TA-VALIDATION is based on verification that we have results for all expected tests (both pass / fail and performance).
Checklist
- Is the selection of tests correct?
- Are the tests executed enough times?
- How confident are we that all test results are being captured?
- Can we look at any individual test result, and establish what it relates to?
- Can we trace from any test result to the expectation it relates to?
- Can we identify precisely which environment (software and hardware) were used?
- How many pass/fail results would be expected, based on the scheduled tests?
- Do we have all of the expected results?
- Do we have time-series data for all of those results?
- If there are any gaps, do we understand why?
- Are the test validation strategies credible and appropriate?
- What proportion of the implemented tests are validated?
- Have the tests been verified using known good and bad data?
Confidence
TA-METHODOLOGIES | Reviewed: ✔ | Score: 0.0
Supported Requests:
| Item | Summary | Score | Status |
|---|---|---|---|
| TT-CONFIDENCE | Confidence in XYZ is achieved by measuring and analysing behaviour and evidence over time. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
Supporting Items:
None
References:
-
trustable/assertions/TA-METHODOLOGIES_CONTEXT.mdGuidance
To satisfy this assertion, all manual processes used in the verification of XYZ must be documented, including the methodologies applied, the results for specific aspects and iterations, and evidence that these processes were reviewed against documented criteria.
Most analysis (e.g., data analysis for TA-ANALYSIS) should be automated to enable continuous feedback. However, the quality of any remaining manual processes (whether from first parties or external third parties) must be considered and how they are documented and reviewed. Considerations should be made about how manual processes may impact identifying and addressing Misbehaviours (TA-MISBEHAVIOURS).
Assignment of responsibilities for any manual work must follow a documented process that verifies competence and grants appropriate access, with automation applied where possible. Resulting assigned responsibilities must ensure organisational robustness (e.g., avoidance of conflicts of interest) together with appropriate independent verification and validation. Manual reviews involving source inspections must follow documented guidelines, with exceptions recorded and illustrated through examples. These guidelines should evolve over time and cover:
- coding patterns (e.g., good patterns, anti-patterns, defensive coding)
- structured design practices (e.g., control flow constraints)
- complexity management (e.g., limiting feature creep)
- documentation (e.g., clear, formal figures and diagrams)
- feature subset restrictions (e.g., programming language subsets)
- code of conduct guidelines (e.g., review etiquette, handling disagreements)
Nevertheless, specific coding rules (e.g., memory allocation, typing, concurrency) should be integrated into automatic linting and static analysis tools where appropriate.
All processes and checks must themselves be reviewed to drive continuous improvement following specified guidelines. Any resulting changes from reviews must follow change control, regardless of who initiates them or under what circumstances.
Evidence
- Manual process documentation
- References to methodologies applied as part of these processes
- Results of applying the processes
- Criteria used to confirm that the processes were applied correctly
- Review records for results
Confidence scoring
Confidence scoring for TA-METHODOLOGIES is based on identifying areas of need for manual processes, assessing the clarity of proposed processes, analysing the results of their implementation, and evaluating the evidence of effectiveness in comparison to the analysed results
Checklist
- Are the identified gaps documented clearly to justify using a manual process?
- Are the goals for each process clearly defined?
- Is the sequence of procedures documented in an unambiguous manner?
- Can improvements to the processes be suggested and implemented?
- How frequently are processes changed?
- How are changes to manual processes communicated?
- Are there any exceptions to the processes?
- How is evidence of process adherence recorded?
- How is the effectiveness of the process evaluated?
- Is ongoing training required to follow these processes?
TA-CONFIDENCE | Reviewed: ✔ | Score: 0.0
Supported Requests:
| Item | Summary | Score | Status |
|---|---|---|---|
| TT-CONFIDENCE | Confidence in XYZ is achieved by measuring and analysing behaviour and evidence over time. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
Supporting Items:
None
References:
-
trustable/assertions/TA-CONFIDENCE_CONTEXT.mdGuidance
To quantify confidence, either a subjective assessment or a statistical argument must be presented for each statement and then systematically and repeatably aggregated to assess whether the final deliverable is fit for purpose.
To improve the accuracy of confidence evaluations in reflecting reality, the following steps are necessary:
- Break down high-level claims into smaller, recursive requests.
- Provide automated evaluations whenever possible, and rely on subjective assessments from appropriate parties when automation is not feasible.
- Aggregate confidence scores from evidence nodes.
- Continuously adjust prior confidence measures with new evidence, building on established values.
Any confidence scores, whether tracked manually or statistically, must be based on documented review guidelines that are themselves reviewed and applied by appropriate parties. These guidelines should focus on detecting inconsistencies in the reasoning and evidence linked to related Expectations, and on assessing the relevancy of all aspects considered. As a result, the argument structure must reflect the project scope, which in turn should be captured in the set Expectations and linked to the project’s analysis, design considerations, and partitioning. Within this structure, Statements must be ordered or weighted so that their relative importance and supporting reasoning are clear, with iteration scores capturing strengths and weaknesses and guiding decisions.
As subjective assessments are replaced with statistical arguments and confidence scores are refined with new evidence, evaluation accuracy improves. Over time, these scores reveal the project's capability to deliver on its objectives. The process itself should be analysed to determine score maturity, with meta-analysis used to assess long-term trends in sourcing, accumulation, and weighting.
Evidence
- Confidence scores from other TA items
Confidence scoring
Confidence scoring for TA-CONFIDENCE is based on quality of the confidence scores given to Statements
Checklist
- What is the algorithm for combining/comparing the scores?
- How confident are we that this algorithm is fit for purpose?
- What are the trends for each score?
- How well do our scores correlate with external feedback signals?
TRUSTABLE
TRUSTABLE-SOFTWARE | Reviewed: ✔ | Score: 0.0
This release of XYZ is Trustable.
Supported Requests:
None
Supporting Items:
| Item | Summary | Score | Status |
|---|---|---|---|
| TT-PROVENANCE | All inputs (and attestations for claims) for XYZ are provided with known provenance. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
| TT-CONSTRUCTION | Tools are provided to build XYZ from trusted sources (also provided) with full reproducibility. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
| TT-CHANGES | XYZ is actively maintained, with regular updates to dependencies, and changes are verified to prevent regressions. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
| TT-EXPECTATIONS | Documentation is provided, specifying what XYZ is expected to do, and what it must not do, and how this is verified. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
| TT-RESULTS | Evidence is provided to demonstrate that XYZ does what it is supposed to do, and does not do what it must not do. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
| TT-CONFIDENCE | Confidence in XYZ is achieved by measuring and analysing behaviour and evidence over time. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
References:
None
TT
TT-PROVENANCE | Reviewed: ✔ | Score: 0.0
Supported Requests:
| Item | Summary | Score | Status |
|---|---|---|---|
| TRUSTABLE-SOFTWARE | This release of XYZ is Trustable. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
Supporting Items:
| Item | Summary | Score | Status |
|---|---|---|---|
| TA-SUPPLY_CHAIN | All sources for XYZ and tools are mirrored in our controlled environment | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
| TA-INPUTS | All inputs to XYZ are assessed, to identify potential risks and issues | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
References:
-
trustable/tenets/TT-PROVENANCE_CONTEXT.mdGuidance
Anything that can affect the output of the XYZ project is considered to be an input. This will include:
- Software components used to implement specific features
- Software tools used for construction and verification
- Infrastructure that supports development and release processes
Ideally we want all XYZ contributors to be expert, motivated, reliable, transparent and ethical. Unfortunately this is not always achievable in practice:
- Given the scale, complexity and evolution of modern software systems it is impossible for engineers to be expert in all topics.
- Even the most competent engineers have bad days.
- Many engineers are unable to share information due to commercial secrecy agreements.
- Individuals and teams may be motivated or manipulated by external pressures, e.g. money and politics.
We can and should, however, consider who produced XYZ and its components, their motivations and practices, the assertions they make, supporting evidence for these assertions, and feedback from users of XYZ if available.
Similarly, we want existing software used to create XYZ to be well documented, actively maintained, thoroughly tested, bug-free and well suited to its use in XYZ. In practice, we will rarely have all of this, but we can at least evaluate the software components of XYZ, and the tools used to construct it.
TT-CONSTRUCTION | Reviewed: ✔ | Score: 0.0
Supported Requests:
| Item | Summary | Score | Status |
|---|---|---|---|
| TRUSTABLE-SOFTWARE | This release of XYZ is Trustable. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
Supporting Items:
| Item | Summary | Score | Status |
|---|---|---|---|
| TA-RELEASES | Construction of XYZ releases is fully repeatable and the results are fully reproducible, with any exceptions documented and justified. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
| TA-TESTS | All tests for XYZ, and its build and test environments, are constructed from controlled/mirrored sources and are reproducible, with any exceptions documented | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
| TA-ITERATIONS | All constructed iterations of XYZ include source code, build and usage instructions, tests, results, and attestations. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
References:
-
trustable/tenets/TT-CONSTRUCTION_CONTEXT.mdGuidance
Where possible we prefer to build, configure and install XYZ from source code, because this reduces (but does not eliminate) the possibility of supply chain tampering.
When constructing XYZ we aspire to a set of best practices including:
- reproducible builds
- construction from a given set of input source files and build instructions leads to a specific fileset
- re-running the build leads to exactly the same fileset, bit-for-bit
- ensuring that all XYZ dependencies are known and controlled (no reliance on external/internet resources, or unique/golden/blessed build server); and
- automated build, configuration and deployment of XYZ based on declarative instructions, kept in version control (e.g. no engineer laptop in the loop for production releases)
Some of these constraints may be relaxed during XYZ development/engineering phases, but they must all be fully applied for production releases. Note that when we receive only binaries, without source code, we must rely much more heavily on Provenance; who supplied the binaries, and how can we trust their agenda, processes, timescales and deliveries?
TT-CHANGES | Reviewed: ✔ | Score: 0.0
Supported Requests:
| Item | Summary | Score | Status |
|---|---|---|---|
| TRUSTABLE-SOFTWARE | This release of XYZ is Trustable. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
Supporting Items:
| Item | Summary | Score | Status |
|---|---|---|---|
| TA-FIXES | Known bugs or misbehaviours are analysed and triaged, and critical fixes or mitigations are implemented or applied. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
| TA-UPDATES | XYZ components, configurations and tools are updated under specified change and configuration management controls. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
References:
-
trustable/tenets/TT-CHANGES_CONTEXT.mdGuidance
We expect that XYZ will need to be modified many times during its useful/production lifetime, and therefore we need to be sure that we can make changes without breaking it. In practice this means being able to deal with updates to dependencies and tools, as well as updates to XYZ itself.
Note that this implies that we need to be able to:
- verify that updated XYZ still satisfies its expectations (see below), and
- understand the behaviour of upstream/suppliers in delivering updates (e.g. frequency of planned updates, responsiveness for unplanned updates such as security fixes).
We need to consider the maturity of XYZ, since new software is likely to contain more undiscovered faults/bugs and thus require more changes. To support this we need to be able to understand, quantify and analyse changes made to XYZ (and its dependencies) on an ongoing basis, and to assess the XYZ approach to bugs and breaking changes.
We also need to be able to make modifications to any/all third-party components of XYZ and dependencies of XYZ, unless we are completely confident that suppliers/upstream will satisfy our needs throughout XYZ's production lifecycle.
TT-EXPECTATIONS | Reviewed: ✔ | Score: 0.0
Supported Requests:
| Item | Summary | Score | Status |
|---|---|---|---|
| TRUSTABLE-SOFTWARE | This release of XYZ is Trustable. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
Supporting Items:
| Item | Summary | Score | Status |
|---|---|---|---|
| TA-BEHAVIOURS | Expected or required behaviours for XYZ are identified, specified, verified and validated based on analysis. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
| TA-MISBEHAVIOURS | Prohibited misbehaviours for XYZ are identified, and mitigations are specified, verified and validated based on analysis. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
| TA-INDICATORS | Advance warning indicators for misbehaviours are identified, and monitoring mechanisms are specified, verified and validated based on analysis. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
| TA-CONSTRAINTS | Constraints on adaptation and deployment of XYZ are specified. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
References:
-
trustable/tenets/TT-EXPECTATIONS_CONTEXT.mdGuidance
While most modern software is developed without a formal requirement/specification process, we need to be clear about what we expect from XYZ, communicate these expectations, and verify that they are met.
In most (almost all?) cases, we need to verify our expectations by tests. These tests must be automated and applied for every candidate release of XYZ.
It is not sufficient to demonstrate that the software does what we expect. We also need to analyse the potential risks in our scenario, identify unacceptable and/or dangerous misbehaviours and verify that they are absent, prevented or mitigated.
In most cases it is not sufficient to demonstrate behaviours and mitigations only in a factory/laboratory environment. We also need to establish methods for monitoring critical behaviours and misbehaviours in production, and methods for taking appropriate action based on advance warning indicators of potential misbehaviour.
Consequently, when defining expectations, mitigations, and warning indicators, the scope and its assumptions about environment and usage must be specified and stated explicitly, so users understand the context of the claims.
TT-RESULTS | Reviewed: ✔ | Score: 0.0
Supported Requests:
| Item | Summary | Score | Status |
|---|---|---|---|
| TRUSTABLE-SOFTWARE | This release of XYZ is Trustable. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
Supporting Items:
| Item | Summary | Score | Status |
|---|---|---|---|
| TA-VALIDATION | Tests exercise both stressed and representative conditions, validating behaviour through systematic, scheduled repetition. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
| TA-DATA | Test and monitoring data from development and production are appropriately collected and retained. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
| TA-ANALYSIS | Collected test and monitoring data for XYZ is analysed using verified methods to validate expected behaviours and identify new misbehaviours. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
References:
-
trustable/tenets/TT-RESULTS_CONTEXT.mdGuidance
We need to perform tests to verify expected behaviours and properties of XYZ in advance of every candidate release.
We also need to validate these tests, to confirm that they do actually test for the expected behaviours or properties, and that test failures are properly detected. Usually this can be done by inducting software faults to exercise the tests and checking that the results record the expected failure.
Similarly we need to verify that prevention measures and mitigations continue to work for each XYZ candidate release, and in production.
All these validated test results and monitored advance warning signal data from production must be collected and analysed on an ongoing basis. We need to notice when results or trends in advance warning signals change, and react appropriately.
TT-CONFIDENCE | Reviewed: ✔ | Score: 0.0
Supported Requests:
| Item | Summary | Score | Status |
|---|---|---|---|
| TRUSTABLE-SOFTWARE | This release of XYZ is Trustable. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
Supporting Items:
| Item | Summary | Score | Status |
|---|---|---|---|
| TA-METHODOLOGIES | Manual methodologies applied for XYZ by contributors, and their results, are managed according to specified objectives. | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
| TA-CONFIDENCE | Confidence in XYZ is measured based on results of analysis | 0.00 | ✔ Item Reviewed ✔ Link Reviewed |
References:
-
trustable/tenets/TT-CONFIDENCE_CONTEXT.mdGuidance
Our overall objective is to deliver releases of XYZ that meet our expectations and do not cause harm. By collecting and assessing evidence for all of the factors above, we aim to assess (ideally measure) confidence in each release candidate, to support go/nogo decision-making, In assessing confidence we need to consider various categories of evidence including:
- subjective (e.g. provenance, reviews and approvals)
- binary (e.g. test pass/fail)
- stochastic (e.g. scheduling test results over time)
- empirical (e.g. advance warning signal monitoring data from production deployments)
Extensions
Applications of TSF
The Eclipse Trustable Software Framework (TSF) is an objective-based framework, which can be applied in a number of ways. Aspects of the Tenets may satisfied by the existing processes or practices applied a project, while others may need these to be extended, or identify new processes that could be added.
This section documents some specific applications of the framework, both as a generic template to be applied to projects, and as an illustration of how the TSF assertions can be extended.
- RAFIA is a process model for critical software projects, which applies the TT-EXPECTATIONS and TT-RESULTS Tenets.
- STPA is a methodology that can be applied as part of RAFIA's Risk Analysis process to satisfy the objectives of TA-MISBEHAVIOURS.
- Managing statements is a process description for applying the Trustable methodology to ensure independence and robustness in Statements tracked with TSF, satisfying aspects of TA-METHODOLOGIES.
RAFIA
RAFIA
RAFIA is a process model for critical software projects. It is a specific application of the Eclipse Trustable Software Framework (TSF), focussing on two of the Trustable tenets: TT-EXPECTATIONS and TT-RESULTS.
The name RAFIA is an acronym for Risk Analysis, Fault Induction and Automation, which are the key activities that differentiate it from other software engineering process models.
The following diagram shows how these activities fit into an iterative software development and maintenance workflow, and how they all relate to the TSF concept of Misbehaviours.
Refer to the sections linked above for more information.
The rounded boxes represent the high-level stages in a development workflow, while the coloured rectangles represent specific activities within these stages that are involved in RAFIA, with colours indicating different types of activity.
Solid curved lines show the main flow of information and artifacts. Dotted lines show how the outputs of one activity may provide inputs for another. Dashed lines with open arrowheads indicate feedback from one stage to another, such as bug reports identifying a problem or a gap in an existing artifact.
As a result, RAFIA processes can be used with TSF to demonstrate how desired complex system behaviours and misbehaviours are addressed, supporting Validation and complying with safety-related, cybersecurity-related, and other key requirements like high availability. This approach relies on the stochastic nature of modern software systems, which run on multiple multi-processor hardware platforms, to collect the rich data required.
Risk Analysis
Risk Analysis describes a set of processes that are used to document and verify both what software is expected to do, and what it must not do, by focussing on outcomes that we specifically wish to avoid or prevent.
There are a number of established methodologies and approaches to this, many of which focus on the potential faults or failure modes of a system (including software systems), and how the system or a given component either prevents these, or mitigates their effects.
The RAFIA approach was developed using System Theoretic Process Analysis (STPA), methodology developed by MIT, and the approach is strongly informed by this technique.
Hazard analysis
The first objective of Risk Analysis is hazard analysis: identifying and characterising Misbehaviours, and classifying them by reference to their negative outcomes.
Detailed procedures and guidance for accomplishing this using STPA are provided in the STPA section, but its objectives can be summarised as follows:
- Describe a system or subsystem that incorporates the software, which might be a physical or a software system, or a discrete part of a larger system
- Identify losses (outcomes that are unacceptable for the system's stakeholders) and hazards (system-level conditions that can lead to these losses)
- Specify a hierarchical control structure, which describes the functionality of the system in terms of its elements (notably controllers and controlled processes) and interactions between them (notably control actions and feedback)
- Analyse this structure to identify unsafe control actions (UCA) (interactions between a controller and a controlled process that may result in a hazard)
- Identify causal scenarios (factors that can lead to unsafe control actions, or directly to hazards)
- Devise and specify constraints (Statements about the software or a system that must be true in order to avoid a given hazard, UCA or causal scenario)
Beyond the basics, other STPA variants expand the analysis with domain-specific knowledge, and supporting analyses (such as FTA) are often needed to incorporate derived test results. This serves as a starting point from which the analysis can be further refined. Whatever processes are followed, appropriate review guidelines must be established to meet project-specific needs.
Note
If you wish to use a methodology other than STPA to apply RAFIA, you should first familiarise yourself with how STPA approaches hazard analysis, to ensure that your selected methodology fulfils these objectives.
Traceability
To document how the results of hazard analysis inform the design, implementation and verification of our software, we use the model described by the TSF.
With STPA, for example, the losses, hazards and UCAs documented in the analysis represent a set of prohibited Misbehaviours, which may be documented as Expectations. The derived set of constraints are documented as Assertions, which specify how the risks associated with these misbehaviours are managed for the software, or in a given system. Other risk analysis techniques can provide inputs in a similar way.
This set of Assertions, together with Assertions developed through analysis of other software or system expectations, may then inform (or be mapped to existing) test specifications and related fault inductions, which provide Evidence that the identified risks are indeed managed as asserted.
The Misbehaviours identified by hazard analysis are also a valuable source of scenarios that need to be tested, and fault inductions that can be used to verify the tests, or the implemented mitigations.
If all top-level objectives are tracked as Statements (Expectations, Assertions, Evidence) with forward and backward traceability, RAFIA establishes a verification-driven workflow with analysis-led traceability:
- Expectations must be supported by sufficient Evidence, with progress tracked through reviewed confidence measurements.
- The workflow requires analysis-led traceability linking analysis to Statements about system objectives, architecture, design, and verification and validation outcomes.
Consequently, changes to any Statement require re-evaluating the associated analysis links that connect them.
Ensuring that knowledge remains in sync across design and evaluation cycles is addressed in Automation of information gathering and presentation.
Risk Evaluation
The STPA methodology does not address another important aspect of Risk Analysis: evaluating the relative importance or criticality of the Hazards and/or Misbehaviours that have been identified.
This can be partially addressed using STPA's concept of Losses, which allow us to categorise negative outcomes that are unacceptable to stakeholders, and prioritise the necessary remedies or mitigations on this basis. However, this only covers one aspect of risk evaluation: determining the severity of a risk.
Risk evaluation of an identified hazard needs to consider at least two things:
- The severity, impact or consequences of the hazard, in terms of its potential adverse effects
- The likelihood or frequency of the hazard, in terms of the probability that it will occur in a given timeframe
Other factors to be considered include:
- controllabilty: If the hazard does occur, how effectively can its adverse effects be mitigated?
- exposure or demand: To qualify likelihood, how often is the entity impacted by the hazard likely to be exposed to it in their use of the system?
Together, these factors are used to categorise and determine the relative importance of Misbehaviours. This is valuable when considering the cost of eliminating or adding mitigations to address hazards against the net effect of this on overall risk. This is particularly important when a mitigation or remedy may have a significant impact on the overall design, since the changes involved may themselves introduce more risk.
Evaluation of risk is considered in the Eclipse Trustable Software Framework as part of TA-CONFIDENCE.
Fault Induction
Mitigating known faults and testing for Misbehaviours is accepted as good software development practice. However, if we are to trust software, we must have a method for exposing undetected Misbehaviours, as well as for identifying faults in our implemented mitigations and tests. These flaws in the software and its tests can be exposed by intentionally breaking or stressing the system. In RAFIA, the set of techniques used for this purpose are called fault inductions.
For example:
- Introduce software errors and misconfigurations into the target software
- Introduce workload stresses to starve resources or overload the system
- Cause software processes to terminate abnormally
- Trigger known component Misbehaviours
- Run software processes that deliberately misbehave
These techniques are used to:
- Demonstrate that tests react correctly to Misbehaviours
- Demonstrate that mitigations prevent Misbehaviours, or react as expected
- Expose Misbehaviours that we have not already identified or anticipated
- Catch unintended effects in later system changes, including integration of the software in a new system context
Misbehaviours, and the role of Fault Induction in identifying them, are further discussed in this section.
Automation
The final A in RAFIA stands for 'Automation'. We aim to automate as many steps in the software development process as possible, and collect the results of these steps in association with all changes to the software and its supporting documentation. This ensures that these steps are auditable, and ideally repeatable, and enables us to determine the impact of a change in its given context.
Automated builds
The automated construction of the software, and of the tools and environments that are used in its construction and verification, is covered by the TT-CONSTRUCTION tenet.
Automated tests
Wherever possible, verification of the software should be accomplished using automated tests.
System-level and integration tests are the primary focus of RAFIA, but component tests may add another layer of verification before system tests are executed, or be used to verify aspects of system behaviour that it is not feasible to replicate in the available system testing environments.
Test specification and analysis
This diagram shows how the specification of tests interacts with both Test Data Analysis and Risk Analysis.
The TSF Specification is an organised graph of records documenting:
- Expectations related to the software, how it is intended to be used, and how it should be developed, maintained and verified
- Assertions related to these, specifying criteria that need to be satisfied
- Evidence that exists or is generated (e.g. by an automated test) to support these assertions, or to determine the extent to which they are satisfied for a given iteration of the software
The Test Specification forms part of this evidence. It consists of Items in the TSF graph that reference the associated test scenario files, which are stored in a git repository. It documents:
- The Test Scenarios that are used to verify the behaviour of the software, the expected results of executing them, and any test data that needs to be collected for further analysis (see Test Data Analysis below)
- The Test Contexts within which these scenarios are to be executed. These may be actual target systems, virtualised simulations of these, or some other approximation of the environment within which the software under test will be deployed. A context definition may also include specific configurations of the hardware or software with which the software is integrated.
Test Data Analysis examines the results of executing a test, and processes these to produce further evidence.
- Test Results are the direct outputs of an executed test, which may include both the pass/fail outcomes of tests, and test data collected before, during or after the execution of tests.
- Test Metrics are derived from these results, typically by processing this 'raw' data and extracting specific parts of the data or calculating values derived from it. The result data that needs to be collected as an input to this process should be defined in the test scenarios; the desired extracts or inputs and the algorithms for processing them should be specified in, or referenced by, assertions.
- Historical Statistics are derived by analysing metrics recorded for previous executions of the same test(s); for example, examining how often or how frequently a specific test metric fails to satisfy its criteria, or examining trends in a given performance metric over time.
Test design and implementation
This diagram shows how Test Design and Test Implementation both inform and are informed by the specification and analysis.
The Test Design is derived from the Test Specification for execution in a given context. This defines executable sequences of actions that correspond to the scenarios, for use in specific types of test.
Pre-merge Tests are run for every proposed change to the software and its associated test contexts. These tests should all pass (or fail as expected, e.g. in the case of false negative tests). Any issues with these tests must be resolved before a change is permitted to be merged into the software's mainline. These should consist of:
- Happy Path Tests, which execute a scenario under 'normal' conditions, where the software's intended behaviour is expected to be observed.
- Exception Handling Tests, which execute scenarios where a known failure mode or other exceptional condition exists, resulting in an expected deviation from the 'happy path'.
- False Negative Tests, which deliberately create the conditions in which an existing test scenario should fail (using Fault Induction) to verify that the test both detects the condition (i.e. fails) and reports the expected failure (and not a different failure).
Post-merge Tests are periodically executed for integrated sets of changes (e.g. on the mainline, or a release or product branch). These may consist of:
- Soak Tests, which exercise the software in more extended sequences, or over an extended period of time.
- Stress Tests, which extend soak tests to run in a test context that simulates 'abnormal' system conditions, such as constrained memory or storage. capacity, misbehaving system processes, or high volume external input
- Performance Tests, which extend soak or stress tests to collect specific measurements of the software behaviour, or to record observed deviations from expected performance criteria.
A Test Implementation may be required to enable test definitions to be executed in a given test context. This might include:
- Stressor implementations, artificially induced conditions intended to provoke Misbehaviours in a stress test.
- Other Fault Induction implementations for use in false negative or exception handling tests, which may include deliberately misconfigured components, simulated dependencies or client processes, and deliberately broken versions of software components or dependencies.
See Misbehaviours for more information about how different types of test and analysis of test results can help to inform other activities, and how their results can in turn be used to inform test design.
Automated verification
There are many verification actions beyond (as well as connected with) automated tests, which can and should be automated wherever possible.
The results of an automated build or test might be verified as part of an associated CI job, for example, or fed into a separate verification job. Verification actions here might be as simple as checking that the expected result outputs have been produced and successfully stored for each build of test job, or they might involve complex analytical processes to support Test Data Analysis.
The use of static analysis and linting tools to check source code is also a well-established practice. This class of tools can be extended to include spell-checking for documentation, formatting rules for code or commit messages, and other tasks that take the burden of review away from humans.
As part of linting, the status of Statements and their links to the risk analysis help ensure that impact analysis for affected Statements is visible for every change. This is achieved when the example Statement management process is followed, ensuring that link review is applied systematically.
Manual verification
When some aspect of the software cannot be verified by an automated test, this step must be performed by a human reviewer. Examples include reviewing the implementation of a test to verify that it conforms to a test specification, or confirming that a test specification covers all of the criteria specified in an assertion to which it is linked.
Verification actions performed by humans should always be performed on files under change control, or reports generated by an automated process (such as a build or test), or documents produced by an automated documentation generation process.
Automated document generation
Any documents needed for manual verification should be generated by automated processes, using either input files managed under change control (e.g. the content of a document stored as Markdown in a git repository) or the results of some other automated verification actions (e.g. the results of running an automated build or test).
This also allows easy distribution of additional diagrams or figures provided alongside the risk analysis, associating them with Statements as qualifying information. Reviews should ensure these materials are appropriate and understandable (e.g., include clear legends).
Automating and coordinating these processes alongside all of the other automated actions serves two purposes: it minimises the time human reviewers need to spend trying to find and identify the review inputs for a given software iteration, and it can help to ensure that the set of documents required for a review are reproducible and clearly associated with that iteration.
Such tight automatic coupling of documentation ensures that partitioning and prioritisation of work are guided by linked safety analysis outcomes.
Misbehaviours
The TA-MISBEHAVIOURS assertion from the TSF is a key focus of RAFIA, and many of its activities revolve around the concept of Misbehaviours.
This term is used instead of faults or failure modes because it encompasses unintended or unspecified behaviours of the software (or the system that it is part of) as well as behaviour that is in violation of its specified Expectations.
Identified and confirmed Misbehaviours correspond to Faults: deviations from specified Expectations. However, we also wish to identify Misbehaviours that have not yet been considered, or are not yet adequately specified. Hazard analysis techniques such as STPA specifically encourage us to consider classes of Misbehaviour that may result when a system is behaving exactly as specified, but not as intended.
The following diagram illustrates how RAFIA processes are used to identify, document and make use of Misbehaviours:
The processes indicated are all described in this section, and most of them correspond to activities described in the Risk Analysis and Automation sections; the exception to this is Fault and Defect analysis, which is independent of RAFIA, but it may nevertheless inform and be informed by these processes.
The results of analysis, including new or refined Expectations and Assertions, test specifications and descriptions of identified Misbehaviours, are captured as part of a TSF Specification in the form of Statements and Artifacts.
Test results, other collected test data and Faults or Defects are managed outside this specification, but may be referenced by Statements (e.g. to automate assessment of test results using validators).
Identifying and categorising Misbehaviours
As shown in the preceding diagram, all of these activities revolve around the identification Misbehaviours.
There will always be a set of Misbehaviours that have not been identified, and our Expectations, Assertions, tests and Fault inductions may not cover or address all of the Misbehaviours that we have identified. However, we can identify new Misbehaviours by analysing the results of our existing tests, by developing new tests with the specific aim of exposing unidentified Misbehaviours, and by using the knowledge gained in this way to refine the results of our Risk analysis.
In addition to identifying new Misbehaviours, Risk analysis and Test analysis can help to identify and monitor Advance Warning Indicators, which can then be used to proactively respond to known conditions that may lead to a deviation from expected Behaviour, instead of simply reacting after a Misbehaviour has occurred. Furthermore, the data that is gathered by monitoring these indicators can itself enable us to identify new Misbehaviours.
The following diagram illustrates how a clear understanding of Misbehaviours can inform the processes of testing, analysis and specification:
Risk Analysis
Risk analysis is a key source of Misbehaviours, and should be used where possible to characterise all documented Misbehaviours.
This means that Misbehaviours are described in terms of the analytical model(s) used to perform Hazard Analysis. This provides a way to align Misbehaviours observed in testing (or in deployed software) or identified by test analysis to be aligned with those identified through Risk Analysis.
This is important because it enables us to identify limitations or gaps in the Risk Analysis results, or in the models that are used to perform it.
Fault Induction
Misbehaviours are used to create Fault Induction tests, and new Misbehaviours can be identified using the same techniques.
Testing
The Automated testing section describes how Pre-merge tests are used to verify that the software satisfies its specified Expectations and Assertions. These tests can use Fault Induction techniques to verify that the tests are fit for purpose or that mitigations correctly handle exceptions.
However, others types of test are used to identify new Misbehaviours, by subjecting the software (and the system that it is part of) to environmental factors, inputs or simulated Faults that are designed to exercise behaviours that may not yet be covered by Pre-merge tests or Behaviour specifications.
-
Soak tests can help to identify new Misbehaviours by simply executing the software repeatedly or over a longer timeframe, thereby triggering behaviour that may not occur in shorter or more atomic Pre-merge test.
-
Stress tests can identify Misbehaviours by simulating environmental factors that may impact the software's behaviour, trigger exception-handling routines or mitigations, or cause it to break in as-yet-unanticipated ways.
-
Performance tests are primarily intended to help calibrate the software or a system, to understand and document any limitations that should be placed on its configuration, or restrictions on how it should be used. However, it can also help to identify Misbehaviours by pushing the performance of the software or hardware to its limits.
Fault and Defect analysis
Faults identified in components, whether found through testing or reported by the originators of the component, can be a valuable source of new Misbehaviours. If a Fault cannot be mapped to a documented Misbehaviour, or if it cannot be characterised by a Hazard Analysis model, then this may suggest that it represents a new category of Misbehaviour.
Defects identified in other artifacts associated with the software or system, such as incomplete, incorrect or misleading specifications, can also suggest Misbehaviours that may not have been considered.
As noted above, a useful first step in Fault or Defect analysis can be attempting to describe the observed problem(s) using one of the models used for Risk Analysis (e.g. a software architecture diagram or specification), to make it easier to determine whether it corresponds to a previously identified Misbehaviour.
System and Testing Faults
When reporting and analysing Faults detected during testing, it is important to distinguish between System Faults and Testing Faults:
- System Faults are those cases where testing has positively determined that the software or system itself has exhibited a Misbehaviour; this affects the TA-MISBEHAVIOURS assertion of the software's TSF graph
- Testing Faults are those cases where the automated testing apparatus failed to measure the software or system, so we cannot draw conclusions about the presence or absence of Misbehaviours; this concerns the validity of the tests, so affects the TA-ANALYSIS assertion of the software's TSF graph, but also the TA-MISBEHAVIOURS assertion of the automated test framework, if that is additionally subject to analysis against the TSF
Test data analysis
As described in Automation, test results, test metrics, historical statistics and other data collected during tests or accumulated over time, can also help us to identify new (or confirm predicted) Misbehaviours. In some cases this will be very obvious: a test that was passing is now failing since we made a certain change. In other cases we may only observe anomalous patterns that need further investigation.
Examples include:
- Intermittent errors in regular tests, which disappear if the test is re-run, only to re-appear in subsequent runs.
- Anomalous patterns observed in collected test data, which do not cause tests to fail, but also do not correlate with what we would expect, or what we have observed in historical tests.
- New warnings in system logs, which may not indicate an actual failure, but are indicative that something has changed.
- Changes in observed behaviour, or in measured performance characteristics, that were not an expected result of a change.
These events or patterns do not always indicate the nature of the Misbehaviour, but can make us aware that some aspect of our system is not behaving as we had expected. The problem might be a poorly implemented test, or an inadequately controlled test environment, but it could equally be a badly specified test, or an unnoticed design flaw.
Existing Misbehaviours and Risk Analysis can help to narrow down what is behind anomalous patterns, and confirm whether this is expected behaviour, a new category of Misbehaviour, or an example of an existing category. Where a link to Misbehaviours is established, it may also be valuable to consider whether the observed anomaly or pattern might be used as an Advance Warning Indicator, or as the basis of a new test.
Validation
Formal validation for achieved integrity levels is often performed through processes mapped to applicable domain standards.
The purposes of Safety Integrity Levels (SILs, as defined in IEC 61508), as well as other abstraction labels that express integrity or assurance levels in these standards, include:
- Expressing integrity levels and expected risk reduction from components, subsystems, or systems
- Highlighting differences in required integrity at interfaces (with freedom from interference claims) in mixed-criticality systems
To optimise high-integrity systems, mixed criticality uses abstraction labels for components, subsystems, and processes (through requirements) to guide partitioning (architectural design), selection, and prioritisation. Yet, as noted in "Engineering a safer world: Systems thinking applied to safety" by Nancy G. Leveson, unintended emergent properties still arise at the system level despite the use of high-integrity components. Emergent behaviour in complex, evolving systems requires a holistic systems engineering approach, such as that provided by the TSF.
Trustable does not require SILs to be assigned during design, architecture, or requirements stages. Instead, the project defines which levels have been achieved at the system level based on project-specific claims and evidence, continuously presented through the TSF graph (including for certification). TSF guidance is thus agnostic to traditional integrity requirements expressed in domain-specific classification schemes (e.g., SIL, ASIL, DAL, Class, EAL), leaving assignment of such levels to systems (and subsystems) applying TSF.
The Trustable Tenets (TTs) are designed to be pragmatic and to avoid prescribed breakdowns, keeping them independent of any specific process (specifying the What, not the How). This document therefore applies an abstraction-level structure covering aspects often tied to integrity-level assignments, aligning them with the TTs. The TSF, in turn, allows projects to define alternative approaches consistent with the TTs. An initial in-context review ensures the completeness and applicability of the TTs:
- System-level integrity (TT-EXPECTATIONS, TT-RESULTS)
- Module-level integrity (TT-PROVENANCE)
- Process-level integrity (TT-CONSTRUCTION, TT-CHANGES, TT-CONFIDENCE)
System-level integrity
Top-level product claims must be tracked as Statements (Expectations, Assertions, Evidence). Expectations require supporting Evidence (e.g., test and analysis results) with reviewed confidence measures.
For instance, in safety-related systems, RAFIA guidance describes the structure of analysis and testing required to justify these claims. The RAFIA verification-driven workflow ensures safety-analysis-led traceability, linking STPA-driven analysis of features, mitigations, and their decomposition to design specifications and test results (primarily under TT-EXPECTATIONS). This workflow applies equally to bespoke elements and to managed selections of pre-existing elements. Results are reviewed against the analysis to establish failure rates for both system functions and interacting subsystems, using statistical methods such as FTA (primarily under TT-RESULTS).
Any failure to achieve the required failure rates is exposed and triggers further RAFIA processes until integrity targets are met, driving continuous improvement.
Module-level integrity
At the module level, TSF analyses inputs (components, tools, data) to identify risks, treating all inputs as untrustworthy by default (prevalent for TT-PROVENANCE). Module-level assessments, alongside system-level testing and analysis results, are linked to system-level objectives to address all critical misbehaviour mitigations with appropriate confidence measurements.
If a low score remains, the project may increase analysis or apply remedial measures to meet objectives. Possible actions include:
- Strengthening weak parts of the system
- Replacing components with higher-scoring alternatives
- Increasing redundancy
- Enhancing diagnostics
- Redesigning the system
Note, a module or subsystem may map its claims to TSF for reuse, but downstream projects must decide how to consume externally managed Statements. Reusing upstream scores as-is is invalid, as they require contextualisation through additional reasoning, data, and reviews.
Process-level integrity
At the process level in TSF, Evidence is mapped to objectives, which address not only system behaviours, but also robustness (primarily covered by TT-CONSTRUCTION, TT-CHANGES, TT-CONFIDENCE). Confidence measurements at this level are applied consistently across projects to ensure transparency, confirming that system- and module-level Statements are supported by appropriate (in-context) Evidence.
The TSF graph combines these measurements into confidence scores that show overall integrity across the system, processes, and data, while highlighting gaps. Scores and their prioritisation are refined using project-specific context (e.g., links to losses in STPA), weightings, and evidence from both development and production.
In conclusion, component or subsystem SILs are not directly applied within Trustable processes. The focus is on concrete system features, with design considerations and gaps documented transparently, regardless of prior subsystem claims. The system's performance level can then be derived from presented Evidence.
STPA
Using STPA for RAFIA
This section describes how System Theoretic Process Analysis is used as part of Risk Analysis. While other analysis techniques may be used to accomplish the same goals, the provided procedure and glossary introduce and discuss some of the core concepts that underpin RAFIA.
Note that many of the examples and some of the terminology used are focussed on safety-related hazard analysis, but the STPA methodology and RAFIA itself are also intended to be applicable to the analysis and management of risk in other domains, most notably cyber-security.
The tabular data structures used to record STPA results as described in the STPA procedure are documented in an informal schema.
RAFIA STPA Procedure
This is an adaptation of the methodology described in the STPA Handbook, for use with RAFIA, incorporating clarifications and improvements devised by Simon Whitely of Whitely Aerospace.
Terms that are specific to STPA are linked to the glossary, which also includes guidance about how to approach many of these steps and tips to avoid common pitfalls.
In this procedure:
- An element is a component, subsystem or collaborating set of components in a larger system, which has an identifiable purpose within that system.
- SUA stands for Software Under Analysis, meaning the software element that is the primary focus of your work.
- Target system is a software or hardware/software system that is intended to incorporate the SUA in a given role.
To apply STPA for a given SUA and a given Target system may require more than one analysis, each with different scope and/or purpose. For a given analysis, there may also be a number of iterations, which may extend or refine the defined purpose or the control structure for that analysis.
The results of each step in a given analysis are recorded in a workbook, which consists of a number of interlinked tables. The columns for these tables are described in the workbook schema and can be seen in the workbook template.
A given iteration of STPA for a defined scope and purpose concludes when all of the 10 steps have been completed, as specified in the Review criteria.
Note
STPA is an iterative process, and you should expect to revisit and refine or extend the results of earlier steps as a result of insights that you obtain in the course of the analysis.
However, if these insights lead to radical changes in your understanding of the existing scope or the control structure defined in step 3, then it may be more effective to conclude the current analysis and restart, rather than attempt to rework.
1. Define the scope of analysis
Describe a System that incorporates the SUA, which might be a physical or a software system, or a discrete part of a larger system.
- This defines the scope of the analysis:
- Typically you will define a scope that corresponds to the limits of your design responsibility, or your ability to influence the design of the system or its elements.
- You may also decide to focus on one aspect or part of a larger system.
- Identify the boundary or boundaries of the
System, where applicable, and list the inputs or outputs that cross each
boundary:
- A Boundary may be a non-physical interface, or set of interfaces, or represent a set of software APIs.
- A Boundary may be defined by the Elements that manage inputs and outputs between the System and its Environment.
- An initial version of the Control Structure can be drawn at this point, and may inform the above.
- Any design assumptions that you make regarding the System or its Elements should also be documented, especially where these relate to safety- or security-critical aspects of the system.
Review criteria
- Scope of the analysis is documented in the workbook, including:
- High level description of the system, its boundary and its environment
- A diagram illustrating the system and its boundary, or a preliminary control structure
- Scope is also recorded in artifacts under change control.
2. Define the purpose of analysis
Identify Losses (outcomes that are unacceptable for the System's stakeholders) and Hazards (System-level conditions that can lead to these Losses), and then derive or devise System-level constraints (SLC) (system conditions or behaviours that need to be satisfied to prevent hazards) for the latter.
These formally specify the focus of the analysis; typically this will be informed by the safety and/or security goals of the system being analysed. For a software system or component, these may be the assumed goals of an intended target system, or class of systems.
- Losses may include unacceptable commercial- (loss of profit), financial- (damage to property) or user-related (loss of mission) outcomes as well as safety-related (death or injury to people) ones.
- Hazards directly relate to the scope of the analysis:
- They identify a set of System conditions leading to Losses that the designers are responsible for preventing or mitigating.
- A single analysis may focus on a subset of the complete set of Hazards applicable to a System.
- When writing SLC, it is important to consider the System as a whole and what must be true in order to prevent the Hazards.
- See also the guidance for Constraints in general when writing SLC.
- SLC may also describe necessary Mitigations if it is not possible to prevent or avoid a Hazard.
Review criteria
- Losses are documented in the workbook, meeting the following criteria:
- Do not refer to individual Elements or specific causes.
- Hazards are documented in the workbook, meeting the following criteria:
- Do not refer to individual Elements of the System.
- Are linked to one or more Losses.
- Refer to factors that can be controlled or managed by the System designers and operators.
- Describe system-level conditions to be prevented, not failures or deviations from specified system functions.
- Do not use ambiguous or recursive words like "unsafe", "unintended", "accidental", etc.
-
System Level Constraints are documented in the workbook, meeting the following criteria:
- Relate to the System, rather than individual components.
- Are linked to one or more Hazards.
- Communicate what needs to be true in order to avoid a negative outcome, rather than how that is to be accomplished.
- Use the indicative mood, rather than the imperative mood.
- Use the same context and terminology as the associated Hazards.
-
Losses, Hazards and SLC are stored as artifacts under change control.
3. Describe a control structure
Specify a hierarchical Control Structure, which describes the functionality of the System in terms of its Elements (notably Controllers and Controlled Processes) and interactions between these (notably Control Actions and Feedback).
- The System concept from step 1 will provide a starting point for this.
- The SUA may be represented by one or more of the Elements:
- Multiple Elements can be used to show discrete functions of the SUA.
- Use a hierarchical control structure diagram to show the Elements and
Interactions:
- Elements are represented by labelled rectangular boxes.
- Control actions are represented by arrows with a solid line, and should point downwards wherever possible.
- Feedback is represented by arrows with a dashed line, and should point upwards wherever possible.
- Other interactions are represented by arrows with a dotted line.
- Only include one arrow of each type between Elements in the diagram:
- More specific interactions will be detailed in the Interactions table.
- Label each of the elements with a unique identifier (e.g.
E1)in addition to its name. - Classify each of the interactions (and label accordingly) as:
- Control Actions (
Clabel) - Feedback (
Flabel) - Other interactions (
Ilabel)
- Control Actions (
- It may be useful to pair Control Action and Feedback interactions (and number them accordingly) to help identify feedback loops (ie. where feedback leads to control actions, or control actions lead to feedback). This is examined in step 6.
- Use the Elements and Interactions tables to record the diagram:
- Each box in the diagram is an Element.
- Each arrow in the diagram is an Interaction.
- Document more details for each Element, including its Responsibilities and its role(s) in the control structure.
- Document more details for each Interaction, including a short description of its type (C, F or I) and its start and end points.
- Break down the simplified Interactions shown in the diagram:
- Detail more specific Control Actions, Feedback and Other Interactions as
appropriate, to characterise discrete Interactions of each type as
appropriate for the level of abstraction:
- You should not try to include every possible signal exchanged between the Elements.
- Control Actions may be an abstraction describing the overall intent of a sequence of interactions.
- Feedback is an abstraction classifying information that the Controller needs, rather than specific signals conveying that information.
- Assign these Interactions' identifiers based on the Diagram label:
- e.g. More specific Control Actions associated with the diagram label
C1would be assigned identifiers ofC1.1,C1.2, etc.
- e.g. More specific Control Actions associated with the diagram label
- Detail more specific Control Actions, Feedback and Other Interactions as
appropriate, to characterise discrete Interactions of each type as
appropriate for the level of abstraction:
Review criteria
- A hierarchical control structure diagram is completed:
- All elements are labelled appropriately.
- All interactions are labelled appropriately.
- Control structure is documented in the workbook:
- Control structure diagram is included in Scope, along with a description.
- Elements and Interactions tables are complete:
- Entries correlate with the Element and Interaction labels in the diagram.
- Interactions are broken down into more specific Control Actions, Feedback and Other Interactions as appropriate.
- More specific Interactions are grouped under the corresponding Interactions from the diagram, and appropriately labelled.
- Interactions are described at an appropriate level of abstraction
- Control structure is also stored as artifacts under change control.
4. Identify Unsafe Control Actions
Identify Unsafe Control Actions (UCA) (interactions between a Controller and a Controlled Process that may result in a Hazard) by analysing each Control Action (CA) in the control structure.
Note
Focus on identifying contexts and circumstances in which a control action might potentially lead to a Hazard, rather than what might cause this.
The resulting list feeds into the subsequent Causal Scenario analysis (see step 7), which will explore the reasons why a UCA might occur.
- Use the CA-Analysis table to record the analysis results
- For each Control Action in the Interactions table, consider each of the types of UCA and record whether it is applicable
- If a UCA type is applicable, but would never result in an unsafe outcome,
then the Analysis Result should be
Safe - If you can identify any worst-case scenario in which the type of UCA might
apply, then the result should be
UCA. - If the UCA type is not applicable to the Control Action, record the result
as
N/A, and document your reasoning for this in the Notes column.
- In the Justification field add:
- A description or example of how a UCA could occur if the result is
UCA - A justification for why the UCA type does not apply if the result is
N/A - A justification for why a UCA cannot occur if the result is
Safe
- A description or example of how a UCA could occur if the result is
- Elements may be designated out of scope for a particular analysis if they have
no Control Actions and provide no Feedback
- Record this designation in the Justification column
- A UCA must include a WHEN-type statement (UCA Context) to describe the
system- or component-level condition in which it applies
- This provides additional context to explain why the UCA is applicable
- Add contexts in the UCA-Context table, so they may be shared between UCA
- You may also want to return to earlier Control Actions after adding a new UCA Context, as the new context may suggest a new UCA.
- Where any UCA is identified that you cannot link to a hazard, return to the hazard definitions and determine if you need to add a new hazard or SLC
- For each row with an Analysis Result of
UCAadd a corresponding row in the UCA table.- For each item, review the Justification field and ensure that it provides a clear explanation of how a UCA of this type may result.
- If the text needs improving, either update the corresponding cell in the CA-Analysis table, or replace the mirrored text in the UCA table.
Review criteria
- At least one set of CA-Analysis rows exists for each control action in
the Interactions table
- Each set of rows for a given CA and UCA-Context includes a row for each of UCA types defined by the UCAType table
- Specified UCA-Contexts are relevant and specific to the context of the linked CA-Analysis rows
- Each CA-Analysis row with a
UCAresult has a corresponding row in the UCA table. - Each CA-Analysis row has a Justification, which either describes a UCA (see next) or explains / justifies why a different result was recorded.
- The UCA Description field for each row in the UCA table provides a clear description of how this type of UCA applies to the given CA and context.
- CA-Analysis considers problems arising from concurrency as a possible factor leading to a UCA, especially for the Sequence/Order category.
5. Devise Controller (Functional) Constraints
Devise Constraints from the UCA results: falsifiable statements about an element of the Control Structure that must be satisfied in order to prevent or avoid each of the UCA.
- Work through the UCA table to consider each UCA in turn.
- The generated constraint text and WHEN-type statement are a prompt that should be used to suggest a more meaningful constraint statement.
- Devise (or identify an existing) Constraint (or Constraints) that either achieves the constraint prompt objective, or provides a mitigation.
- Record these in the Constraints table, with a type of
CFC. - Where more than one Constraint of a given type is needed to address a UCA or Causal Scenario, define a 'parent' constraint (e.g. CFC-3) and group the sub-constraints under it (e.g. CFC-3.1, CFC-3.2 etc.).
Review criteria
- Each row in the UCA table links to a corresponding CFC constraint.
- The Constraint text meets the following criteria:
- Communicates what needs to be true in order to avoid a negative outcome, rather than how that is to be accomplished.
- Uses the indicative mood, rather than the imperative mood.
- Uses the same context and terminology as the associated UCA.
6. Identify Control Loops and Sequences
These are groups of control actions and feedback that inform or trigger one another, and which realise a discrete set of behaviour associated with a particular feedback control loop.
- Add loops to the Control-Loops table
- Each control loop should have a description to make its role clear.
- What is the controlled process?
- How and why does the loop control it?
- What are the associated SLC?
- Add steps for each of the loops in the CL-Sequences table.
- Select the interaction id for each step in a loop sequence.
- Interactions may be control actions, feedback, or interactions as all are relevant.
- Interactions may be included in more than one Control Loop:
- This is a factor to consider as part of Causal Scenario Analysis for the repeated Interactions, as it may lead to conflicting, unintended or out of sequence interactions.
- Record for each step (or N/A with a justification):
- Provider Process Model or state: What information does the Provider use to inform this interaction? If the provider is a Controlled process providing feedback, then this will be its state.
- Provider logic: What logic does the provider use to determine when to provide the interaction, and what to provide? What interfaces are used to provide the Control Action or Feedback. What protocols and/or lower-level components are involved in the Control Path or the Feedback Path?
- Target behaviour: What is the Target expected to do as a result of the provided interaction?
- Note that provider logic may be very simple ("Return from a Control Action").
Review criteria
- Each Control Loop has a description, which identifies:
- The controlled process involved
- The overall objective(s) of the control mechanisms
- Each Control Loop is linked to an SLC.
- Each row in the Interactions table is linked to at least one row in the CL-Sequences table.
- Each CL-Sequence row documents Provider Process model, Provider Logic,
and Expected target behaviour, or
N/Awith a note justifying why this is not applicable. - Provider Logic clearly identifies the interfaces, protocols and/or components involved in providing and communicating the Interaction.
7. Identify Causal Scenarios
Identify Causal Scenarios (factors that can lead to Unsafe Control Actions, or directly to Hazards)
- Use the Control Loops to identify Causal Scenarios.
- For each control loop step, add a full set of rows in the Causal-Scenarios
table for the defined set of CSTypes.
- Note: Feedback has different applicable CS types (CS2- and CS4-) than those applicable to Control Actions (CS1-, CS3-, CS4-).
- Where the CS Type is not applicable, the CS Result should be set to N/A.
- For each step/type pair that is applicable, write a description of how this step might lead to either a UCA or Hazard in the Causal Scenario Definition column. Include examples or clarifying notes in the Notes column.
- Identify which UCAs or which Hazards may result.
Review criteria
- Each CL-Sequence row is linked to at least one set of Scenarios rows.
- Each set of Scenarios rows includes all 15 of the CSType categories, with a Result from the CSResults table.
- Each applicable Scenario row includes a Causal Scenario Definition.
- Clarifying examples and/or notes are recorded in the Notes column as required.
- Each applicable Scenario row is linked to at least one UCA or Hazard.
- Scenarios involving Interactions consider the role of the interfaces, protocols and/or components used to provide or communicate the Control Action or Feedback.
- Scenarios involving Conflicting Control and Controller or Process Disturbance consider whether there is sufficient isolation between potential interfering elements.
- Scenarios consider reasonably foreseeable misuse of the system as a possible causal factor for unintended actions or feedback.
8. Devise Causal Scenario Constraints
- Devise (or identify an existing) Constraint (or Constraints) that either
prevents or avoids this scenario, or provides a mitigation:
- Add new items to the Constraints table with a type of
CSC. - Link them to the Scenario(s).
- If linking to a CFC-type Constraint, add an explanation to justify how and why the Constraint addresses the Causal Scenario in its Notes column.
- Add new items to the Constraints table with a type of
Review criteria
- Each Causal-Scenario row is linked to at least one CSC of CFC Constraint.
- If a CFC Constraint, the Scenarios Notes column justifies how and why the CFC Constraint addresses the Causal Scenario.
- New Constraints have a type of
CSCand are linked to the associated Causal Scenario(s). - Constraint text meets the following criteria:
- Communicate what needs to be true in order to avoid a negative outcome, rather than how that is to be accomplished.
- Use the indicative mood, rather than the imperative mood.
- Use the same terminology as the associated Causal Scenario, UCA and/or Hazards.
9. Specify Misbehaviours and Expectations
Based on the results of the analysis, devise and specify:
- Misbehaviours identified or considered in the Causal Scenario analysis:
- These identify a class of fault, failure mode or other misbehaviour involving the SUA that can lead to a UCA or a Hazard as part of a Causal Scenario.
- These are a prompts for Fault Inductions.
- When we find an issue through testing, it should be compared to this list,
to see if it defines a new class of Misbehaviour.
- If so, then the STPA should be revisited.
- Expectations for the SUA which is responsible for preventing or
mitigating a risk (Hazard, UCA or Causal Scenario) or Misbehaviour:
- These might be directly provided by Constraints.
- Constraints might need to be decomposed into a set of Expectations, or rewritten to reflect specific system components' names instead of the element names used in the Control Structure.
- Expectations may cover all, or part of, more than one Constraint.
- Assumptions for other elements of a system, or for integrators or
designers of a system incorporating the SUA, where these are responsible for
preventing or mitigating a risk or misbehaviour:
- Again these might be directly provided by Constraints, or require some decomposition or aggregation.
Review criteria
- Misbehaviours are recorded in artifacts under change control:
- Recorded Misbehaviours include references linking them to the corresponding STPA result artifacts (UCA, Causal Scenarios).
- Expectations are recorded in artifacts under change control:
- Recorded expectations include references linking them to the corresponding STPA result artifacts (Constraints).
- Assumptions are recorded in artifacts:
- Recorded expectations include references linking them to the corresponding STPA result artifacts (Constraints).
10. Review STPA results
- Review your own analysis results and extend or clarify where appropriate.
- e.g. New SLC may be suggested by the defined Constraints, Expectations or Assumptions, and re-considering the CA-Analysis in the light of these new SLC may identify new UCA.
- Have an independent STPA practitioner review the analysis results.
- Verifies that the process was followed correctly and the results were recorded coherently.
- Checks the recorded results using the Review criteria in the step and the guidance from the Glossary.
- Have an independent subject matter expert review the analysis results.
- Verifies that the technical inputs to the process are appropriate, clearly described, and sufficient.
- Verifies that the process resulted in appropriate, coherent, applicable outputs (Step 9).
Review criteria
- Analyst review findings are documented.
- Independent STPA practitioner review findings are documented.
- Independent SME review findings are documented.
RAFIA STPA glossary
This glossary provides definitions and guidance for the key concepts in System Theoretic Process Analysis (STPA) as applied in the accompanying procedure.
Note that RAFIA's use of STPA is specifically focussed on its application for software systems; refer to the works in References section for more general, system-oriented guidance.
References
Where indicated ("from the STPA Handbook"), the quoted text sections in this glossary are extracts from the STPA Handbook, which is © 2018 Nancy Leveson and John Thomas.
RAFIA-specific terms
Software under analysis (SUA)
The software under analysis (SUA) is the software project, product or component that is the focus of the team or organisation applying RAFIA. This is normally analogous to the XYZ placeholder (metasyntactic variable) used by the Eclipse Trustable Software Framework.
Element
An Element is a component, subsystem or collaborating set of components in a System, which has an identifiable purpose within that system. STPA principally deals with those Elements that have a Controller or Controlled Process role, but it can also be useful to consider Elements that have neither of these roles, or both, as part of the analysis.
Scope of analysis
The scope of a given analysis using STPA is defined by the System that will be analysed, as delineated by a boundary (or boundaries) that divide or distinguish it from its Environment.
System
from STPA Handbook
A system is a set of components that act together as a whole to achieve some common goal, objective, or end. A system may contain subsystems and may also be part of a larger system
The System is an Abstraction that defines the scope of our analysis. It must, by definition, have a purpose or set of goals.
A System is defined by a Boundary and may have inputs and outputs that cross this boundary. Factors that are external to the system, but which may nevertheless influence its state, are referred to as its Environment.
System boundary
The boundary of the System is an arbitrary construct that we define for the purposes of analysis. It may or may not correspond to a physical boundary, or a concrete separation in a software context (e.g. between process or threads of operation, or between binary components), and may only indicate a division of responsibility.
from STPA Handbook
The most useful way to define the system boundary for analysis purposes is to focus on those parts of the system over which the system designers have some control
Environment
Factors that are external to the System, but which may nevertheless influence its state, are referred to as its Environment. This might be a physical environment (e.g. if the system under analysis is a vehicle), but it can describe anything outside the defined boundary that may nevertheless be relevant to the goals of the System.
For a software component, this might include the processor hardware that it executes upon, or other hardware devices it interacts with, or the operating system software if the system under analysis is an application.
from STPA Handbook
The environment is usually defined as the set of components (and their properties) that are not part of the system but whose behavior can affect the system state.
The concept of an environment implies that there is a boundary between the system and its environment.
Abstraction levels
from STPA Handbook
..When talking about a system, it is always necessary to specify the purpose of the system that is being considered.
A system is an abstraction, that is, a model conceived by the viewer.
The observer may see a different system purpose than the designer or focus on different relevant properties. Specifications, which include the purpose of the system, are critical in system engineering. They ensure consistency of mental models among those designing, using, or viewing a system, and they enhance communication.
STPA involves the use of abstractions, which can be a confusing unless the contributors to an analysis (including later consumers of its results) have a clearly-defined common frame of reference.
The risk of confusion can increase if we refer to 'levels of abstraction', which may refer to different perspectives or focussed analyses within a given abstraction.
For the purposes of RAFIA, we define levels of abstraction in relation to the SUA and the set of Hazards and Losses defined for an analysis.
-
The entry level abstraction should be one where the focus is on the role of the SUA as a whole (or Elements representing discrete functional aspects thereof) in a System, with respect to a defined set of Losses and Hazards.
-
A higher level abstraction is one at which the SUA is not distinguished as an Element, but where it has a defined role or responsibilities as part of one or more Controllers or Controlled Processes.
When a software product or project is the focus of STPA, the Losses defined for an analysis may only have meaning at a higher level of abstraction. This is because software alone does not typically lead to losses; it must necessarily be executed in a given context, and this context may determine the set of applicable Losses.
For example, if the SUA is a software component intended for use in a subsystem of a vehicle, and the Losses relate to safety ("Loss of life or injury to humans"), then the relationship between a software failure or misbehaviour and consequent harm to a human may not be 'visible' at the entry level.
- A lower level abstraction is one at which the focus of analysis is on the functionality of a component or a specific aspect of the SUA, rather than the role of the SUA in a system.
Consideration of a lower level abstraction may be valuable as part of Causal Scenario analysis, but this could be conducted using a different form of software, system or safety analysis.
Purpose of analysis
The purpose of a given STPA is defined by a set of Losses and associated Hazards - negative outcomes associated with the System - that the analysts wish to prevent or mitigate. Key outputs of the methodology are a set of Constraints, which describe the conditions that must exist in order to accomplish this goal.
Loss
from STPA Handbook
A loss involves something of value to stakeholders. Losses may include a loss of human life or human injury, property damage, environmental pollution, loss of mission, loss of reputation, loss or leak of sensitive information, or any other loss that is unacceptable to the stakeholders.
Losses represent outcomes that we (or other stakeholders) wish to avoid.
They should be at the highest level of abstraction and focus on the most critical aspects of the System. For safety, these will normally focus on loss of life or human injury, but they may also included losses relating to other system design goals, such as security, performance, reliability or usability.
from STPA Handbook
Example Losses
- L-1: Loss of life or injury to people
- L-2: Loss of or damage to vehicle
- L-3: Loss of or damage to objects outside the vehicle
- L-4: Loss of mission (e.g. transportation mission, surveillance mission, scientific mission, defense mission, etc.)
- L-5: Loss of customer satisfaction
from STPA Handbook
Tips to prevent common mistakes when identifying losses
- Losses should not reference individual components or specific causes
- Losses may involve aspects of the Environment over which the system designer or operator has only partial control or no control at all.
- You should also document any special considerations or assumptions made, such as losses that are explicitly excluded.
Hazard
from STPA Handbook
A hazard is a system state or set of conditions that, together with a particular set of worst-case environmental conditions, will lead to a loss
Hazards are states or conditions of the System, as opposed to states of the Environment or individual Element failures. These are states or conditions that need to be prevented, not states that the system must normally be in to accomplish its goals.
This does not mean that we can exclude external factors from our analysis; rather, we should focus upon how the System under analysis is involved in managing or controlling the associated risk. Understanding how the System can detect and respond to an external factor may be a key part of this.
If we cannot control or manage a Hazard as the designer or operator of our System, then it may be out of scope for our specific analysis. However:
-
It is still important to document such Hazards, as the associated risk may need to be managed by some other means
-
Even if a Hazard cannot be prevented, it may still be possible to reduce the severity of the consequent Loss(es) or the probability of it occurring. (See Mitigation).
It is important not to confuse hazards with failures, or with system functions that have not been implemented as specified. Hazards may result from a failure of a component, or from a flaw in its design or implementation, but they may also arise when all of a system’s components perform exactly as specified.
from STPA Handbook
Examples of system-level hazards
- H-1: Aircraft violate minimum separation standards in flight [L-1, L-2, L-4, L-5]
- H-2: Aircraft airframe integrity is lost [L-1, L-2, L-4, L-5]
- H-3: Aircraft leaves designated taxiway, runway, or apron on ground [L-1, L-2, L-5]
- H-4: Aircraft comes too close to other objects on the ground [L-1, L-2, L-5]
- H-5: Satellite is unable to collect scientific data [L-4]
- H-6: Vehicle does not maintain safe distance from terrain and other obstacles [L-1, L-2, L-3, L-4]
- H-7: UAV does not complete surveillance mission [L-4]
- H-8: Nuclear power plant releases dangerous materials [L-1, L-4, L-7, L-8]
Hazards are not inevitable. There must always be a worst-case environment in which hazards will lead to a Loss, but a given hazard may not necessarily lead to a loss in all cases.
from STPA Handbook
Tips to prevent common mistakes when identifying hazards
- Should not refer to individual components of the system: all hazards should refer to the overall system and system state
- Must be traceable to one or more losses
- Should refer to factors that can be controlled or managed by the system designers and operators
- Should describe system-level conditions to be prevented, not failures or deviations from specified system functions
- The number of hazards should be relatively small, usually no more than 7 to 10
- Should not include ambiguous or recursive words like "unsafe", "unintended", "accidental", etc.
Sub-hazards
System-level Hazards can also be refined into sub-hazards for more complex Systems, which may be helpful when identifying more granular System Level Constraints, or to guide the definition and allocation of Controller Responsibilities in the Control Structure.
from STPA Handbook
Example sub-hazards
| Sub-hazards derived from H-4 | Example constraints |
|---|---|
| H-4.1: Deceleration is insufficient upon landing, rejected takeoff, or during taxiing | SC-6.1: Deceleration must occur within TBD seconds of landing or rejected takeoff at a rate of at least TBD m/s |
| H-4.2: Asymmetric deceleration maneuvers aircraft toward other objects | SC-6.2: Asymmetric deceleration must not lead to loss of directional control or cause aircraft to depart taxiway, runway, or apron |
| H-4.3: Deceleration occurs after V1 point during takeoff | SC-6.3: Deceleration must not be provided after V1 point during takeoff |
Constraints
Constraints define conditions that must exist (or be maintained) in order to avoid a Hazard, or to provide a Mitigation when it cannot be avoided. This may mean adding or refining Behaviours to:
- Eliminate a Misbehaviour at a system or component design level
- Proactively control a Misbehaviour, to avoid the conditions that cause it
- Reactively control a Misbehaviour, to avoid its consequences
- Proactively or reactively control a Misbehaviour's consequences, to reduce their impact, or delay it long enough for another mitigation to be activated.
Constraints can also specify new or refined development processes, such as:
- Using specific tests or types of test to detect Misbehaviours
- Using specific analysis methodologies to detect Misbehaviours
More than one Constraint of each type may be defined for a given Misbehaviour. Where more than one Constraint of a given type are needed to address a UCA or Causal Scenario, define a 'parent' constraint (e.g. CFC-3) and group the sub-constraints under it (e.g. CFC-3.1, CFC-3.2 etc).
General tips for constraints
- A good constraint clearly communicates what needs to be true in order to avoid a negative outcome, rather than how that is to be accomplished.
- Avoid using terms like 'must', 'should' or 'shall': use the indicative mood, rather than the imperative mood.
- Frame the constraints using the context and the terminology used in the associated Hazards, UCA or Causal Scenarios, and using the element names from the Control Structure
The accompanying procedure defines three different types of constraint:
- SLC : System Level Constraints
- CFC : Controller (Functional) Constraints
- CSC : Causal Scenario Constraints
Note that different sets of constraints may be defined and apply at different Abstraction Levels. If a constraint is difficult to frame in specific terms at the current abstraction level, then that is an indication that further analysis is required at a lower or higher level.
Such analysis at a lower level could involve:
- Completing a new STPA with a different control structure and Hazards / Losses
- Specifying a set of system use cases for the SUA, to clarify the role(s) it is expected to take
- Improving the specification of the SUA, or the other documentation (about components or integrating systems) that is used to inform the analysis (e.g. by adding UML sequence diagrams for specific interactions)
- Using the results of Causal Scenario analysis to:
- Direct a design or code analysis of components
- Define a test campaign to determine control or feedback paths
- Confirm predicted misbehaviour (and detection / responses) using stress, soak and fault induction tests
- Add observability or monitoring features to the SUA or an integrating system to gather more information or drive Mitigations
Mitigation
If possible, Constraints should eliminate a risk (Hazard, UCA or Scenario) e.g. by explicitly designing and implementing the system to make it impossible. However, this will not always be feasible, and even if a risk can be realistically 'designed out', STPA directs us to consider worst case scenarios where e.g. the design has not been correctly implemented, or has an undetected flaw.
If a risk cannot be eliminated, then it may be possible to:
- proactively control it, to detect conditions that may lead to it, and take actions to avoid the risk manifesting
- reactively control it, by detecting when it manifests and taking actions to return the system to a safe state
- reduce its impact, by taking actions to limit the negative outcomes
Multiple mitigation strategies may be employed to address a single risk, which can significantly increase overall confidence in the solution or reduce the overall probability of an undetected and/or unmitigated fault. However, each mitigation represents another set of behaviours to analyse, and may introduce new risks.
System Level Constraint (SLC)
from STPA Handbook
A system-level constraint specifies system conditions or behaviors that need to be satisfied to prevent hazards (and ultimately prevent losses)
System-level constraints are the criteria that we use during our analysis to determine whether a given set of conditions can lead to a Hazard.
Note
In some STPA training material, you may see this type of constraint referred to as a High Level Safety Constraint (HLSC).
SLC document the high-level properties that a system must exhibit in order to achieve its goals, whether these relate to safety, security or some other critical objective, as expressed by the Losses and Hazards.
These are not intended to be verifiable requirements: rather, they help us to clearly specify the safety goals for the System as a whole. Violating a system-level constraint is what leads to Hazards. We use them to identify UCAs and Causal Scenarios; from these we derive Controller Constraints and Causal Scenario Constraints which are verifiable, and provide us with the basis for testing and fault injection.
There are two common patterns of SLC:
- Specifying how a Hazard may be prevented, by inverting the
condition of the hazard
- e.g. "X happens, leading to Y" becomes "X must not happen"
- Specifying how the consequences of a Hazard can be reduced, by
identifying initiating condition(s) and mitigating action(s)
- e.g. "If hazard occurs, then this mitigating action must occur"
from STPA Handbook
System level constraint examples
| Hazard | System-Level Constraint |
|---|---|
| H-1: Aircraft violate minimum separation standards [L-1, L-2, L-4, L-5] | SC-1: Aircraft must satisfy minimum separation standards from other aircraft and objects [H-1] |
| H-2: Aircraft airframe integrity is lost [L-1, L-2, L-4, L-5] | SC-2: Aircraft airframe integrity must be maintained under worst-case conditions [H-2] |
from STPA Handbook
Tips to avoid common mistakes with system-level constraints
- These constraints relate to the system, rather than individual components
- Should not specify a particular solution or implementation
- Can often be derived by simply inverting the condition of a hazard
- Must be traceable to a hazard, but this need not be one-to-one
- A single constraint might be used to prevent more than one hazard
- Multiple constraints may be related to a single hazard
- Each hazard could lead to one or more losses
- Subsequent stages of analysis will systematically identify scenarios that can violate these constraints
Controller Constraint
from STPA Handbook
A controller constraint specifies the controller behaviors that need to be satisfied to prevent UCAs
Controller constraints identify the criteria that must be satisfied by a Controller in order for UCA to be avoided. They must always relate to a UCA, and are typically derived by inverting the conditions of the UCA. However, a number of controller constraints may be required to prevent a UCA, or one controller constraint may address several UCAs. They may also be defined to reduce the harmful effects of a UCA that has led to a hazardous condition, or to prevent a UCA identified for another Element.
Be careful not to confuse controller constraints with particular design features or mitigations that the Controller must implement in order to satisfy the constraint. They describe the criteria that must be satisfied, not how or by what means this is to be achieved.
Control Structure
from STPA Handbook
A hierarchical control structure is a system model that is composed of feedback control loops. An effective control structure will enforce constraints on the behavior of the overall system.
We use control structures to model the behaviour of the System under analysis, to help us determine how specific behaviour, in combination with worst-case system and/or environmental conditions, may lead to the violation of one or more System-level Constraints, and potentially lead to Losses.
Control structures contains at least five types of elements:
A control structure diagram consists of boxes representing Elements (Controllers and Controlled processes), which are arranged in a hierarchy, and directional arrows, representing interactions between Elements (Control Actions, Feedback, Other inputs and outputs).
from STPA Handbook
The vertical axis in a hierarchical control structure is meaningful. It indicates control and authority within the system... Downward arrows represent control actions (commands) while the upward arrows represent feedback
Control structures are an abstraction of the System that we can use to reason about its behaviour.
This is an example of a control structure from STPA Handbook.
from STPA Handbook
Common points of confusion with control structures
1) A control structure is not a physical model
- The hierarchical control structure used in STPA is a functional model, not a physical model like a physical block diagram, a schematic, or a piping and instrumentation diagram.
- The connections show information that can be sent, such as commands and feedback — they do not necessarily correspond to physical connections.
2) A control structure is not an executable model
- Instead, STPA can be used to carefully derive the necessary behavioral constraints, requirements, and specifications needed enforce the desired system properties.
- A control structure does not assume obedience
- Just because a controller sends a control action does not mean that in practice it will always be followed.
- Likewise, just because a feedback path is included in a control structure does not mean that in practice the feedback will always be sent when needed or that it will be accurate.
3) Use abstraction to manage complexity
- Control structures use abstraction in several ways to help manage complexity
- The principle of abstraction can also be applied to the command and feedback paths in the control structure.
- Even if details are known and design decisions have been made, it can be helpful to first apply STPA at a higher abstract level first to provide quicker results and identify broader issues before analyzing more detailed control structure models.
Controlled process
A controlled process is involved in some way in the system state(s) that can lead to Hazards and Losses.
from STPA Handbook
A hazard is defined in terms of the state of the controlled process at the bottom of the figure, e.g., the attitude of the aircraft, the speed of the automobile, the position of the robot. States are made up of components or variables. As the goal of STPA is to identify how the controlled process gets into a hazardous state, then we need to look at the ways the controlled process state can change state.
In more complex systems, there may be controlled processes at higher levels in the hierarchy, or at lower (more detailed) levels of abstraction, which relate to system functions that are indirectly involved in Hazards.
Controller
A controller controls some aspect of a controlled process. In order to do this, it must have:
- a goal or goals (responsibilities), which may include maintaining constraints on the behaviour of the controlled process
- some way to affect the behaviour of the controlled process, via control actions
- a model of the state of the controlled process (process model)
- a control algorithm that is used to determine when control actions are required
- some source(s) of feedback relating to the controlled process
Note that in some control structures, controllers may represent a human interacting with the system. Where this is the case, concepts such as Control Action,Control Algorithm and Process Model are replaced with equivalents that apply to humans (e.g. expected action, instructions / procedure, mental model).
Control action
Control Actions are interactions that a Controller provides to control the behaviour of a Controlled Process or another controller, in order to meet a defined set of goals. These goals should relate to the System Level Constraints.
"Tips for Control Actions
- Do not be tempted to list every possible interaction between elements as a Control Action.
- One Control Action may represent a series of software-level operations
Control algorithm
Controllers typically have a control algorithm, which determines the Control Action that they may provide, based on 'beliefs' that the controller has about the state of the Controlled Process, which may be stored in a Process Model and/or informed by Feedback
from STPA Handbook
The automated control algorithm has two primary functions: (1) generate control actions and (2) maintain accurate information (models) about the state of the controlled process and external system components and environment that can impact the generation of control actions.
Feedback
Feedback is information that a Controller needs to help determine when a Control Action is required, either as a direct input to a Control Algorithm, or to inform the controller’s 'beliefs' about a Controlled Process.
Feedback relating to one process may be received indirectly, from other controllers or processes, and may relate to (or be inferred from) the Environment.
"Tips for Feedback
- In software, a Control Action frequently has implicit Feedback (e.g. if it is a function call), which means that feedback is not always a discrete signal.
- It is best to think about Feedback as the data that a Controller needs in order to correctly perform a Control Action, rather than as a single interaction or communicated 'packet'
from STPA Handbook
Feedback can be derived from the control actions and responsibilities by first identifying the process models that controllers will need to make decisions.
Process model
A process model is an abstraction of the Controller's 'beliefs' about the state of a Process that it is responsible for controlling, which are inputs to its Control Algorithm and may inform the Control Actions that it provides.
This model is based on the feedback that the controller receives and/or historical information that it has stored. Issues with the design, fidelity and/or timely update of this model may be a causal factor for UCA and consequent Hazards.
Other interactions
Controllers and Controlled Processes may also have inputs and outputs that are neither Control Actions nor Feedback, but which may nevertheless be relevant in an analysis (e.g. because they may affect the state of an Element, or its provision of required control actions or feedback).
Controlled Processes in particular will often have inputs and outputs that are not Control Actions or Feedback, but rather aspects of a process that we wish to control.
You may include these as I-type interactions in the Control Structure, as this
can be useful as a reminder of possible interference mechanisms between
Elements in Step 7
Controller Responsibility
from STPA Handbook
Responsibilities are a refinement of the system-level constraints — what does each entity need to do so that together the system-level constraints will be enforced?
Once Controllers have been identified, these can be assigned responsibilities. These reflect the goal(s) or purpose of the controllers and should relate to the System-level Constraints, and the System role(s) that controllers are required to play in maintaining these constraints.
from STPA Handbook
Tips to prevent common mistakes in a control structure
- Ensure labels describe functional information that is sent, not a specific physical implementation.
- Avoid ambiguous and vague labels like simply "Command" or "Feedback" when the type of information is known.
- Check that every controlled physical process is controlled by one or more controllers (not always required, but often indicates a mistake).
- Review responsibilities (including traceability) for conflicts and gaps.
- Check that control actions needed to satisfy the responsibilities are included.
- Check that feedback needed to satisfy the responsibilities is included. (optional if applied early in concept development when feedback is unknown; later steps can identify missing feedback)
UCA
from STPA Handbook
An Unsafe Control Action (UCA) is a control action that, in a particular context and worst-case environment, will lead to a hazard.
The Handbook describes four ways in which a Control Action can be unsafe:
from STPA Handbook
Categories of Unsafe Control Action
- Not providing the control action leads to a hazard
- Providing the control action leads to a hazard
- Providing a potentially safe control action, but too early, too late, or in the wrong order
- The control action lasts too long or is stopped too soon
The Handbook also notes that sub-categories may be distinguished for each category:
from STPA Handbook
Example sub-categories for 'Providing' UCAs
- Consider contexts in which the control action may never be safe
- Consider contexts in which the control action has an incorrect parameter (e.g. setting an incorrect emergency frequency on a radio)
- Consider contexts in which an insufficient or excessive control action may be unsafe (e.g. providing insufficient or excessive braking commands)
- Consider contexts in which the direction of the control action may be unsafe (e.g. providing turn left instead of turn right commands)
- Consider contexts in which the control action has already been provided (repetitive control actions)
A UCA must have 5 parts:
| Part: | source | type | action | context | hazard |
|---|---|---|---|---|---|
| Content: | Controller | UCA Type | Control Action | UCA Context | Hazards |
| Example: | BSCU Autobrake | provides | Brake command | during a normal takeoff | H-4.3 |
Structuring a UCA so that it includes or references all of these elements makes it clearer and helps to prevent some of the common mistakes summarised below.
from STPA Handbook
Tips to prevent common mistakes when identifying UCAs
- Ensure every UCA specifies the context that makes the control action unsafe.
- Ensure UCA contexts specify the actual states or conditions that would make the control action unsafe, not potential beliefs about the actual states.
- Ensure the UCA contexts are defined clearly.
- Ensure the UCA contexts are included and not replaced by future effects or outcomes.
- Ensure traceability is documented to link every UCA with one or more hazards.
- Review any control action types assumed to be N/A, and verify they are not applicable.
- For any continuous control actions with a parameter, ensure that excessive, insufficient, and wrong direction of the parameters are considered.
Ensure any assumptions or special reasoning behind the UCAs are documented
UCA Type
The nine types of UCA used for RAFIA are as follows
- Not Provided
- Provided
- Magnitude (too little)
- Magnitude (too much)
- Duration (too short)
- Duration (too long)
- Timing (too early)
- Timing (too late)
- Sequence / Order
Note that these categories are provided for guidance only and may not apply in all contexts, or to all types of control action.
For example, the Magnitude and Duration categories are relevant for continuous control actions, such as applying pressure to an actuator or emitting a warning tone, but may not apply to discrete ones, such as activating a relay or sending a message.
UCA Context
The context of a UCA describes the actual System conditions that make the Control Action unsafe. A control action may be safe in one context and unsafe in another.
For example, a UCA associated with a rear-facing camera system in a vehicle may only apply while the vehicle is reversing, while another UCA only applies while the vehicle is not reversing.
Causal Scenario
from STPA Handbook
A loss scenario describes the causal factors that can lead to the unsafe control actions and to hazards.
Causal scenarios (which are called loss scenarios in the STPA Handbook) provide more specific context for a UCA or Hazards. UCAs identify the actual system states or conditions that may result to a hazard, but should not attempt to explain the reasons why the associated Control Action is provided, not provided, or provided incorrectly.
The STPA Handbook describes two classes and four types of causal scenario:
a) Scenarios leading to UCAs
- Due to unsafe controller behaviour
- Due to missing or inadequate feedback or input
b) Scenarios in which control actions are improperly executed or not executed
- Involving the control path
- Related to the controlled process
These can be further broken down into the following categories:
- Controller: Problems with the Controller itself
- Control Algorithm: Problems with the logic (specification, design or implementation) of the controller's Control Algorithm
- Unsafe Control Input: Problems relating to unsafe control inputs from other controllers, including UCAs
- Process Model: Problems with the process model (or mental model) of the controller
- Controller Disturbance: Problems arising from other factors that may affect the Controller
- Feedback: Problems with the data / information that needs to be communicated
- Feedback Path: Problems with the communication / transmission of the data / information
- Unsafe Data / Information: Problems with the data / information contributing to the Feedback, which may result from another UCA
- Control Action: Problems with the control action itself, including Unsafe / Insecure Control Actions
- Control Path: Problems with the communication of the Control Action to its target
- Process: Problems with the controlled process itself
- Conflicting Control: Interference from other controllers
- Process Inputs: Other information or actions which may affect the process
- Process Outputs: Other information or actions which may result from the control action or the operation of the Controlled Process
- Process Disturbance: Anything else outside the process that may affect it
from STPA Handbook
Common mistakes when identifying Loss Scenarios
The most common mistake is to identify individual causal factors rather than a scenario... The problem with listing individual factors outside the context of a scenario is that it’s easy to overlook how several factors interact with each other, you can overlook non-trivial and non-obvious factors that indirectly lead to UCAs and hazards, and you may not consider how combinations of factors can lead to a hazard. Considering single factors essentially reduces to a FMEA where only single component failures are considered.
Control Path
The control path refers to the means by which a Control Action is communicated from a Controller to a Controlled Process.
Feedback Path
The feedback path refers to the means by which Feedback is communicated from a Controlled Process to a Controller.
Causal Scenario Constraints
STPA results schema
The results of applying STPA should be recorded as structured textual data that is stored using plain text file formats, such as CSV, YAML or markdown, which represents tabular data that can be worked upon in a spreadsheet or a relational database.
This schema is illustrated by the workbook template. This consists of three types of sheet:
- Text sheets (README, Scope): Providing guidance and context - not exported
- Workbook tables - Recording results - exported as CSV
- Category tables - For reference - not exported
The data stored in the latter two types of table are described in the following sections.
Data types
The following data types are used in the table descriptions:
- UID: Locally unique alphanumeric identifier
- Number: Integer value
- Markdown text: A block of (multiline) text with markdown formatting.
- Text: Plain text
- Text array: An array of text items
- Ref-name: Name used to refer to a Reference
- Reference (type): Text matching a constrained set of values defined by type
- Reference array (type-name): An array of Reference items
- Link (table): The UID of a record in table
- Link array (table): An array of UIDs for records in table
Workbook tables
These are the exported tables, which contain all of the structured data recording the STPA results.
Losses
The set of Losses for this analysis.
| Column | Data type | Notes |
|---|---|---|
| Loss Id | UID | |
| Loss description | Text | |
| Loss category | Reference (LCategory) | Categories are for guidance only |
Hazards
The set of Hazards for this analysis.
| Column | Data type | Notes |
|---|---|---|
| Hazard Id | UID | |
| Hazard description | Text | |
| Link to loss(es) | Link Array (Losses) | Each Hazard must link to at least one Loss |
| Notes | Markdown text |
Constraints
Constraints for Hazards (SLC), UCA (Controller Constraints) and/or Causal Scenarios.
| Column | Data type | Notes |
|---|---|---|
| Constraint Id | UID | |
| Description | Text | |
| Constraint Type | Reference (CType) | Determines the type(s) of links |
| Link to Constraint(s) | Link Array(Constraints) | Links to other Constraints (e.g. for sub-constraints) |
| Link to Hazard(s) | Link Array(Hazards) | SLC and CSC |
| Links to UCA | Link Array(UCA) | CFC and CSC |
| Links to CS | Link Array(Causal Scenarios) | CSC only |
| Links to TSF | Text Array | UID of associated Statements in an associated TSF Specification |
Elements
The elements of the Control Structure defined for this analysis.
| Column | Data type | Notes |
|---|---|---|
| Element Id | UID | |
| Element name | Text | |
| Responsibilities | Text array | Responsibilities of the Element |
| Roles | Reference array (ERoleType) | |
| Notes | Markdown text |
Interactions
Interactions between the elements of the Control Structure defined for this analysis.
| Column | Data type | Notes |
|---|---|---|
| Interaction Id | UID | |
| Diagram Label | Text | |
| Interaction description | Text | |
| Type | Reference (IType) | |
| Provider Id | Link (Elements) | |
| Receiver Id | Link (Elements) | |
| Category | Reference (ICategory) | |
| Notes | Markdown text |
CA-Analysis
Analysis of the Control Actions (only) in the Control Structure defined for this analysis.
| Column | Data type | Notes |
|---|---|---|
| CA Analysis ID | UID | |
| CA Id | Link (Interactions) | |
| UCA Type | Reference (UCAType) | |
| UCA Context | Link (UCA-Contexts) | |
| Analysis Result | Reference (CAResult) | |
| Hazard(s) | Link Array (Hazards) | If Analysis Result is UCA, must link to at least one Hazard |
| Justification | Text | Description or example of UCA, or justification for the result |
UCA-Contexts
The UCA Contexts used in the UCA for this analysis.
| Column | Data type | Notes |
|---|---|---|
| Context Id | UID | |
| Unsafe Context | Text | A context in which one or more control actions may be unsafe. |
| Notes | Markdown text | Description or clarification of the context |
UCA
The UCA identified in this analysis.
| Column | Data type | Notes |
|---|---|---|
| UCA Id | UID | |
| CA | Link (Interactions) | |
| UCA Type | Reference (UCAType) | |
| UCA Context | Link (UCA-Contexts) | |
| UCA Definition | Text | Structured definition of UCA using STPA keywords |
| UCA Description | Text | Description or example of the UCA |
| Constraint Id | Link array (Constraints) |
Control-Loops
Control Loops for Controlled Processes.
| Column | Data type | Notes |
|---|---|---|
| Loop Id | UID | |
| Control Loop Description | Text | |
| Controlled Process | Link (Elements) | |
| Linked SLC(s) | Link array (Constraints) | Should only include SLC |
CL-Sequences
Control Loop sequences, describing how sets of Interactions are involved in implementing control loops.
| Column | Data type | Notes |
|---|---|---|
| CL-Sequence Id | UID | |
| Loop | Link (Control-Loops) | The control loop for this step |
| Step | Number | A numerical identifier for a sequential step in the control loop |
| Interaction Id | Link (Interactions) | The interaction that this step involves |
| Provider process model or state | Text | The Process Model of the Provider, or its state if a Controlled process |
| Provider logic | Text | The logic used by the Provider to inform this interaction |
| Expected Receiver behaviour | Text | How the Provider expects the Receiver to behave |
Scenarios
Causal Scenarios to explain how causal factors affecting the Interactions in each of the CL-Sequences may lead to UCA or Hazards
| Column | Data type | Notes |
|---|---|---|
| Scenario Id UID | ||
| Seq Ref | Link (CL-Sequences) | |
| CS Type | Reference (CSType) | |
| Causal Scenario Prompt | Text | Constructed prompt text for the Causal Scenario |
| Analysis Result | Reference (CSResult) | |
| Causal Scenario Definition | Text | Description of how this interaction might lead to a UCA or a Hazard (or both) |
| Links to UCA | Link array (UCA) | |
| Links to Hazard(s) | Link array (Hazards) | |
| Constraint Id | Link array (Constraints) | |
| Notes | Markdown text | Example(s) of the Causal Scenario and other explanatory notes |
Category tables
These tables provide a constrained set of values for specific columns in the workbook tables. They are used to populate dropdown selectors and construct prompt text in the workbook template, and are not exported. The standard sets of categories used in the template are included here for reference, but these may be adapted or extended as required.
LCategory
Categories of Losses (for information and grouping of associated Hazards, UCAs, etc)
| Loss Category | Description |
|---|---|
| Assets | Losses relating to stakeholder's physical assets, equipment, property, etc |
| Commercial | Losses relating to a stakeholder organisation's commercial costs or benefits |
| Safety | Losses relating to the physical well-being of a human stakeholder |
| Security | Losses relating to a stakeholder's confidential information or intellectual property |
| User | Losses relating to a user's goals, convenience, time, desires, etc |
CType
Type identifiers and descriptions for Constraints
| CType | Description |
|---|---|
| SLC | System Level Constraint |
| CFC | Controller (Functional) Constraint |
| CSC | Causal Scenario Constraint |
ERoleType
Types of role for Elements in the control structure
| ERoleType | Responsibilities / Involvement |
|---|---|
| Controller | Provides control actions to a Controlled Process or another Controller |
| Controlled Process | Implements (part of) the behaviour that needs to be controlled |
| Actuator | Mechanisms by which a Controller acts upon a Controlled Process |
| Sensor | Mechanisms by which a Controller senses Feedback from a Controlled process |
| Interference | May interfere with the correct functioning of the Control Structure |
| Control Path | Communicates a Control Action from a Controller to a Controlled Process |
| Feedback Path | Communicates Feedback from a Controlled Process to a Controller |
| Out of Scope | Element is out of scope for this analysis, but has an assumed role |
Note
Elements may have more than one role in the control structure.
The Interference, Control Path and Feedback Path roles are added to better characterise software-specific interactions.
The Out of Scope role should only be used when an Element has another defined role in the control structure.
IType
Type identifiers and descriptions for interactions
| IType | Description |
|---|---|
| C | CONTROL ACTION |
| F | FEEDBACK |
| I | Information (other than Feedback) |
| P | Controlled Process input or output |
| X | Other interaction (e.g. interference, disturbance) |
Note
The I, P and X types of interaction are not a formal part of the control structure, but may be included in a diagram or structure definition as a prompt for Causal Analysis.
ICategory
Categories of Interactions (e.g. Continuous or Discrete). This is an optional characteristic that may be meaningful for some types of interaction.
| ICategory | Description |
|---|---|
| C | Continuous |
| D | Discrete |
UCAType
Type identifiers and descriptions for UCA.
| UCAType | Description | Keyword | Constraint keyword |
|---|---|---|---|
| NP | NP - Not Provided | DOES NOT PROVIDE | MUST PROVIDE WHEN REQUIRED |
| PR | PR - Provided | PROVIDES | MUST CORRECTLY PROVIDE WHEN REQUIRED |
| ML | ML - Magnitude (less than) | PROVIDES (TOO LITTLE) | MUST NOT PROVIDE TOO LITTLE |
| MM | MM - Magnitude (more than) | PROVIDES (TOO MUCH) | MUST NOT PROVIDE TOO MUCH |
| DS | DS - Duration (too short) | PROVIDES (TOO SHORT) | MUST PROVIDE FOR LONG ENOUGH |
| DL | DL - Duration (too long) | PROVIDES (TOO LONG) | MUST NOT PROVIDE FOR TOO LONG |
| TE | TE- Timing (too early) | PROVIDES (TOO EARLY) | MUST NOT PROVIDE TOO EARLY |
| TL | TL - Timing (too late) | PROVIDES (TOO LATE) | MUST NOT PROVIDE TOO LATE |
| SO | SO - Sequence / Order | PROVIDES (OUT OF SEQUENCE) | MUST NOT PROVIDE OUT OF SEQUENCE |
Note
The Keyword and Constraint keyword field are used to construct UCA and
constraint prompts in the workbook template.
CAResult
Results of CA-Analysis.
| Result | Description |
|---|---|
| UCA | Unsafe Control Action |
| Safe | Applicable, but does not result in UCA |
| N/A | Not Applicable |
| TBD | Not yet analysed |
| Not yet analysed |
CSType
Type identifiers, descriptions and prompts for Scenarios
| Label | Type | Description |
|---|---|---|
| CS1-C | Controller (itself) | Problems with the Controller itself |
| CS1-A | Control Algorithm(s) | Problems with the logic (specification, design or implementation) of the controller's algorithm |
| CS1-I | Unsafe Control input | Problems relating to unsafe control inputs (e.g. from other controllers), which may result from another UCA |
| CS1-M | Process Model | Problems with the process model (or mental model) of the Controller |
| CS1-D | Controller Disturbance | Problems arising from other factors that may affect the Controller |
| CS2-F | Feedback (itself) | Problems with the data / information that needs to be communicated |
| CS2-P | Feedback Path | Problems with the communication / transmission of the data / information |
| CS2-U | Unsafe Data / Information | Problems with the data / information contributing to the feedback, which may result from another UCA |
| CS3-A | Control Action (itself) | Problems with the Control Action itself, including Unsafe / Insecure Control Actions |
| CS3-P | Control Path | Problems with the communication of the Control Action to its target |
| CS4-P | Process (itself) | Problems with the Controlled Process |
| CS4-C | Conflicting Control | Interference from other Controllers |
| CS4-I | Process Inputs | Other information or actions that may affect the Controlled Process |
| CS4-O | Process Outputs | Other information or actions that may result from the Control Action |
| CS4-D | Process Disturbance | Anything else outside the Controlled Process that may affect it |
CSResult
Results of Scenario Analysis.
| Result | Description |
|---|---|
| UCA | Link to UCA |
| Hazard | Link to Hazard |
| Both | Link to UCA and Hazard |
| OOS | Out of scope for this analysis |
| SAF | Scenario already found |
| N/A | Not Applicable |
| TBD | Not yet analysed |
| Not yet analysed |
Managing Statements for safety-related Software
The Trustable Methodology can be used to track requirements for safety-related software. This document describes our current understanding of how data in a Trustable Graph should be managed to ensure the underlying argument possesses the following desirable safety properties:
- Completeness
- Correctness
- Freedom from Intrinsic Logical Faults
- Understandability
- Precision
- Modularity
- Fault Detection
- Defined responsibility
Warning
Completeness is not yet fully addressed by the process in this document. This is because we need weights to track completeness.
Contributors
Managing a Trustable Graph involves three classes of Contributor:
- The Author(s)
- The Software Reviewer
- The Trustable Reviewer
who are collectively responsible for protecting the integrity and quality of the following activities and their outcomes:
- Adding Links
- Removing Links
- Adding Items
- Editing Items
- Removing Items
- Reviewing Items
- Clearing Suspect Links
- Updating Scores
The remainder of this document is dedicated to describing the specific responsibilities of Contributors in ensuring the aforementioned properties are achieved and maintained.
Danger
In practice, change requests often reflect the undertaking of several activities in parallel. For instance, often a change to the contents of an item will also include clearing the resulting Suspect links.
Each Contributor should always consider the responsibilities arising from all the activities that are undertaken by a change, whether this is explicit or implied.
Configuration
To apply the Trustable Methodology to a project, the following sets of Contributors must be identified:
| Group | Definition |
|---|---|
| Authors | All Contributors to XYZ |
| Software Reviewers | A subset of Contributors with proven competence in an area of the project. The collective competence of Software Reviewers should cover all aspects of XYZ. |
| Trustable Reviewers | A subset of Contributors competent in the application of the Trustable Methodology and the use of XYZ. |
The assignment of these roles must be documented and enforced. Within the scope of a single change request, separate people should perform each role.
Desirable properties
This procedure contributes to the property of Defined Responsibilities.
General Principles
This section details some general principles that all Contributors should consider when proposing or reviewing a change.
-
Suspect items are indicative of a Fault or Change in Evidence. MRs that introduce a Suspect item should never be accepted, unless they apply to an Evidence Item and are justified by reference to a fault or change in an Artifact.
Proper uses of the review status
- Contributor A discovers a bug in a Validator. They open an MR to mark all Evidence items using that Validator as Suspect, preventing inaccurate scores being used until the bug is fixed.
- Contributor B pushes a change that causes changes to a generated Artifact. This causes the linked Statement to be marked as Suspect.
Improper uses of the review status
- Contributor C makes a Statement X and thinks it might support Statement Y. They link it to Statement Y and mark it as Suspect so other Contributors know they are unsure.
- Contributor D makes a Statement Z. They identify it as Evidence, but being unable to find a suitable Artifact, they mark it as Suspect.
-
Suspect Links are Indicative of a Fault or Change in the argument. Changes that introduce a Suspect link should never be accepted, unless they are justified by significant and unavoidable changes to a Request that must be resolved at multiple levels in the Argument.
Proper use of a Suspect link
The Consumer decides they want to pivot immediately to new hardware. Contributor A changes Expectation X to reflect this, but leaves the links to it (from Assertions Y and Z) as Suspect. This then enables Contributors B and C to iteratively rework their argument (in Y and Z) for the new hardware.
Improper use of a Suspect link
Contributor D fixes a spelling mistake in an Assertion. They clear the resulting Suspect links to other Statements they own, but leave links to Contributor E's Statements as Suspect, expecting Contributor E to clear them later.
Desirable properties
These principles contribute to the properties of Fault Detection and Freedom from Intrinsic Logical Faults.
Actions
This section sets out the responsibilities of Contributors involved in a change to the Trustable Graph. All changes should be submitted through a merge request. Reviewers indicate that they have completed their responsibilities and are satisfied with the changes by approving the request.
A remark on scope
The scope of this document is restricted to ensuring proper implementation of the Trustable Methodology. This process should complement normal software engineering practice, not replace it. Where a Contributor has no specific responsibilities in the Trustable process, they are still responsible for the overall quality of the changes.
All Actions discussed below can be done either directly in a text editor, or by using trudag.
Adding Items
When a new item is added, Contributors have the following responsibilities.
Author
- Must submit a complete change that does not require a response from Reviewers.
Software Reviewer
- Confirms the new item relates to XYZ.
- Confirms any referenced Artifacts are related to the item's Statement.
- Confirms the new item is marked as reviewed.
Trustable Reviewer
- Confirms the new item is a genuine Statement related to XYZ.
- Confirms the new item is marked as reviewed.
Desirable Properties
This procedure contributes to the properties of Precision, Understandability and Freedom from Intrinsic Logical Faults.
Adding links
When a new link is added, Contributors have the following responsibilities.
Author
- Must submit a complete change that does not require a response from Reviewers.
Software Reviewer
- Confirms the destination item is a necessary, but not sufficient, condition for the source item.
- Confirms the new link is not marked as Suspect.
Trustable Reviewer
- Confirms the destination item is a necessary, but not sufficient, condition for the source item.
- Confirms the link does not adversely affect the modular structure of the Graph without justification. That is, it does not connect two unconnected subgraphs without justification.
- Confirms the new link is not marked as Suspect.
Desirable properties
This procedure contributes to the properties of Correctness and Modularity.
Removing links
When a link is removed, Contributors have the following responsibilities.
Author
- Must submit a complete change that does not require a response from Reviewers.
Software Reviewer & Trustable Reviewer
-
Confirm they agree it is possible for the destination item to be False and the source item to be True.
OR
-
Confirm the destination item is True if and only if a different combination of existing destination items is True.
Desirable properties
This procedure contributes to the properties of Completeness and Freedom from Intrinsic Logical Faults.
Removing Items
When an item is removed, Contributors have the following responsibilities.
Author
- Must submit a complete change that does not require a response from Reviewers.
Software Reviewer
- Confirms the item is not an Expectation for XYZ.
Trustable Reviewer
- Confirms the item has no links, Suspect or otherwise.
Desirable properties
This procedure contributes to the property of Completeness.
Updating scores
When a score is added or updated, Contributors have the following responsibilities.
Author
- Must submit a complete change that does not require a response from Reviewers.
Software Reviewer
- If a Validator is used, confirms the action of that Validator corresponds to the Statement made in the Evidence item.
Trustable Reviewer
- If a Validator is used, confirms it correctly uses the dotstop plugin system.
- If a Validator is used, confirms the calculated value can be interpreted as a probability. That is, it is non-dimensional and is confined to the interval \([0,1]\).
- If an SME score is provided, confirms each of the named SME's is an Author
Desirable properties
This procedure contributes to the properties of Correctness, Precision, and Freedom from Intrinsic Logical Faults.
Editing Items
When an item is edited, Contributors have the following responsibilities.
Author
- Must submit a complete change that does not require a response from Reviewers.
Software Reviewer & Trustable Reviewer
-
Confirm the changes result in a logically equivalent Statement.
OR
-
Confirm the changes result in a new Statement and consider this as equivalent to the following actions in sequence:
Desirable properties
This procedure contributes to the properties of Correctness, Completeness and Freedom from intrinsic logical faults.
Clearing Suspect links
All Contributors should consider the change as equivalent to adding a new link, then follow the relevant procedure.
Reviewing Items
All Contributors should consider the change as equivalent to adding a new item and (if applicable) adding scores, then follow the relevant procedure.
Modifying SME Scores
Sometimes you may need to re-review an item, but one or more of the original SMEs are unavailable to confirm or adjust their scores. In such cases, it is acceptable to exclude the unavailable SME as an Author and then set their score to zero.
Assigning zero is a conservative approach that prevents inflating the item score beyond what is supported by available SME input. Always include a clear note explaining the reason for the score change.
Alternatively, scores from active maintainers may be updated by new SME Authors. These changes must follow the same review and change-management rules as adding or deleting items, with the new Authors' names and scores replacing those of the original individuals. Any differences in assessment are recorded in the project history and should align better over time as the project matures and calibration improves.
However, you may find it is no longer possible for an existing SME to continue to re-evaluate their score. At this point, it is at the discretion of a Trustable reviewer to determine if it is acceptable to remove the existing SME scores. The Trustable reviewer should evaluate the necessity of the SME score removal, with the aim that existing SME scores are not discarded to circumvent the original contributor's evaluations.
The following provides some guidance for certain scenarios:
-
If an SME is no longer working on the project, but is still available for comment, they should be notified of the action. At which point they should express any concerns about the proposed scoring changes (for example, if in their opinion an SME was scoring too high, this should be discussed, and all SMEs involved should come to some common understanding of the context involved in the statement and the reasons that statement might not be true).
-
If an SME is no longer working on the project, and is not available for comment for any reason, then it is up to the Trustable reviewer for the appropriate action to take. A Trustable reviewer may seek several other SMEs to provide scoring if they feel like this is needed to ensure the score is more accurate.