When using Testspace during the development cycle data generated from testing is automatically collected, stored, and continuously analyzed. This mined data is used to generate Insights,
metrics used to assess and make process decisions concerning the quality of the software development process.
The following is an overview of the
Results Strength- measures the stability of results and infrastructure
Test Effectiveness- measures if tests are effectively capturing side-effects
Failure Resolution- measures if failures are being resolved quickly and efficently
Insights are input into decision-making and require interpretation based on the Project’s specific process.
Descriptions of the numbered areas:
Time Period Selection- with the number of complete results sets and the total number of test cases processed.
Results Strength Indicator- with associated metrics.
Test Effectiveness Indicator- with associated metrics.
Failure Resolution Indicator- with associated metrics.
Results Vs. File Churn- comparitive charts for
Results Health, and
Passed Percentage Chart- with results counter for the selected period.
Regressions Chart- unique vs. reccuring regressions with counter for the selected period.
Results Strength to assess the stability of results and infrastructure
By tracking the average
Pass Rate and
Health Rate – the percentage of passing test cases and healthy results – the Results Strength Indicator provides insight into the collective strength of all active Branch/Spaces (depending on Project Type) for the selected time period.
Health Rate is based on the status of all result sets which exist in one of 3 states.
Test cases failures can be Exempted from the determination of results health. refer to the How-to: Manage Health Status for more information.
|Healthy||0 nonexempt test failures and with all metric criteria met|
|Unhealthy||1 or more nonexempt test failures, or unmet metric criteria|
|Invalid||Excluded from the calculation of Health Rate, see How-to: Manage Health Status for details about an invalid result.|
The Results Strength Indicator is derived from the Health Rate and Invalid Rate as defined by the following table.
|Indicator||Pass Rate (PR) - and/or - Health Rate (HR)||Invalid Rate (IR)|
|Undertermined||IR > 15%|
|PR > 95% - or - HR > 80%|
|PR = 80 to 95% - or - HR = 65 to 80%|
|PR < 80% - and - HR < 65%|
The metrics associated with Results Strength are defined as follows:
|Pass Rate||The average % of tests that passed.|
|Health Rate||The % of results that were healthy. Invalid results are not counted.|
|Health Recovery Time||The average time required for unhealthy results to turn healthy, in units of days.|
|Invalid Rate||The % of invalid results caused by missing metrics, or a significant drop in test suite/case count.|
Note: Due to rounding, the
Pass Rate may show 100% even though regressions have occurred.
Test Effectiveness to measure if tests are effective at capturing side-effects
Based on the premise that the purpose of automated testing is to capture commits that result in regressions. Tracking test regression, especially for developer-focused changes, is one of the primary indicators for how effective the CI based testing is.
Rules of Regression:
non-failing statuses(providing a level of hysteresis).
The Test Effectiveness Indicator is derived from the Effective Regression Rate – the percentage of results sets with unique regressions – and the Results Strength as defined in the table below.
|Indicator||Effective Regression (EF)|
|EF < 1%|
|EF > 5% to 30%|
|EF = 1 to 5% - or - EF > 30% to 50%|
|EF > 50%|
As with all indicators, Test Effectiveness should be viewed in the context of code churn. The Metrics published under Test Effectiveness are described below.
|Effective Regression||The % of results with new test case failures (unique regressions), including invalid results.|
|Unique Regressions||The % of regressions that contain one or more new test failures.|
|New Failures||The total number of new test failures.|
|Spaces/Branches that Regressed||The % of Spaces (or branches, depending on Project Type) that regressed at least once|
Failure Resolution to measure if test failures are being resolved quickly.
The effects of letting failures drift during Continuous Integration are well understood. Failure Resolution provides a macro view of how timely new test failures are being resolved.
High Frequency (HF) Failures – test cases that failed three or more times over five consecutive results – are devided by the average
Failures per Regression to calculate the
HF Failure Rate.
|Indicator||Resolved Failures (RF) - and - Resolution Time (RT)||Total Failures (TF)|
|Undertermined||TF = 0|
|RF > 80% - and - RT < 8 days||TF > 0|
|RF >= 60% to 80% - and - RT >= 8 days||TF > 0|
|RF < 60%||TF > 0|
The Metrics published under Failure Resolution are described below.
|Resolved Failures||The % of failures that were resolved.|
|Resolution Time||The average time required for test case failures to be resolved, in units of days.|
|Failures per Regression||The average % of test failures per regression.|
|HF Failure Rate||The % of test failures per regression that are marked high frequency.|
Resolved Failures rate (resolved/unresolved) may include unresolved failures that first occurred prior to the selected time period, i.e. failures that are not
The timeline column chart compares collective results status – healthy, unhealthy and invalid – against file churn as a quantitative measurement of change. The counts are based on all result sets for all Spaces active during the selected time period. The chart provides a 12-month view of the Project with the selected time period highlighted.
The results counter reports the total number of result sets analyzed from all Spaces that were active during the selected time period.
The chart provides a proportional view of the
average passing percentage for all result sets analyzed.
The percentage of passing test cases – with the rate of change shown below it – is reflected in the calculation of the Project's Results Strength Indicator
The regressions counter reports the total number of regressions analyzed from all Spaces that were active during the selected time period.
The chart provides a proportional view of the two types of regressions – unique vs. recurring – for all result sets analyzed.
- A unique regression occurs when there are one or more new test case failures from the previous result set.
- A recurring regression occurs when all test case failures have recurred from the previous result set.
Regressions do not include failing metrics. The percentage of results with unique vs. recurring regressions are reflected in the Project's Test Effectiveness Indicator as defined above.
Space/Branch Insights (depending on Project Type) publish rates and metrics calculated for for each space active at some point during the selected period.
The metrics help when assessing the readiness of the software associated with each Space/Branch. The meaning of readiness is specific to the Project – anything from "is a feature or bug fix ready to be merged?" to "is a product ready for customer release?"
All Space/branch metrics must be viewed in the context of change activity as measured by the number of
Merged Pull Requests and
Quality Improvements can be tracked by
Code Coverage Growth, and decreases in
Static Analysis Issues
The metrics published for each active Space are defined below:
|Pass Rate||The average passing % for the selected period.|
|Health Rate||The % of healthy results. Invalid results are not counted.|
|Results||The total number of results including invalid.|
|File Churn||The total number of files changed.|
|Issues Decrease||The number of issues removed or added.|
|Test Growth||The growth in new test cases.|
|Code Coverage (lines) Growth||The growth in code coverage.|
|Merged Pull Requests||The number of merged pull requests.|