Interpreting Results

Dashboard

Your dashboard is intended to give you a view of everything happening in your experiments at a glance.

Think of the metrics displayed here as the vital signs of your experiments:

Splitforce iOS Android and Unity Testing Dashboard

If you click on the Top Performing Variation for any dashboard item, you will be brought to that experiment’s Full Report. Reports display raw data, as well as metrics related to test performance and significance:

Splitforce Dashboard Mobile App Testing Software Results

Raw Data

Conversions is the total number of users that converted.

Events is the total number of events users completed.

Time is the total time users spent in a session, on a screenview, etc.

Quantity is the total quantity of something occurring in the app

Users is the number of users included in the experiment to date.

Performance Metrics

Conversion Rate is a measure for Conversion Goals and is calculated as:

Unique Goals Reached / Users

Average Events per User is a measure for Event Goals and is calculated as:

Total Events / Users

Average Time per User is a measure for Timing Goals and is calculated as:

Total Time / Users

Average Quantity per User is a measure for Quantity Goals and is calculated as:

Total Quantity / Users

Observed Improvement is the percent change of each variation’s performance compared to the baseline. It is calculated as:

(New Variation’s Performance) – (Baseline Variation’s Performance)
Baseline Variation’s Performance

* 100

Significance Metrics

Confidence Intervals represents the range in which there is a 95% probability that the variation’s true performance lies. Confidence Intervals can be found in numerical form underneath the variation performance metric, and represented graphically by the syringe graph to the right of the variation performance metric. It is calculated as:

For rate goals:

+/- 1.96 * sqrt (p * (1 – p) / n)

For non-rate goals:

+/-1.96*(stdev/sqrt(n))

Chance to Beat Baseline is the probability that the Observed Improvement is statistically significant. It is calculated as the p-Value associated with the following z-statistic:

New Variation’s Performance – Baseline Variation’s Performance
sqrt (Standard Error of New Variation^2 + Standard Error of Baseline Variation^2)

Estimated Users to Completion is an estimate of the number of new test participants needed to arrive at a statistically significant result, where the top-performing variation’s Chance to Beat Baseline is greater than or equal to 95%.