Are you making things worse? Better results with mindful design and statistical trials.

CONTENTS

Introduction to mindful design and statistical trials

Our ambitious goals for our statistical trial framework

○○ My purpose for this article on integrated statistical trials ○○

In this article I discuss the advanced statistical trials framework that is included in the DMS and how it can be used to quantify the changes and improvements you make to your systems and processes.

This article is focused on the many components of a system that need to be integrated in order to implement statistical trials in your live production systems. In this article I intend to explain our design approach, the DMS features that resulted from this design approach, and an example of a statistical trial.

Introduction to mindful design and statistical trials

○○ The hidden risks in many ideas and proposed solutions ○○

Today, it's far too common for highly-touted solutions by well-intended teams to turn out to be limited successes and even failures in hindsight.

Why does this happen? And why did these failed ideas sound so good at the time?

There are three recurring reasons that can cause ideas to sound better than they actually are; inherent bias, modern system complexity, and how difficult it can be to measure the full effects of a change.

First, let's talk about bias.

○○ The hidden risks of bias ○○

No matter how smart and open-minded people are, they tend to protect their stated ideas and see what they're already looking for (or have been trained to see). As Abraham Maslow once said, “If the only tool you have is a hammer, you will start treating all of your problems like a nail”.

For an example, lets stick a toe into the turbulent waters of politics. Studies show that depending on a stated liberal or conservative stance, reasonable and intelligent people (not extremists) can be shown the exact same problem and will predictably favor opposing solutions. Why? An affirmation of group belonging sets a public expectation of alignment with the stated group, and groups often have favored information sources, which means new information that is consumed is actually reinforcement training on how to see the world.

This bias can easily cause us to select and justify bad or narrow ideas.

○○ The hidden risks of ignored complexity ○○

Second, let's talk about system complexity.

Modern problems are surrounded by complex processes, systems and services. Where we once could make targeted improvements in a controlled environment (like an old linear assembly line) while ignoring externalities (like worker safety, equality, or environmental harm), we now find ourselves struggling to improve areas of high complexity, dependency and distribution while also having to minimize externalities and harm.

This complexity requires a whole new way of thinking, which Jay Forrester named “systems thinking” in 1956. Systems thinking means looking at our problems in terms of relationships between elements and not as individual parts. Elements in complex systems interact with each other, which also means changing one element will change another, and often with unexpected and unintended results.

A great example of a complex system is the environment. Sure, you can dump pesticide into a pond to get rid of mosquito larvae, but as we all know now (or should know), that won't be the only thing that happens.

Combining our bias with complex system problems, let's explore a popular question - "why are people with opposing political views so certain of their different solutions to the same problem?"

Bias is one cause, but a not-so-obvious second cause is that each side is looking at individual ‘elements’ of the problem and not the ‘system’. At the element level, opposing groups can have high degrees of certainty because they are choosing to look at different areas of the problem space. (e.g. Imagine everyone wants a better pond to live beside, but one group sees a mosquito problem in the pond, the other sees a low fish, plant and wildlife population problem).

○○ Removing bias and addressing complexity ○○

In todays media it is all too common to see "solutions" to complex problems that focus (often with fervor) on one element of the problem and promote simple and biased solutions ("let's dump pesticide in the pond - we all hate mosquitos!"). Sure, these over-simplified short-term solutions may address a targeted element of the problem, but step back, and you can see how many of these 'solutions' will cause problems that are not being discussed.

As a personal thought exercise, take a moment and think about a complex human problem that matters to you - now think about the ideas that come to mind to fix it. Maybe you can spot your own biased focus on individual elements of the problem space and not the system?

For innovators driving change (especially those attempting bold changes) it's very important to design solutions with minimal personal bias and maximum understanding of the systems we are changing. Without this, our well-intended actions can cause more harm than good.

○○ The complexity of measuring the real benefits change ○○

For my third and last point, I want to quickly discuss the challenges with measuring change in complex environments.

I'll start by stating the obvious, it's very hard to measure the broad effects of change on complex systems because of the scale of factors (number of system elements) and the ripple effects that significant changes introduce across elements of the system (externalities).

○○ Our statistical trials framework that quantifies the effects of change ○○

This is exactly why we built a powerful statistical trials engine directly into the DMS.

We wanted to empower our clients with the ability to test their innovative solutions to complex problems directly from their business systems (in this case DMS).

This fully integrated solution has key advantages; it enables our clients to test in-system changes while collecting data important to the specific change, and it allows them identify and measure effects on the entire system from the DMS business intelligence and reporting.

In this article, I’ll describe my personal approach for solution design that minimizes bias and maximizes positive system-wide outcomes, and describe how you can build and leverage a statistical trials solution to test and measure your innovations and know that you are really making things better.

Let’s get started.

Significant innovations require ambitious goals, these were ours:

We could have designed our trial system to support the testing of a specific innovation or change (a one-off solution), but our more ambitious goal was to build a comprehensive solution that could be used to trial virtually anything in the system at anytime. Our solution goals included:

The ability to trial any change to the DMS system or associated business process on any site, targeting any participant or staff member, and interactions with any feature or process.
The ability to run traditional systems and processes in parallel with changes for comparison (without time separation bias), and to run more than one trial at the same time in separate areas of the system or process so that there is no technical limitation on running trials in the organization.
The ability to measure the effect of changes (positive or negative outcomes) at the dispute file level, the process level, the participant level, and the interaction/feature level. To maximize measurement, this includes the ability to evaluate statistical (quantitative) outcomes, the delivery of custom surveys, and the gathering of user feedback and ratings.
The ability to fully control trials without code deployments or system downtime. This includes the ability to turn trial changes on and off immediately (manually), to set time-trial durations that automatically start and end trials, or to have trials automatically end when they defined and statistically relevant volumes.
The ability to combine trial data and reporting (including participants, interventions and outcomes) with any existing DMS data source. This enables existing business intelligence and reporting to be used to evaluate system-wide impacts and unexpected effects across the entire business (positive or negative externalities).

How do you design and validate the best solutions (and be ready for a potential trial)?

Before we describe our trials technology, I would like to share my approach for innovative change in complex systems.

When working on an innovative solutions, it can be hard to define, select and refine the best possible solution without introducing bias. It can also be very hard to measure the positive and negative effects on the entire system, especially when it has not been well planned for in advance.

Although I always adapt this approach to specific problems, the following describes the general approach I have had consistent success with both from a design and measurement perspective. Although listed sequentially here, this process can be run iteratively or in parallel depending on the scale of the solution and availability and maturity of the team.

Define the core problem area you are trying to improve as affected groups, affected systems, and effected processes. Be comprehensive and address as much of the system that will be effected as possible.
- Tip: Include people that have different experience, priorities and perspectives in the problem space - but that also have a shared passion for finding the best solutions and outcomes. Avoid personalities that are close minded and dominant as they tend to shut down necessary idea exploration and debate. People that are opinionated and can deeply defend their opinions can be great, and should not be confused with dominant people that try to enforce unbacked opinions or shut down exploration with self-serving motives.
- Tip: Avoid jumping to solutions (a dangerous practice I outlined in a previous article) and work through the complexity of the problems space first. It's too easy to get attached to specific solutions which you then end up trying to make fit the problem ...when all you have is a hammer, all problems looks like a nail.
Define potential solution ideas as general 'what if we...' statements, that also include short 'why' statements (expected benefits), and short 'how' statements (expected changes).
- Tip: Start with the 'what if we...' and 'why' statements to capture ideas and intention, and then add in the 'how'. Its easy to fall into a trap of getting caught up in 'how' and stop creating ideas that come from 'what if we...' and 'why'.
- Tip: If a 'what if we...' idea is suggested but it is a verifiably bad idea or something that can never be done, move it into a 'probably a bad idea list' so it doesn't get the same attention as workable ideas`.
From your list of effected groups, systems, processes, and solution ideas, create a comprehensive list of solution goals and ensure they explain as many intended positive outcomes across the problem space as possible.
- Tip: Make this a living list and refer back to these goals often, you will be surprised how many goals are competing and how important these opposing goals can be for solution refinement. Competing goals might be things like: we should make submissions as fast and simple as possible for citizens, vs, we should make sure all submissions are as complete and all special cases are addressed.
- Tip: If you can, sequence your goals from most important to least important, this will help ensure that the things that matter most are never at the mercy of of things that matter least.
Define a comprehensive list of solution risks and available mitigations. Spend the time to think of and list as many negative outcomes to as many elements and actors as possible. This is an important investment to avoid externalities (unintended harm).
- Tip: Make this a living list and refer back to these risks often, you will be surprised how often variations of the same risks will keep popping up, and how, with small refinements the mitigations in your solution, you will end up with a greatly improved final solution.
Design a draft final solution (conceptual design) using all of the exploration information above. While drafting the solution, continually refine it for maximization of your stated goals and the minimization of your stated risks.
- Tip: Always try to 'show' and don't just 'tell' - create UI mockups, process diagrams, and sequenced use case stories from the perspective of different actors to show the future as clearly as possible. Avoid long technical documents and excessive use of text or you will lose the opportunity for understanding and rapid feedback.
- Tip: Where there are 'options' in the solution, make sure that you include them (and their benefits/risks) in your conceptual design.
Review the solution with subject matter experts. When reviewing your solution, document any new risks, ideas, or goals that are mentioned, and as necessary incorporate these into a refined final solution.
- Tip: Set two expectations for evaluation teams: 'anyone can refute anything as long as they can back up their arguments' and 'as soon as the best idea is identified, it wins'. The ability to dispute anything (with justification) makes sure errors in the solution are caught and it can really help to reduce bias. As soon as a best idea becomes clear to the group, you should move inferior ideas aside.
- Tip: If your first solution review exposes a lot of risks, ideas and goals, update your draft solution and review it again with the subject matter experts with a focus on where you have updated the solution based on their feedback.
And lastly, where there is high complexity and uncertainty with the proposed solution - design and run a statistical trial to measure the effects of the solution.
- Note: Although there are many ways to conduct trials, our preferred method is to run the status quo systems and processes in parallel with improved systems and processes (A & B). This approach eliminates time-based bias that can occur in sequential trials (A then B). Our trials engine was designed to allow either approach, as some changes are too expensive and complex to run in parallel.
- Note: As trial design and execution is a very complex subject, I am not going to focus on it in this article. To keep this article focused, I will provide a description of our integrated trials engine and an example of a real trial that we ran.

What are the key features and components of our integrated statistical trials engine?

For the trials engine that we built into the DMS, we landed on the following features. Directly after this description, I will provide an example of how these features were used to conduct a real world trial that was proposed by a Behavioral Insights Group.

Participant opt-in engine: Although not relevant to all trials, providing the participant(s) with the choice to participate (opt-in, or opt-out) is important where the participant(s) will:
- Experience a change that may not be as beneficial to them as previous methods (e.g. where there is a large expected organizational benefit that justifies the change but it is not known that the participant will benefit so they should agree to any potential for a reduced overall benefit).
- Where extra time, rules adherence, and effort is required by participants when compared to non-trial methods and agreement with the additional effort is necessary.
- Where personal information will be gathered as part of the process and this needs to be agreed to prior to the trial.
For those that don't opt in, it's also important that your solution allows opt-out participants (and opt-out dispute files) to be part of control group so that you can avoid selection bias (i.e. to ensure that participants that choose to opt-in are not actually different than those that do not - skewing your trial results).
Randomization engine: Whether your trial uses opt-in, you still need a way to randomize exposure to the improvement (intervention) so that you can maintain a control group for comparison. This is done by randomly allocating dispute files or participants into treatment and control groups. This needs to be done in a way that is not visible to participants to avoid problems like the Hawthorne effect (where individuals modify their behavior out of awareness they are being observed or tracked). Once randomized our participants and dispute files are categorized as:
- Standard (default for opt-out)
- Treatment (opt-in and randomized to experience the change or intervention)
- Control (opt-in and randomized to not experience the change or intervention)
Trial controls: For our trials engine we wanted a high level of control over the trial. To enable this control we included:
- The ability to target trials at internal (staff), external (consumers of the service) or internal & external audiences
- The ability to set any trial to active, on hold or to end a trial at any time
- The ability to set an automated start and end date for time-based trials
- The ability to set a threshold for statistical relevance after which the trial automatically ends
- The ability to shut off new trial participations (i.e. having no new disputes or participants added), but still allowing interventions, changes and outcomes associated to the trial to be completed by those already in the trial.
Intervention (improvement) tracking: A critical component of any trial is the ability to have the treatment group experience the associated change that you are making. This means that you not only need to be able to have participants experience the change vs the standard methods, but that you are able to identify and target the change to specific participants and know who experienced it. To enable intervention tracking, we implemented the following capabilities:
- The ability to target and track interventions based on participant roles (including internal staff, and external consumers of services)
- The ability to target and track interventions to everyone interacting with a dispute file or specific DMS feature
- The ability to implement changes (treatments) to any DMS features and on any site while maintaining the regular methods.
- The ability to control (limit) how many times a treatment is experienced by a participant, a target group, or on a dispute file for repetitive dispute file or feature accesses.
Outcome tracking: To ensure that we can measure the outcomes of the interventions, we included three methods of tracking. These include; statistical trial outcomes (data and analysis based), surveys and questionnaires (subjective user feedback) and user ratings (quantitative rating-based feedback).
- Note: Its important to keep outcome tracking separate from intervention tracking to allow the two sources be flexibly deployed (e.g. a user experiences a change and provides immediate personal feedback vs. a user experience a bunch of changes then staff provide feedback on the net outcome of the changes).
To enable full outcome tracking, we implemented the following capabilities:
- The ability to display surveys or obtain ratings at any time, independent of the participant role, site or feature they are experiencing.
- The ability to link surveys and ratings to specific participants, interventions, features and sites (e.g. at the last step of online intake by the applicant)
- The ability to link surveys and ratings to participants and roles that did not receive interventions (e.g. for internal staff in the main DMS system to rate the online intake submitted by the applicant).
- The ability to limit surveys and ratings to one per participant, user role (i.e. internal staff or external applicant) or dispute file, so that additional accesses to the same site, file or features does not trigger the unnecessary gathering of feedback that has already been provided.
Integrated statistical reporting: The last component of our integrated trials solution is probably the most important, the ability to report on your findings. To do this, we included core areas of statistical reporting:
- The ability to create custom reports using trial information for the management and measurement of the trial.
- The ability to report on trial information merged with systems data so that you can contextualize the change from the perspective of treatment and control, with important business data.
- The ability to use existing organization-wide business intelligence (optionally combining trial datasets) to identify for system-wide effects and unexpected externalities.

Can you provide a real-world example of a trial that used these features in DMS?

One of the most complex areas of modern dispute resolution is evidence. This is mainly caused by how easy it is to create and provide evidence today.

The ability to take pictures, recordings, and screenshots of communications (texts, emails, chats, posts) makes it easy for a person in an emotional dispute to provide too much evidence of how they are being wronged. Add the 24/7 ability to submit evidence and this problem can get pretty bad. This convenience has created new risks around evidence volumes and quality (remember, with 'systems thinking' a positive change in one area like ease of creating and submitting evidence can cause problems in other areas).

Although the DMS has an advanced and comprehensive suite of digital evidence features that were designed to address the challenges of modern evidence submission (which I plan to cover in a future comprehensive design series article), our client was looking for behavioral 'nudges' that could further improve the evidence submitted by disputants.

For this trial, the client partnered with a behavioral insights group to come up with potential ideas (interventions) to improve submitted evidence. When they had their initial ideas formed, we worked with this group to refine the trial based on extended goals and identified risks. We then worked to remove our bias and expand our understanding of possible network effects on the system. With the interventions designs finalized, we validated how we would measure the affects of the intervention and designed the associated reporting. We then built the intervention and deployed it as a parallel trial (A/B) using our integrated trials engine.

The trial goals: To improve 3 core areas of evidence submission.
- Improved evidence quality: Poor quality evidence is defined as the right evidence, but provided in a way that it is either unusable or difficult to use. Examples of poor quality evidence include; low resolution images, blurry photos, long videos with no timestamps that indicate important sections, documents with missing page numbers.
- Improved evidence quantity: There is a goldilocks zone for evidence, not too much and not too little. This goldilocks zone depends on the complexity and type of issue being resolved. Too little evidence can cause negative resolution outcomes (failure to prove your point) and too much evidence creates extra work for everyone trying to make sense of it all. An example might be extensive damages to rented apartment. A single photo is probably not enough to win an award for the full damages, but 200 photos and 4 long walkthrough videos is probably unnecessary and is just going to add a lot of extra time and overhead for resolution staff.
- Improved evidence organization: Well-organized evidence helps achieve better resolution outcomes and improves resolution efficiency. By well-organized evidence we mean files that have a good name, a good description, are well structured, and are associated to a specific issue and a specific claim being made. For an example of well organized evidence let's return to the rented apartment damages example. Instead of just submitting a big stack of pictures, documents and receipts, imagine each separate area of damage having its own pictures of before the damage, after the damage, proof the tenant caused the damage, and a reputable company's estimate to repair the damage. This well organized evidence makes a very strong case that is very easy to navigate.
The theory being tested: The behavioral insights group theorized that people would provide better evidence if they knew that they are only getting one hearing, the hearing is a fixed duration (if time runs out they may need to re-apply with issues that could not be dealt with), and once their matters were decided they would not be able to get another hearing on the same matter again (without appeal). The behavioral insights group theorized that disputants are largely unaware of this reality, and having this information in combination with some simple instructions on evidence quality, quantity and organization, they would improve their evidence out of a desire to get a favorable decision.
The selection of trial participants: The disputes and participants that were included in the trial were selected using the following methods
- With the trial turned on and the intake opt-in option enabled, all users submitting a new application are shown a question in the first step of their application asking if they want to opt-in for the trial. The trial system stores this answer and does not ask again on subsequent applications to control the trial to one application per user (so that users with multiple applications do not skew the trial results).
- From each application where the user opts in, the DMS trials engine randomly (and without the user knowing) assigns the dispute as 'treatment' (that will get the interventions), and 'control' (that won't get the interventions). For disputes that were not opt-in, they are also included in the trial and identified as standard disputes. Tracking these files ensured that the trial is not being biased by differences between applicants that decide to opt-in or opt-out.
The system interventions implemented in DMS: The following interventions were included in the trial:
- After the step in our intake where the applicant selects their dispute issues and are moving to provide their issue information and evidence, the applicant is shown the first intervention. The intervention stopped their application activities and communicated that:
  - They will be assigned one of 40 arbitrators and get a hearing with a maximum duration of one hour
  - This arbitrator will consider relevant evidence only
  - Duplicated or unnecessary evidence will not help their case
  - Each piece of evidence submitted should provide new relevant information
  - Each piece of evidence should be given a descriptive name
- When additional evidence was being submitted to the file prior to the hearing, additional participants on the treatment dispute file (not the applicant) were shown a similar but slightly modified intervention that communicated through a checklist visual that:
  - They will be assigned on of 40 arbitrators and get a hearing with a maximum duration of one hour and they should make sure their evidence is:
    - Relevant and related directly the case they are making
    - New and not duplicating information already provided
    - Descriptively named
The methods used for measuring trial outcomes: Although the integrated trial data was merged with DMS data for quantitative analysis, there was an additional need for subjective assessments. For this trial we included two methods for recording subjective outcomes. This includes:
- An applicant survey that ensured a rating of the user experience (obtained by the applicant and at the end of the intake submission process). This was gathered from applicants for all control, treatment and opt-out dispute applications.
- A resolution staff assessment (after the hearing was completed) on the associated evidence as 3 separate ratings of; quality, quantity, and organization. This was gathered on all control, treatment and opt-out dispute files.
The core trial reporting: There were a number of reporting and analysis datasets that were included in the trial. They addressed trial management, trial results, dependent datasets and external business intelligence and operational reporting.
- The trial management report: This allows the behavioral insights group to monitor and manage the trial. At a high level this included:
  - The counts of opt-ins and opt-outs in the trial (to validate participation rates)
  - The counts of treatment, standard and control disputes in the trial (to validate expected trial volumes and to identify when statistically relevant volumes had been hit and the trial can be ended)
  - The status of each trial object and the associated intervention and outcome statuses (used to validate trial completeness rates)
- The trial results report: This allows the behavioral insights group to test and refine trial data analysis methods and to analyze early outcomes of the trial. It also is the final dataset for the full analysis of the completed trial. At a high level this included:
  - All participants (and or files) included in the trial, with relational system identifiers for joining this data with production system datasets.
  - All interventions the participants have received, and all relevant intervention information about the interventions
  - All outcomes the participants provided, and all relevant information about these outcomes (i.e. ratings, survey answers, unstructured feedback).
  - This data was all anonymized so that it could be used for analysis by external teams without any information privacy risks.
- The dependent dataset reports: This operational data included all of the identified and anticipated areas of positive or negative change in the business. This data included relational identifiers for joining this data with the trial datasets.
  - This data was all anonymized so that it could be used for analysis by external teams without any information privacy risks.
- Business intelligence and operational reporting: Used to analyze the effect of the trial on the broader business and look for externalities or unexpected affects. This data should provide good coverage of any service and process with even the slightest chance of being effected by the trial.
  - This data was used by the business for its own internal analysis due to the fluid nature of the is work, and the business understanding required to conduct the analysis.
The trial result: I hate to do this to those that put in the effort to read this entire article, but the result is.... a cliffhanger. At the time of this writing, this trial is almost, but not yet complete. The detailed analysis is starting in the near future.

In Closing

In closing, I'll admit this approach may seem like a lot of effort. That said, for any substantial innovation in a significantly complex high-volume system, doing it right the first time can lead to immediate benefits and much better long-term outcomes. Initial success also creates a foundation for additional improvements that push the benefits even further.

It's probably obvious, but I'll say it anyways. Failures in complex systems don't come without cost. We should all try to avoid the extra cost (and frustration) of having to abandon poorly designed solutions, or even more regretful, implementing harmful changes only to realize (too late to remove) that the system is now permanently worse than if nothing was done at all.

Some of you might be thinking 'if you aren't failing you aren't trying hard enough', which is a common innovation mantra, but recurring failures should only be seen as a positive when an innovation justifies a high failure rate (like inventing the light bulb).

Our world needs real solutions to complex problems. Although we can't expect to win every time, it's very important to use approaches that allow us to win more than we lose. Not only do these wins create much-needed progress, but they reward and energize our teams and businesses with the realization that our good intentions can become real and beneficial outcomes.

I hope this article helps you define better solutions for your complex systems without bias, and gives you new ideas for testing the solutions to know if you are really making things better. If you enjoyed this and are looking for other articles, check out our designing the future blog. Until our next article, keep on innovating and sharing your working solutions!

Mike Harlow

Solution Architect

Hive One Justice Systems, Hive One Collaborative Systems

Where can you get more information?

If you have questions specific to this article or would like to share your thoughts or ideas with us, you can reach us through the contact form on our web site.

If you are interested in the open source feature-rich DMS system that is the basis of the innovation in this article, visit the DMS section of our web site.

If you want to learn more about this solution, we offer live sessions and lectures that provide much greater detail (including diagrams, statistics and feature demonstrations) and allow breakout discussion around key areas of audience interest. To learn more about booking a live session, please contact us through our web site.

If you are an organization that is seeking information to support your own transformation, we provide the following services:

Current state analysis: where we evaluate your existing organizational processes and systems and provide a list of priority gaps, recommendations, opportunities and solutions for real and achievable organizational improvement.
Future state design: where we leverage our extensive technology, design, creativity and sector experience to engage in the "what is" and "what if" discussions that will provide you with a clear, compelling and achievable future state that you can use to plan your transformation.

You can review our available services on our web site.

Are you making things worse? Better results with mindful design and statistical trials.

Introduction to mindful design and statistical trials

Significant innovations require ambitious goals, these were ours:

How do you design and validate the best solutions (and be ready for a potential trial)?

What are the key features and components of our integrated statistical trials engine?

Can you provide a real-world example of a trial that used these features in DMS?

In Closing

Where can you get more information?

Recent Posts

コメント

CONTACT US TO LEARN MORE