“Lean improvement” has swept the enterprise world, but there are still countless obstacles for product teams to adapt to the trial-driven trends and make decisions based on customer data. To help people adapt to this process, we work with Fortune 500 product teams to track their entire experiment process from hypothesis development to customer insight. After reviewing 2,000 experiments, we have learned a lot about corporate culture, the nature of user research, and product management processes.

 

Today, our research clients include leading thinking product teams from AT&T, Capital One, PwC, Aetna, and many more. Out of these 200 trials, 660 prototypes were produced, and nearly 400,000 users gave us feedback, including 46,000 minutes of video interviews, 6,500 charts, and hundreds of undoubtedly wise and useful product decisions.

We collated the trial data and interviews with clients to validate what we learned in the process. Here are seven of the most meaningful and actionable insights:

Change is hard.

As the old saying goes, authenticity is hard work. As a startup, we have to keep a close eye on the world all the time — talking to customers can be an important organic part of what we do. But, as we know, this rarely happens in a large organization.

 

Despite having long believed in the value of rapid prototyping and experimentation, Fortune 500 product managers often work with multiple projects competing for priority. User research is expensive and is performed on a monthly or quarterly basis by an internal team or external organization. Completing user research on a weekly or even daily basis is unheard of for them.

 

While “on-demand user insight” sounds appealing, in practice it challenges the conventions of many companies. One of the most ingrained is the bias against “extra planning”. When the research cycle is several months, it is crucial to ensure that every aspect is well crafted and vetted. But once you’ve accelerated the process to a few hours or days, iterations don’t have to be completely planned out.

 

Our data show how difficult it is to change mindsets and behaviors. At full capacity, each product team runs approximately 8-12 trials on the platform. Even with internal working groups and outside help, clients still need another three to six months to reach breadth. Of course, part of the time is spent understanding how to quickly turn data into decisions. Much of the time is spent as a cultural drain on the product team’s transition from waterfall to agile, in contrast to planning awareness and iteration-based research. Taking two weeks to outline user research is inevitably flawed, and does not match doing six iterations in the same time frame.

 

Cindy Alvarez, director of User experience at Yammer, repeatedly stresses the practical practices of “lean” and “customer development” in large organizations. She calls on people to stop making empty plans and instead talk directly to customers, because that’s the most effective way to do it.

 

There’s no doubt she’s right, and that’s what we’re calling for. We conduct research among new customers, including the benchmarking competitiveness of customer insights and the availability of their respective products. So far, this has been a useful inspiration for the product team to start iterating.

 

Sometimes the formal is better than the informal.

From the theme of the previous insight, even if large companies reach full speed, they can’t connect with the pace of experimentation that startups do. The initial design of the product makes it easy for anyone with a stake to submit a temporary trial, which is the desired state of affairs. Instead of running trials on the fly, they usually submit trial plans on a weekly basis.

 

It turns out there’s a good reason for that. While process work makes sense in startups, it often doesn’t work in large organizations, perhaps because of different project structures and often internal competition. Product managers actively consult with these stakeholder teams as they interpret customer feedback and determine next steps. Often this requires a predictable and periodic rhythm to keep everyone on the same page.

 

That’s why concepts like “sprint design” have come to the fore, preparing time for coordination among interested teams. We are accepting the role that form plays here, and encourage regular “trial sessions” throughout the organization until testable hypothesis tests are completed.

 

Product testing can be divided into different categories.

Before we can build platforms and workflows to speed up the user research process, we must first better understand the types of products that user research needs in the product team. That’s why, before development, we manually run 500 or so trials using third-party tools.

 

We found that user research trials involving prototypes (not in a production environment) fall into basically six categories. Usability testing is widely accepted. Although our definitions of these trials were not universal, they were surprised by the “rules of thumb” and configurable trial templates for each type. You can read it in the Prototype Guide, and here’s an overview:

The frequency of all tests run is classified as follows:

We still have a lot of research to do, but this part of the definition of a trial helps user researchers take almost any client request and turn it into an executable study in a matter of minutes.

 

All studies are biased

Our platform mainly consists of experiments in what are called “simulated environments”. Users who provide feedback know they are part of the research and spend their time on it. They use these highly finished prototypes and know they’re not on the market.

 

We focus on this type of trial because the product team can get a lot of data from it, but it also fits in with the organization’s existing processes and risk tolerance. Without in-house design and development resources, unvaluable clients become guinea pigs for half-finished products and are not consulted about legal compliance. Of course, these numbers may not be as accurate as those for released products.

All studies, including ours, have a certain bias. But that’s not an excuse to avoid doing full user research. Be enthusiastic about doing more experimental research, and try to minimize bias, or you might miss a tree in the forest.

 

One of the core tenets of the scientific method is the concept of replicability – the results of any single experiment can be online through another experiment. We often see product teams using a single “statistical significance” to confirm a suspect hunch or small project. But there are some factors that can, and almost always do, bias test results without any intentional error. Asking the wrong cutting edge questions or undersampling a car from a target audience can skewing individual test results.

 

To derive value from individual trials and customer data points, product teams need to practice validating data through iteration. Even if the results of any given trial are offset or outdated, they are offset by the steady user research process. If you will, to prevent the pursuit of unimportant results is to be careful not to assume that the data is feasible insight until the model has been rigorously established.

That’s why we made sure that almost every trial was studied qualitatively as well as quantitatively. In addition, we strive to provide comparable insights — it’s rare to know how a user would feel about a prototype without feeling anything at all. In the real world, users have a range of options to satisfy their requirements, so we need to ensure that the solution feedback always refers to another alternative. Composition and optimization greatly reduce bias and yield a large amount of data to help us identify models and insights. Sir, we also emphasize the importance of combining other data inputs, such as traditional market research and in-app analytics.

 

The feedback continues to amaze us.

You might think that after generating and analyzing vast amounts of user data, we would “see all” feedback and insights. But that’s definitely not true, and we’re surprised to see something new every day about the “huge difference between what users say and what they do.”

 

It’s pretty bad at predicting future behavior right now. We have studied extensively the psychology of this dynamic. However, it was quite surprising when we found almost unanimous support for a feature in our research, but no user interest at all in a subsequent prototype. Putting visual stimuli in front of the target market is absolutely necessary to validate results.

 

Genuine expression of affection.

Market trends change rapidly and product teams keep up with them. There are very few things they can do, such as watching a video of a user interview, that make them sit back and watch. Some people laugh, some people cry, some people are happy, some people are shocked, user research is really an emotional roller coaster.

 

Passionate validation

Companies often ask, “How do we know when we are validating product concepts with our customers?” While we don’t have any hard and fast rules, we half-jokingly apply for “Pokemon GO Benchmark.” We looked at hundreds of mobile games. Players give detailed feedback on open questions, spend a lot of time talking to prototypes and regularly use new features. Obviously, each product doesn’t need to be a huge success, but the Pokemon GO project can serve as an anecdote.

 

The point is, even when we think the segmentation of users is reasonable, the results are rarely predictable or obvious. You simply cannot underestimate how difficult and rewarding commuting can be. Shorter iterations unlock deeper insights.

 

When our early customers finally ran trials quickly on beta builds, it was clear why providing meaningful customer insight to companies that needed months of research to execute was elusive. Speed itself is key.

 

But when we accelerate the research process, the customer is not satisfied once the concept is proven. They have the time and energy to consider why prototypes are considered more valuable than early iterations and alternatives. To keep up, we had to establish a broadly adapted qualitative workflow so that we could go back to the user sample of the pilot product and ask open-ended questions. This will help us to unlock deeper insights.

 

We define deep insight as an understanding of the customer’s role, and the value of that role goes beyond the individual projects the product team is working on. Useful for all members of organizations focused on providing value to the same market. It’s not just about understanding why customers like expensive one-time payments and relatively cheap monthly subscriptions. This is quite a meaningful insight that can be applied to other products in an organization’s product matrix, making rapid iteration possible.

 

Data means an end once

It’s easy to get lost in the noise, instead of doing the hard work to discover and improve ROI. Building a successful platform quickly forced us to expect product teams to provide more than “data-driven” capabilities.

 

Initially we assumed that the generated data would translate directly into better product decisions. To some extent, this is important for the organization as a whole. However, when we really dig in, we find that “data driven” is not what the product manager really wants or needs.

 

We listen attentively to how clients transfer the value of their trials to colleagues in other departments. More often than not, they mentioned how experimentation allowed their team to run around hypotheses rather than opinions. The team spent 15 minutes following the hypothesis and 15 minutes evaluating and interpreting the results. Instead of wasting two hours of useless debate. One product manager discusses how he uses experiments simply because the data helps him provide weekly data to his supervisor. Another product manager iterates and learns through experimentation and gains a lot. Of course, data is vital to achieving our goals, but it is only a means, not an end.

 

This article is compiled by Mili @ Shouting Technology: https://medium.com/pminsider/after-running-2-000-experiments-for-fortune-500-product-teams-heres-what-we-learned-c4123fb 207c8#.6gi1n8h9u