Use cases
Experiments
Running experiments like A/B Testing and Multivariate Testing is a powerful technique in product development for continuous learning and iterating based on feedback. Featurevisor can help manage those experiments with a strong governance model in your organization.
What is an A/B Test?
An A/B test, also known as split testing or bucket testing, is a controlled experiment used to compare two or more variations of a specific feature to determine which one performs better. It is commonly used in fields like web development, marketing, user experience design, and product management.
It is common practice to call the default/existing behaviour as control
variation, and the new/experimental behaviour as treatment
variation.
Why run A/B Tests?
The primary goal of an A/B test is to measure the impact of the variations on predefined metrics or key performance indicators (KPIs). These metrics can include conversion rates, click-through rates, engagement metrics, revenue, or any other measurable outcome relevant to the experiment.
By comparing the performance of the different variants, statistical analysis is used to determine if one variant outperforms the others with statistical significance. This helps decision-makers understand which variant is likely to have a better impact on the chosen metrics.
Process of running an A/B Test
A/B testing follows a structured process that typically involves the following steps:
- Research and identify: Find a customer or business problem and turn it into a testable hypotheses by determining the specific element, such as a webpage, design element, pricing model, or user interface component, that will be subjected to variation.
- Power analysis: Determine if there's enough traffic or users to run the experiment and achieve statistical significance.
- Create variations: Develop multiple versions of the element, ensuring they are distinct and have measurable differences.
- Split traffic or users: Randomly assign users or traffic into separate groups, with each group exposed to a different variant.
- Run the experiment: Implement the variants and collect data on the predefined metrics for each group over a specified period.
- Analyze the results: Use statistical analysis to compare the performance of the variants and determine if any differences are statistically significant.
- Make informed decisions: Based on the analysis, evaluate which variation performs better and whether it should be implemented as the new default or further optimized.
What about Multivariate Testing?
A multivariate test is an experimentation technique that allows you to simultaneously test multiple variations of multiple elements or factors within a single experiment.
Unlike A/B testing, which focuses on comparing two or more variants of a single element, multivariate testing involves testing combinations of elements and their variations to understand their collective impact on user behavior or key metrics.
Difference between A/B Tests and Multivariate Tests
Often times, A/B tests with 3 or more variations are referred to as A/B/n tests. We are considering them both as A/B tests in this guide.
A/B Tests | Multivariate Tests | |
---|---|---|
Purpose | Compare two or more variants of a single element | Simultaneously test multiple elements and variations |
Variants | Two or more variants (Control and Treatment) | Multiple variants for each element being tested |
Scope | Focuses on one element at a time | Tests combinations of elements and their variations |
Complexity | Relatively simpler to set up and analyze | and analyze More complex to set up and analyze |
Statistical Significance | Typically requires fewer samples to achieve significance | Requires larger sample sizes to achieve significance |
Insights | Provides insights into the impact of individual changes | Provides insights into the interaction between changes |
Test Duration | Generally shorter duration | Often requires longer duration to obtain reliable results |
Examples | Ideal for testing isolated changes like UI tweaks, copy variations | Useful for testing multi-factor changes like page redesigns, interaction between multiple elements |
Our application
For this guide, let's say our application consists of a landing page containing these elements:
- Hero section: The main section of the landing page, which includes:
- headline
- subheading, and
- call-to-action (CTA) button
We now want to run both A/B Tests and Multivariate Tests using Featurevisor.
Understanding the building blocks
Before going further in this guide, you are recommended to learn about the building blocks of Featurevisor to understand the concepts used in this guide:
- Attributes: building block for conditions
- Segments: conditions for targeting users
- Features: feature flags and variables with rollout rules
- SDKs: how to consume datafiles in your applications
The quick start can be very handy as a summary.
A/B Test on CTA button
Let's say we want to run an A/B Test on the CTA button in the Hero section of your landing page.
The two variations for a simple A/B test experiment would be:
- control: The original CTA button with the text "Sign up"
- treatment: The new CTA button with the text "Get started"
We can express that in Featurevisor as follows:
# features/ctaButton.yml
description: CTA button
tags:
- all
bucketBy: deviceId
variations:
- value: control
description: Original CTA button
weight: 50
- value: treatment
description: New CTA button that we want to test
weight: 50
environments:
production:
rules:
- key: "1"
segments: "*" # everyone
percentage: 100 # 100% of the traffic
We just set up our first A/B test experiment that is:
- rolled out to 100% of our traffic to everyone
- with a 50/50 split between the
control
andtreatment
variations - to be bucketed against
deviceId
attribute (since we don't have the user logged in yet)
Importance of bucketing
Featurevisor relies on bucketing to make sure the same user or anonymous visitor always sees the same variation no matter how many times they go through the flow in your application.
This is important to make sure the user experience is consistent across devices (if user's ID is known) and sessions.
You can read further about bucketing in these pages:
The deviceId
attribute can be an unique UUID generated and persisted at client-side level where SDK evaluates the features.
If we wanted to a more targeted rollout, we could have used segments to target specific users or groups of users:
# features/ctaButton.yml
# ...
environments:
production:
rules:
- key: "2"
segments:
- netherlands
- iphoneUsers
percentage: 100 # enabled for iPhone users in NL only
- key: "1"
segments: "*"
percentage: 0 # disabled for everyone else
You can read further how segments are defined in a feature's rollout rules here.
Evaluating feature with SDKs
Now that we have defined our feature, we can use Featurevisor SDKs to evaluate the CTA button variation in the runtime, assuming we have already built and deployed the datafiles to our CDN.
For Node.js and browser environments, install the JavaScript SDK:
$ npm install --save @featurevisor/sdk
Then, initialize the SDK in your application:
import { createInstance } from "@featurevisor/sdk";
const f = createInstance({
datafile: "https://cdn.yoursite.com/datafile.json",
onReady: () => console.log("Datafile has been fetched and SDK is ready")
});
Now we can evaluate the ctaButton
feature wherever we need to render the CTA button:
const featureKey = "ctaButton";
const context = {
deviceId: "device-123",
country: "nl",
deviceType: "iphone"
};
const ctaButtonVariation = f.getVariation(featureKey, context);
if (ctaButtonVariation === "treatment") {
// render the new CTA button
return "Get started";
} else {
// render the original CTA button
return "Sign up";
}
Here we see only two variation cases, but we could have had more than two variations in our A/B test experiment.
Multivariate Test on Hero element
Let's say we want to run a Multivariate Test on the Hero section of your landing page.
Previously we only ran an A/B test on the CTA button's text, but now we want to run a Multivariate Test on the Hero section affecting some or all its elements. We can map our requirements in a table below:
Variation | Headline | CTA button text |
---|---|---|
control | Welcome | Sign up |
treatment1 | Welcome | Get started |
treatment2 | Hello there | Sign up |
treatment2 | Hello there | Get started |
Instead of creating a separate feature per element, we can create a single feature for the Hero section and define multiple variables for each element.
The relationship can be visualized as:
- one feature
- having multiple variations
- each variation having its own set of variable values
# features/hero.yml
description: Hero section
tags:
- all
bucketBy: deviceId
# define a schema of all variables
# scoped under `hero` feature first
variablesSchema:
- key: headline
type: string
defaultValue: Welcome
- key: ctaButtonText
type: string
defaultValue: Sign up
variations:
- value: control
weight: 25
- value: treatment1
weight: 25
variables:
# we only define variables inside variations,
# if the values are different than the default values
- key: ctaButtonText
value: Get started
- value: treatment2
weight: 25
variables:
- key: headline
value: Hello there
- value: treatment3
weight: 25
variables:
- key: headline
value: Hello there
- key: ctaButtonText
value: Get started
environments:
production:
rules:
- key: "1"
segments: "*"
percentage: 100
We just set up our first Multivariate test experiment that is:
- rolled out to 100% of our traffic to everyone
- with an even 25% split among all its variations
- with each variation having different values for the variables
Evaluating variables
In your application, you can access the variables of the hero
feature as follows:
const featureKey = "hero";
const context = { deviceId: "device-123" };
const headline = f.getVariable(featureKey, "headline", context);
const ctaButtonText = f.getVariable(featureKey, "ctaButtonText", context);
Use the values inside your hero element (component) when you render it.
Tracking
We understood how to create features for defining simple A/B tests and also more complex multivariates using variables in Featurevisor, and then evaluate them in the runtime in our applications when we need those values.
But we also need to track the performance of our experiments to understand which variation is doing better than others.
This is where the activate()
method of the SDK comes in handy. Before we call the method, let's first set up our activation event handler in the SDK initialization:
import { createInstance } from "@featurevisor/sdk";
const f = createInstance({
datafile: "https://cdn.yoursite.com/datafile.json",
onReady: () => console.log("Datafile has been fetched and SDK is ready"),
onActivation: (featureKey, variation, context, captureContext) => {
// send the event to your analytics platform
// or any other third-party service
}
});
In the onActivate
handler, we know which feature was activated and which variation was computed for the current user or device. From here, we are in full control of sending the event to our analytics platform or any other third-party service for further analysis.
As an example, you can refer to the guide of Google Tag Manager for tracking purposes.
Featurevisor is not an analytics platform
It is important to understand that Featurevisor is not an analytics platform. It is a feature management tool that helps you manage your features and experiments with a Git-based workflow, and helps evaluate your features in your application with its SDKs.
Activation
From the application side, we need to take the responsibility of activating the feature when we are sure that the user has been exposed this feature.
For example, in the case of our CTA button experiment, we can activate the feature when the user sees the CTA button on their screen:
const featureKey = "hero";
const context = { deviceId: "device-123" };
f.activate(featureKey, context);
Mutually exclusive experiments
Often times when we are running multiple experiments together, we want to make sure that they are mutually exclusive. This means that a user should not be bucketed into multiple experiments at the same time.
In more plain words, the same user should not be exposed to multiple experiments together, and only one experiment at a time avoiding any overlap.
One example: if User X is exposed to feature hero
which is running our multivariate test, then the same User X should not be exposed to feature wishlist
which is running some other A/B test in the checkout flow of the application.
For those cases, you are recommended to see the Groups functionality of Featurevisor, which will help you achieve exactly that without your applications needing to do any extra code changes at all.
Further reading
You are highly recommended to read and understand the building blocks of Featurevisor which will help you make the most out of this tool:
- Attributes: building block for conditions
- Segments: conditions for targeting users
- Features: feature flags and variables with rollout rules
- Groups: mutually exclusive features
- SDKs: how to consume datafiles in your applications
Conclusion
We learned how to use Featurevisor for:
- Creating both simple A/B tests and more complex Multivariate tests
- evaluate them in the runtime in our applications
- track the performance of our experiments
- activate the features when we are sure that the user has been exposed to them
- make multiple experiments mutually exclusive if we need to
Featurevisor can be a powerful tool in your experimentation toolkit, and can help you run experiments with a strong governance model in your organization given every change goes through a Pull Request in your Git repository and nothing gets merged without reviews and approvals.