In user relationship management, there are some questions that go straight to the heart:

  • What is the value of this group of users?

  • Why use this measure to interfere with the user, and not another way.

  • Why do you interfere with one group of users and not the other, and what are their criteria.

There are these problems, because the essence of customer value is not enough to understand, the lack of effective division.

User fine operation value is huge

With the demographic dividend disappearing and growth peaking, there is an urgent need to learn from existing user pools. The extensive buy quantity strategy in the past is no longer effective, first, the cost of buy quantity is gradually high, second, the user loyalty brought by buy quantity is very low. The division and reciprocal flow of existing customer groups has become a top priority. Industry slang “wash user”, is to speak of this strategy.

Different functions have different ideas about how to divide users. A product has a product view, which may be based on a functional preference; Operation has the view of operation, which is the definition of gameplay of various activities; Even the leader has a point of view. However, no matter how to cut in, the core of the business will be held tightly.

What is the essence of business: The essence of business is profit. Therefore, we evaluate and divide the life cycle of users from their monetary value.

User life cycle value is not a new product in the academic field. The theory was put forward in the 1980s. But with the Internet, very little information is available. There are two possible reasons. One is that the Internet has exploded in the past 20 years and there is money to be made lying around. Second, the internal strategy of each is not unified, unable to form a unified caliber.

But none of this is a reason not to use it. Here, we strip away the complex business logic and analyze the life cycle value of the user and the state of the user from the transaction point of view.

User Life Cycle Value (CLV)

As fine operation roll out, in the past extensive, buy quantity user already no longer buy zhang. The minimum service that each user can accept varies. How to use resources effectively according to user value. The use of maximum leverage has become the key to the life and death of enterprises.

In the past, there was no unified theory for Internet applications or games. However, using interdisciplinary thinking, it can be found that the field of marketing has been studied, and has given a highly accurate and interpretable model method.

This method is called Customer Life Time Value, or CLV or LTV for short.

What is the CLV

The user lifecycle is a way of characterizing the user. Generally used to solve two types of problems:

  1. How much value does the user have, to measure the input-output ratio

  2. After user intervention, optimize resource placement according to the change of user life cycle value.

These are the two core issues of user management: the value of users and the effectiveness of policies.

It should be noted that the product form of CLV requires non-contract. The most representative contract in China is the contract mobile phone. General Internet products, contract form is relatively rare.

CLV’s user base needs to have already made a transaction, and non-paying users are not considered. Of course, concept migration, where payments are replaced by active or content consumption, can also be handled by the model.

What questions does CLV answer

Whether users are active or lost, how much potential they have to pay, and whether they will buy again at some point in the future. These are three questions that user lifecycle value can answer.

How to introduce CLV into your product

Application scenarios

  • Determine the phase of the user’s life cycle

  • Predict the probability of purchase within the specified period

  • Predict the life cycle value of users

  • Predict future payments using historical payment data

Definition of activity and loss

Definition:

Users are active when they interact

Loss occurs when users do not interact for a period of time

Lifetims toolkit introduced

Install the Python toolkit:

pip install lifetimesCopy the code

CLV data mining

Three metrics are required for user lifecycle determination

  1. Frequency Indicates the number of days within a login period

  2. Recency Indicates the maximum period between the first active user and the last active user

  3. T User phase, the first active to the end of the observation period

For payment forecasts, you also need the average amount of money a user pays.

Data acquisition

Fetch from the database

SELECT
      customer_id,
      COUNT(distinct date(transaction_at)) - 1 as frequency,
      datediff('day'.MIN(transaction_at), MAX(transaction_at)) as recency,
      AVG(total_price) as monetary_value,
      datediff('day'.CURRENT_DATE.MIN(transaction_at)) as T
    FROM orders
    GROUP BYCustomer_idpython processingCopy the code
from lifetimes.datasets import load_transaction_data
    from lifetimes.utils import summary_data_from_transaction_data
    
    transaction_data = load_transaction_data()
    print(transaction_data.head())
    """ date id 0 2014-03-08 00:00:00 0 1 2014-05-21 00:00:00 1 2 2014-03-14 00:00:00 2 3 2014-04-09 00:00:00 2 4 2014-05-21 00:00:00 2 "" "
    
    summary = summary_data_from_transaction_data(transaction_data, 'id'.'date', observation_period_end='2014-12-31')
    
    print(summary.head())
    """ Frequency recency T id 0 0.0 0.0 298.0 1 0.0 0.0 224.0 2 6.0 142.0 292.0 3 0.0 0.0 147.0 42.0 9.0 183.0 """
    
    bgf.fit(summary['frequency'], summary['recency'], summary['T'])
    # 
      


from lifetimes.datasets import load_cdnow_summary
data = load_cdnow_summary(index_col=[0])

print(data.head())
""" Frequency recency T ID 1 00 0.00 38.86 4 0 0.00 38.86 5 0 0.00 38.86 ""BG/NBD modelCopy the code

BG/NBD is an improved version of the classical Model. For detailed mathematical proof, see: A Note on Deriving the Pareto/NBD Model and Related Expressions

The model has the following assumptions:

Four parameters were obtained by model fitting.

from lifetimes import BetaGeoFitter

# similar API to scikit-learn and lifelines.
bgf = BetaGeoFitter(penalizer_coef=0.0)
bgf.fit(data['frequency'], data['recency'], data['T'])
print(bgf)

      
bgf.summaryCopy the code



Effect visualization

from lifetimes.plotting import plot_probability_alive_matrix

plot_probability_alive_matrix(bgf)Copy the code



The bottom right corner is the best customer with high transaction frequency. Large trading span; The customer on the upper right has done multiple transactions for a short time and is most likely lost.

Predict the purchase behavior of individual users

t = 10 #predict purchases in 10 periods
individual = summary.iloc[20]
# The below function is an alias to `bfg.conditional_expected_number_of_purchases_up_to_time`
bgf.predict(t, individual['frequency'], individual['recency'], individual['T'])
# 0.0576511Copy the code

Life cycle value forecasting

To predict value, you need a fourth parameter: the average value of a user’s transaction.

This model has an important premise: there is no correlation between purchase frequency and purchase amount. For details, see The gamma-Gamma Model of Monetary Value

from lifetimes.datasets import load_cdnow_summary_data_with_monetary_value

summary_with_money_value = load_cdnow_summary_data_with_monetary_value()
summary_with_money_value.head()
returning_customers_summary = summary_with_money_value[summary_with_money_value['frequency'] >0]

print(returning_customers_summary.head())
""" Frequency recency T monetary_value customer_id 1 2 30.43 38.86 22.35 2 1 1.71 38.86 11.77 6 7 29.43 38.86 73.74 71 5.00 38.86 11.77 9 2 35.71 38.86 25.55 ""Copy the code

Correlation test

returning_customers_summary[['monetary_value'.'frequency']].corr()
Monetary_value frequency monetary_value 1.000000 0.113884 frequency 0.113884 1.000000 ""

from lifetimes import GammaGammaFitter

ggf = GammaGammaFitter(penalizer_coef = 0)
ggf.fit(returning_customers_summary['frequency'],
        returning_customers_summary['monetary_value'])
print(ggf)
"" "< lifetimes. GammaGammaFitter: fitted with the subjects, 946 p: 6.25, q: 3.74, v: 15.45 > "" "Copy the code

Times are estimated

print(ggf.conditional_expected_average_profit(
        summary_with_money_value['frequency'],
        summary_with_money_value['monetary_value']
    ).head(3))
""" Customer_id 1 24.658619 2 18.911489 3 35.170981Copy the code

Total value estimate

Finally, DCF cash flow is discounted to get a current estimate of the total user value.

# refit the BG model to the summary_with_money_value dataset
bgf.fit(summary_with_money_value['frequency'], summary_with_money_value['recency'], summary_with_money_value['T'])

print(ggf.customer_lifetime_value(
    bgf, #the model to use to predict the number of future transactions
    summary_with_money_value['frequency'],
    summary_with_money_value['recency'],
    summary_with_money_value['T'],
    summary_with_money_value['monetary_value'],
    time=12.# months
    discount_rate=0.01 # monthly discount rate ~ 12.7% annually
).head(3))
Customer_id 1 140.096211 2 18.943467 3 38.180574 Name: CLV, DType: FLOAT64 """Copy the code

conclusion

The user lifecycle value model is different from other models. The model models each user individually, rather than rigid division by the loss of days, has a strong flexibility. After capturing the user lifecycle phases, as well as the lifecycle value of the user, the next step is the application.

Landing scenarios are varied, but to push upstream and downstream, there still needs to be a compelling reason. The advice here is to simulate historical data performance and use data to illustrate the effect.

If you like this article, don’t forget to like 👍, like ❤ + follow 🔔 oh, your little gesture is the biggest support for the author ~💪