• 2013 Real Time Bid Optimization with Smooth Budget Delivery in Online Advertising
  • The pictures and data are from the original text, and the formula numbers are consistent with the text for convenience
  • The cover is from Visual China

1 background

Why Budget control (Budget Pacing)?

In general, using a budget smoothing consumption constraint (which means buying no more than a certain percentage of exposure of interest before a set time) prevents an advertiser from burning through the budget prematurely or avoiding explosive spending rates.

Typically, a smooth budget delivery constraint, expressed as not buying more than a set fraction of the impressions of interest before a set time, is used to prevent the campaign from finishing the budget prematurely or avoiding a bursty spending rate.

This limitation often helps advertisers to have a sustained impact on their ads, avoid heavy advertising at peak times (with potential performance degradation), and explore a wider audience.

Budget control can avoid two things:

  • Campaign stops early: The advertiser wants his campaign to spend its budget early in the day to avoid missing the opportunity for the rest of the day, as shown in Figure 1 (a)

  • Spending fluctuations: Advertisers expect to be able to analyze the effectiveness of their advertising campaigns on a regular basis, and high budget fluctuations make the consistency of the results questionable, as shown in Figure 1 (b)

If the advertising budget for a day is divided equally over time, as shown in Figure 1 (c), two problems arise:

  • Traffic Issues: Online transactions can vary greatly throughout the day, depending on the target audience. For example, in the first half of the day can receive more relevant traffic than in the second half of the day, if the budget is not uniformly allocated, may not be able to consume the budget on the day or buy low-quality traffic in the second half of the day. Adopting uniform budget controls for traffic can solve this problem to some extent, as shown in Figure 1 (d)

  • Performance issues: The quality of online traffic can change throughout the day for different audience segments. Whether measured by CPC(cost-per-click), CPA(cost-per-Action), CTR(click-through-rate) or AR(Action-rate), Budget control algorithms should allocate most of their budget to high-quality hours of the day. As shown in Figure 1 (e), and there are usually few options for high-quality cycles. This may lead to large fluctuations and may violate budget smoothing consumption limits

The article contribution

The contribution of this paper is to propose an online method to optimize performance metrics while meeting the budget smoothing consumption limit of each AD, as follows

  • The control feedback loop is used to estimate the future expenditure rate iteratively to achieve the budget smoothing consumption constraint

  • Payout rates are used to select high-quality exposures and adjust bids based on previous effect allocations to maximize effect targets

2 methods

The paper carries out budget control respectively for the two kinds of advertising, and the two kinds of advertising are

  • Flat CPM Campaigns: Bid is a fixed price and the advertising evaluation index is CTR or AR

  • Dynamic CPM (dCPM) Campaigns: Optimize advertising bid based on different traffic. The advertising evaluation metrics are eCPC and eCPA

The unified optimization framework is as follows


min C T R . A R . e C P C  or eCPA   s.t.  t T s ( t ) B Or less ϵ s ( t ) b t Or less Delta t. t t { 1 . . T } e C P M Or less M ( 2 ) \begin{array}{ll}\min & -\mathrm{CTR},-\mathrm{AR}, \mathrm{eCPC} \text { or eCPA } \\ \text { s.t. } & \left|\sum_{t}^{T} \mathbf{s}(t)-B\right| \leq \epsilon \\ & \left|\mathrm{s}(t)-b_{t}\right| \leq \delta_{t} \quad \forall t \in\{1, \ldots, T\} \\ & e C P M \leq M\end{array} \quad(2)

The first constraint is the constraint of the total budget of the day. The time of the day is divided into T parts, s(T) S (T)s(T) is the budget consumed in time period T. The second constraint is the constraint of budget smoothing consumption, that is, the constraint of budget consumption at time T; The third constraint is that the eCPM of the day cannot exceed the upper limit M, that is, cannot overbid.

In this formula, the optimization parameter is B (t)b(t)b(t) because the total budget BBB and average exposure cost upper limit MMM are set by the advertiser.

2.1 Budget smoothing consumption

The original idea for budget control was to take the daily budget as input and calculate the consumption schedule for each AD in real time. Based on the attrition plan, the DSP will attempt to spread out the exposure acquisition activities over a day. Divide the day into T time slots and allocate a budget for each AD during each slot.

To see the relationship between the bid optimization system’s burn rate (pacing rate) and other parameters, refer to the following equation:


s ( t ) = j I t c j x j imps ( t ) reqs ( t ) bids ( t ) reqs ( t ) imps ( t ) bids ( t ) reqs ( t ) pacing_rate ( t ) win_rate ( t ) ( 3 ) \begin{aligned} \mathbf{s}(t) & =\sum_{j \in \mathbb{I}_{t}} c_{j} x_{j} \propto \operatorname{imps}(t) \\ & \propto \operatorname{reqs}(t) \frac{\operatorname{bids}(t)}{\operatorname{reqs}(t)} \frac{\operatorname{imps}(t)}{\operatorname{bids}(t)} \\ & \propto \operatorname{reqs}(t) \cdot \operatorname{ pacing\_rate }(t) \cdot \operatorname{win\_rate }(t) \end{aligned} \quad(3)

Where S (t) S (t) S (t)s(t) is the budget consumed at time t, reqs(t)reqs(t) is the number of advertising requests that meet the directional condition at t moment, so that the number of bids(t) is the number of advertising requests received at T moment. Imps (t) IMps (t) is the number of exposures at time t. In the above way, pacing rate and Win rate are defined.

Note: Reqs (t) reQS (t)reqs(t) can be replaced by a constant if it is assumed that those AD requests that satisfy an advertising target audience are evenly distributed among all incoming AD requests.

To make progress, it is desirable to adopt a dynamic sequential approach, in which feedback is obtained from the previous interval expenditure and the consumption rate is adjusted for the next interval. By calculating the ratio (3), the consumption rate for the next interval t+1t+1t+1 can be obtained by simple recursion.

pacing_rate
( t + 1 ) (t+1)

= = = pacing_rate ⁡ (t) s (t + 1) s ⁡ (t) reqs ⁡ (t) reqs ⁡ (t + 1) win_rate ⁡ (t) win_rate ⁡ (t + 1) (t) \frac{\operatorname{s}(t+1)}{\operatorname{s}(t)} \frac{\operatorname{reqs}(t)}{\operatorname{reqs}(t+1)} \frac{\operatorname{win\_rate }(t)}{\operatorname{win\_rate }(t+1)}(t)s(t)s(t+1)reqs(t+1)reqs(t)win_rate(t+1)win_rate(t)

= = = pacing_rate ⁡ bt + 1 s (t) (t) reqs ⁡ (t) reqs ⁡ (t + 1) win_rate ⁡ (t) win_rate ⁡ (t + 1) and (4) (t) \ frac {b_ {t + 1}} {\ operatorname} {s (t)} \frac{\operatorname{reqs}(t)}{\operatorname{r e q s}(t+1)} \frac{\operatorname{w i n \_ r a t e}(t)}{\operatorname{w i n \_ r a t e}(t+1)} \quad(4)(t)s(t)bt+1reqs(t+1)reqs(t)win_rate(t+1)win_rate(t)(4)

Where REps (t+1) REps (t+1) REps (t+1) and win_rate(t+1)win\_rate(t+1)win_rate(t+1) are the number of requests and winning rate predicted for the next time interval. You can make predictions using historical data and only care about the ratio of these parameters to their values, not necessarily their absolute values.

Set the future expenditure in (4), i.e. S (t+1)s(t+1)s(t+1) s(t+1) to be equal to the ideal expected expenditure at the time interval t+1 bt+1b_{t+1}bt+1 in order to impose budget constraints. Different options for BT +1b_{t+1} BT +1 introduce different strategies for budget control. For example, a simple strategy, called Uniform control (uniform pacing), tries to average out the budget for a given activity over the course of a day. By defining future expenditure BT +1ub_{t+1}^ubt+ 1U (u stands for unity) as


b t + 1 u = ( B m = 1 t s ( m ) ) L ( t + 1 ) m = t + 1 T L ( m ) ( 5 ) b_{t+1}^{u}=\left(B-\sum_{m=1}^{t} \mathbf{s}(m)\right) \frac{L(t+1)}{\sum_{m=t+1}^{T} L(m)} \quad(5)

Where, the first item represents the remaining budget, and the ratio of the second period t+1t+1t+1 to the remaining time, if the time interval is equal, there is the following relationship


b t + 1 u = ( B m = 1 t s ( m ) ) 1 T t ( 6 ) b_{t+1}^{u}=\left(B-\sum_{m=1}^{t} \mathbf{s}(m)\right) \frac{1}{T-t} \quad(6)

The proposed strategy is to spend more money on an AD during periods when there is a greater chance of getting an event of interest (click or convert). To do this, look at historical campaign data and measure the performance of your ads at each time period. On this basis, a discrete probability density function is established, which is described by a series of click or transition probabilities: P0… , pTp_0… , p_Tp0… , pT assumes that there are T intervals every day, ∑tTpt=1\sum_t^{T}p_t=1∑tTpt=1. Now at each interval, calculate the ideal payout for the next interval bt+ 1Pb ^p_{t+1}bt+1p (p denotes probability), as shown below


b t + 1 p = ( B m = 1 t s ( m ) ) p t + 1 L ( t + 1 ) m = t + 1 T p m L ( m ) ( 7 ) b_{t+1}^{p}=\left(B-\sum_{m=1}^{t} \mathbf{s}(m)\right) \frac{p_{t+1} \cdot L(t+1)}{\sum_{m=t+1}^{T} p_{m} \cdot L(m)} \quad(7)

If the time interval is equal, the relationship is as follows


b t + 1 p = ( B m = 1 t s ( m ) ) p t + 1 m = t + 1 T p m ( 8 ) b_{t+1}^{p}=\left(B-\sum_{m=1}^{t} \mathbf{s}(m)\right) \frac{p_{t+1}}{\sum_{m=t+1}^{T} p_{m}} \quad(8)

Note: If for some JJIs pj=0p_j = 0Pj =0, then the AD will not spend money in that time frame, and therefore, it will never explore opportunities that arise in the JJJ time frame. To avoid this situation, bt+1ub_{t+1}^ubt+ 1U and BT + 1Pb ^p_{t+1}bt+1p can be combined.

2.2 Select high quality AD requests

After calculating the bid rate, each AD can use this information to adaptively select certain high-quality traffic to bid on and adjust the bid price to maximize revenue. Flat CPM campaigns and Dynamic CPM (dCPM) campaigns will be discussed below.

Flat CPM campaigns

Bid at a fixed price C ∗ C ^{*}c∗ without knowing the probability of success (no information on CTR or AR). In order to consume the smoothing constraint, the exposure needs to satisfy the following constraint (the generalized second bid is not considered, and the charge is considered as bid)


i m p s ( t ) = s ( t ) c \operatorname{imps^{*}(t)} = \frac{s(t)}{c^{*}}

When so many exposures are generated, the following constraints are met


b i d s ( t ) = i m p s ( t ) win_rate(t) \operatorname{bids^{*}(t)} = \frac{\operatorname{imps^{*}(t)}}{\operatorname{win\_rate(t)}}

Similarly, the need to obtain so many bidding opportunities to meet the following constraints


r e q s ( t ) = b i d s ( t ) pacing_rate(t) \operatorname{reqs^{*}(t)} = \frac{\operatorname{bids^{*}(t)}}{\operatorname{pacing\_rate(t)}}

These AD requests are selected from a set of incoming AD requests that have a high click-through or conversion rate. To this end, an empirical histogram of the CTR or AR distribution QT (x) q_T (x)qt(x) is constructed based on the historical data of each activity, where QT (x) q_T (x)qt(x) represents the number of AD requests in interval T that are considered to have CTR or AR of X, for example, see Figure 2.

The on-line algorithm in this paper finds a threshold τ(t)\tau(t)τ(t) in the interval t to filter AD requests with low CTR or AR in qt(x) q_T (x)qt(x) region, thus achieving smooth control (to calculate CTR and AR in real time). The threshold can be expressed as


tau ( t ) = arg min x x 1 q t ( s ) d s reqs ( t ) \tau(t)=\arg \min _{x}\left|\int_{x}^{1} q_{t}(s) d s-\operatorname{reqs}^{*}(t)\right|

Qt (x) q_T (x) QT (x) Is calculated based on historical data. If the current AD request distribution does not meet qt(x) q_T (x)qt(x), it will cause oscillations in the interval. To prevent this from happening, the confidence interval of the threshold parameter τ(t)\tau(t)τ(t) is evaluated. The first to use the following method to update the threshold tau (t) \ tau tau (t) (t) average mu tau (t) \ mu_ {\ tau} (t) mu tau sigma tau 2 (t) and variance (t) \ sigma_ {\ tau} ^ {2} sigma tau 2 (t) (t).


mu tau ( t ) = mu tau ( t 1 ) + 1 t ( tau ( t ) mu tau ( t 1 ) ) \mu_{\tau}(t)=\mu_{\tau}(t-1)+\frac{1}{t}\left(\tau(t)-\mu_{\tau}(t-1)\right)


sigma tau 2 ( t ) = t 1 t sigma tau 2 ( t 1 ) + 1 t ( tau ( t ) mu tau ( t 1 ) ) ( tau ( t ) mu tau ( t ) ) ( 10 ) \sigma_{\tau}^{2}(t)=\frac{t-1}{t} \sigma_{\tau}^{2}(t-1)+\frac{1}{t}\left(\tau(t)-\mu_{\tau}(t-1)\right)\left(\tau(t)-\mu_{\tau}(t)\right) \quad(10)

τ(t)\ Tau (t)τ(t) satisfies a Gaussian distribution, So tau (t) \ tau tau (t) the upper and lower bounds of the (t) for mu tau (t) + gamma sigma tau (t) (d) \ mu_ {\ tau} (t) + \ gamma \ frac {\ sigma_ (t)} {\ tau} {\ SQRT (d)} mu tau (t) + gamma sigma tau (d) and (t) Mu tau (t) – gamma sigma tau (t) (d) \ mu_ {\ tau} (t) – \ gamma \ frac {\ sigma_ (t)} {\ tau} {\ SQRT (d)} mu tau (t) – gamma sigma tau (d) (t), where d is to look at history to statistic the number of days.

The critical value γ=1.96\gamma=1.96γ=1.96 provides a 95% confidence interval. Update the upper and lower CTR or AR thresholds at each interval.

When an AD request reaches the system, its CTR or AR value is first estimated by a statistical model. If the predicted value is greater than the upper limit of the threshold, the AD request is retained and the fixed bid V ∗ V ^{*} V ∗ is bid.

If the predicted value is less than the lower limit of the threshold, the AD request is simply discarded without further processing. If the predicted value is between the upper and lower bounds, AD requests will be randomly selected with a probability equal to the consumption rate pacing_rate(t)pacing\_rate(t)pacing_rate(t). This scheme, while approximate, ensures that opportunity exploration continues on the high – and low-quality AD request boundaries while satisfying the budget smoothing consumption constraints.

Dynamic CPM Campaigns

With dCPM, bids can be dynamically modified for each AD with the goal of getting more quality exposure at a lower cost. Firstly, a bidding histogram is constructed, and the average value of cic_ICI bid in t interval is denoted as C ∗ C ^{*} C ∗, which corresponds to the bid in Flat CPM campaigns.

Then, starting with a basic bid price, take into account the current pacing_rate(t)pacing\_rate(t)pacing_rate(t), raise or lower the bid price appropriately to meet budget smoothing. Take CPA billing as an example to discuss

Note: pacing_rate(t)pacing\_rate(t) Pacing_rate (t) controls the strategy of this bid

If you don’t bid high enough, you may not win the bid; If the price is too high, it will lead to a rise from CPA. To adjust the bid, set two thresholds for pacing_rate(t)pacing\_rate(t)pacing_rate(t) 0≤β1≤β2≤10 \leq \beta_1 \leq \beta_2 \leq 10≤β1≤β2≤1

Consider the three intervals of pacing_rate(t)pacing\_rate(t)pacing_rate(t)

  • Safe zone: when pacing_rate(t)≤β1pacing\_rate(t) \leq \beta_1pacing_rate(t)≤β1, there is no situation that can’t be used up due to audience orientation (budget is small enough, relative concept)

  • Key region: when β1

  • Danger zone: When β2

The typical BIDDING objective for dCPM is to meet or exceed the CPA target G(as compared to eCPA). Use this target value to define a base bid price UI =AR×Gu_i =AR \times Gui=AR×G, where AR is the predicted result of the model for the current request. If the bid is in the key area, use c^ I = UI \hat{c}_i = u_IC ^ I = UI as the bid, without much adjustment of the bid.

Lower the bid in the safe zone, assuming that in the event the actual bid is CIC_ICI, and a histogram of θ=cic^ I \theta = \ FRAc {c_i}{hat{c} _I}θ=c^ ICI can be established, which can be found at 1% to 2% of the bottom of the histogram, Bid c^ I =θ∗ UI \hat{c} _I = \theta^{*} u_IC ^ I =θ∗ UI

In the danger zone, there are reasons why consumption cannot be achieved:

  • Audience targeting is too restrictive, so there aren’t enough AD requests to bid on

  • Even with frequent bidding, the bids were not high enough to win a public auction

For the second case, raise the bid and consider a bid cap C. The bid price is increased by using the parameter ρ∗≥1\rho^{*} \geq 1ρ∗≥1, and with the increase of pacing_rate(t)pacing\_rate(t)pacing_rate(t), a linear increase method is used


rho = 1 + C / c 1 1 Beta. 2 ( pacing rate ( t ) Beta. 2 ) ( 11 ) \rho^{*}=1+\frac{C / c^{*}-1}{1-\beta_{2}}\left(\operatorname{pacing}_{-} \operatorname{rate}(t)-\beta_{2}\right) \quad(11)

Final bid for c ^ I = rho ∗ UI \ hat _i = {c} \ rho ^ {*} u_ic ^ I = rho ∗ UI, c ∗ c ^ {*} c ∗ is the average of the bid, history and dynamic change (change slower).

3 Practical Problems

3.1 Cold start Problem

Click or convert events take some time to be fed back to the system when a new activity starts, so there is not enough information to perform CTR or AR estimation and bid optimization. This is called the cold start problem. Following a similar line of thinking, recommend lists of high quality sites and audience groups by deducing similarities between existing activities and applying content characteristics such as user and publisher attributes. In addition, epsilon greedy strategy based on context is adopted in online bidding optimization process. If the incoming AD request is within one of the recommended publishers or audience groups, a higher bid will be made. Otherwise, AD requests will randomly select a default bid price to explore unseen sites and users. As time goes on and data accumulates, the activity of online search will decrease, and conventional prediction models will play a major role in bidding optimization.

3.2 Prevent overexpenditure

Since budget expenditures are speed-controlled, total daily expenditures may exceed the allocated daily budget B if a sudden flood of advertising requests occurs. To overcome this problem, several monitoring procedures have been implemented to regularly check the total daily budget expenditures and to interval expenditures in each time period T. If the total cost exceeds the daily budget B, the activity will stop completely.

If the interval exceeds BT +δ b_T +\deltabt+δ, bidding will be temporarily suspended until the next time slot T +1.

3.3 Distributed Architecture

Figure 3 shows a simplified algorithm flow chart for each individual AD request. Note that this workflow of submitting bids takes less than 50 milliseconds to complete and processes nearly a million AD requests a second. Thus, the entire BID optimization is implemented and parallelized across multiple distributed computing clusters across multiple data centers. The offline training process uses R, Pig, and Hadoop to generate layered CTR and AR estimates and train ad-specific prediction models over a large number of ads. The online process streams incoming AD requests to a number of servers and evaluates bids through the real-time message bus.

4 experimental

4.1 Control Policy Comparison

Offline experiment, refer to the original text for details. Compare the uniform control strategy (Shown in Formula 6), the proposed control strategy (shown in Formula 8) and the ideal budget control respectively. Comparing the uniform control strategy with the ideal budget control, the results are shown below, and the error is less than 1% of the total budget.

By comparing the proposed control strategy with the ideal budget control, the results are shown as follows, and the error is 2.3% of the total budget.

4.2 Online experimental results

See the original text for experimental Settings, and the results are shown in the following three tables

Conclusions are as follows:

  • Click-through rates are up. They’re up dramatically

  • CPC goes up, CPA goes down

5 to discuss

  • When selecting high quality traffic to bid, adjust according to pacing_rate(t)pacing\_rate(t)pacing_rate(t), However, pacing_rate(t)pacing\_rate(t) Pacing_rate (t) is allocated according to the distribution of traffic on the day. When the interval traffic is high, pacing_rate(t)pacing\_rate(t)pacing_rate(t) is also high. Is it unreasonable to lower the bid price at this time without considering the competition between advertisers to make bid landscape prediction?

  • The upper and lower bounds of τ(t)\tau(t)τ(t) are used for smoothing control, which is reasonable, but the t time interval budget itself is not directly adjusted

This article is participating in the “Nuggets 2021 Spring Recruitment Campaign”, click to see the details of the campaign