The purpose of Pinterest advertising engineering team is to provide the best service experience for our advertising partners, and advertising overinvestment is one of the problems we are trying to solve. At Pinterest, we use Kafka Streams, which can send predicted AD consumption data to thousands of AD delivery services in a matter of seconds. This article will explain what overshooting is, and then share how we used Kafka Streams to build a prediction system to provide near-real-time forecast consumption data to reduce overshooting.

About super throw

When advertisers run out of money, they can no longer charge for their ads if they continue to run them, a phenomenon known as overinvesting. Overinvesting reduces the opportunity for other advertisers with budget surpluses to reach potential customers with their products and services.

To reduce the overinvestment rate, we should start from two aspects:

  1. Calculate the real-time cost: The information from the AD exposure should be fed back to the AD system within seconds so that the system can shut down those campaigns that have exhausted their budget.
  2. Consumption forecasting: In addition to timely communication of consumption data that has occurred, the system should also have the ability to forecast future consumption. When certain plans are expected to reach the budget ceiling, they should be slowed down so that plans can smoothly reach the budget ceiling. Because an AD that has already been placed stays in the user interface, the user can still act on it. The lag of this behavior makes it difficult to accurately measure advertising consumption over a short period of time. The natural delay is inevitable, and the only thing we can be sure of is the advertising event.

Let’s take an example to illustrate. Suppose there is an Internet company that can serve ads, and advertiser X buys an AD service from the company at a price of $0.10 / exposure and a budget of $100 / day. That means the AD will have a maximum of 1,000 exposures per day.

The company quickly implemented a simple and straightforward placement system for advertisers:

When a new AD display opportunity appears on the site, the front end requests an AD from AD Inventory. The AD library decides whether to place advertiser X’s ads based on the remaining budget. If the budget is still adequate, the library will notify the front end of an AD run (such as an AD slot on the client APP). When the user views the AD, an expose is sent to the billing system.

However, when the company examined their earnings, it found that things were not going as expected.

The AD of advertiser X was actually displayed 1100 times, and the average price of each exposure was only $0.09 because the budget was only $100. An extra 100 exposures is free exposure that could have been used to show other advertisers’ ads. This is the industry platitudes over the problem.

So why do they overshoot? In this example, we assume that the response time of the settlement system is too long. Suppose the system has a 5 minute delay in processing an exposure, resulting in an overshoot. Therefore, the Internet company adopted some optimization measures to improve the system performance, and succeeded in making an extra $9! Because it gave 90 out of 100 invalid impressions to other advertisers with a good budget, it reduced the overcast rate to 10/1000 = 1%.

Soon after, another advertiser, Y, also contacted the company and wanted to buy ads for a budget of $100 / day, $2.0 / click (for example, a user clicks on an AD link to get to advertiser Y’s own website), up to 50 clicks per day. The company added advertiser Y to their AD delivery process and added click event tracking to their system.

One day down, the company’s advertising system has overcast again.

Settlement comes down, advertiser Y got 10 free clicks unexpectedly! The Internet company found that even though the billing system was fast enough, there was no way to predict whether an AD would be clicked, and in the absence of such future consumption information, overbuying would never be avoided.

The hero in this example finally found a very clever solution: calculate the predicted cost for each advertiser. Projected consumption is the amount that has been spent but has not yet been consumed. If actual consumption + forecast consumption > daily budget, stop the advertiser’s advertising.

Building a predictive system

The original

Our users browse Pinterest every day for new ideas: from personalized recommendations, to search, to running recommendations. We need to build a reliable and scalable advertising system to deliver ads and make sure that every dollar of our advertisers’ budget is well spent.

demand

We set out to design a consumption prediction system with the following objectives:

  • Ability to handle different types of ads (exposure, click)
  • Must be able to process tens of thousands of events per second
  • Broadcast updates to more than 1,000 consumers
  • The end-to-end delay cannot exceed 10 seconds
  • Guaranteed 100% Uptime
  • In engineering should be kept as light weight and maintainability as possible

Why Kafka Streams

We’ve evaluated different types of streaming services, including Spark and Flink. All of these technologies work for us in terms of data size, but Kafka Streams has some special advantages for us:

  • Millisecond latency: Kafka Streams offers millisecond latency guarantees, which Spark and Flink cannot
  • Lightweight: Kafka Streams is a Java application with no heavy external dependencies (such as dedicated clustering), which reduces our maintenance costs.

Specific implementation

The diagram below shows the system structure at a high level with the addition of consumption prediction:

  • Ads Serving: Distributes the AD to the client, records the AD delivery, and gets the estimated consumption data from the “inflight Spend “service.
  • Spend system: Aggregate advertising events and inform the advertising system of the current consumption of each advertiser.
  • Inflight Spend:
    • Insertion insertion: {key: adgroupId, value: inflight_spend}
      • AdgroupId is the ID of the AD group under the same budget constraints
      • Inflight_spend = price * impression_rate * action_rate
        • Price: The bid for the current AD
        • Impression_rate: The historical empirical value of an AD’s conversion rate from placement to exposure. Note that not every AD is necessarily exposed to the user
        • Action_rate: For pay-per-click advertisers, this represents the probability that a user clicks on the AD; For pay-per-exposure advertisers, this value is 1
    • Consume Aggregator: Subscribe to the input topic and aggregate consumption data for each adgroup using Kafka Streams. We used a 10 second window to calculate the predicted consumption of each adgroup. The “output” topic will be put into the system for consumption. When new information is received, the system will update the predicted consumption data.

In practice, the accuracy of our consumption prediction is very high. After the whole budget forecasting system went online, our overcast rate dropped significantly. Below is a sample test result comparing our actual consumption with predicted consumption.

Note: The horizontal axis in the figure is the time axis in 3 minutes. The vertical axis shows consumption per unit time. The blue line represents predicted consumption and the green line represents actual consumption

Some experience

  1. Windows can seriously affect performance if not performed well. We got an 18-fold performance improvement after tumbling Windows instead of Hopping Windows. Our initial implementation used a jump window to calculate the predicted consumption over 3 minutes. In our actual case, a window size of 3 minutes and a front advance length of 10 seconds would result in 180 seconds / 10 seconds = 18 open Windows. Every event processed through Kafka Streams is updated to 18 Windows at the same time, resulting in a lot of unnecessary computation. To solve this problem, we changed the jump window to a scroll window. Compared to jump Windows, scrolling Windows do not overlap each other, meaning that only one window needs to be updated for each event received. Because the update operation was reduced from 18 to 1, the window type change operation increased the overall throughput by 18 times.
  2. Information compression strategy: In order to reduce the amount of data broadcast to consumers, adgroup ids are differential encoded and consumed data is stored using lookup tables. After compression, we reduced the size of the information transmission to a quarter of the original size.

conclusion

Using Apache Kafka Streams to build a predictive consumption system is a new approach to our advertising infrastructure, and the system is efficient, stable, highly fault-tolerant, and extensible. We plan to continue to explore Kafka 1.0 and KSQL by Confluent and apply them to system design in the future.

Medium.com/pinterest_…

Thanks to Dongyu for correcting this article.