Dynamic programming introduction and basic solution ideas

In the New Year, we should keep up our spirits. From now on, aim for an article a week to motivate yourself to keep growing.

Talk about your recent favorite dynamic programming, written in JavaScript, and future articles will probably focus on the front end as well.

An overview of the

Dynamic programming is one of the four classical algorithm ideas. It is not a specific algorithm, but provides a way to solve the problem. Many people find it difficult to get started with dynamic programming, the most difficult of the four ideas to understand. I also felt that way at first, but after mastering it, I felt ok. Dynamic programming is hard to understand because it’s not the way we’re used to thinking, and it’s confusing to start with all the complicated nouns.

So in this article, I want to completely throw away the definition and use the most popular language and examples to explain, which may be more user-friendly for beginners to understand. After suggesting you read nevertheless, still go looking for other article more in-depth systematic understanding.

What is dynamic programming

For the moment, let’s use an easy to understand but inaccurate language to describe it: dynamic programming is to decompose a big problem into a number of small steps with the same idea sequentially, and each step is to pursue the optimal solution, step by step to the end, you can get the desired result.

Here’s the simplest example:

We are standing at the bottom of a flight of stairs. What should we do if we want to know how many floors there are in a staircase?

The idea of dynamic programming is to focus on how the two neighboring states change. For this example, we just focus on moving from the current ladder to the next one. Let’s say our initial position is the 0th stair, and then we can easily imagine that every time we take a step up, we are in a position that is +1 on the previous level.

Here, we can get an equation:

let cur = 0; // Current floor
let next = cur + 1; // The next floor
Copy the code

If we follow this equation, we’ll know how many steps there are by the time we get to the top. In code:

const stairs = 10; // Suppose there are 10 layers
function getStairs() {
  let cur = 0; // Current floor
  while(cur < stairs) {
    let next = cur + 1; // Get the next state from the current state
    cur = next; // Change the current state to the next state
  }
  return cur;
}
Copy the code

This example may be too simple to get a feel for. Let’s make it a little bit more complicated.

Example 1: Fight monsters by climbing stairs

Let’s say there’s a monster on each floor of this staircase, and you have to kill that monster in order to go up, and killing that monster costs different amounts of health. We have an array of how much health it would take to kill the monster, subscript I for the I + 1 stair.

After killing the monster, we can choose to climb one or two steps. What is the minimum health cost to climb to the top (note that the stairs are after the health cost)?

// For example, if the subscript 1 is 15, it will cost 15 HP to kill the tier 2 monster
const costs = [10.15.20];Copy the code

You might want to do it the greedy way, which is to choose the one that requires less effort from within two runways, but that’s a mistake. For example, the array [0, 1, 2, 3] looks greedy like this:

Choose 0 between 0 and 1, and then 1 and 2
Choose 1 between 1 and 2, and then 2 and 3
Choose 2 between 2 and 3
to

That would take the 0-1-2 route to the top, but the 0-2 route would be the least expensive.

Here’s how you do it with dynamic programming.

We can switch gears, we can work backwards from the top. When we reach the top, we have only two options, either to go up the first to last staircase or to go up the second to last staircase, and the least expensive option, of course, is to choose the less expensive of the two stairs. So for other stairs, does this also apply to the idea? The answer is yes.

According to the above thinking, the problem can be thought of as we climb to the I level and choose the one that costs less health between i-1 and I-2, so that we can find the rules of progress (i.e. state transition equations) :

dp[i] = Math.min(
  dp[i - 1] + costs[i - 1].// Add the amount of health spent on that layer before coming up
  dp[i - 2] + costs[i - 2])Copy the code

At the same time, we can use an array to store the optimal solution of each step, so that when we solve for the optimal solution of the next step, we will evaluate the optimal solution of the previous step, so that each step is the optimal solution.

var minCostClimbingStairs = function(cost) {
  // Use an array to store the total cost of reaching each level
  const dp = []; 
  // The first two layers can not be derived by equation, manual initialization
  // The first two tiers can be climbed from their original positions without killing monsters, so the cost is 0
  dp[0] = 0;
  dp[1] = 0;

  let i = 2;
  // Note that you must be on top of all the steps to reach the top
  while(i < cost.length + 1) {
      dp[i] = Math.min(dp[i - 1] + cost[i - 1], dp[i - 2] + cost[i - 2]);
      i++;
  }
  
  // Return the cost at the top
  return dp[dp.length - 1];
};
Copy the code

Remember that the key to dynamic programming is to break the big problem down into smaller ones, figure out how to derive the equation of the current state from the previous state — the state transition equation — and make sure that the subsequent steps follow that equation, which leads to the right result.

Later I will analyze some leetcode dynamic programming problems, you can try to do it on the website to find out.

Example 2: Trading stocks with fees (2d dynamic programming)

714. The best time to buy or sell stocks includes fees

Again, we first define the dynamic transfer equation. Obviously, in this problem, the current state moves to the next state, corresponding to the time in the question has passed one day, but due to the limitation of the question, we can not determine whether to buy the stock today, so how to define the state transition equation?

Looking closely at the question, we actually have two states on any given day: owning stocks and not owning stocks, and we can take a stupid approach to preserve both states. Specific storage methods are as follows:

A two-dimensional array dp is used for storage. Dp [I][0] means no stock is held on day I. Dp [I][1] means stock is held on day I

There are four possible outcomes in a state transition (when a stock is traded from day one to day two)

Not held the previous day, not held the next day, no operation
One day hold stocks, the next day also hold stocks, no operation;
The day before did not hold, hold the next day, explained to buy today’s stock;
Held one day, not held the next day; It says the stock was sold today;

For the first two cases, state transition equations can be listed:

// No operation is performed, just a simple assignment
dp[i][0] = dp[i - 1] [0]
dp[i][1] = dp[i - 1] [1];
Copy the code

The latter two cases are slightly more complicated:

// Do not own the next day; It says the stock was sold today;
// When the stock is sold, our income increases (plus that day's stock price), minus the fees, which is today's income
dp[i][0] = dp[i - 1] [1] + prices[i] - fee;
// I bought today's stock.
// When we buy stocks, our income decreases (minus the day's stock price)
dp[i][1] = dp[i - 1] [0] - prices[i];
Copy the code

Although there are two possibilities for DP [I][0] and DP [I][1], we know from the analysis of the problem that we need to pursue the maximum profit, so we only need to take the larger value of the two possibilities. In this way, the final state transition equation can be obtained:

dp[i][0] = Math.max(dp[i - 1] [0], dp[i - 1] [1] + prices[i] - fee );
dp[i][1] = Math.max(dp[i - 1] [1], dp[i - 1] [0] - prices[i]);
Copy the code

The rest is simple:

var maxProfit = function(prices, fee) {
  let dp = Array.from({length:prices.length}).map(i= >[]);
  // dp[I][j] I: day I, j: 0 no hold, 1 hold
  dp[0] [0] = 0;
  dp[0] [1] = -prices[0];

  for(let i = 1; i < prices.length; i++) {
    let pre = dp[i - 1];
    dp[i][0] = Math.max(pre[0], pre[1] + prices[i] - fee );
    dp[i][1] = Math.max(pre[0] - prices[i], pre[1]);
  }
  
  // Return to the status of no stock held on the last day, which must be the maximum profit
  // Why else do you keep the stock?
  return dp[dp.length - 1] [0];
}
Copy the code

Example 3: Limit trading times for buying and selling stocks (3D dynamic programming)

188. The best time to buy and sell stocks IV

This problem and the previous problem is not very different in nature, the difficulty is to increase the number of transactions of the limit. For days and trade status (held and not held), you can use the same method definition as above. For the number of transactions, you can add a one-dimensional array that represents how many times the current transaction has occurred.

Ps. A separate buy and sell is a transaction.

Dp [I][0][j] indicates that it does not own the stock on day I and trades j times

While there may be many ways to trade j times on day I, there must be only one way to be most profitable. We trade day I 0, 1, 2… Save all k maximum profits, and then go through all the trades on the last day to see which one made the most profit, and you’ll get the correct answer.

So with that in mind, how do we design the state transition equation? You can think like the above question:

If you don’t own the stock the next day and you make j trades, there are only two possibilities the day before;

If you owned the stock the day before, you sold the stock today, so the number of trades the day before is J-1;

The previous day did not hold the stock, no operational changes, trading times is J;

If you own the stock the next day and you make j trades, then there are only two possibilities the day before;

Hold the stock the day before, no operational changes, trading times is J;

If you did not hold stocks on the previous day, it means that you bought stocks on the next day. However, the number of transactions is still J, because simply buying does not count as a complete transaction.

The state transition equation can be analyzed by taking the more profitable of the two possibilities:

dp[i][0][j] = Math.max(dp[i - 1] [0][j], dp[i - 1] [1][j - 1] + prices[i]);
dp[i][1][j] = Math.max(dp[i - 1] [1][j], dp[i - 1] [0][j] - prices[i]);
Copy the code

The state transition equation is there, but in the final practical implementation, there are two important things to note.

Since the number of transactions is limited by days and a complete transaction takes at least two days, the actual number of transactions is less than or equal to math.min (k, days /2).
Sometimes an illegal number of transactions occurs (for example, there can be no number of transactions on day 0), so for an illegal number of transactions, an infinitesimal value needs to be initialized to prevent errors.
Zero transactions per day requires special handling.

Finally, attach the complete code:

var maxProfit = function(k, prices) {
    if(! prices.length)return 0;

    const n = prices.length;
    // Maximum number of transactions
    const x = Math.min(k, n / 2);
    const dp = Array.from({length: n}).map(
        i= > Array.from({length:2}).map(i= >[])
    )

    dp[0] [0] [0] = 0;
    dp[0] [1] [0] = -prices[0];
    
    // Initialize the number of illegal transactions with an infinitesimal number
    for(let i = 1; i < x + 1; i++) {
        dp[0] [0][i] = dp[0] [1][i] = -Infinity;
    }

    for(let i = 1; i < n; i++) {
    	// The case of 0 transactions requires special handling
        dp[i][0] [0] = 0; // If the array is initialized with 0, omit this line of code
        dp[i][1] [0] = Math.max(dp[i - 1] [1] [0], -prices[i])
        for(let j = 1; j < x + 1; j++) {
            dp[i][0][j] = Math.max(dp[i - 1] [0][j], dp[i - 1] [1][j - 1] + prices[i])
            dp[i][1][j] = Math.max(dp[i - 1] [1][j], dp[i - 1] [0][j] - prices[i]); }}// Returns the highest value of all trades on the last day without owning a stock
    return Math.max(... dp[n -1] [0])};Copy the code

Three-dimensional arrays are complex and error-prone to write, so we can optimize the code a bit. For example, there are only two cases of whether a stock is held or not, and we can directly substitute two variables for the few cases. Detailed code can refer to the official solution.

conclusion

This paper briefly introduces the concept of dynamic programming, but dynamic programming also has its limitations, is not suitable for all problems. For a detailed theory on this, please refer to geek time teacher Wang Zheng’s article.

Here is a brief introduction of my personal summary of dynamic programming solution routines:

Think about the problem, identify the data in the problem that might be involved in the state change, and define a dynamically programmed array accordingly
Find the state transition equation (how to deduce the state from the previous state)
Define initial values (for example, dp[0] is usually self-defined)

Based on personal experience, step 2 is often the most difficult, followed by step 1. Even with a rudimentary grasp of dynamic programming, I was confused when I started doing my own problems because I couldn’t figure out the state transition equation. Think of it this way:

So let’s figure out what the variables are at each stage of the state. In the case of stocks, for example, the variables of each day are profit, number of days, number of trades, etc. According to the number of variables, the dynamic programming requiring several dimensions can be determined basically.
Using general logic, there are several possible scenarios for each phase (day) of the state. For example, in the problem of two-dimensional dynamic programming, there are only two cases, holding stocks and not holding stocks; In the problem of 3d dynamic programming, there are k(number of transactions) * 2 (hold and not hold) possibilities.
Think about how each state has changed from the previous one. For example, each state of the stock is transferred from two states of the previous day.
Prune all the possibilities of the previous phase. In the stock problem, for example, we seek maximum profit by taking the greater of the two possibilities of the previous day.

In the early stage of dynamic programming, the problem cannot be solved. At this time, you don’t have to stop thinking, but you can understand others’ answers thoroughly. After doing more, you will gradually get used to the thinking mode of dynamic planning.

Dynamic programming introduction and basic solution ideas

An overview of the

What is dynamic programming

Example 1: Fight monsters by climbing stairs

Example 2: Trading stocks with fees (2d dynamic programming)

Example 3: Limit trading times for buying and selling stocks (3D dynamic programming)

conclusion

Related Posts

LeetCode journey for front-end Engineers — Two-pointer Tips

How to create a Google Extension

[LeetCode] Remove duplicates from an ordered array – fast or slow Pointers