Since it is to play, it is easy, on the basis of popular understanding of the core principle, focus on practice.

Streamlit_prophet is an open source project that can be used right out of the box to further lower the threshold of Prophet usage (even for operations and business).

Introduction to the

Time series are influenced by four components:

  • Trends: macro, long-term, and persistent forces
  • Cycle: for example, the price of a commodity fluctuates around an average over a short period of time;
  • Season: the law of change is relatively fixed, and presents some periodic characteristics; “Season” is not necessarily calculated by year, but the rule of different time of week and day, which can also be called seasonal.
  • Stochastic: Stochastic uncertainty, also known as Stochastic Process.

The four components add up to the influence on the whole time sequence, including the addition model and the multiplication model:

  • Addition model: relatively independent, with little influence among the four components;
  • Multiplication model: mutual influence is more obvious;

Sequence before the article introduced the AR, MA, ARIMA and other traditional time-series model, but more trouble with it, need to constantly adjust the three parameters, d/p/q “mindless” random search can be used cross validation to find suitable parameters, but it is not smooth, even good data analysts will feel scalp ~ (ha, ha, ha, bosses do not spray)

Prophet is customized for the above scene, directly adjust the 4 components through parameters, easy to use, full effect (shining stage ~).

The Prophet used

1, install,

There are many pits here (win installation), and there are many records of pit climbing on the Internet, but there is one of the simplest way — Conda installation:

conda install pystan
conda install prophet
Copy the code

2, use,

Basic usage

Scikit-learn is used in a similar style to scikit-learn:

from prophet import Prophet
import pandas as pd

df = pd.read_csv('example_wp_log_peyton_manning.csv')

# 0, basic method
# Create predictors to fit data
m = Prophet()
m.fit(df)

# Set the prediction box
future = m.make_future_dataframe(periods=365)
future.tail()

# Predict the future time point, and the forecast gives the predicted value and confidence interval
forecast = m.predict(future)
forecast[['ds'.'yhat'.'yhat_lower'.'yhat_upper']].tail()

# drawing
fig1 = m.plot(forecast)
# Draw a component diagram
fig2 = m.plot_components(forecast)
Copy the code

Introduce holiday factor

# 1. Model [holidays and special events]
# Add the holidays factor to the model with the holidays parameter
playoffs = pd.DataFrame({
  'holiday': 'playoff'.'ds': pd.to_datetime(['2008-01-13'.'2009-01-03'.'2010-01-16'.'2010-01-24'.'2010-02-07'.'2011-01-08'.'2013-01-12'.'2014-01-12'.'2014-01-19'.'2014-02-02'.'2015-01-11'.'2016-01-17'.'2016-01-24'.'2016-02-07']),
  'lower_window': 0.'upper_window': 1,
})
superbowls = pd.DataFrame({
  'holiday': 'superbowl'.'ds': pd.to_datetime(['2010-02-07'.'2014-02-02'.'2016-02-07']),
  'lower_window': 0.'upper_window': 1,
})
holidays = pd.concat((playoffs, superbowls))

# pass in vacation time to build predictor
m = Prophet(holidays=holidays)
forecast = m.fit(df).predict(future)
fig = m.plot_components(forecast)
Copy the code

Introduce statutory holiday factor

# 2. Add the add_country_holidays method to the model.
m = Prophet(holidays=holidays)
m.add_country_holidays(country_name='US') Official holidays in China
m.fit(df)

forecast = m.predict(future)
fig = m.plot_components(forecast)
Copy the code

Seasonal adjustment

Use “partial Fourier sums” to estimate seasonality:

# 3. Change annual seasonal parameters
# year seasonality, default is 10; Weekly seasonality, default is 3
# Increasing the number of Fourier terms allows seasonal adaptation to a faster cycle of change, but may also lead to overfitting
from prophet.plot import plot_yearly
m = Prophet(yearly_seasonality=20).fit(df)
a = plot_yearly(m)
Copy the code

Prophet defaults to [weekly] and [yearly] seasonality, and can use the add_seasonality method to add seasonality, such as monthly, quarterly, and hourly:

# 4, Specify custom seasonality Add month seasonality m = Prophet(weekly_seasonality=False) m.add_seasonality(name='monthly', period=30.5, fourier_order=5) forecast = m.fit(df).predict(future) fig = m.plot_components(forecast)Copy the code

If the holiday effect is found to be over-fitting, the holidays_prior_scale parameter can be adjusted (default is 10) to reduce the holiday effect. Seasonality_prior_scale, seasonality_scale

m = Prophet(holidays=holidays, holidays_prior_scale=0.05).fit(df)
forecast = m.predict(future)
forecast[(forecast['playoff'] + forecast['superbowl']).abs(a) >0][
    ['ds'.'playoff'.'superbowl'[-]]10:]
Copy the code

There are other uses, which are not described very much. Please refer to the code example on your Github, which comes from the Prophet official documentation.

Streamlit_prophet deployment

Official introduction of open Source project:

Deploy a Streamlit app to train, evaluate and optimize a Prophet forecasting model visually.

Streamlit_prophet is an application of time series prediction model based on Prophet built on Streamlit. See the official video introduction, feel very easy to use, run ~

1, deployment,

As with prophet installed locally, there are many pitfalls, but it is ok to install it based on Conda.

Create a virtual environment:

Conda create -n streamlit_prophet Python =3.8 Activate streamlit_prophetCopy the code

Install streamlit_prophet:

conda install pystan 

pip install -U streamlit_prophet

#Start the service 
streamlit_prophet deploy dashboard
Copy the code

Local access: http://127.0.0.1:8080/. If the following page is displayed, the deployment is successful ~

2. Introduction to use

Official with some data examples, can quickly start; Unfortunately, we useOnly CSV file data can be uploaded (maximum 200M). Data warehouse data cannot be directly connected.

For data columns, there are selection, filtering, sampling and cleaning modules:

Adjust potential inflection point, season and holiday effect parameters:

Model verification and prediction:

3. Model results

Click the prediction option in the upper left corner to start model prediction. If the parameters on the left are adjusted, the model will automatically rerun:

Model performance indicators:

Model error analysis. In the second figure, the further the point is from the red line, the more ridiculous the prediction error is (for the convenience of subsequent model adjustment,Maximum error point, whether to consider eliminating abnormal data) :

The blue shade is the confidence interval (80%) of the model’s predicted results, and the red line is the trend estimated by the model:

More details need friends to try ~ here is not a display

summary

Streamlit_prophet is a great data product that greatly simplifies time series forecasting. It is very user-friendly for data analysts, but the biggest disadvantage is that it cannot be directly connected to the data store for forecasting.

However, in order to make good use of such products (but not limited to this product), the core is to understand the business, in-depth understanding of timing models, and Prophet modeling methods.


Reference:

  1. Github.com/artefactory…
  2. Facebook. Making. IO/prophet/doc…
  3. Github.com/xihuishawpy…

Welcome to pay attention to individual public number: DS number said