Original link:tecdat.cn/?p=18782

Original source:Tuo End number according to the tribe public number

 

In this paper we discuss the calculation of life expectancy. The starting point for demographic models is the mortality table. However, this assumption is biased because it assumes that living conditions will not improve. To get things right, we used more complete data, where the number of deaths is based on x years and includes the date T.


DE=read.table("DE.txt",skip = 3,header=TRUE)
EXPS=read.table("EXPS.txt",skip = 3,header=TRUE)
Copy the code

We use Dx t for deaths, Ex t for exposures. So, for someone x years old at date T, the probability of dying in that year is qx, t = Dx, t/Ex, t. The data is stored in a matrix for visualization and in a database for regression.


QF[QF==0]=NA
QH[QH==0]=NA
Copy the code

Some modifications have to be made to avoid zero problems, because (I) we find the ratio (ii) and then we log). We can visualize it as a function of x and t.

persp(log(QF))
Copy the code

 或


persp3d(ages,annees,log(QH),col="light blue")
Copy the code

 

In order to simulate the evolution of Qx and T, we can get inspiration from the model of Lee&Carter (1992), which assumes log (qx, t) = Ax + Bx⋅Kt. A = (A0, A1… A110) is somehow log (qx, t). K = (K1816, K1817… K2015) enables us to understand that improved living conditions reduce the likelihood of dying within a year. These improvements are not uniform, so we use B = (B0, B1,… B110) to make the improvements dependent on L ‘age.

To estimate parameters A, B, and K, we try to use the binomial model. B (Ex, t, qx, t), that’s the basic model of life insurance. 12. Dx, t ~ B (Ex, t, exp [Ax + Bx Kt])

Another clue is to use the law of small numbers, which states that if the probability is low (as is the case with the probability of death in a year), the binomial law can be approximated by the Poisson distribution. Poisson regression is used here, and the explanatory variables are age X, year T and exposure as offset variables. The only problem is it’s not linear regression. 12. We have a nonlinear model here because E [Dx, t] = (exp[log (Ex, t) + Ax + Bx + Kt]).


gnm( DH ~ offset(log(EH)  + as.factor(age) +
Multas.factor(age,as.factor(annee),
family = poisson(link="log")
Copy the code

We have estimated coefficients A ^, B ^, and K ^.


Ax=reg$coefficients[2:111]
Bx=reg$coefficients[112:222]
Kt=reg$coefficients[223:length(reg$coefficients)]
Copy the code

We can represent three sets of coefficients. First of all, A ^ is the average change,

plot(ages[-1],Ax)
Copy the code

 

We can also plot time in terms of K ^.

 

 

Again, the model is unrecognizable. In short, improvement doesn’t mean anything. We can represent minus K ^, which has the advantage of describing an improvement in living conditions. And finally, let’s graph minus B ^

 

 

The difficulty is that, in order to predict life expectancy, we need to calculate QT, x for large values of t, which have not yet been observed. For example, someone might want to know about Q50,2020 (for those born in 1970). We’re going to use q50,2020 is equal to e to the A to the 50 plus B to the 50 K to the 2020. The problem is that K ^ 2020 is not part of the estimate K ^.

This idea was originally proposed by Lee&Carter (1992), we could try exponential models or linear models (on the original K ^ sequence after 1950).


lm(log(Kt[idx])~ann[idx])
futur=2016:2125

lm(Kt[idx]~ann[idx])

points(futur,pr,col="blue")
Copy the code

 

We can then build A series of predictions based on past data, q ^ x, t = exp [A ^ x + B ^ x K ^ t], and future data q ~ x, t = exp [A ^ x + B ^ x K ~ t].

So if we keep the data from the past, this is the probability of death in 1880


plot(BASE$x[BASE$t==1880],BASE$pred[BASE$t==1880],
log="y")
Copy the code

 

 

Again, we use both models for the future (in this case, 2050)


BASE2$Qpred1=exp(cste+BASE2$Ax+BASE2$Bx*BASE2$Kt1)


plot(BASE2$x[BASE2$t==2050],BASE2$Qpred1[BASE2$t==
2050],log="y")
Copy the code

 

 

Used for exponential forecasting

For the linear prediction, for someone born in 1968, we have the probability of dying the following year


if(sbase$t[i]<= 2015)
{vq[i]=BASE[ BASE$x==sbase$x[i]) &  BASE$t==sbase$t[i]),"Qpred"] 
if(sbase$t[i] <2015) 
{vq[i]=BASE2[(BASE2$x==sbase$x[i]) & (BASE2$t==sbase$t[i]),"Qpred2"] 
Copy the code

 

 

On the left are our model’s estimates, and on the right are our predictions.

To calculate life expectancy at birth, we use the following code

The sum (cumprod (exp (vq [0], "for))) [1] of 77.62047Copy the code

Then, we can do a function to visualize this evolution of life expectancy


vP = cumprod(exp(-(sbase$vq[1:110])))
sum(vP)}
Copy the code

 


ANN =1930:2010
plot(ANN ,E2)
Copy the code

 

 

If we look at the change, we see a change of (about) 0.25 per year

 

 

 

On the other hand, if we adopt the prediction that preserves the Kt index change, it can be concluded that

 

 

The result is not true, it takes less account of the curve change.

 

 


Most welcome insight

1.R language multiple Logistic Logistic regression application case

2. Panel smooth transfer regression (PSTR) analysis case implementation

3. Partial least squares regression (PLSR) and principal component regression (PCR) in MATLAB

4.R language Poisson regression model analysis cases

5. Hosmer-lemeshow goodness of fit test in R language regression

6. Implementation of LASSO regression, Ridge regression and Elastic Net model in R language

7. Realize Logistic Logistic regression in R language

8. Python predicts stock prices using linear regression

9. How to calculate IDI and NRI indices for R language in survival analysis and Cox regression