Original link:tecdat.cn/?p=10278

Original source:Tuo End number according to the tribe public number


 

Part 1: Introduction to survival analysis

This presentation will introduce survival analysis, refer to:

 

Clark, T., Bradburn, M., Love, S., & Altman, D. (2003). Survival analysis part I: Basic concepts and first analyses. 232-238. ISSN 0007-0920.

 

Some of the packages we’ll be using today include:

  • lubridate
  • survival
  • survminer
library(survival)
library(survminer)
library(lubridate)
 
Copy the code

 

What is survival data?

Event time data consists of different start and end times.

The example of cancer

  • Time from surgery to death
  • Time from the beginning of treatment to progress
  • Time from response to relapse

 

Examples from other areas

Event timing data is common in many areas, including but not limited to

  • Time from HIV infection to AIDS development
  • The time of the heart attack
  • When substance abuse occurs
  • Machine failure time

 

Survival analysis alias

Since survival analysis is common in many other fields, it also has other names

  • Reliability analysis
  • Duration analysis
  • Event history analysis
  • Event time analysis

 

Lung data set

Data included patients with advanced lung cancer from the North Central Cancer treatment group. Some of the variables we will use to demonstrate methods today include

  • Time: Survival time in days
  • Status: Review status 1 = review, 2 = invalid
  • Gender: Male = 1 Female = 2

 

What is censorship?

 

RICH JT, NEELY JG, PANIELLO RC, VOELKER CCJ, NUSSENBAUM B, WANG EW. A practical guide to understanding Kaplan-Meier curves. Otolaryngology – Head and Neck surgery: official journal of the American Academy of Otolaryngology – Head and Neck Surgery. 2010; 143 (3) : 331-336. Doi: 10.1016 / j.otohns.2010.05.007.

 

Review the types

A topic may be reviewed for the following reasons:

  • Subsequent loss
  • Withdrew from the study
  • Fixed no activity before end of study period

Specifically, these are examples of censorship.

 

Review survival data

 

In this example, how would we calculate the percentage of 10 years without events?

Subjects 2, 3, 5, 6, 8, 9, and 10 were event-free at year 10. Subjects 4 and 7 had this event 10 years earlier. Topic 1 was reviewed 10 years ago, so we don’t know if they had this event 10 years ago – how do we include this topic in our estimates?

 

Assign follow-up time

  • The subject under review still provides information and therefore must be appropriately included in the analysis
  • The distribution of follow-up times is skewed and may vary between patients undergoing examinations and those with events

 

Component of survival data

For Topic II:

  1. Activity time TiTi

  2. Review time CiCi

  3. Event indicator δ I δ I:

    • 1. If the observed event (i.e. Ti≤CiTi≤Ci)
    • If checked, 0 (i.e. Ti>CiTi>Ci)
  4. Observation time Yi=min(Ti,Ci)Yi=min(Ti,Ci)

Observation time and event indication are provided in lung data

  • Time: Survival time in days (YiYi)
  • Status: Review Status 1 = Review, 2 = Dead (δ I δ I)

 

 

Process dates in R

Data typically comes with a start date and an end date, rather than a pre-calculated lifetime. The first step is to make sure that these formats are set to dates in R.

Let’s create a small sample data set where sx_Date contains variables from the date of surgery and last_FUp_Date from the last follow-up.

date_ex <- 
  tibble(
    sx_date = c("2007-06-22"."2004-02-13"."2010-10-27"), 
    last_fup_date = c("2017-04-15"."2018-07-04"."2016-10-31")
    )

date_ex
Copy the code
## # A tibble: 3 x 2
##   sx_date    last_fup_date
##   <chr>      <chr>        
## 1 2007-06-22 2017-04-15   
## 2 2004-02-13 2018-07-04   
## 3 2010-10-27 2016-10-31
Copy the code

We see that they are character variables, which is usually the case, but we need to format them as dates.

 

Format date – radix R

date_ex %>% 
  mutate(
    sx_date = as.Date(sx_date, format = "%Y-%m-%d"), 
    last_fup_date = as.Date(last_fup_date, format = "%Y-%m-%d"))Copy the code
## # A tibble: 3 x 2
##   sx_date    last_fup_date
##   <date>     <date>       
## 1 2007-06-22 2017-04-15   
## 2 2004-02-13 2018-07-04   
## 3 2010-10-27 2016-10-31
Copy the code
  • Please note that,RThe format must contain delimiters and symbols. For example, if your date format is M/D/Y, yesformat = "%m/%d/%Y"
  •  

Format date – LubriDate package

We can also use the Lubridate package to format dates. In this case, use the YMD function

date_ex %>% 
  mutate(
    sx_date = ymd(sx_date), 
    last_fup_date = ymd(last_fup_date)
    )
Copy the code
## # A tibble: 3 x 2
##   sx_date    last_fup_date
##   <date>     <date>       
## 1 2007-06-22 2017-04-15   
## 2 2004-02-13 2018-07-04   
## 3 2010-10-27 2016-10-31
Copy the code
  • Please note that with the basicsRDifferent options do not require a delimiter
  •  

Calculate survival time

Now that the date is formatted, we need to calculate the difference between the start time and the end time in some units (usually months or years). In base, R, used for DiffTime calculates the number of days between two dates, which is then converted to a numeric value using as.numeric. Then divide the average number of days by 365.25 years and convert to years.

date_ex %>% 
  mutate(
    os_yrs = 
      as.numeric(
        difftime(last_fup_date, 
                 sx_date, 
                 units = "days")) / 365.25
    )
Copy the code
## # A tibble: 3 x 3 ## sx_date last_fuP_date os_YRS ## <date> <date> < DBL > ## # 3 2010-10-27 2016-10-31 6.01Copy the code

 

Calculate survival time

The operator can specify a time interval %–%, and then use the method to convert the time interval to the number of elapsed seconds as.duration, and finally divide by dyears(1) to convert it to the number of years to obtain the number of seconds in a year.

## # A tibble: 3 x 3 ## sx_date last_fuP_date os_YRS ## <date> <date> < DBL > ## # 3 2010-10-27 2016-10-31 6.02Copy the code

Events marking the

For the component of survival data, I mentioned event indicators:

Event indicator δ I δ I:

  • 1. If the observed event (i.e. Ti≤CiTi≤Ci)
  • If checked, 0 (i.e. Ti>CiTi>Ci)

In lung data, we have:

  • Status: Review status 1 = review, 2 = invalid

 

Survival function

The probability that the subject will survive beyond the specified time

 

S (t) = (t > t) Pr = 1 – (t) S (t) = F (t > t) Pr = 1 – F (t)

 

S(t)S(t) : survival function F(t)=Pr(t ≤t)F(t)=Pr(t ≤t) : cumulative distribution function

Theoretically, the survival function is smooth; In practice, we observe events on discrete time scales.

 

The survival probability

  • The probability of survival at some time, S(t), S(t), is the conditional probability of surviving beyond that time, given that the individual has survived just before that time.
  • It can be estimated as the number of follow-up patients who were alive at the time but had no loss divided by the number of patients who were alive at the time
  • Kaplan-meier estimates of survival probabilities are the product of these conditional probabilities
  • At time 0, the probability of survival is 1, i.e. S(t0)=1S(t0)=1

 

Create a living object

The Kaplan-Meier method is the most commonly used method for estimating survival time and probability. This is a nonparametric method that produces a step function that drops each time an event occurs.

  • Create a survival object. For each topic, there will be one entry as the survival time,+If the subject is censored, it is followed by one. Let’s take a look at the top 10 observations:
##  [1]  306   455  1010+  210   883  1022+  310   361   218   166
Copy the code

 

Kaplan-meier method was used to estimate the survival curve

  • thesurvfitThe function creates a survival curve based on the formula. Let’s generate an overall survival curve for the entire homogeneous group and assign it to Objectf1And then checknamesOf this object:
names(f1)
Copy the code
##  [1] "n"          "time"       "n.risk"     "n.event"    "n.censor"  
##  [6] "surv"       "std.err"    "cumhaz"     "std.chaz"   "start.time"
## [11] "type"       "logse"      "conf.int"   "conf.type"  "lower"     
## [16] "upper"      "call"
Copy the code

Some of the key components that the SurvFit object will use to create a survival curve include:

  • timeWhich contains the start and end of each interval
  • surv, which contains each corresponding survival probabilitytime

 

Kaplan and Meier

Now, draw the object to get the Kaplan-Meier diagram.

plot(survfit(Surv(time, status) ~ 1, data = lung), 
     xlab = "Days", 
     ylab = "Overall survival probability")
Copy the code

 

 

  • baseRThe default graph in shows a step function (solid line) with a correlation confidence interval (dotted line)
  • The horizontal line represents the lifetime of the interval
  • The interval is terminated by an event
  • The height of the vertical line shows the change in cumulative probability
  • Reviewed observations with scale lines reduce the cumulative survival between intervals.

 

Kaplan and Meier

Build on ggplot2 and can be used to create Kaplan-Meier diagrams.

  • The default graph has a step function (solid line) with a correlation confidence band (shaded area).
  • By default, the scale line for the patient being examined is displayed. In this example, the scale line itself is somewhat blurred and can be unmarked using the optioncensor = FALSE

 

Estimated survival of xx years

A quantity that often needs attention in survival analysis is the probability of surviving beyond a certain number of (xx) years.

For example, estimate the likelihood of surviving to 11 years

## Call: survfit(formula = Surv(time, status) ~ 1, Data = lung) ## ## time n.risk n.vent survival std.err Lower 95% CI Upper 95% CI ## 365 65 121 0.409 0.0358 0.345 0.486Copy the code

We found that the 11-year survival rate in this study was 41%.

The upper and lower limits of 95% confidence intervals are also shown.

 

Survival rate and survival curve at xx years

The 11-year survival probability is the point on the Y-axis that corresponds to the 11-year X-axis survival curve.

 

Xx year survival rates are often misestimated

What if we use “naive” estimates?

121 of 228 patients died by year 1, so:

 

– 1-1 – year survival was misestimated when ignoring the fact that 42 patients were examined 1 year earlier.

 

  • The correct estimated probability of survival – 41 per cent per year.

 

Ignore the impact of censorship on xx years of survival

  • Imagine two studies, each with 228 topics. There were 165 deaths in each study. One was not examined (orange line) and 63 patients were examined by another (blue line)
  • Ignoring the review results in an overestimate of the overall probability of survival, as the reviewed subjects provide information only for part of the follow-up period and then fall outside the risk range, reducing the cumulative probability of survival

 

Estimated median survival time

Another quantity that often needs to be looked at in survival analysis is the mean time to survival, which we quantify using the median. The predicted survival time is not normally distributed, so an average is not an appropriate summary.

 

## Call: Survfit (formula = Surv(time, status) ~ 1, data = lung) ## ## n events median 0.95lCL 0.95UCL ## 228 165 310 285 363Copy the code

We see a median survival of 310 days. It also shows the upper and lower bounds of the 95% confidence interval.

 

Median survival time and survival curve

The median survival time was 0.50

Median survival rates are often misestimated

The median survival time of 165 patients who died was summarized

##   median_surv
## 1         226
Copy the code
  • When ignoring the fact that the patients examined also contributed to follow-up, an incorrect median estimated survival time of 226 days was obtained.
  • The correct median survival estimate is 310 days.

 

The impact of the review on median survival was ignored

  • Ignoring the review results in an artificially low survival curve by excluding the follow-up time contributed by the reviewed patients (purple line)
  • The real survival curve of the datalungShown in blue for comparison

Survival time between groups was compared

  • We can use the logarithmic rank test for inter-group importance tests
  • Log-rank test is the most common method for comparing survival time between groups by averaging observations over the entire follow-up period
  • Depending on the study question, some versions may place more emphasis on early or late follow-up and may be more appropriate

We use the function to obtain the logarithmic rank P value. For example, we can test for survival time differences based on gender in lung data

## Call: ## ## N Observed Expected (O-E)^2/E (O-E)^2/V ## sex=1 138 112 91.6 4.55 10.3 ## sex=2 90 53 73.4 5.68 10.3 ## ## Chisq= On 1 degrees of freedom, P = 0.001Copy the code

 

Extract information from the Survdiff object

P values are extracted from the results

1 - pchisq(sd$chisq, length(sd$n) - 1)
Copy the code
# # 0.001311165 [1]Copy the code

Returns the formatted p value

# # 0.001 [1]Copy the code

 

Cox regression model

We may want to quantify the effect size of a single variable, or include multiple variables in the regression model to account for the effect of multiple variables.

Cox regression model is a semi-parametric model, which can be used to fit univariate and multivariate regression models with survival outcomes.

 

 

H (t) H (t) : instantaneous rate at which a hazard or event occurs H0 (t) H0 (t) : baseline hazard

Some key assumptions of the model:

  • Non-information censorship
  • Proportion of dangerous

Note: Parametric regression models for survival outcomes can also be used, but these models will not be covered in this training.

We can fit the regression model of survival data using the CoxPH function, which uses an object on the left and has the standard syntax R for regression formulas on the right.

## Call: ## ## coef exp(coef) se(coef) z p ## Sex-0.5310 0.5880 0.1672-3.176 0.00149 ## ## Likelihood ratio test=10.63 on 1 df, P =0.001111 ## n= 228, number of events= 165Copy the code

 

Format Cox regression results

We can see the clean version of broom output:

 

Or use

Dangerous than

  • The number of concerns from the Cox regression model is the hazard ratio (HR). HR represents the risk ratio between the two groups at any given point in time.
  • HR is interpreted as the instantaneous incidence of events of interest that are still at event risk.
  • If you have a regression parameter ββ (fromestimateOur columncoxph), HR = experience value (β) experience value ⁡ (β).
  • HR <1 indicates a decreased risk of death, while HR> 1 indicates an increased risk of death.
  • Thus, our HR = 0.59 means that at any given time, about 0.6 times as many women as men die.

 

Part 2: Landmark analysis and time-dependent covariates

In Part 1, we introduced the use of logarithmic rank test and Cox regression to examine associations between covariates of interest and survival outcomes.

  

Example: tumor reaction

Example: Overall survival was measured from the start of treatment, focusing on the association between complete response to treatment and survival.

  • Anderson et al. (JCO, 1983) described why traditional methods (such as logarithmic rank test or Cox regression) were biased in favor of responders in this case and proposed a revolutionary approach.
  • The null assumption in the boundary marker approach is that the process of survival from the boundary marker is independent of the response state of the boundary marker.

Anderson, J., Cain, K., & Gelber, R. (1983). Analysis of survival by tumor response. Journal of Clinical Oncology : Official Journal of the American Society of Clinical Oncology, 1(11), 710-9.

 

Other examples

Some other possible covariables that may not have been looked at in cancer research include:

  • Transplant failure
  • Graft versus host disease
  • Secondary excision
  • Adjuvant therapy
  • compliance
  • Adverse events

 

The sample data

Data from 137 bone marrow transplant patients. Variables include:

  • T1Time of death or last follow-up (days)
  • delta1Mortality indicators; 1 0 living death
  • TADuration of acute graft-versus-host disease (days)
  • deltaAAcute graft-versus-host disease index; 1- Develops acute graft-versus-host disease, 0- never develops acute graft-versus-host disease

Let’s load the data for use throughout the example

 

The landmark law

  1. Select the fixed time after the baseline as the marker time.Pay attention to: Should be performed on the basis of clinical information prior to examining data
  2. A subset of those populations tracked at least to milestone times.Pay attention to: Please be sure to report numbers excluded due to concerns or censorship prior to landmark time.
  3. The milestone times were calculated and the traditional logarithmic rank test or Cox regression was applied

Of interest in the BMT data is the association between acute graft-versus-host disease (aGVHD) and survival. But aGVHD was evaluated after transplantation, which is our baseline, which is when follow-up started.

Step 1 Select a landmark time

Typically, aGVHD occurs in the first 90 days after transplantation, so we use a 90-day marker.

There is interest in the relationship between acute graft-versus-host disease (aGVHD) and survival. But aGVHD was evaluated after transplantation, which is our baseline, which is when follow-up started.

Step 2: Track at least a subset of the population prior to the milestone time

This reduced our sample size from 137 to 122.

  • All 15 excluded patients died before the 90-day milestone

There is interest in the relationship between acute graft-versus-host disease (aGVHD) and survival. But aGVHD was evaluated after transplantation, which is our baseline, which is when follow-up started.

Step 3 Follow up time was calculated according to landmarks and traditional methods were applied.

Examples of Cox regression benchmarks using BMT data

In Cox regression, subset option COxPH in can be used to exclude patients who were not followed up during the signature period

Time dependent covariate

An alternative approach to boundary marker analysis is to combine time-dependent covariables. This might be a better fit

  • The value of a covariable varies over time
  • There are no obvious milestones

Time dependent covariable data Settings

Analysis of time-dependent covariates R requires the establishment of special data sets.

There is no ID variable in the BMT data, which is required to create a special dataset, so create a variable named my_id.

The TMERGE function is used with the Event and function TDC to create special data sets.

  • tmergeCreate a long data set with multiple time intervals for different covariate values for each patient
  • eventCreate a new event indicator to match the newly created interval
  • tdcCreate a time-dependent covariable indicator to match the newly created time interval

 

Time-dependent covariate – single patient

To see how this works, let’s look at the data from the first five patients.

 ##   my_id   T1 delta1   TA deltaA
## 1     1 2081      0   67      1
## 2     2 1602      0 1602      0
## 3     3 1496      0 1496      0
## 4     4 1462      0   70      1
## 5     5 1433      0 1433      0
Copy the code

New data sets of these same patients

## my_id T1 delta1 id tstart tstop death agvhd ## 1 1 2081 0 1 0 67 0 0 ## 2 1 2081 0 1 67 2081 0 1 ## 3 2 1602 0 2 0 1602 0 0 ## 4 1496 0 3 0 1496 0 0 ## 5 1462 0 4 0 70 0 0 ## 6 1462 0 4 70 1462 0 1 ## 7 5 1433 0 5 0 1433 0 0Copy the code

 

Time dependent covariable-Cox regression

Now, we can analyze this time-dependent covariance as usual using Cox regression with CoxPH

Abstract

We found no significant association between acute graft-versus-host disease and death using landmark analyses or time-dependent covariates.

In general, one would want to visualize single covariates using landmark analysis and univariate and multivariate modeling using Cox regression with time-dependent covariates.

 

Part 3: Competitive risk

 

 

 

 

What is competitive risk?

When multiple possible events occur on an object in the event occurrence time setting

Example:

  • recurrence
  • Death from disease
  • Died of other causes
  • Response to treatment

All of these (or some of them and others) may be possible events in any given study.

 

So what’s the problem?

Unobserved dependencies between event times are fundamental issues that lead to special considerations.

For example, it is conceivable that patients who relapse are more likely to die, so the time of recurrence and the time of death would not be independent events.

 

The background of competitive risk

Two methods of analysis when there are multiple potential outcomes:

  1. Cause-specific hazard for a given event: This represents the incidence of events per unit time in an event that did not fail due to another event
  2. Cumulative incidence of a given event: This represents the incidence of events per unit of time and the impact of competing events

Each of these approaches may illuminate only one important aspect of the data and may make others difficult to understand, so the method chosen should depend on the question of interest.

 

 

 

Example of melanoma data

It contains variables:

  • timeSurvival time is measured in days and may be reviewed.
  • statusOne died of melanoma, two survived, and three died of other causes.
  • sex1 = male, 0 = female.
  • ageAge.
  • yearOperation.
  • thicknessTumor thickness (mm).
  • ulcer1 = present, 0 = absent.

 

Cumulative incidence of melanoma data

Estimate cumulative incidence in the context of competitive risk.

 

## Estimates and Variances: # # $est # # # # 1 1000 2000 3000 4000 5000 0.12745714 0.23013963 0.30962017 0.3387175 0.3387175 # # 1 3 0.03426709 0.05045644 0.05811143 0.1059471 0.1059471 ## ## $var # 1000 2000 3000 4000 5000 ## 1 0.0005481186 0.0009001172 0.0013789328 0.001690760 0.001690760 ## 13 0.0001628354 0.0002451319 0.0002998642 0.001040155 0.001040155Copy the code

 

Plot cumulative incidence – radix R

Generate a base graph of default values.


plot(ci_fit)
Copy the code

 

 

Plot cumulative incidence

 

  •  

The cumulative incidence between groups was compared

Used for inter-group testing.

For example, Melanoma was compared based on the presence or absence of ulcer disease. The test results can be found in Tests.


ci_ulcer[["Tests"]]
Copy the code
1 ## # 3 0.158662 6.903913e-01 1 ## # 3Copy the code

 

Plot cumulative incidence by group

Plot cumulative incidence by group – manually

Please note that I personally find the GGcompetinGrisks function lacking in custom functionality, especially compared to GGSurvPlot. I usually do my own diagrams, first creating a clean data set of cuminc fitting results, and then plotting the results. For more information on the underlying code, see the

Draws a single event type

Usually, only one type of event is of interest, although we still have to consider competing events. In that case, the events of interest can be drawn separately. Again, I do this manually by first creating a clean dataset of the cumINC fitting results, and then drawing the results. For details on the underlying code, see the source code for this presentation.

 

Add numbers to the risk table

You might want to add the number of risk tables to the cumulative incidence chart, and as far as I know, there is no easy way to do this. See the source code for this presentation for an example


Competitive risk regression

Two methods:

  1. Specific cause risk

    • Instantaneous incidence of a given event type in subjects with no current event
    • Cox regression was used for estimation
  2. Subdistribution Subdistributes risks

    • The instantaneous incidence of a given type of event in subjects who have not experienced such an event
    • Fine-gray regression was used for estimation

 

Competitive risk regression in melanoma data — Subdistribution

Suppose we were interested in studying the effect of age and sex on melanoma deaths, whereas deaths from other causes are competing events.

 

  • crrYou need to specify covariables as matrices
  • Can be used if multiple events are concernedfailcodeOption requests the results of other events, which are returned by defaultfailcode = 1


shr_fit
Copy the code
## convergence: TRUE # coefficients: ## sex age ## 0.58840 0.01259 ## standard errors: ## sided p-values: ## sex age ## sidedCopy the code

In the previous example, both sex and age were encoded as numeric variables. If character variables exist, model.matrix must be used

 

Format the results from CRR

Or output that is not currently supported by CRR.

Competitive risk regression-causal analysis in melanoma data

Review all subjects that are not of concern, in this case due to melanoma death, and use coXPH as usual. As a result, cause-specific risk assessment methods are now applied to patients who die from other causes in response to competing risks.

 

Part 4: Advanced topics

 

Content covered

  • Basic knowledge of survival analysis, including Kaplan-Meier survival function and Cox regression
  • Landmark analysis and time-dependent covariates
  • Cumulative incidence and regression of competitive risk analysis

 

What else?

There can be a lot of bits and pieces:

  1. Evaluate proportional risk assumptions
  2. Survival a smooth survival chart XX was drawn
  3. Conditional survival

 

Assess proportional risk

One of the assumptions of the Cox proportional risk regression model is that the risk is proportional at each time point throughout the follow-up. How do we check if the data fits this hypothesis?

Use the features in the Cox.zPH survival package. The result is two things:

  1. A hypothesis test of whether the effect of each covariable changes over time, and a global test of all covariables.

    • This is done by proving the interaction between the covariable and log (time)
    • Significant p values indicate a violation of the proportional risk hypothesis
  2. While the residual figure

    • Evidence of deviations from the zero slope line indicates a violation of the proportional risk hypothesis

print(cz)
Copy the code
## GLOBAL chisq p ## GLOBAL CHISq P ## GLOBAL CHISq P ## GLOBAL CHISq P ## GLOBAL CHISq P ## GLOBAL CHISq P ## GLOBAL CHISq P ## GLOBAL CHISq P ## GLOBAL CHISq P ##Copy the code
plot(cz)
Copy the code

 

Smooth survival graph – Survival quantile

Sometimes you might want to visualize survival estimates in terms of continuous variables. Find the quantile of survival data. The default quantile is P = 0.5 median survival.

  • X stands for event

  • O stands for review

  • The line is a smoothed estimate of the mean survival rate based on age

 

 

Conditions for survival

Sometimes it makes sense to generate estimates of survival in patients who have survived for some time.

 

Zabor, E., Gonen, M., Chapman, P., & Panageas, K. (2013). Dynamic prognostication using conditional survival estimates. Cancer, 119(20), 3589-3592.

 

Conditional survival estimation

Let’s set survival at six months


map_df(
  prob_times, 
  ~conditional_surv_est(
    basekm = fit1, 
    t1 = 182.625, 
    t2 = .x) 
  ) %>% 
  mutate(months = round(prob_times / 30.4)) %>% 
  select(months, everything()) %>% 
  kable()
Copy the code

Conditional survival diagram

We can also visualize conditional survival data based on different survival time lengths.

 

The resulting curve has a survival curve every time we make a conditioning adjustment. In this case, the first line is the overall survival curve because it is adjusted for time 0.

 

 


reference

1.R language multiple Logistic Logistic regression application case

2. Panel smooth transfer regression (PSTR) analysis case implementation

3. Partial least squares regression (PLSR) and principal component regression (PCR) in MATLAB

4.R language Poisson regression model analysis cases

5. Hosmer-lemeshow goodness of fit test in R language regression

6. Implementation of LASSO regression, Ridge regression and Elastic Net model in R language

7. Realize Logistic Logistic regression in R language

8. Python predicts stock prices using linear regression

9. How to calculate IDI and NRI indices for R language in survival analysis and Cox regression