Title: | A Multistate Life Table (MSLT) Methodology Based on Bayesian approach |
---|---|
Description: | Create life tables with Bayesian approach, which can be very useful for modelling a complex health process when considering multiple predisposing factors and multiple coexisting health conditions. |
Authors: | Emma Zang [aut, cph], Xuezhixing Zhang [aut, cre], Scott Lynch [aut, cph] |
Maintainer: | Xuezhixing Zhang <[email protected]> |
License: | GPL (>= 3) |
Version: | 1.0.0 |
Built: | 2024-11-21 03:35:01 UTC |
Source: | https://github.com/dewelve/bayesmlogit |
A Bayesian Multistate Life Table Method for survey data, developed by Lynch and Zang (2022), allowing for large state spaces with quasi-absorbing states (i.e., structural zeros in a transition matrix).
bayesmlogit( y, X, file_path = NA, samp = 1000, burn = 500, verbose = 100, thin = 5, trace.plot = FALSE )
bayesmlogit( y, X, file_path = NA, samp = 1000, burn = 500, verbose = 100, thin = 5, trace.plot = FALSE )
y |
A vector of state transitions, which can be created either manually or with |
X |
A matrix of covariates. Note that |
file_path |
The file path for outputs. If a path is specified, the result will also be saved in the given file path. You can find two result files in the specified file: |
samp |
Number of posterior samples. For efficiency purposes, if you need a large sample (e.g., |
burn |
'burn-in' period. Default is 500. |
verbose |
Progress report. Default is 10, which means this function will report the current progress for every 10 posterior samples. |
thin |
The thinning strategy to reduce autocorrelation. For example, if |
trace.plot |
If TRUE, this function will create a new directory under given |
This function came from the deprecated bayeslogit package, which conducts Bayesian multinomial logistic regressions using Polya-Gamma latent variables (Polson et al. 2013). It should be jointly used with the mlifetable() function, which will generate life tables based on the estimates from regressions.
A list that contains two arrays:
out: An array that contains all posterior samples generated.
outwstepwidth: An array generated by selecting one sample from every thin samples in out.
The number of columns in both arrays is determined by the number of covariates in X and the number of unique transition status in y. For example, if we have 12 covariates in X and 36 unique transitions in y, our result will contain (12+1)*(36-1)= 455 columns in total.
mlifeTable
, lifedata
, CreateTrans
## Not run: data <- lifedata y <- data[,1] X <- data[,-1] # This example will take about 30 mins. out <- bayesmlogit(y, X ,samp=1000, burn=500,verbose=10) ## End(Not run)
## Not run: data <- lifedata y <- data[,1] X <- data[,-1] # This example will take about 30 mins. out <- bayesmlogit(y, X ,samp=1000, burn=500,verbose=10) ## End(Not run)
A function used to create transition vectors with data in long format, which requires the dplyr
package.
CreateTrans(ID, Age, State, Death, states)
CreateTrans(ID, Age, State, Death, states)
ID |
A vector that specifies the ID for each subject. |
Age |
A vector that indicates each subject's age at this visit. |
State |
A vector or a factor that indicates the state for each subject at this visit. |
Death |
A vector that indicates whether the subject died or not at this visit. |
states |
The total number of states in our data. |
The rules for creating transitions can be found with ?lifedata
. In essence, arrange the data in long format, including details about the present state at time t. This procedure will assist in generating a dataset in long format that captures transitions by utilizing states from both time t-1 and t.
A vector that contains all transitions.
ID <- rep(1:50, each = 5) Age <- rep(31:35, times = 50) State <- sample(1:5,size=250,replace=TRUE) Death <- rep(c(0,0,0,0,1),times=50) Example <- data.frame(ID,Age,State,Death) Example$trans <- CreateTrans(Example$ID,Example$Age, Example$State,Example$Death,states=6)
ID <- rep(1:50, each = 5) Age <- rep(31:35, times = 50) State <- sample(1:5,size=250,replace=TRUE) Death <- rep(c(0,0,0,0,1),times=50) Example <- data.frame(ID,Age,State,Death) Example$trans <- CreateTrans(Example$ID,Example$Age, Example$State,Example$Death,states=6)
A function for comparing the life expectancies of subgroups. This function will, by default, calculate the percentage of samples in your reference group with a higher (or lower) life expectancy (or proportion of total life expectancy) than other groups.
life_compare( file_path, file = paste(file_path, "/mplotResults", sep = ""), state.include = 0, states, ref.var, ref.level, index.matrix, prop = TRUE, criterion = ">", state.names = NA )
life_compare( file_path, file = paste(file_path, "/mplotResults", sep = ""), state.include = 0, states, ref.var, ref.level, index.matrix, prop = TRUE, criterion = ">", state.names = NA )
file_path |
The file path for data reading. It can be inherited from |
file |
The file path for outputs. Default is |
state.include |
The status we aim to compare. It can be a number or a vector. Default is 0, which means we'll consider all states. It can be inherited from |
states |
The total number of states in data. It can be inherited from |
ref.var |
A vector containing all covariates used as comparison factors for each subgroup. |
ref.level |
A vector that declares the reference value of each reference variable. |
index.matrix |
A matrix that generated in |
prop |
If TRUE, this function will output the comparision reulsts of life expectancy proportions in addition to orginal comparison results. Default is TRUE. It can be inherited from |
criterion |
The criterion for comparison, which can be either ">" or "<". Default is ">". |
state.names |
A vector used to specify names of each state except death. It can be inherited from |
A .csv
file with comparison results.
## Not run: #By setting the parameter 'compare' in mlifeTable_plot() to TRUE. #We can directly employ this function. mlifeTable_plot(X=lifedata[,-1],state.include = 3, groupby = c("male","black","hispanic"), cred = 0.84, states = 3, file_path = ".", compare = TRUE, ref.var = c("black","hispanic"), ref.level = c(0,0)) ## End(Not run)
## Not run: #By setting the parameter 'compare' in mlifeTable_plot() to TRUE. #We can directly employ this function. mlifeTable_plot(X=lifedata[,-1],state.include = 3, groupby = c("male","black","hispanic"), cred = 0.84, states = 3, file_path = ".", compare = TRUE, ref.var = c("black","hispanic"), ref.level = c(0,0)) ## End(Not run)
Data extracted and processed from The Health and Retirement Study (HRS).
lifedata
lifedata
A data frame with 8198 rows and 16 variables:
Transitions that recorded in the original data. In this data, we have 6 kinds of transtions in total.
Age for each subject.
Sex for each subject. male=1, female=0.
Dummy variables for race.
Marital status.
Dummy variables for education level.
Birth cohort, which is birth year minus 1900.
Dummy variables for birth regions.
Dummy variables for residential regions.
To use this package with your data, please make sure your data have a vector for transitions. The transitions can be manually created following the example below:
In lifedata
, Each subject has 3 states in the cohort: 1: health; 2: unhealthiness; 3: death.
Thus we will have 6 kind of possible transitions: 1:health to health; 2:health to unhealthiness; 3: health to death; 4: unhealthiness to health; 5: unhealthiness to unhealthiness; 6: unhealthiness to death. To check the transition for each subject, please use lifedata[,1]
.
When creating transitions by yourself, please follow the orders as below:
Health | Unhealthiness | Death | |
Health | 1 | 2 | 3 |
Unhealthiness | 4 | 5 | 6 |
Death | - | - | - |
where the first column indicates the previous state of subjects and the first row indicates the current state that subjects are in. The numbers indicates the index of our transitions.
For impossible transitions like death to death, you can also label them following the above order, which won't change the results. If transitions are not created in this order, the computation may encounter an error. One can also use CreateTrans()
to create the transition vector.
A Bayesian Multistate Life Table Method for survey data, developed by Lynch and Zang (2022), allowing for large state spaces with quasi-absorbing states (i.e., structural zeros in a transition matrix).
mlifeTable( y, X, trans, states, file_path, groupby = NA, no_control = NA, values = NA, status = 0, startages = 0, endages = 110, age.gap = 1, nums = dim(trans)[1], mlifeTable_plot = FALSE, state.names = NA, ... )
mlifeTable( y, X, trans, states, file_path, groupby = NA, no_control = NA, values = NA, status = 0, startages = 0, endages = 110, age.gap = 1, nums = dim(trans)[1], mlifeTable_plot = FALSE, state.names = NA, ... )
y |
A vector of transitions. |
X |
A matrix of covariates. Note that |
trans |
The posterior samples generated using |
states |
The total number of states in data. |
file_path |
The file path for outputs. |
groupby |
A vector that contains the covariates for subgroup comparisons. Default is NA, which means that we won't make subgroups. |
no_control |
The covariates that we don't want to control in subgroup analysis. Default is NA, which means we will control all covariates in X. As an example, in Lynch and Zang's study (2022), they incorporated education into the multinomial logit model. However, in the life table calculation, if one does not want to control for education, one could opt to use its region-specific mean rather than the sample mean using no_control. |
values |
A list that specifies values for covariates. Default is NA. If both no_control and values are specified, the option values takes precedence. |
status |
A numeric value. The option allows producing status-based life tables. Default is 0, produces population-based life tables. |
startages |
Start age of the life table. Default is 0. |
endages |
End age of the life table. Default is 110. |
age.gap |
This option allows users to specify the age interval of the life table. Default is 1. For example, if the survey data were sampled every 2 years, users can specify the age interval to be 2 in the life table. |
nums |
Number of life tables generated for each subgroup. Default is the size of posterior samples we used. |
mlifeTable_plot |
If TRUE, this option will create a new directory |
state.names |
A vector used to specify names of each state except death. You can also specify them in the output files. |
... |
Extra parameters for |
This function generates life tables based on the estimates from the Bayesian multinomial logit regressions, which can be obtained using the bayesmlogit()
function. The values in the generated life table represent the expected remaining years to be spent in each state conditional on a give age. Current version was designed to only generate life tables based on data with a death state.
Life tables for each subgroup.
## Not run: #The life tables generated in the example have 3 columns, which correspond to 3 states: #1: health; 2: unhealthiness; 3: death; data <- lifedata y <- data[,1] X <- data[,-1] # This example will take about 30 mins. out <- bayesmlogit(y, X ,samp=1000, burn=500,verbose=10) trans <- out$outwstepwidth mlifeTable(y,X,trans =trans, groupby = c("male","black","hispanic"), no_control = "mar", startages=50, age.gap=1, states=3, file_path=".") # To name each subgroup, try the subgroup.names option. mlifeTable(y,X,trans =trans, groupby = c("male","black","hispanic"), no_control = "mar", states=3, startages=50, age.gap=1, file_path=".", subgroup.names= c("F-W","M-W","M-B","F-B","F-H","M-H")) # To generate plots, try the mlifeTable_plot option mlifeTable(y,X,trans =trans, groupby = c("male","black","hispanic"), no_control = "mar", states=3, startages=50, age.gap=1, nums = 400, file_path=".", subgroup.names= c("F-W","M-W","M-B","F-B","F-H","M-H"), mlifeTable_plot = T, cred = 0.84) # To specify a variable at a fixed value other than the mean value. Try option "values". mlifeTable(y,X,trans =trans, groupby = c("male","black","hispanic"), no_control = "mar", values = list("cohort" = 36), states=3, startages=50, age.gap=1, nums = 400, file_path=".", subgroup.names= c("F-W","M-W","M-B","F-B","F-H","M-H"), mlifeTable_plot = T, cred = 0.84) ## End(Not run)
## Not run: #The life tables generated in the example have 3 columns, which correspond to 3 states: #1: health; 2: unhealthiness; 3: death; data <- lifedata y <- data[,1] X <- data[,-1] # This example will take about 30 mins. out <- bayesmlogit(y, X ,samp=1000, burn=500,verbose=10) trans <- out$outwstepwidth mlifeTable(y,X,trans =trans, groupby = c("male","black","hispanic"), no_control = "mar", startages=50, age.gap=1, states=3, file_path=".") # To name each subgroup, try the subgroup.names option. mlifeTable(y,X,trans =trans, groupby = c("male","black","hispanic"), no_control = "mar", states=3, startages=50, age.gap=1, file_path=".", subgroup.names= c("F-W","M-W","M-B","F-B","F-H","M-H")) # To generate plots, try the mlifeTable_plot option mlifeTable(y,X,trans =trans, groupby = c("male","black","hispanic"), no_control = "mar", states=3, startages=50, age.gap=1, nums = 400, file_path=".", subgroup.names= c("F-W","M-W","M-B","F-B","F-H","M-H"), mlifeTable_plot = T, cred = 0.84) # To specify a variable at a fixed value other than the mean value. Try option "values". mlifeTable(y,X,trans =trans, groupby = c("male","black","hispanic"), no_control = "mar", values = list("cohort" = 36), states=3, startages=50, age.gap=1, nums = 400, file_path=".", subgroup.names= c("F-W","M-W","M-B","F-B","F-H","M-H"), mlifeTable_plot = T, cred = 0.84) ## End(Not run)
A function for plotting posterior means and their credible intervals. It can also be used as a subfunction in mlifetable()
.
mlifeTable_plot( state.include = 0, groupby, file_path, X, cred = 0.84, states, prop = TRUE, subgroup.names = NULL, state.names = NA, compare = FALSE, midpoint.type = "mean", ... )
mlifeTable_plot( state.include = 0, groupby, file_path, X, cred = 0.84, states, prop = TRUE, subgroup.names = NULL, state.names = NA, compare = FALSE, midpoint.type = "mean", ... )
state.include |
A vector or a number used to specify the states whose expectancy years are of interest. Default is 0, which means we'll generate plots for all states. For multiple states specified, we will get the expectancy years for each state and their sum. |
groupby |
A vector that contains covariates for subgroup comparisons. It can be inherited from |
file_path |
The file path for outputs. It can be inherited from |
X |
A matrix of covariates. Note that X must include age as a convariate. It can be inherited from |
cred |
Credible level. For example, if |
states |
The total number of states in data. It can be inherited from |
prop |
If TRUE, this function will output life expectancy proportion plots and tables in addition to original life expectancy plots. Default is TRUE. |
subgroup.names |
A vector that contains names of each subgroup. You can also specify them in the output files. |
state.names |
A vector used to specify names of each state except death. It can be inherited from |
compare |
If TRUE, this function will quote |
midpoint.type |
A character used to specify the midpoint type for credible interval plots. Can be either "mean" or "median". Default is "mean", which means the plots will use mean values as the middle point. |
... |
Extra parameters for |
Plots and tables for posterior means and credible intervals of each subgroups.
## Not run: #Generate plots and corresponding tables only. mlifeTable_plot(X=lifedata[,-1],state.include = 0, groupby = c("male","black","hispanic"), cred = 0.84, states = 3, file_path = ".") #Additionally generate the comparsion results to the reference level. mlifeTable_plot(X=lifedata[,-1],state.include = 0, groupby = c("male","black","hispanic"), cred = 0.84, states = 3, file_path = ".", compare = TRUE, ref.var = c("black","hispanic"), ref.level = c(0,0)) ## End(Not run)
## Not run: #Generate plots and corresponding tables only. mlifeTable_plot(X=lifedata[,-1],state.include = 0, groupby = c("male","black","hispanic"), cred = 0.84, states = 3, file_path = ".") #Additionally generate the comparsion results to the reference level. mlifeTable_plot(X=lifedata[,-1],state.include = 0, groupby = c("male","black","hispanic"), cred = 0.84, states = 3, file_path = ".", compare = TRUE, ref.var = c("black","hispanic"), ref.level = c(0,0)) ## End(Not run)