Using the National Health and Nutrition Examination Survey Data (nhefs), we will assess whether quitting smoking leads to a decrease in the risk of death. We’ll also investigate whether potential confounding factors–age, sex, blood pressure, diabetes, and exercise.
To begin, let’s pare down the data frame and assign column labels.
Prepare a table describing the cohort by whether or not the participant quit smoking. Do not include death in the summary table. Consider using the gtsummary functions that build on a summary table.
1 Wilcoxon rank sum test; Pearson's Chi-squared test
Exercise 3
Is there a difference in death rates by smoking status?
Prepare an unadjusted and adjusted rate difference in the table.
# Visit https://www.danieldsjoberg.com/gtsummary/reference/tests.html for a listing of included tests.# Tests that make use of the `add_difference(adj.vars=)` argument are adjusted analyses.gts_death_unadjusted <- df_nhefs %>%tbl_summary(by = qsmk, include = death,statistic =all_categorical() ~"{p}%" ) %>%add_difference() %>%add_stat_label() gts_death_unadjusted
1 Regression least-squares adjusted mean difference
2 CI = Confidence Interval
Exercise 4
Build a logistic regression model with death as the outcome. Include smoking and the other variables as covariates. Summarize your regression model in a table, reporting the odds ratios, confidence intervals, and p-values.
mod <-glm(death ~ qsmk + age + sex + sbp + dbp + exercise, data = df_nhefs, family = binomial)gts_mod <-tbl_regression(mod, exponentiate =TRUE) %>%add_global_p() %>%bold_p() %>%italicize_levels() %>%modify_caption("**Logistic regression model predicting death**") %>%modify_header(label ="**Variable** (N = {N})")gts_mod
Logistic regression model predicting death
Variable (N = 1548)
OR1
95% CI1
p-value
Quit Smoking
0.5
Did not Quit
—
—
Quit
0.90
0.64, 1.26
Age
1.11
1.10, 1.13
<0.001
Sex
<0.001
Female
—
—
Male
1.78
1.31, 2.43
Systolic BP
1.02
1.01, 1.03
<0.001
Diastolic BP
1.00
0.98, 1.01
0.6
Exercise Level
0.3
Little or no exercise
—
—
Moderate exercise
0.78
0.56, 1.08
Much exercise
0.81
0.52, 1.25
1 OR = Odds Ratio, CI = Confidence Interval
Exercise 5
Write a brief summary of the results above using inline_text() to report values from the tables directly into the markdown report.
The analysis assessing the relationship between quitting smoking and subsequent death within the next 20 years included 1548 participants. The median age among those who quit was higher compared to those who did not (46 vs 42; p<0.001).
On univariate analysis, participants who did not quit smoking had higher rates of death (difference -4.9%; 95% CI -9.7%, -0.07%; p=0.037). However, on multivariable analysis, the relationship was not longer significant (odds ratio 0.90; 95% CI 0.64, 1.26).