Using the National Health and Nutrition Examination Survey Data (nhefs), we will assess whether quitting smoking leads to a decrease in the risk of death. We’ll also investigate whether potential confounding factors–age, sex, blood pressure, diabetes, and exercise.

To begin, let’s pare down the data frame and assign column labels.

Prepare a table describing the cohort by whether or not the participant quit smoking. Do not include death in the summary table. Consider using the gtsummary functions that build on a summary table.

^{1} Wilcoxon rank sum test; Pearson's Chi-squared test

Exercise 3

Is there a difference in death rates by smoking status?

Prepare an unadjusted and adjusted rate difference in the table.

# Visit https://www.danieldsjoberg.com/gtsummary/reference/tests.html for a listing of included tests.# Tests that make use of the `add_difference(adj.vars=)` argument are adjusted analyses.gts_death_unadjusted <- df_nhefs %>%tbl_summary(by = qsmk, include = death,statistic =all_categorical() ~"{p}%" ) %>%add_difference() %>%add_stat_label() gts_death_unadjusted

^{1} Regression least-squares adjusted mean difference

^{2} CI = Confidence Interval

Exercise 4

Build a logistic regression model with death as the outcome. Include smoking and the other variables as covariates. Summarize your regression model in a table, reporting the odds ratios, confidence intervals, and p-values.

mod <-glm(death ~ qsmk + age + sex + sbp + dbp + exercise, data = df_nhefs, family = binomial)gts_mod <-tbl_regression(mod, exponentiate =TRUE) %>%add_global_p() %>%bold_p() %>%italicize_levels() %>%modify_caption("**Logistic regression model predicting death**") %>%modify_header(label ="**Variable** (N = {N})")gts_mod

Logistic regression model predicting death

Variable (N = 1548)

OR^{1}

95% CI^{1}

p-value

Quit Smoking

0.5

Did not Quit

—

—

Quit

0.90

0.64, 1.26

Age

1.11

1.10, 1.13

<0.001

Sex

<0.001

Female

—

—

Male

1.78

1.31, 2.43

Systolic BP

1.02

1.01, 1.03

<0.001

Diastolic BP

1.00

0.98, 1.01

0.6

Exercise Level

0.3

Little or no exercise

—

—

Moderate exercise

0.78

0.56, 1.08

Much exercise

0.81

0.52, 1.25

^{1} OR = Odds Ratio, CI = Confidence Interval

Exercise 5

Write a brief summary of the results above using inline_text() to report values from the tables directly into the markdown report.

The analysis assessing the relationship between quitting smoking and subsequent death within the next 20 years included 1548 participants. The median age among those who quit was higher compared to those who did not (46 vs 42; p<0.001).

On univariate analysis, participants who did not quit smoking had higher rates of death (difference -4.9%; 95% CI -9.7%, -0.07%; p=0.037). However, on multivariable analysis, the relationship was not longer significant (odds ratio 0.90; 95% CI 0.64, 1.26).