| ID | Gender | Age | Age.of.onset | EDSS | Does.the.time.difference.between.MRI.acquisition.and.EDSS…two.months | Types.of.Medicines | Presenting.Symptom | Dose.the.patient.has.Co.moroidity | Pyramidal | Cerebella | Brain.stem | Sensory | Sphincters | Visual | Mental | Speech | Motor.System | Sensory.System | Coordination | Gait | Bowel.and.bladder.function | Mobility | Mental.State | Optic.discs | Fields | Nystagmus | Ocular.Movement | Swallowing |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | F | 56 | 43 | 3.0 | No | Gelenia | Motor | No | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 |
| 2 | F | 29 | 19 | 1.5 | No | Gelenia | Sensory | No | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
| 3 | F | 15 | 8 | 4.0 | No | Tysabri | Motor | No | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
| 4 | F | 24 | 20 | 6.0 | No | Tysabri | Sensory | No | 1 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 5 | F | 33 | 31 | 0.0 | No | Avonex | Pain | No | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 6 | F | 44 | 40 | 5.0 | No | Avonex | Motor | No | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
| 7 | M | 43 | 40 | 3.5 | No | Betaferon | Motor & Visual | No | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 |
| 8 | F | 32 | 30 | 1.0 | No | Gelenia | Visual | No | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 9 | F | 36 | 33 | 6.0 | No | Gelenia | Motore | No | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
| 10 | F | 39 | 35 | 3.0 | No | Betaferon | Motor & Behavioural | No | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
gtsummary - Your New Go-To for Tables
I thought I should bring this excellent package to your attention if you weren’t aware that it exists, as I have taken gtsummary somewhat for granted over the last few years since it first appeared on CRAN. I’m prompted in part due to a research student having to recently remake several “Table 1” - style tables (following a data change) in manuscript preparation for submission and they were going to redo this manually. When they realised what gtsummary could do in terms of saving them time, I think they were fairly impressed. So today, I’m just going to show you a couple of basic functionalities of this package. It is extremely extensible and if you can’t find answers for your own customisation needs on the homepage or vignette, I have found googling the issue often brings an answer. The developer is also quite active on stackoverflow.com. The homepage can be found at:
https://www.danieldsjoberg.com/gtsummary/index.html
We going to use a publicly available MS dataset, so if you want to run the code yourself you will first need to download the data from:
This dataset contains the demographic and clinical data on 60 patients (MRI data in accompanying datasets available at link).
1 Load and Inspect the Data
Let’s have a look at the first few lines:
2 Summary Table
Let’s say you want to create a summary table showing descriptive statistics of the various demographic and clinical characteristics, stratified by DMT (Types.of.Medicines). In the first instance, this can be a basic call of tbl_summary() specifying Types.of.Medicines as the stratifying variable. We want to specify medians (IQR) and n’s (%’s) as the summary statistics.
Code
| Characteristic | Overall, N = 601 | Avonex, N = 51 | Betaferon, N = 241 | Gelenia, N = 91 | Rebif, N = 141 | Tysabri, N = 81 |
|---|---|---|---|---|---|---|
| Gender | ||||||
| F | 46/60 (77%) | 5/5 (100%) | 15/24 (63%) | 9/9 (100%) | 10/14 (71%) | 7/8 (88%) |
| M | 13/60 (22%) | 0/5 (0%) | 8/24 (33%) | 0/9 (0%) | 4/14 (29%) | 1/8 (13%) |
| N | 1/60 (1.7%) | 0/5 (0%) | 1/24 (4.2%) | 0/9 (0%) | 0/14 (0%) | 0/8 (0%) |
| Age | 33.0 (20.0,42.3) | 24.0 (23.0,33.0) | 37.5 (23.8,43.0) | 42.0 (36.0,52.0) | 32.5 (18.5,38.0) | 20.5 (15.0,24.3) |
| Age.of.onset | 30.5 (19.8,40.0) | 20.0 (20.0,31.0) | 35.0 (23.0,41.0) | 40.0 (30.0,42.0) | 31.0 (18.5,37.0) | 17.0 (16.3,21.3) |
| EDSS | 2.0 (1.0,3.5) | 1.5 (1.0,4.0) | 2.3 (1.0,3.1) | 3.0 (1.5,3.0) | 1.3 (1.0,2.4) | 3.0 (1.4,4.3) |
| Does.the.time.difference.between.MRI.acquisition.and.EDSS...two.months | 26/60 (43%) | 0/5 (0%) | 10/24 (42%) | 3/9 (33%) | 11/14 (79%) | 2/8 (25%) |
| Presenting.Symptom | ||||||
| Balance | 4/60 (6.7%) | 0/5 (0%) | 2/24 (8.3%) | 0/9 (0%) | 2/14 (14%) | 0/8 (0%) |
| Balance &Motor | 1/60 (1.7%) | 0/5 (0%) | 0/24 (0%) | 0/9 (0%) | 0/14 (0%) | 1/8 (13%) |
| Motor | 10/60 (17%) | 1/5 (20%) | 3/24 (13%) | 1/9 (11%) | 3/14 (21%) | 2/8 (25%) |
| Motor & Behavioural | 1/60 (1.7%) | 0/5 (0%) | 1/24 (4.2%) | 0/9 (0%) | 0/14 (0%) | 0/8 (0%) |
| Motor & Sensory | 1/60 (1.7%) | 0/5 (0%) | 1/24 (4.2%) | 0/9 (0%) | 0/14 (0%) | 0/8 (0%) |
| Motor & Visual | 2/60 (3.3%) | 0/5 (0%) | 2/24 (8.3%) | 0/9 (0%) | 0/14 (0%) | 0/8 (0%) |
| Motore | 1/60 (1.7%) | 0/5 (0%) | 0/24 (0%) | 1/9 (11%) | 0/14 (0%) | 0/8 (0%) |
| Pain | 1/60 (1.7%) | 1/5 (20%) | 0/24 (0%) | 0/9 (0%) | 0/14 (0%) | 0/8 (0%) |
| Sensory | 19/60 (32%) | 0/5 (0%) | 8/24 (33%) | 3/9 (33%) | 7/14 (50%) | 1/8 (13%) |
| Sensory & Visual | 1/60 (1.7%) | 0/5 (0%) | 1/24 (4.2%) | 0/9 (0%) | 0/14 (0%) | 0/8 (0%) |
| Sensory & Motor | 1/60 (1.7%) | 0/5 (0%) | 1/24 (4.2%) | 0/9 (0%) | 0/14 (0%) | 0/8 (0%) |
| Sensory & Visual | 1/60 (1.7%) | 0/5 (0%) | 0/24 (0%) | 1/9 (11%) | 0/14 (0%) | 0/8 (0%) |
| Sensory & Visual ,Balance , Motor, Sexual,Fatigue | 1/60 (1.7%) | 0/5 (0%) | 1/24 (4.2%) | 0/9 (0%) | 0/14 (0%) | 0/8 (0%) |
| Sensory &Motor | 1/60 (1.7%) | 0/5 (0%) | 0/24 (0%) | 0/9 (0%) | 0/14 (0%) | 1/8 (13%) |
| Visual | 14/60 (23%) | 3/5 (60%) | 4/24 (17%) | 2/9 (22%) | 2/14 (14%) | 3/8 (38%) |
| Visual & Balance | 1/60 (1.7%) | 0/5 (0%) | 0/24 (0%) | 1/9 (11%) | 0/14 (0%) | 0/8 (0%) |
| Dose.the.patient.has.Co.moroidity | 13/60 (22%) | 0/5 (0%) | 8/24 (33%) | 3/9 (33%) | 2/14 (14%) | 0/8 (0%) |
| Pyramidal | 31/60 (52%) | 2/5 (40%) | 14/24 (58%) | 5/9 (56%) | 4/14 (29%) | 6/8 (75%) |
| Cerebella | 17/60 (28%) | 1/5 (20%) | 8/24 (33%) | 2/9 (22%) | 3/14 (21%) | 3/8 (38%) |
| Brain.stem | 5/60 (8.3%) | 1/5 (20%) | 1/24 (4.2%) | 0/9 (0%) | 1/14 (7.1%) | 2/8 (25%) |
| Sensory | 18/60 (30%) | 1/5 (20%) | 8/24 (33%) | 3/9 (33%) | 3/14 (21%) | 3/8 (38%) |
| Sphincters | 9/60 (15%) | 0/5 (0%) | 5/24 (21%) | 0/9 (0%) | 2/14 (14%) | 2/8 (25%) |
| Visual | 17/60 (28%) | 3/5 (60%) | 6/24 (25%) | 2/9 (22%) | 2/14 (14%) | 4/8 (50%) |
| Mental | 2/60 (3.3%) | 0/5 (0%) | 2/24 (8.3%) | 0/9 (0%) | 0/14 (0%) | 0/8 (0%) |
| Speech | 6/60 (10%) | 0/5 (0%) | 4/24 (17%) | 0/9 (0%) | 1/14 (7.1%) | 1/8 (13%) |
| Motor.System | 35/60 (58%) | 3/5 (60%) | 14/24 (58%) | 5/9 (56%) | 6/14 (43%) | 7/8 (88%) |
| Sensory.System | 19/60 (32%) | 0/5 (0%) | 8/24 (33%) | 4/9 (44%) | 4/14 (29%) | 3/8 (38%) |
| Coordination | 17/60 (28%) | 2/5 (40%) | 6/24 (25%) | 2/9 (22%) | 2/14 (14%) | 5/8 (63%) |
| Gait | 17/60 (28%) | 2/5 (40%) | 7/24 (29%) | 1/9 (11%) | 4/14 (29%) | 3/8 (38%) |
| Bowel.and.bladder.function | 9/60 (15%) | 1/5 (20%) | 2/24 (8.3%) | 1/9 (11%) | 3/14 (21%) | 2/8 (25%) |
| Mobility | 4/60 (6.7%) | 0/5 (0%) | 2/24 (8.3%) | 1/9 (11%) | 1/14 (7.1%) | 0/8 (0%) |
| Mental.State | 3/60 (5.0%) | 0/5 (0%) | 2/24 (8.3%) | 0/9 (0%) | 1/14 (7.1%) | 0/8 (0%) |
| Optic.discs | 22/60 (37%) | 2/5 (40%) | 8/24 (33%) | 3/9 (33%) | 4/14 (29%) | 5/8 (63%) |
| Fields | 0/60 (0%) | 0/5 (0%) | 0/24 (0%) | 0/9 (0%) | 0/14 (0%) | 0/8 (0%) |
| Nystagmus | 7/60 (12%) | 1/5 (20%) | 3/24 (13%) | 2/9 (22%) | 0/14 (0%) | 1/8 (13%) |
| Ocular.Movement | 2/60 (3.3%) | 0/5 (0%) | 0/24 (0%) | 1/9 (11%) | 0/14 (0%) | 1/8 (13%) |
| Swallowing | 3/60 (5.0%) | 0/5 (0%) | 3/24 (13%) | 0/9 (0%) | 0/14 (0%) | 0/8 (0%) |
| 1 n/N (%); Median (25%,75%) | ||||||
In fact, that’s a pretty good start. However, we think that including the column frequency as the denominator in every cell is just clutter, so let’s remove that. We’ll also include an argument for reporting missingness if any exists. Additionally, we want to tidy up some of the variable names - I’ll just do Age, Age.of.onset and the somewhat convoluted Does.the.time.difference.between.MRI.acquisition.and.EDSS...two.months for now. In fact, for the latter we’ll make it a short name and include a footnote to expand on the variable description.
Code
dat |>
select(-ID) |>
tbl_summary(
by = Types.of.Medicines,
statistic = list(all_continuous() ~ "{median} ({p25},{p75})",
all_categorical() ~ "{n} ({p}%)"),
digits = all_continuous() ~ 1,
missing_text = "(Missing)",
label = c(Age ~ "Age, yrs - median (IQR)",
Age.of.onset ~ "Age onset, yrs - median (IQR)",
Does.the.time.difference.between.MRI.acquisition.and.EDSS...two.months ~ "Time difference < 2 months")) |>
modify_table_styling(columns = label,
rows = label == "Time difference < 2 months",
footnote = "Does the time difference between MRI acquisition and EDSS < two months") |>
add_overall()| Characteristic | Overall, N = 601 | Avonex, N = 51 | Betaferon, N = 241 | Gelenia, N = 91 | Rebif, N = 141 | Tysabri, N = 81 |
|---|---|---|---|---|---|---|
| Gender | ||||||
| F | 46 (77%) | 5 (100%) | 15 (63%) | 9 (100%) | 10 (71%) | 7 (88%) |
| M | 13 (22%) | 0 (0%) | 8 (33%) | 0 (0%) | 4 (29%) | 1 (13%) |
| N | 1 (1.7%) | 0 (0%) | 1 (4.2%) | 0 (0%) | 0 (0%) | 0 (0%) |
| Age, yrs - median (IQR) | 33.0 (20.0,42.3) | 24.0 (23.0,33.0) | 37.5 (23.8,43.0) | 42.0 (36.0,52.0) | 32.5 (18.5,38.0) | 20.5 (15.0,24.3) |
| Age onset, yrs - median (IQR) | 30.5 (19.8,40.0) | 20.0 (20.0,31.0) | 35.0 (23.0,41.0) | 40.0 (30.0,42.0) | 31.0 (18.5,37.0) | 17.0 (16.3,21.3) |
| EDSS | 2.0 (1.0,3.5) | 1.5 (1.0,4.0) | 2.3 (1.0,3.1) | 3.0 (1.5,3.0) | 1.3 (1.0,2.4) | 3.0 (1.4,4.3) |
| Time difference < 2 months2 | 26 (43%) | 0 (0%) | 10 (42%) | 3 (33%) | 11 (79%) | 2 (25%) |
| Presenting.Symptom | ||||||
| Balance | 4 (6.7%) | 0 (0%) | 2 (8.3%) | 0 (0%) | 2 (14%) | 0 (0%) |
| Balance &Motor | 1 (1.7%) | 0 (0%) | 0 (0%) | 0 (0%) | 0 (0%) | 1 (13%) |
| Motor | 10 (17%) | 1 (20%) | 3 (13%) | 1 (11%) | 3 (21%) | 2 (25%) |
| Motor & Behavioural | 1 (1.7%) | 0 (0%) | 1 (4.2%) | 0 (0%) | 0 (0%) | 0 (0%) |
| Motor & Sensory | 1 (1.7%) | 0 (0%) | 1 (4.2%) | 0 (0%) | 0 (0%) | 0 (0%) |
| Motor & Visual | 2 (3.3%) | 0 (0%) | 2 (8.3%) | 0 (0%) | 0 (0%) | 0 (0%) |
| Motore | 1 (1.7%) | 0 (0%) | 0 (0%) | 1 (11%) | 0 (0%) | 0 (0%) |
| Pain | 1 (1.7%) | 1 (20%) | 0 (0%) | 0 (0%) | 0 (0%) | 0 (0%) |
| Sensory | 19 (32%) | 0 (0%) | 8 (33%) | 3 (33%) | 7 (50%) | 1 (13%) |
| Sensory & Visual | 1 (1.7%) | 0 (0%) | 1 (4.2%) | 0 (0%) | 0 (0%) | 0 (0%) |
| Sensory & Motor | 1 (1.7%) | 0 (0%) | 1 (4.2%) | 0 (0%) | 0 (0%) | 0 (0%) |
| Sensory & Visual | 1 (1.7%) | 0 (0%) | 0 (0%) | 1 (11%) | 0 (0%) | 0 (0%) |
| Sensory & Visual ,Balance , Motor, Sexual,Fatigue | 1 (1.7%) | 0 (0%) | 1 (4.2%) | 0 (0%) | 0 (0%) | 0 (0%) |
| Sensory &Motor | 1 (1.7%) | 0 (0%) | 0 (0%) | 0 (0%) | 0 (0%) | 1 (13%) |
| Visual | 14 (23%) | 3 (60%) | 4 (17%) | 2 (22%) | 2 (14%) | 3 (38%) |
| Visual & Balance | 1 (1.7%) | 0 (0%) | 0 (0%) | 1 (11%) | 0 (0%) | 0 (0%) |
| Dose.the.patient.has.Co.moroidity | 13 (22%) | 0 (0%) | 8 (33%) | 3 (33%) | 2 (14%) | 0 (0%) |
| Pyramidal | 31 (52%) | 2 (40%) | 14 (58%) | 5 (56%) | 4 (29%) | 6 (75%) |
| Cerebella | 17 (28%) | 1 (20%) | 8 (33%) | 2 (22%) | 3 (21%) | 3 (38%) |
| Brain.stem | 5 (8.3%) | 1 (20%) | 1 (4.2%) | 0 (0%) | 1 (7.1%) | 2 (25%) |
| Sensory | 18 (30%) | 1 (20%) | 8 (33%) | 3 (33%) | 3 (21%) | 3 (38%) |
| Sphincters | 9 (15%) | 0 (0%) | 5 (21%) | 0 (0%) | 2 (14%) | 2 (25%) |
| Visual | 17 (28%) | 3 (60%) | 6 (25%) | 2 (22%) | 2 (14%) | 4 (50%) |
| Mental | 2 (3.3%) | 0 (0%) | 2 (8.3%) | 0 (0%) | 0 (0%) | 0 (0%) |
| Speech | 6 (10%) | 0 (0%) | 4 (17%) | 0 (0%) | 1 (7.1%) | 1 (13%) |
| Motor.System | 35 (58%) | 3 (60%) | 14 (58%) | 5 (56%) | 6 (43%) | 7 (88%) |
| Sensory.System | 19 (32%) | 0 (0%) | 8 (33%) | 4 (44%) | 4 (29%) | 3 (38%) |
| Coordination | 17 (28%) | 2 (40%) | 6 (25%) | 2 (22%) | 2 (14%) | 5 (63%) |
| Gait | 17 (28%) | 2 (40%) | 7 (29%) | 1 (11%) | 4 (29%) | 3 (38%) |
| Bowel.and.bladder.function | 9 (15%) | 1 (20%) | 2 (8.3%) | 1 (11%) | 3 (21%) | 2 (25%) |
| Mobility | 4 (6.7%) | 0 (0%) | 2 (8.3%) | 1 (11%) | 1 (7.1%) | 0 (0%) |
| Mental.State | 3 (5.0%) | 0 (0%) | 2 (8.3%) | 0 (0%) | 1 (7.1%) | 0 (0%) |
| Optic.discs | 22 (37%) | 2 (40%) | 8 (33%) | 3 (33%) | 4 (29%) | 5 (63%) |
| Fields | 0 (0%) | 0 (0%) | 0 (0%) | 0 (0%) | 0 (0%) | 0 (0%) |
| Nystagmus | 7 (12%) | 1 (20%) | 3 (13%) | 2 (22%) | 0 (0%) | 1 (13%) |
| Ocular.Movement | 2 (3.3%) | 0 (0%) | 0 (0%) | 1 (11%) | 0 (0%) | 1 (13%) |
| Swallowing | 3 (5.0%) | 0 (0%) | 3 (13%) | 0 (0%) | 0 (0%) | 0 (0%) |
| 1 n (%); Median (25%,75%) | ||||||
| 2 Does the time difference between MRI acquisition and EDSS < two months | ||||||
If you want to save the created table, you can do this in one of two ways. The first is save it directly as a .docx file which should work most of the time. However, if you notice any formatting issues, change the save target file extension to .html, then open that in Word and you should be ok as well. An important point is to first save the table in your R script to an object - e.g.
tbl <- dat |> tbl_summary(...
The command to save the table as a Word (or html file is then):
gt::gtsave(as_gt(tbl), filename = "summary_table.docx", path = "...your_path.../")
3 Regression Table
gtsummary’s other strength is in making regression tables, and the relevant workhorse function here is tbl_regression().
Let’s say we’re interested in the association between Age onset and the presence of Sensory symptoms (I don’t really know whether this makes sense or not but it’s just to run a regression). The outcome variable here is binary, so we’ll need to specify a logistic regression model. We can do that as follows in R and we obtain the standard (fairly bland from the point of view of presentation/collaboration) ouput:
Call:
glm(formula = Sensory ~ Age.of.onset, family = "binomial", data = dat)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.75743 0.87101 -2.018 0.0436 *
Age.of.onset 0.02987 0.02641 1.131 0.2581
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 73.304 on 59 degrees of freedom
Residual deviance: 71.994 on 58 degrees of freedom
AIC: 75.994
Number of Fisher Scoring iterations: 4
Let’s pretty this up by passing the model results through tbl_regression():
| Characteristic | log(OR)1 | 95% CI1 | p-value |
|---|---|---|---|
| Age.of.onset | 0.03 | -0.02, 0.08 | 0.3 |
| 1 OR = Odds Ratio, CI = Confidence Interval | |||
Not bad, but we’d like the output to be in terms of odds-ratios rather than log odds-ratios. That’s actually quite simple to do:
| Characteristic | OR1 | 95% CI1 | p-value |
|---|---|---|---|
| Age.of.onset | 1.03 | 0.98, 1.09 | 0.3 |
| 1 OR = Odds Ratio, CI = Confidence Interval | |||
What if you want to include some model summary fit-statistics:
| Characteristic | OR1 | 95% CI1 | p-value |
|---|---|---|---|
| Age.of.onset | 1.03 | 0.98, 1.09 | 0.3 |
| Null deviance = 73.3; Null df = 59.0; Log-likelihood = -36.0; AIC = 76.0; BIC = 80.2; Deviance = 72.0; Residual df = 58; No. Obs. = 60 | |||
| 1 OR = Odds Ratio, CI = Confidence Interval | |||
That’s all great, but I’ve just noticed that the predictor variable isn’t formatted so well, so let’s change that.
Code
| Characteristic | OR1 | 95% CI1 | p-value |
|---|---|---|---|
| Age onset | 1.03 | 0.98, 1.09 | 0.3 |
| Null deviance = 73.3; Null df = 59.0; Log-likelihood = -36.0; AIC = 76.0; BIC = 80.2; Deviance = 72.0; Residual df = 58; No. Obs. = 60 | |||
| 1 OR = Odds Ratio, CI = Confidence Interval | |||
tbl_regression() supports almost any model you can throw at it.
4 Last Word
I hope you find both of these functions useful in your day-to-day coding and data analysis - they are great additions to your R toolkit, not only for their time-saving capabilities, but also the fantastic improvements to the visual style of results formatting that you can achieve, for which base R often falls far short.