Exploring longitudinal treatment sequences

A brief R + {reactable} pattern paired that makes drug or treatment sequences quickly searchable

16 minute read Published: 11 Feb, 2021

This post demonstrates a brief R pattern paired with reactable that makes drug or treatment patterns quickly searchable by an end-user via .html report.

# Install packages if you need to
install.packages(c("tidyverse", "reactable"))

library(tidyverse)
library(reactable)

Objective

I often need to summarize what lines of therapy were received by patients from large electronic health record (EHR) data. End users of these summaries (clinicians or research investigators) typically want to quickly know how many patients received a combination of drugs in certain sequences. I’ve found the following pattern of arranging treatment strings into a searchable reactable table gets the job done.

Simulated data for demonstration

Suppose you have data across 1000 patients with an average of 5 rows of therapy for each. The data are arranged in long format, and the first 15 rows look like this:

set.seed(8675309)

# number of patients
n <- 1000
# number of rows (several per patient)
n_rows <- n*5

ehr_data <- tibble(
  patient_id = sample(12345:(12345+n-1), size = n_rows, replace = TRUE),
  treatment = sample(
    c("Abiraterone",
      "Enzalutamide",
      "Docetaxel",
      "Lupron",
      "Prednisone"
      ),
    size = n_rows, replace = TRUE, prob = c(.3, .3, .1, .1, .2)
  ),
  treatment_date = sample(
    seq(as.Date("2016-01-01"), as.Date("2018-12-31"), by = "day"),
    size = n_rows, replace = TRUE
  )
)

# print the first 15 rows
ehr_data %>%
  arrange(patient_id, treatment_date) %>%
  head(15)

## # A tibble: 15 x 3
##    patient_id treatment    treatment_date
##         <int> <chr>        <date>        
##  1      12345 Prednisone   2016-02-05    
##  2      12345 Abiraterone  2017-10-02    
##  3      12345 Enzalutamide 2018-06-21    
##  4      12345 Docetaxel    2018-11-05    
##  5      12346 Enzalutamide 2016-01-02    
##  6      12346 Enzalutamide 2016-05-27    
##  7      12346 Abiraterone  2016-08-01    
##  8      12346 Abiraterone  2017-11-15    
##  9      12346 Lupron       2018-02-08    
## 10      12346 Enzalutamide 2018-03-18    
## 11      12346 Abiraterone  2018-08-12    
## 12      12346 Lupron       2018-09-27    
## 13      12347 Abiraterone  2017-04-30    
## 14      12347 Abiraterone  2018-06-25    
## 15      12348 Abiraterone  2016-08-11

Treatment sequences by patient

One view of the data that may be requested is of the type: “I want to know which patients received abiraterone in combination with other drugs. I also want to look at patterns of care for patients who received enzalutamide.” The following reactable table does this job: type abi or enza in the search filter box for treatment sequence, and boom!

ehr_data %>% 
  group_by(patient_id) %>% 
  arrange(treatment_date) %>% 
  summarise(
    treatment_sequence = paste(unique(treatment), collapse = ", "), 
    .groups = "drop_last"
  ) %>% 
  rename(
    "Treatment sequence" = treatment_sequence,
    "Patient ID" = patient_id
  ) %>%
  reactable(
    columns = list(
      "Patient ID" = colDef(minWidth = 100),
      "Treatment sequence" = colDef(minWidth = 300)
    ),
    filterable = TRUE
  )

Frequency of treatment sequences

Another common request is “What are the 5 most common treatment sequences?” or “Which treatment sequences have sufficient patient numbers that I can study them?” Adding a count() to the above code pattern and sorting the n column in the reactable table gets that job done quite nicely.

ehr_data %>% 
  group_by(patient_id) %>% 
  arrange(treatment_date) %>% 
  summarise(
    treatment_sequence = paste(unique(treatment), collapse = ", "), 
    .groups = "drop_last"
  ) %>%
  count(treatment_sequence, sort = TRUE) %>%
  rename("Treatment sequence" = treatment_sequence) %>%
    reactable(
    columns = list(
      "Treatment sequence" = colDef(minWidth = 400),
      n = colDef(minWidth = 100)
    ),
    sortable = TRUE, filterable = TRUE
  )

Final note

One thing to note is that concurrent or repeat treatments not handled separately in the above code patterns, and the tables only print a given drug once according to first date of receipt. For example, if a patient’s regime was Abiraterone + Prednisone → Enzalutamide → Abiraterone, that patient’s treatment sequence would show up as Abiraterone, Prednisone, Enzalutamide. There are certainly more rigorous ways to do this (list columns come to mind); however, I’ve found this sort of view goes a long way towards assessing study feasibility or honing attention to those patients with treatment regimes relevant to a particular research study.

Full code at: https://github.com/tgerke/treatment-sequences

Published by Travis Gerke on Thursday, February 11, 2021 in R and tagged Tips using 3212 words.