R - Loading data from Google drive

Recently, my colleague contacted me to assist another colleague who was having trouble loading data stored on a Google drive account into R. I have never thought about using Google drive as a place to store data and then load it into the R environment. Normally, I store and load data from GitHub, but there are some limitations, particularly when the dataset is very large. Google drive might be an easy workaround to this limitation, so I decided to figure out how to make this work.

I posted this tutorial on my RPubs site.

Presentations with R Markdown - Part 3: Changing font colors

I continue my series on constructing a presentation using R Markdown with a new addition (Part 3) on font colors. We use colors to highlight text to enhance or draw attention to it. In this article, I provide code on how one can do this in R Markdown’s presentation using revealjs.

Part 3 of this series is available on my RPubs site.

Survival Analysis - Immortal Time Bias with Stata

I wrote a tutorial on how to handle immortal time bias with survival analysis using Stata. In the tutorial, I used a time-varying predictor for the grouping variable and assigned the period before exposure to the control group. This was inspired by the paper Redelmeier and Singh wrote on “Surival in Academy Award-Winner Actors and Actresses.” There was a lot of debate about the rigor of their analyses, and Sylvestre and colleagues re-analyzed the data with immortal time bias in mind. This tutorial uses data from Sylvestre and colleagues to re-create their results.

The tutorial is on my RPubs page. Data used for the tutorial is located on my GitHub page.

To load the data, you can use the Stata import command

import delimited "https://raw.githubusercontent.com/mbounthavong/Survival-analysis-and-immortal-time-bias/main/Data/data1.csv"

Constructing a Markov model for cost-effectiveness analysis using Excel: A tutorial

I wrote a tutorial on how to construct a Markov model using Excel, which is available on my RPubs site (link). This was meant to complement a workshop that I am preparing for trainees interested in pharmacoeconomics.

The Markov model is a versatile mathematical model that allows researchers to simulate a chronic disease for many years. It is unique due to its features such as disease states which can contain the costs and benefits associated with them.

I posted files associated with this tutorial on my GitHub Markov Model Tutorial respository (link). These include some readings to provide sufficient background and the Excel file with the Markov model example. To properly download these files, make sure to go to the “Raw” file and right clich on the “Raw” option then “Save link as” onto your computer. There is a detailed explanation on the RPubs tutorial.

This is a work in progress, so expect some updates in the future.

MEPS Tutorial - Some of my helpful notes

There are a lot of lessons that I’ve learned from using the Medical Expenditure Panel Survey (MEPS) data from the Agency for Healthcare Research and Quality (AHRQ). Some of these I learned after I made some mistakes and some I learned from other people. Overall, it’s a short but evolving note of the things that I’ve learned about MEPS and its nuances. I plan on updating this in the future as I expect to learn more new things. But for those who are interested in learning what I’ve learned, you can read my notes on my RPubs page, which is here.

MEPS tutorial on interrupted time series analysis in R

I wrote a short tutorial on how to perform an interrupted time series analysis in R. I had a challenging time working on this because I wasn’t familiar with all the nuances of the ITSA. More importantly, I wasn’t able to leverage my Stata skills to do this in R. I’m used to the Stata margins command, which is great for creating constrasts. R has its own version of the margins command, but it lacks some of Stata’s features such as the pwcompare, which I use a lot in Stata. However, I found a workaround with linear splines, and I have uploaded this to my RPubs site (link). I hope you find this useful. I also saved my R Markdown code on my GitHub site (link).

MEPS tutorials on linkage files and trend analysis

I create two MEPS tutorials recently. One is on the use of condition-event linkage files to capture the disease-specific costs. I used migraine as a motivating example. In this tutorial, I go through the steps to identify migraine-related costs assocaited with office-based visits and inpatient night stays. In the second tutorial, I review how to perform simple trend analysis with linear regressio models. I pooled MEPS data from 2016 to 2021 and apply the approriate primary sampling units and strata from the pooled file.

The first tutorial is located on my RPubs page (MEPS Tutorial 4 - Using condition-event link (CLNK) file: A case study with migraine). The R Markdown code to create the tutorial is located in my GitHub repository (link).

The second tutorial is also located on my Rpubs page (MEPS Tutorial 5 - Simple Trend Analysis with Linear Models). The R Markdown code to create the tutorial is located in my GitHub repository (link).

Interrupted time series analysis (ITSA) with Stata

Interrupted time series analysis (ITSA) is a study design used to study the effects of an intervention across time. An important feature of the ITSA is the time when the intevention occurs. The time before and after the intervention are of interest because we want to visualize if the trends are similar or different. Additionally, we want to visualize the change immediately after the intervention is implemenated. I call this period the index date.

In this article, I’ll review the single-group ITSA and multiple groups ITSA. Then I’ll review how to perform an ITSA in Stata.

You can view the complete tutorial on my RPubs site.

Exact matching using R - MatchIt package

Recently, I was asked to help create a matching algorithm for a retrospective cohort study. The request was to perform an exact match on a single variable using a 2 to 1 ratio (unexposed to exposed). Normally, I would use a propensity score match (PSM) approach, but the data did not have enough variables for each unique subject. With PSM, I tend to build a logit (or probit) model using variables that would be theoretically associated with the treatment assignment. However, this approach requires enough observable variables to construct these PSM models. For this request, there were a few variables for each subjects; the only variable available were the unique identifier, site, and a continuous variable.

This problem led to a tutorial on how to perform an exact match using the MatchIt package in R, which can be viewed here in my RPubs page.

In this tutorial, you will learn how to perform an exact match with a single variable using a hypothetical dataset with 30 subjects.