Predicting the number of patients going to Skilled Nursing Facilities could be very useful for various stakeholders such as nursing home administrators and Medicare budget legislators. Narrowing down the samples by county gives SNF administrators the ability to forecast the number of patients they might receive on an annual basis. I created a linear regression model with R using ED disposition data throughout California as well as the California Department of Finance’s population prediction data throughout 2050. After filtering the disposition data by county and for SNF dispositions then filtering the population data by county as well as population above the age of 65, I performed an inner join on the two datasets. I was able to find a very strong correlation of .942 between the elderly population and discharges to skilled nursing facilities in LA County. I also tested for the same correlation in Sacramento County and yielded a .903.
Using ggplot2, I developed a scatter plot with SNF dispositions as my y-axis and the population aged over 65 as my x-axis. The result was moderately strong and positive. I ran a linear model on the variables and summarized it using the summary() function. The p-value shows that we can reject the the Null-Hypothesis in this instance. The coefficient of determination of 0.8883 tells us that 88 percent of the variation in SNF discharges can be explained by the 65+ population.
This gives me confidence to run predictions considering that we have good estimates of the population for the next 30 years however it is safest not to extrapolate too far outside of the model. The model predicts 16,024 patients dispositioned to SNFs in 2019 and 17,277 in 2020 for the county of Los Angeles.