Katie Kruzan


Andrew Downs

Data Scientist

Predictive Model Shows 8.45% of Tennessee Patients Are Trending Towards an Opioid Clinical Event

Data science reveals the potential for care managers to intervene with addiction patients before they are diagnosed, giving providers a head-start on intervention and reducing the opportunity that it turns to a life-threatening scenario.

Executive Summary

The United States is currently facing an epidemic of opioid abuse, dependence, and overdose.

  • In 2016, the number of overdose deaths due to prescription and illegal opioids was 5 times higher than in 1999.
  • From 1999 to 2016, more than 350,000 people died due to an overdose of opiates including prescription and illegal opioids (1).
  • On an average, 130 Americans die every day from an opioid overdose (2).
Figure 1: Estimated peak death counts by type compared to current drug overdose (OD) death counts (3).

This affects millions of citizens and early diagnosis of opioid dependency can help in better treatment options for this disease. Using predictive modeling, early indicators of the disease can be identified. If the disease is predicted before it comes to a stage of diagnosis, care management can be deployed and ultimately a Care Network can be optimized to save lives through early detection.

By using the Random Forest model, for a dataset of patients from Tennessee, we predicted that there would be another 8.45% patients who are trending towards opioid dependence as they remain undiagnosed with symptoms.


Opioid dependence is a problem that affects millions of citizens and impacts stakeholders across the healthcare sector including private, employer-sponsored and public health plans. Predictive modeling can be used to intervene earlier, improving the opportunities for successful case management, optimizing patient care.

Morphine and codeine are natural opioids. Oxycodone, hydromorphone, hydrocodone, and oxymorphone are semisynthetic opioids. Methadone, tramadol, and fentanyl are synthetic opioids. Heroin is an illegal opioid that is synthesized from morphine.

Analyzing Care Pathways

Opioid dependence is characterized by individuals showing significant levels of tolerance, they experience withdrawal after abrupt discontinuation of opioid substances and show the inability to quit (4).

The current issues of opioid abuse, dependence, and overdose can be traced back to the increased use of prescription opioids for pain management in the 1990s, increased deaths due to heroin in 2010, and a significant increase in the overdose deaths involving synthetic opioids in 2013.

For the dataset, all patients with an ICD-10 Diagnosis Code related to opioid dependence (ICD10CM - F11 Category) from a provider in Tennessee after January 1, 2016, were selected. The entire claim history of these patients was used since the time they were included in the dataset. Another comparable dataset was created with patients who did not have an ICD-10 code related to opioid dependence.

To obtain features for the Random Forest model, ICD-10 diagnosis and CPT procedure codes with the highest correlations to opioid dependence were first determined. The Random Forest Model uses an ensemble of decision trees to make a decision. They are generated by averaging many trees by splitting the data into two iteratively using a specific feature. This model was trained using the two previously mentioned datasets. The model confidence was verified using 10-fold cross-validation and verification on the patients of a different state.

This model was then used to predict opioid dependence in the patients and non-patients selected from Tennessee.

Opioid Model: Researching the claims in Tennessee.

1). The ratio of the two datasets selected for non-patients to patients was 52/48. Based on the amount of data, the training and validation datasets were divided in 70/30 split. From the ROC curve, the area under curve (AUC) is calculated as 0.896, and the Random Forest model was accepted with a model confidence value of 0.816.

2). For patients with opioid dependence, top features by the service lines were identified for opioid dependence.

3). When using a random sample of patients in Tennessee, 90.25% patients show no symptoms, 8.45% remain with their symptoms undiagnosed, 0.15% have an opioid dependence diagnosis, but they do not show any symptoms and 0.74% are diagnosed with opioid dependence.

By using CARE™ Disease Prediction, we can predict the onset of a clinical event before diagnosis. By researching the claims in Tennessee, we were able to predict that out of the 10,000 patients including 9,911 with no diagnosis and 89 with opioid dependence diagnosis, there would be another 845 patients who are trending towards opioid dependence.

Summary: We all win when the patient’s well-being is put at the center of the care delivery model.

As new technologies are introduced in the healthcare sector, artificial intelligence and predictive analytics appear to hold significant advantages for the improvement in care pathways. As we have demonstrated with the prediction of the onset of a clinical event related opioids, a significant number of lives are at stake.


(1)Seth P, Scholl L, Rudd RA, Bacon S. Increases and Geographic Variations in Overdose Deaths Involving Opioids, Cocaine, and Psychostimulants with Abuse Potential – United States, 2015-2016. MMWR Morb Mortal Wkly Rep. ePub: 29 March 2018 (2) NCHS, National Vital Statistics System, Estimates for 2017 and 2018 are based on provisional data. (3) Katz, J. (June 2017). Drug Deaths in America Are Rising Faster Than Ever. New York Times (4) American Psychiatric Association: Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition. Washington, DC, American Psychiatric Association, 1994.