By Wendy S. Meyer
(April 2021) One of the most remarkable inventions begat by the pandemic is the National COVID Cohort Collaborative (N3C).
The scientists and researchers behind the N3C plan to turn massive amounts of already available data into new knowledge to study COVID-19 and identify potential treatments. In just over six months, the N3C was launched, made available to biomedical researchers, and has already produced its first publication.
Tell Bennett, MD, associate professor of pediatrics in the University of Colorado School of Medicine and director of Informatics in the Colorado Clinical and Translational Sciences Institute (CCTSI), has been helping to lead the N3C nationally. He is also the first author on the first paper to be published from N3C data, “The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction.”
How the Study Worked
The N3C Data Enclave is a secure cloud-based data and computing environment designed to facilitate virtual access to clinical data provided by hospitals nationwide. The Anschutz Medical Campus contributes data from Children’s Hospital Colorado and UCHealth. Data are updated up to two times per week and have been standardized and harmonized to one common data model to generate efficient and minimally biased results.
Using the N3C data enclave, researchers analyzed electronic medical record data from more than 1.9 million patients from 34 medical centers nationwide. Today, Bennett says data from approximately 3 million patients can be found in the N3C enclave, which will continue to grow over time as patient data continue to be added. In this retrospective cohort study, Bennett and his co-authors focused on more than 174,000 adults with COVID-19. They stratified patients using a World Health Organization COVID-19 severity scale and demographics. They then evaluated differences between groups over time, using multivariable logistic regression, establishing vital signs and laboratory values among COVID-19 patients with different severities, providing the foundation for predictive analytics.
Bennett says there were three main goals of the paper: “The first was to characterize the N3C cohort to introduce people to it. The second was to show the richness of data available in N3C about hospitalized patients. And last, we used rich inpatient data and machine learning (ML) to build a severity prediction model from the first day they [patients] were in the hospital.”
What They Found
Of the patients with a positive COVID test, 32,472 (18.6%) were hospitalized. The median length of hospital stay was 5 days. Mortality (including discharge to hospice) was 11.6% among hospitalized patients. Others have reported that inpatient mortality has decreased over time. The study confirmed this: inpatient mortality decreased from 16.4% in March and April to 8.6% in September and October. Their data also showed that clinical severity shifted toward less invasive mechanical ventilation and/or ECMO as the pandemic has progressed. Moreover, the study validated the ML predictions when tested against the actual data.
Promise for the Future
Bennett says the ML models have the potential to be useful to clinicians treating patients in the hospital when paired with electronic health records. “These models tell us about the most powerful predictors of severity. If health systems decided to implement these models in the background, they could be surfaced and made available to physicians in the electronic health record,” he says. Another way the ML models could be used is by providing clinicians with a ranked list of variables that predict severity for each patient, which could potentially help clinicians make decisions.
“The N3C project is exciting to me because it merges the two halves of my work life. My ICU experience and direct experience taking care of patients with COVID-19 has been important to making sure the work I was doing in N3C was relevant and clinically meaningful. With a cloud-based enclave and very large data and complex data structures, it takes informaticists to do effective work in that space. Having a foot in both camps has been really useful,” Bennett says.
Bennett hopes his colleagues at CU Anschutz will take advantage of N3C. Current projects run the gamut from social determinants of health to machine learning on laboratory results. He says: “People are approaching the data from different angles and clinical domains. As examples, there are teams working on the effect of COVID-19 on people who are immunosuppressed, those who have cancer and those who have diabetes.”
He says the next project he is eager to tackle relates to children and COVID-19. “We are waiting for a little more data to accumulate, but I think that a national level analysis of the effects of COVID-19 on kids will be an important contribution.”