Posted by Alan Carr on May 19th, 2020
Posted in: Announcements, Data, Electronic Health Records, Emergency Preparedness and Response, Funding, NLM Products, Public Health, PubMed
Tags: CARES Act, COVID19
Dr. Patti Brennan has announced that NLM has received $10 million as part of the Coronavirus Aid, Relief, and Economic Security (CARES) Act, which provides emergency funding for federal agencies to combat the coronavirus outbreak. The funding is being used to support activities to improve the quality of clinical data for research and care; accelerate research including phenotyping, image analysis, and real-time surveillance; and to enhance access to COVID-19 literature and molecular data resources. The following activities highlight many of the investments that NLM is making with this emergency funding.
The novel coronavirus is driving a need for standardized COVID-19 terminology and data exchange that will allow clinicians and scientists to communicate more effectively and consistently. NLM will use the supplemental funds to support the addition of codes for COVID-19-related laboratory tests within LOINC (Logical Observation Identifiers Names and Codes) and to provide implementation guidelines and training in use of the standards. NLM is also enabling sharing of COVID-19 terminology updates through the Value Set Authority Center (VSAC), which makes available value sets and clinical terminologies. Value sets are codes from standard terminologies around specific concepts or conditions and are used as part of electronic clinical quality measures or to define patient cohorts, classes of interventions, or patient outcomes. This important work will facilitate the analysis of electronic health record data and support effective and interoperable health information exchange.
NLM is updating terminology for coronavirus-related drugs and chemicals through resources such as the Medical Subject Headings (MeSH) used for indexing and cataloging biomedical literature, and ChemIDplus, a dictionary of over 400,000 chemicals (names, synonyms, and structures). This work aligns terminology to facilitate the identification of chemicals and drugs used to treat, detect, and prevent COVID-19 and other coronavirus-related infections, including severe acute respiratory syndrome (SARS), and Middle East Respiratory Syndrome (MERS).
NLM’s intramural research program is using virus genomics, health data, and social media data to identify community spread of COVID-19. Researchers are applying machine learning and artificial intelligence techniques to chest X-rays to differentiate viral pneumonia from bacterial pneumonia – expanding knowledge of the process of the SARS-CoV-2 viral infection and assisting in the identification of best practices for diagnosis and care of COVID-19 patients. NLM research in natural language processing contributed to development of LitCovid, a curated literature hub for tracking scientific publications about the novel coronavirus. It provides centralized access to more than 13,500 relevant articles in PubMed, categorizes them by research topic and geographic location, and is updated daily.
NLM’s extramural research program is focusing on novel informatics and data science methods to rapidly improve the understanding of the infection of SARS-CoV-2 and of COVID-19. In April, NLM issued two Notices of Special Interest (NOT-LM-010 and NOT-LM-011) seeking applications (due in June) in these areas: the mining of clinical data for ‘deep phenotyping’ (gathering details about how a disease presents itself in an individual, fine-grained way) to identify or predict the presence of COVID-19; and public health surveillance methods that mine genomic, viromic, health data, environmental data or data from other pertinent sources such as social media, to identify spread and impact of SARS-Cov-2.
NLM is also improving access to published coronavirus literature via PubMed Central (PMC). In response to a call by science and technology advisors from a dozen countries to have publishers and scholarly societies make their COVID-19 and coronavirus-related publications immediately accessible in PMC, along with the available data supporting them, nearly 50 publishers have deposited more than 46,000 coronavirus-related articles in PMC with licenses that allow re-use and secondary analysis. Articles in the collection have been accessed more than 8 million times since March 18. NLM will use supplemental funds to improve the article-submission system to better accommodate publisher submissions and accelerate release of these critically important articles. On the PubMed side of literature offerings, NLM supplemental funds will support integrating LitCovid metadata. Novel sensors are being developed to leverage LitCovid metadata when directing users to curated COVID-19 content. The new infrastructure will permit PubMed to rapidly add additional disease-specific sensors in the future.
As of May 7, NLM’s GenBank resource has 3,893 SARS-CoV-2 sequences from 42 different countries that are publicly available. NLM created a special site, the “Severe acute respiratory syndrome coronavirus 2 data hub,” where people can search, retrieve, and analyze sequences of the virus that have been submitted to the GenBank database. In late March, NLM joined the CDC-led SPHERES consortium, a national genomics consortium which aims to coordinate U.S. SARS-CoV-2 sequencing efforts and make data publicly available in NLM’s GenBank and Sequence Read Archive (SRA), and other appropriate repositories. Supplemental funds will allow GenBank to further enhance the submission workflow, establish and promote use of metadata sample standards, and develop a fully automated SARS-CoV-2 submission workflow that incorporates quality checks, as well as ‘automated curation’, to provide standardized annotation of the SARS2 genomes submitted to GenBank.
SRA is positioned as a ready-made computational environment for public health surveillance pipelines and tool development. SRA metagenomic datasets from both environmental samples and patients diagnosed with COVID-19 can reveal patterns of co-occurring pathogens, newly emerging outbreaks, and viral evolution. NLM supplemental funds are being used to prototype SRA cloud-based analysis tools to search the entirety of the SRA database. These tools can provide efficient search for SARS-CoV-2, identify genetic patterns, and monitor newly submitted data for specific viral patterns.
NLM supplemental funding also supports the identification and selection of web and social media content documenting COVID-19 as part of NLM’s Global Health Events web archive collection. This content documents life in quarantine, prevention measures, the experiences of health care workers, patients, and more. NLM is also participating as an institutional contributor to a broader International Internet Preservation Consortium (IIPC) Novel Coronavirus outbreak web archive collection.