Posted by Alan Carr on March 24th, 2020
Posted in: Advocacy, Data, Document Delivery, Emergency Preparedness and Response, NLM Products, Public Health, PubMed
Tags: coronavirus pandemic, Genbank, PubMed Central, Sequence Read Archive
The National Library of Medicine is working on multiple fronts to improve researchers’ understanding of SARS-CoV-2 (the virus that causes the novel coronavirus) and aid in the response to COVID-19 (the disease caused by the novel coronavirus). By enhancing access to relevant data and information, NLM is demonstrating how libraries can contribute in real time to research and response efforts during this crisis.
NLM is using PubMed Central®, its digital archive of peer-reviewed biomedical and life sciences journal literature currently providing access to nearly 6 million full-text journal articles, to expand access to full-text articles related to coronavirus. These activities build on recent requests from the White House Office of Science and Technology Policy (OSTP) and science policy leaders of other nations calling on the global publishing community to make all COVID-19-related research publications and data immediately available to the public in forms that support automated text-mining.
NLM has stepped up its collaboration with publishers and scholarly societies to increase the number of coronavirus-related journal articles in PMC, along with the available data supporting them. NLM is adapting its standard procedures for depositing articles into PMC to make it easier and faster to submit articles in machine-readable formats. NLM is also engaging with journals and publishers that do not participate in PMC but whose publications are within the scope of the Library’s collection. A growing number of publishers and societies are taking advantage of these flexibilities. Submitted publications are being made available as quickly as possible after publication for discovery in PMC and through the PMC Text Mining Collections for machine analysis, secondary analysis, and other types of reuse. A list of participating publishers and journals is available.
This enhanced collection of text-minable content enables AI and machine-learning researchers to develop and apply novel text-mining approaches that can help answer some of the many questions about coronavirus. Along these lines, NLM and leaders across the technology sector and academia joined OSTP on Monday, March 16, to announce the COVID-19 Open Research Dataset (CORD-19). Hosted by the Allen Institute for AI, CORD-19 is a free and growing resource that was launched with more than 29,000 scholarly articles about COVID-19 and the coronavirus family of viruses. CORD-19 represents the most extensive machine-readable coronavirus literature collection available for text mining to date. This dataset enables researchers to apply novel AI and machine learning strategies to identify new knowledge to help end the pandemic.
NLM’s other important resources in these efforts include: