The increased attention to managing research data during the past few years, along with the adoption of national and international policies for data archiving and sharing, has made data curation a hot topic! The latest webinar from the National Information Standards Organization (NISO), Metadata for Managing Scientific Research Data, provides an overview of why metadata is important, what to consider when selecting a metadata standard, and tips for getting involved with data curation for the first time.
Jane Greenberg, Professor at the School of Information and Library Science (SILS) at the University of North Carolina at Chapel Hill, and Director of the SILS Metadata Research Center, was featured as the guest presenter for the one-hour webinar. Dr. Greenberg has researched and published extensively in areas such as metadata best practices, ontology research, the semantic web, data repositories, and scientific data curation, with funding from the Institute of Museum and Library Services, National Science Foundation, and the National Institutes of Health.
The following are highlights from Dr. Greenberg’s webinar. Visit the NISO website for additional event information, including the presentation slides.
Why is metadata for data important?
Dr. Greenberg highlighted the rapid increase in scientific data and its increasingly important role in the scientific process. What has started as an issue of interest to a select few has captured the national attention through the popular media – the New York Times and Economist have published essays on “big data” and their wide ranging effects. Policy organizations are also placing increasing pressure on scientists to document and share their research data, for example the National Science Foundation and the National Institutes of Health have policies requiring data management plans and documentation. In many ways, information professionals are taking the lead in facilitating this recent emphasis on data management.
Lots of metadata schemes – How do I choose the best one for my needs?
Data curation requires a metadata standard that meets the needs of the data, the institution, and the repository. Dr. Greenberg acknowledged that there are a myriad of different metadata standards, ranging from the simple (Darwin Core, Access to Biological Collections Data, and Ecological Metadata Language), to the complex (FGDC, DDI), and that there is no absolute formula for selecting the best standard for any particular collection. Generally speaking, the simplicity of the metadata standard is inversely proportional to the range of future purposes the metadata may facilitate. Dr. Greenberg recommends planning an adequate amount of time to research the options, and ultimately pursuing a two-pronged approach; choose a method that works for your data collection, but also further the common good by considering interoperability with outside data standards.
How do I stay informed?
Health sciences librarians have the opportunity to play a leading role in scientific data curation, by joining working groups, attending conferences, and reviewing the literature. Dr. Greenberg concluded the webinar by reminding us that standards are guidelines, not police, and metadata standards should be kept as simple as possible to aid reuse, and greater connectivity with the wider world of collected data.