Apr
30
Posted by Derek Johnson on April 30th, 2018
Posted in: Data Science
Tags: Big Data, Data Management, Data Science
In the NNLM Big Data in Healthcare: Exploring Emerging Roles course, we asked participants, as they progressed through the course to consider the following questions: Do you think health sciences librarians should get involved with big data in healthcare? Where should librarians get involved, if you think they should? If you think they should not, explain why. You may also combine a “should/should not” approach if you would like to argue both sides. NNLM will feature responses from different participants over the coming weeks.
Written by: Patricia L. Smith, Impact and Dissemination Librarian at Galter Health Sciences Library, Northwestern University, Chicago, IL
Big data in healthcare is a booming area with many facets and ample opportunities for library involvement. The question is not should librarians get involved, but how can librarians get involved? Librarians are natural stewards for big data—we have unique skills that we can leverage to assist researchers, particularly in citing data, data management, information ethics, and data visualization.
The most natural, and perhaps easiest, segue into big data for librarians is in the area of data citation. Researchers are expected to cite their sources—but what about data sets? Data sets are informing practice and are integral parts of the research process, but it is not yet standard practice to cite data. Due to this gap, it is very difficult to trace the use of this data, which hinders the overall research process. Librarians are already embedded in citation support. We teach classes on EndNote, RefWorks, and other bibliographic management software, and answer questions about citation styles and bibliographies. We are already poised to start conversations about the importance of citing data. Librarians can take the initiative create guides, classes, and other promotional material about how to cite data and why it is important. Furthermore, promoting the citation of data would help us track metrics and provide invaluable information about the impact, resonance, and reach of our researchers’ work. This is also an opportunity to promote depositing data sets in institutional repositories when appropriate. Finally, we also have relationships with vendors/publishers—this could open up additional conversations about indexing data sets in various databases.
Another area in which librarians are increasingly getting involved is in the area of research data management. Metadata librarians, electronic resources librarians, and data librarians are uniquely positioned to collect and appraise data, manage data collections and add appropriate metadata, and preserve data. We can help researchers with best practices for data structure, vocabularies, formats, and more.
Big data is not without controversy when it comes to privacy and ethics. Librarians have a history of exhibiting passion in the area of information ethics, so this seems like a natural partnership! Librarians can take the initiative to start conversations with the public about big data—what it is, what it is not, and why it could raise the proverbial ethical eyebrows. On the flip side, librarians can also have conversations with researchers about the public’s concerns surrounding big data. Researchers probably have the best intentions when it comes to using big data, but they need to be aware of why people might have concerns with privacy. Some hold the belief that “patients have a moral obligation to contribute to the common purpose of improving the quality and value of clinical care in the system.”[1] While I concur that participation in healthcare is crucial to moving the science forward, the phrase “moral obligation” might not be the best choice of words, especially from the perspective of skeptical patients, patients concerned with privacy, or patients from racial or ethnic groups that have historically been mistreated by the medical community. Librarians might be able to liaise between the public and researchers to help strengthen these partnerships, and help researchers communicate in the most effective ways.
Another way librarians can get involved in big data is by learning more about data visualization. Not all librarians have to learn R, or Python, or JavaScript, but having a basic knowledge of programming and speaking the language of data scientists will only help our position. There are many free resources to learn about data visualization, e.g. Sci2, Tableau Public, VOSviewer, and more. Presenting data in a visual format is a valued skill, and librarians can learn some basic skills to get a seat at the table.
Overall, there are many ways librarians can and should get involved in big data in healthcare. We must be confident about the skills we already possess and how they can translate to big data, and we must be proactive in marketing our knowledge.
References