[Skip to Content]
Visit us on Facebook Visit us on FacebookVisit us on Twitter Visit us on TwitterVisit our RSS Feed View our RSS Feed
SEA Currents May 30th, 2020
CategoriesCategoriesCategories Contact UsContact Us ArchivesArchives Region/OfficeRegion SearchSearch



Date prong graphic

Big Data Science: What Librarians Offer

Posted by on September 25th, 2018 Posted in: Data Science

In the NNLM Big Data in Healthcare: Exploring Emerging Roles course, we asked participants, as they progressed through the course to consider the following questions: Do you think health sciences librarians should get involved with big data in healthcare? Where should librarians get involved, if you think they should? If you think they should not, explain why. You may also combine a “should/should not” approach if you would like to argue both sides. NNLM will feature responses from different participants over the coming weeks.

Written by: Margaret Ansell, Nursing and Consumer Health Liaison Librarian, George A. Smathers Libraries, University of Florida, Gainesville, FL

Throughout the history of the profession, librarians have questioned the scope and breadth of their role.  With every new technology comes an opportunity for new services and a threat to old ones.  An example: thanks to the advent of electronic resources and searchable databases, librarians spend much less time retrieving materials for patrons now, and more training patrons how to retrieve materials themselves.  Each time a disruptive technology makes itself known, librarians have to collectively decide how to accommodate it.  Whether such accommodation is considered an evolution of the profession, or a mutation, depends very much on your perspective.  Faced with the disruption created by big data technologies, librarians, and medical librarians in particular, must decide how to accommodate it, and in what ways big data is both an opportunity for and a threat to our services.

Many librarians choose to view big data technologies as less of a disruptive technology and more of the same techniques/technologies currently being used, simply at a larger scale.  Data Management has always been an essential research skill, big data just makes the necessity more evident.  And while data management is a newer part of the average library’s service repertoire, it is overall well understood as a natural part of the library’s expertise, if you consider data as just another type of material that libraries can collect, organize, and preserve.  While the specific tools and techniques used to manage data require computer science skills beyond that of most public service librarians, it is not outside the realm of expertise of many technical services librarians and library information technology staff, who, in collaboration with an institution’s researchers, can create tools, repositories, and templates that ease the burden of the data management process.  The California Digital Library’s DMP Tool is perhaps the strongest example of what such collaborations can create.

However, I think that only viewing big data technologies through the lens of data management ignores entirely new potential opportunities for service and outreach.  As library data scientists like Lisa Federer demonstrate, big data is not simply the result of researchers using the same methods on a larger scale, but truly a new type of science, with new challenges.  It is similar to the revolution in evidence synthesis that occurred when systematic reviews emerged as a premier methodology – to conceive of systematic reviews as simply a more expansive kind of narrative review is to misunderstand fundamental differences in their nature.  Some examples of issues to big data approaches include: the creation and management of searchable, multi-institutional data repositories to support big data techniques; the ethics of the kind of surveillance/data gathering techniques required to create big data (this latest report on fitbit heart rate data is a prime example, particularly because it is not published in any academic journal); or whether current statistical methods are appropriate for the kinds of heterogeneous data sets common to big data.  Now, I don’t think that library science has the answer to all of these questions, or that we should be held responsible for answering even one of them.  What I am saying is that the values of librarianship – accessibility, transparency, and accuracy/rigor – give librarians an important perspective on big data initiatives that expertise in Python or R won’t necessarily bring.

Sadly, while I believe our perspective is valuable even without expertise in big data research techniques, I fear that the voice of librarians is likely to be ignored as irrelevant by researchers and administrators as repositories, tools, and analysis techniques are established, if our perspectives are the only things we have to offer.  Tangible skillsets and resources, of recognizable value to stakeholders in the big data process, may be the only way we will be given a seat at the big data decision-making table.  If nothing else, librarians must learn the language of big data, in order to be part of the conversation.

Image of the author ABOUT SEA Currents

Email author Visit author's website View all posts by

SEA CUrrents Archives 2006-Present

Subscribe to SEA Currents

Blog Categories

Funded under cooperative agreement number UG4LM012340 with the University of Maryland, Health Sciences and Human Services Library, and awarded by the DHHS, NIH, National Library of Medicine.

NNLM and NATIONAL NETWORK OF LIBRARIES OF MEDICINE are service marks of the US Department of Health and Human Services | Copyright | Download PDF Reader