[Skip to Content]
Visit us on Facebook Visit us on FacebookVisit us on YouTube Visit us on YouTubeVisit us on Twitter Visit us on TwitterVisit our RSS Feed View our RSS Feed
The MARquee July 15th, 2020
CategoriesCategoriesCategories Contact UsContact Us ArchivesArchives Region/OfficeRegion SearchSearch



Date prong graphic

Reflections on Librarianship and Big Data

Posted by on November 1st, 2017 Posted in: Data Science

In the NNLM Big Data in Healthcare: Exploring Emerging Roles course, we asked participants, as they progressed through the course to consider the following questions: Do you think health sciences librarians should get involved with big data in healthcare? Where should librarians get involved, if you think they should? If you think they should not, explain why. You may also combine a “should/should not” approach if you would like to argue both sides. NNLM will feature responses from different participants over the coming weeks.

Written By Margaret (Peg) Burnette, Assistant Professor & Biomedical Sciences Librarian, University of Illinois at Urbana-Champaign

The world of librarianship is changing at what seems to be an ever-increasing rate. The librarian’s role has evolved from information organization and access to the provision of specialized services related to information and data quality, management, analysis, and application. Big data is here to stay and permeates both our professional and personal lives. In the era of digital content and libraries without walls, librarians grapple with new challenges in order to remain productive and relevant. And while users may no longer need help finding information, many likely need help with evaluation and management of increasingly large amounts of information and data.

In many ways, the demands of big data are the same as for small data. These demands afford opportunities for librarians that naturally complement librarians’ expertise. Traditional organization and classification skills are still needed to help researchers find, wrangle, and share research and data products of all kinds. More specialized skills, such as statistical or analytical expertise, subject or technical expertise, or advanced computer skills (coding, etc.), enhance the ability to provide highly sought after services that complement the research and education enterprise.

Despite these opportunities, librarians often lack the skills necessary to support research data in a holistic way. Libraries need to plan carefully to match services with librarian competencies and implement strategies to fill gaps. The research and data lifecycles may provide useful frameworks for determining and developing services. For example, an institution might decide to focus on the identification, procurement and application of existing data. Another might focus on infrastructure for data storage solutions which can be a huge challenge for researchers, particularly for big data initiatives. Support for data analysis and data visualization are additional support areas that researchers clamor for. SPSS and R are familiar tools but few have the skills necessary to provide robust support. The immersion that is necessary for mastery of tools like these is simply not realistic for librarians who often wear multiple hats.

A second framework that librarians might consider is big data’s five “Vs”. The Volume of data being produced can benefit from librarian expertise in the areas of organization, security, and storage options. Libraries that are not equipped to offer storage solutions can nonetheless provide information about options and respective implications. Velocity affords opportunities for librarian expertise in the areas of organization, access, and retrieval. For example, librarians can leverage expertise in controlled vocabularies and metadata for data mining projects. Additionally, librarians can apply organizational acumen to help wrangle the Variety of data, both structured and unstructured. Veracity of information is a mainstay of librarianship and data quality is no different. And finally, librarian contributions to data management, curation, and sharing strategies can contribute significantly to the Value of that data.

Ultimately, with all of these opportunities, it is vital to consider data services within the larger institutional context. Some of the services that libraries consider may be provided by other entities such as offices of research or IT units. Coordination is vital to ensure seamless and integrated services streams, shared and complementary responsibilities, and unified goals.

Image of the author ABOUT Hannah Sinemus
Hannah Sinemus is the Web Experience Coordinator for the Middle Atlantic Region (MAR). Although she updates the MAR web pages, blog, newsletter and social media, Hannah is not the sole author of this content. If you have questions about a MARquee or MAReport posting, please contact the Middle Atlantic Region directly at nnlmmar@pitt.edu.

Email author View all posts by
This project is funded by the National Library of Medicine, National Institutes of Health, Department of Health and Human Services, under Cooperative Agreement Number UG4LM012342 with the University of Pittsburgh, Health Sciences Library System.

NNLM and NETWORK OF THE NATIONAL LIBRARY OF MEDICINE are service marks of the US Department of Health and Human Services | Copyright | Download PDF Reader