Written by: Caroline Marshall, MLS, AHIP, Senior Medical Librarian, Public Services, Cedars-Sinai Medical Library, Los Angeles, CA
There is a great deal of discussion about Big Data. We all think other people are doing it, we think we should be doing it, but we are not sure how to get involved (Tattersall & Grant, 2016).
There have been Calls to Action (Martin, 2016) about Big Data and an affirmation in several studies that librarians should get involved. It is almost as if we are going to miss the Big Data train if we don’t jump on board right away. Big Data is not going away but we, as librarians, need to ascertain how involved we can get depending on staffing and time.
Librarian skills for Big data have been identified more or less along the following bullet points
Librarians are no strangers to Big Data and we often use these skills already; we use usage data in journal evaluation and renewals. We look at interlibrary loan data to ascertain how quickly we are turning requests around and as an indication of what journals we should purchase. We work with medical staff on citation management software teaching them how to manage, organize and share large quantities of citations for their publications. Librarians perform information curation such as creating digital archives and assigning metadata that will provide access points or cataloging different types of materials for easy retrieval. In-depth searching is something most of us do every day, defining the question or query to retrieve data is a common skill for many librarians.
Learning other skills such as Data Visualization, especially for some librarians who are mid-career, will mean outside workshops (Burton & Lyon, 2017) that will take away from our “regular” work and there is also the question of whether leadership will want to take us in this direction.
Burton & Lyon (2017) suggests librarians should be ‘Data Savvy’ but this is not a skill that can be taught. We cannot push roles onto staff that do not have the knowledge or the desire. Future Masters of Library Science Programs can incorporate more specific courses to create the data scientist librarian that can be part of the research team, but how will this look? How many projects can one person be embedded especially in an institution that has multiple research projects ongoing? Will that librarian be part of the library or employed by the research team?
I see the librarian’s role not as being embedded in a research team but more in a collaborative, instructional, and facilitation role. This includes teaching classes on statistical or visualization software, and giving guidance on designing the query or on the creation of a database that will need to answer not just the immediate queries, but other queries that the researcher may not have thought of that may come up in the future. We can also identify data repositories that researchers can use that are in our own institutions but that are not gathered in any one place or provide advice on digitization and preservation. We can act as sounding boards in a more consultative manner as opposed to just classes.
We cannot do everything and we need to be aware of staff, skills and time. Some of us are just getting our toes wet offering classes and so forth, but before scaling up to an institutional level we need to ascertain what we can offer and support.
Burton, M., & Lyon, L. (2017). Data Science in Libraries. Research Data and Preservation (RDAP) Review. Bulletin of the Association for Information Science and Technology. . Bulletin of the Association for Information Science and Technology, 43(4), 33-35.
Martin, E. R. (2016). The Role of the Librarian in Data Science. a Call to Action. Journal of eScience Librarianship, 4(2), E1092.
Tattersall, A., & Grant, M. J. (2016). Big Data – What is it and why it matters. Health Info Libr J, 33(2), 89-91. doi:10.1111/hir.12147