[Skip to Content]
Visit us on Facebook Visit us on FacebookVisit us on Linked In Visit us on Linked InVisit us on Twitter Visit us on TwitterVisit us on Facebook Visit us on InstagramVisit our RSS Feed View our RSS Feed
Region 5 Blog April 26th, 2024
CategoriesCategoriesCategories Contact UsContact Us ArchivesArchives Region/OfficeRegion SearchSearch

Feb

17

Date prong graphic

Love Your Data Week, Day 5: Rescuing Unloved Data

Posted by on February 17th, 2017 Posted in: Data Science, News From NNLM PNR


How do data become unloved?  We data users don’t love data that are messy, poorly documented, incomplete, or unwieldy, to name just a few frustrations.  However, one important way that data become unloved is that they are just plain old.  Older data tend not to be machine-readable, which can pretty much be the kiss of death.  Digitization, while it’s improving, is still somewhat labor-intensive and costly, and so unless a data set is obviously worth the trouble, it may languish.

However, researchers are starting to explore whether there may be some hidden gems worth rescuing.  One area in which this is happening is climate data, and a great example is the Glacier Photograph Collection from the National Snow and Ice Data Center (NSIDC).  Before this collection was digitized, users had to travel to the NSIDC in Colorado, ask staff to find physical images or microfilm for them in the collection, and then deal with those physical artefacts.  Not surprisingly, the collection had few users.  However, digitizing these photographs (which can be considered data sources, as they contain information that can be analyzed) has made them not only accessible, but an important resource for documenting changes in glacier size and coverage.  Digitizing some of the old photographs also suggests locations for repeat photographs from the same vantage point, which can indicate changes across time periods.

PHOTO: Left: William O. Field, 1941; Right: Bruce F. Molnia, 2004. Muir Glacier: From the Glacier Photograph Collection. Boulder, Colorado USA: National Snow and Ice Data Center. Digital media.

But, using the above example is cheating a little bit; these photographs were unloved because they were undigitized, but it was clear that they were worth digitizing.  In fact, it was so clear that NSIDC was able to get funding and enter into partnerships to get that work done.  So, what if a researcher has a great idea, but needs sheer person-power to bring it to fruition?  These days, crowd-sourcing may do the trick!  Check out the Swiss project Data Rescue @ Home, in which citizen-volunteers are entering German climate data collected during WWII, and also have completed entering data from a weather station in the Solomon Islands collected in the early to mid-1900s.  By January 2014, they reported having digitized 1.3 million values!   They note: “The old data are expected to be very useful for different international research and reanalysis projects…[for example,] historical weather data from the Azores Islands are particularly valuable since the islands are located at the southern node of the most important climatic variability mode in the North Atlantic-European region, the so-called North Atlantic Oscillation (NAO), and there are not much other historical data available from the larger region.”

PHOTO: Example of data collected in the Solomon Islands, entered electronically by citizen-volunteers of the Data Rescue @ Home project (Accessed 2-13-17).

Interested in getting involved in a citizen-science project yourself? Here’s a list of possibilities!  And, if you really get hooked, you may want to dive into some collections of older non-digitized data and consider starting your own project, to rescue the unloved data and give them new life.

OK, I’m off now to figure out how to get on the project where I can hang out on the beach in New Jersey and count horseshoe crabs!

 

Image of the author ABOUT Ann Glusker


Email author View all posts by
Developed resources reported in this program are supported by the National Library of Medicine (NLM), National Institutes of Health (NIH) under cooperative agreement number UG4LM012343 with the University of Washington.

NNLM and NETWORK OF THE NATIONAL LIBRARY OF MEDICINE are service marks of the US Department of Health and Human Services | Copyright | HHS Vulnerability Disclosure | Download PDF Reader