[Skip to Content]
Visit us on Facebook Visit us on FacebookVisit us on Linked In Visit us on Linked InVisit us on Twitter Visit us on TwitterVisit us on Facebook Visit us on InstagramVisit our RSS Feed View our RSS Feed
Region 5 Blog November 18th, 2024
CategoriesCategoriesCategories Contact UsContact Us ArchivesArchives Region/OfficeRegion SearchSearch

Feb

14

Date prong graphic

Love Your Data Week!

Posted by on February 14th, 2017 Posted in: Data Science, News From NNLM PNR


Welcome to Love Your Data Week 2017!  This “5-day international event to help researchers take better care of their data” has participants from all over the United States and also abroad, with everyone posting and tweeting about data (best practices, resources, etc.).  The PNR will be posting on our Facebook and Twitter pages, as well as here on the Dragonfly blog, about data issues and trends you may want to know about, whether or not you work directly with researchers.

Today’s topic is “Documenting, Describing and Defining Data” and we are pleased to re-post a behind-the-scenes look at how researchers define data quality, from the University of Washington Libraries’ Data Services “Data@Libs” blog. Enjoy!

“Today we’re highlighting the work of a University of Washington research lab, to demonstrate how one group of researchers define data quality.

Loma, Kaeli, and Jorge from the Avian Conservation Laboratory in the UW’s School of Environmental and Forest Sciences kindly agreed to answer a few questions about data quality in their field of research. Let us know your experiences with data quality by tweeting with the hashtag #LYD17 to @UWLibsData.

Provide a brief introduction to yourself and your lab/team:

Kaeli: I study the behavior of crows around dead crows (ethology/thanatology). Most other people in my lab also work on birds, but our individual studies, areas of research and methodologies vary greatly.”

Jorge: “I’m an international student from Chile working on the Avian Conservation Lab of John Marzluff at the School of Environmental and Forest Sciences.”

What does data look like in your area of research?

Kaeli: “My data is generally measurements of time (x seconds spent doing a particular thing or in a particular place) binary measurements (did or didn’t something occur) and count data such as the number of birds present or the number of times an action occurred.”

Jorge: “I have many different kinds of data. I have spatial data that includes locations and attributes of certain aspects of what individual animals I studied did on such places. I also have data on abundance of different bird species on the greater Seattle area.”

The message for today is: “Data quality is the degree to which data meets the purposes and requirements of its use. Depending on the uses, good quality data may refer to complete, accurate, credible, consistent or “good enough” data.” How would you define quality data in your field? Are there any standards for assuring data quality? How do you and your fellow researchers distinguish between quality data and questionable data?

Loma: “I’ve never thought of this before. I would assume that directly observable quantitative data would be considered better quality than qualitative data.”

Kaeli: “This is actually a really hard question. It would probably be really difficult for me to just look at someone’s data and determine if it was of poor quality. Perhaps if I was looking at their raw data sheets and noticed a lot of missing information, but otherwise the devil is in the methodological approach not necessarily the data itself. So I would question the data if say all but two data points were collected at a very specific time of day. Any standards for collecting quality data really come from both your field of study and what statistical methods you plan to use.”

Jorge: “For me, quality data is representative and unbiased. The typical standards have to do with the quantity of data to be able to perform relevant statistical tests, and the training of the people that collected the data.”
“For me, it’s not intuitive to detect bad data. Sometimes you see patterns emerge that don’t match what is expected, and that may help, but otherwise it is not that easy.”

How did you decide what to measure and how to gather the data in your research?

Loma: “I created a hypothesis for the question I was trying to answer, then thought about what I could measure that would allow me to refute or fail to refute that hypothesis. For example, I’m currently trying to figure out what certain vocalizations mean to a crow, so I measured a number of behaviors that are indicative of agitation, fear, aggression, and curiosity. That way, I can compare how often a crow gives those behaviors both before and after I play a certain call through a loudspeaker.”

Kaeli: “I mostly make it up as I go along. Which is kind of a joke and kind of not. Often I design and experiment based on what I think the most meaningful or robust measure of my question will be, but then once I get into the field I find out that doing it that way is actually impractical or impossible so I need to change it. So often in wildlife studies the answer to that question is that we try our best to guess what will work but ultimately we’re at the mercy of our study animal and the elements.”

Jorge: “I asked what would it be relevant to measure for the biological questions I was going to ask and what was it feasible to collect, given my logistical and budgetary constraints.”

Do you have processes in place for maintaining your data for future use and sharing?

Loma:I have my data backed up on two hard drives and the department cloud storage, and I’m willing to share it to anyone who asks so long as they convince me that I’d be included as an author/contributor on whatever they’re working on.”

Kaeli: “Yes but I don’t really use them, [to be honest]. I back up all my data on my computer, 3 hard drives and in dropbox. We’re supposed to also back them up on our lab’s server but I hardly ever do this!”

Jorge: “I keep my data on several places (like the data sheets where I collected it, and different hard drives) to ensure it’s safety. I’m not planning on sharing my data at this time.”

Thank you Loma, Kaeli, and Jorge for sharing your experience with data quality. We wish you the best of luck in your research!”

 

 

 

Image of the author ABOUT Ann Glusker


Email author View all posts by
Developed resources reported in this program are supported by the National Library of Medicine (NLM), National Institutes of Health (NIH) under cooperative agreement number UG4LM012343 with the University of Washington.

NNLM and NETWORK OF THE NATIONAL LIBRARY OF MEDICINE are service marks of the US Department of Health and Human Services | Copyright | HHS Vulnerability Disclosure | Download PDF Reader