Nov
28
Posted by karencoghlan on November 28th, 2018
Posted in: Commentary, Data
Tags: Data, data_science, eScience, RDM
We are currently living in the age of big data. Vast amounts of data are collected on everyone, everyday. The data may come from the phone, tracking where you are in an effort to connect to the closest tower. It might be in the security cameras or the software used by a child writing a paper to a teacher for a class project.
Assessment, according to Merriam-Webster’s Dictionary, is defined as the action or an instance of making a judgement about something. There is a growing trend to use data for assessment purposes, as a way to come to a conclusion. However, this should be done with caution. Interpretation has bias. It depends on the circumstances and prior reference point of view of the interpreter giving the explanation. If interpreted by a human, whatever that person learned in the past can color their present view. A computer can also have bias based in the programming of the language.
There was a recent article in the BBC titled, “The Trouble with Big Data? It’s Called the ‘Recency Bias.” The article did an excellent job of describing, ‘recency bias’, which is the tendency to assume that future events will closely resemble recent experience. As described from the article, “It’s the tendency to base your thinking disproportionately on whatever comes most easily to mind.” With the explosion of more and more data collected on each and every person with each new device the analysis becomes overwhelming. The moment you start looking backwards to analyze the bigger view, there is far too much recent data and far too little of the old to compare it to. Short-sightedness is built into the analysis structure, in the form of an overwhelming tendency to overestimate short-term trends at the expense of history and what has been accomplished in the past.
All this data makes research data management extremely crucial. It is a goal of the National Library of Medicine in the Strategic Plan to “accelerate discovery and advance health by providing the tools for data-driven research.” Many data sets, such as gene sequences and demographic data, are most useful when descriptions are complete. They need to be find-able, accessible and in a usable format. In an era of bigger and bigger data, we need to choose carefully. Just collecting the data without managing the data will overwhelm. When overcome with so much information we go back to out bias and what is easy – leading sometimes to selective amnesia. This brings us to the final point in the article, “that what you choose not to know matters just as much as what you do.”
References:
Chatfield, Tom. “Future – The Trouble with Big Data? It’s Called the ‘Recency Bias’.” BBC News, BBC, 5 June 2016, www.bbc.com/future/story/20160605-the-trouble-with-big-data-its-called-the-recency-bias?ocid=ww.social.link.email.
“A Platform for Biomedical Discovery and Data-Powered Health.” U.S. National Library of Medicine, National Institutes of Health, 26 Mar. 2018, www.nlm.nih.gov/pubs/plan/lrp17/NLM_StrategicReport2017_2027.html.