DataFlash: Data Indexers

Posted by Ann Madhavan on April 2nd, 2018 Posted in: Data Science

The Institute for Health Metrics and Evaluation (IHME) is “an independent population health research center at UW Medicine, part of the University of Washington, that provides rigorous and comparable measurement of the world’s most important health problems and evaluates the strategies used to address them.” Their mission is to improve the health of the world’s populations by providing the best information on population health, and to do so, IHME enlists the expertise of countless individuals, including researchers, data analysts, data scientists, and thirteen data indexers. What is a data indexer? Lyla Medeiros, a data indexer at IHME, shares more about her essential role below…

What is a data indexer? And how long have you been in the role?

Data indexers are part of a team responsible for providing librarian services to IHME. Data indexers not only catalog data for inclusion in the Global Health Data Exchange (GHDx), they also organize and maintain data files, provide reference services to IHME researchers, and search for and acquire new data sources. Data indexers are also responsible for creating documentation on cataloging practices, implementing improvements to process and workflows, reporting and testing technical issues that pop up in the GHDx for the Drupal development team, and managing controlled vocabularies and taxonomies, which includes researching and adding terms. I’ve been working as a data indexer for four years and three months.

What is your education/occupational background?

I earned a BA in Dance Studies and Art History at the State University of New York, Empire State College and a Masters of Library Science at Indiana University, Bloomington. Before becoming a librarian, I trained to become a classical ballet dancer and teacher. I’ve taught ballet in New York, New Mexico and here in Washington.

Who do you work with at IHME?

Outside of the data services team, I work with public health researchers, data analysts, Drupal developers, and student assistants.

IHME US Map Data Visualization

What types of data do you work with?

The data that IHME uses to create global health estimates comes in data file formats like .dta, .dbf, .sav, and Excel tables, Word documents, text files, .pdf documents and Access databases. When necessary, we digitize books and sometimes even microfiche. Right now, I primarily catalog health and demographic survey datasets and their related geospatial data. In the past, I’ve also worked on cataloging health statistics reports, epidemiological surveillance, and serial publications. Some other types of data we collect and catalog include vital registration, hospital discharges, censuses, disease registries and government health budgets.

What do you enjoy most about your job?

I most enjoy the variety of work. For example, today I did research on stroke in order to create new keywords and planned out how to retroactively apply the new keywords to existing records, searched for and cataloged new survey data, contacted a survey provider about missing variables in a data file, and worked on a presentation I’ll be giving to on our keyword taxonomy.

What advice would you give other librarians interested in working with data/in the field of data librarianship?

I am forever thankful for the classes I took in graduate school that focused on representation and organization, metadata and semantics, indexing, creating ontologies in RDF/RDFs (Resource Description Framework/Resource Description Framework Schema) and cataloging in XML. Those classes provided me with a solid foundation for the type of work I do as a data indexer.

I would like to sincerely thank Lyla for providing us with insight into a librarian role that is quite unique, and quite essential. If you would like to learn more about IHME, the GHDx, and many of their ground breaking projects and visualizations, please visit healthdata.org.

ABOUT Ann Madhavan

Email author View all posts by Ann Madhavan

Subscribe to all posts

Developed resources reported in this program are supported by the National Library of Medicine (NLM), National Institutes of Health (NIH) under cooperative agreement number UG4LM012343 with the University of Washington.

NNLM and NETWORK OF THE NATIONAL LIBRARY OF MEDICINE are service marks of the US Department of Health and Human Services | Copyright | HHS Vulnerability Disclosure | Download PDF Reader

DataFlash: Data Indexers

Archived Content

Subscribe to all posts

Blog Categories