The NIH Big Data to Knowledge program is pleased to announce The BD2K Guide to the Fundamentals of Data Science, a series of online lectures given by experts from across the country covering a range of diverse topics in data science. This course is an introductory overview that assumes no prior knowledge or understanding of data science.
This is a joint effort of the BD2K Training Coordinating Center (TCC), the BD2K Centers Coordination Center (BD2KCCC), and the NIH Office of the Associate Director of Data Science.
When: each Friday at noon Eastern Time (9am Pacific) beginning September 9th, 2016.
Please join from your computer, tablet or smartphone: https://attendee.gotowebinar.com/register/341938597813942273 (updated 11/15/16)
You may also dial in using your phone.
United States : +1 (872) 240-3311
Access Code: 786-506-213
For up-to-date information about the series and to see archived presentations, go to: http://www.bigdatau.org/data-science-seminars.
9/9/16: Introduction to big data and the data lifecycle (Mark Musen, Stanford)
9/16/16: SECTION 1: DATA MANAGEMENT OVERVIEW (Bill Hersh, Oregon Health Sciences)
9/23/16: Finding and accessing datasets, Indexing and Identifiers (Lucila Ohno-Machado, UCSD)
9/30/16: Data curation and Version control (Pascale Gaudet, Swiss Institute of Bioinformatics)
10/7/16: Ontologies (Michel Dumontier, Stanford)
10/14/16: Metadata standards (Zachary Ives, Penn)
10/21/16: Provenance (Suzanne Sansone, Oxford)
10/28/16: SECTION 2: DATA REPRESENTATION OVERVIEW (Anita Bandrowski, UCSD)
11/4/16: Databases and data warehouses, Data: structures, types, integrations (Chaitan Baru, NSF)
11/11/16: No lecture ‹ Veteran¹s Day
11/18/16: Social networking data (TBD)
12/2/16: Data wrangling, normalization, preprocessing (Joseph Picone, Temple)
12/9/16: Exploratory Data Analysis (Brian Caffo, Johns Hopkins)
12/16/16: Natural Language Processing (Noemie Elhadad, Columbia)
1/6/17: SECTION 3: COMPUTING OVERVIEW (Dates tentative)
1/20/17: Programming and software engineering; API; optimization
1/27/17: Cloud, Parallel, Distributed Computing, and HPC
2/3/17: Commons: lessons learned, current state
2/10/17: SECTION 4: DATA MODELING AND INFERENCE OVERVIEW (Dates tentative)
2/17/17: Smoothing, Unsupervised Learning/Clustering/Density Estimation
2/24/17: Supervised Learning/prediction/ML, dimensionality reduction
3/3/17: Algorithms, incl. Optimization
3/10/17: Multiple testing, False Discovery rate
3/17/17: Data issues: Bias, Confounding, and Missing data
3/24/17: Causal inference
3/31/17: Data Visualization tools and communication
4/7/17: Modeling Synthesis
SECTION 5: ADDITIONAL TOPICS
4/14/17: Open science
4/21/17: Data sharing (including social obstacles)
4/28/17: Ethical Issues
5/5/17: Extra considerations/limitations for clinical data
5/19/17: SUMMARY and NIH context