PDSG Speaker Series: Reproducible Workflows

Thursday, 9/7/2017, 6:00 - 7:00 PM

Towne 337

Click here to RSVP

We have invited Brett Beaulieu-Jones and Jeremy Leipzig to talk about their cutting-edge research. Come learn about their contributions to data science! Food will be provided.

6:00 Jeremy Leipzig, Sr. Scientist at Cytovas, LLC. and PhD student at the Metadata Research Center, College of Computing and Informatics, Drexel University

Abstract: Reproducible computational research (RCR) provides the keystone to the scientific method, packaging the transformation of raw data to published results in a manner than can be communicated to others. Developing RCR standards has been a growing concern of statisticians, data scientists, and informatics professionals. Metadata provides context and provenance to raw data, and is essential to both discovery and validation RCR. This presentation will give an overview for emerging metadata standards in data, analysis, pipelines tools, and publications.

6:30 Brett Beaulieu-Jones, Co-Founder of SyncOnSet and PhD student in Genomics and Computational Biology at UPenn

Abstract: Reproducibility and data sharing can greatly accelerate scientific discovery. The necessity for privacy can hamper both reproducibility and data sharing. This presentation will discuss how to address each of these things: 1.) how to design reproducible workflows (even with private data) using docker and continuous integration, and 2.) how to generate synthetic data that closely resembles the original data and can be used to train machine learning algorithms without sacrificing privacy.