08-17, 15:00–15:50 (US/Eastern), Little Theatre
Over the past several decades, the scientific process has relied more and more on computational analysis of data to produce digital artifacts. Fields like molecular biology, neuroscience, linguistics, and astrophysics, to name a few, have been revolutionized by this trend to the point that computational workflows are ubiquitous. Although most of these workflows are very similar at a high level - collecting data, analyzing it with code, and publishing the resulting figures - implementation details differ widely.
While there exist standards such as the FAIR Guiding Principles for organizing and sharing data, there are not widely adopted standards for reliably regenerating analyses from said data, especially across compute environments. This talk presents an open framework for archived, reproducible, and transparent science (ARTS) that aims to do exactly this - by packaging data, code, and figures in containers and uploading it to a persistent, trusted, and accessible archive.
Scientific results (especially those funded by the public!) should be accessible—not just to scientists, but to a wider audience as well. Papers with tables and figures have long been the main output of scientific efforts, but recently the importance of sharing the data and code required to generate these outputs has become clear. The ARTS open framework gives guidance around how to package scientific results and archive them in a way that makes it as easy as possible for anyone to reproduce a paper even decades after publication.
Sabar Dasgupta is an electrical and software engineer based in Queens, New York. They currently volunteer at Stanford University and build tools for researchers related to data collection and automation. They are interested in self-hosting, hybrid cloud infrastructure, and sharing repair skills.