Deep Learning Reuse For Science

Deep Learning Reuse For Science#

Written by Nicholas M. Synovic

Preface#

Computers enable scientists to perform their research at unprecedented speed and depth. To leverage this technology, computational scientists need to leverage software engineering processes, methods, and techniques to develop, modify, and incorporate the necessary scientific packages into their experiments. Existing literature has identified that computational scientists are not formally trained software engineers and that the latest software engineering methods are often excluded from their work. This “communication chasm” only continues to grow with the popularization of deep learning models and methods.

The field of deep learning is seemingly accelerating faster and faster with each passing day, making incorporating the latest advancements and discoveries difficult for engineers and scientists alike. Thus deep learning reuse methods, techniques, and libraries are often leveraged to incorporate these findings into software products. While deep learning has existed since the 1980’s, the aforementioned popularization of the technology has only happened within the past ten years. Thus the application of deep learning within the fields of computational Natural Science is nascent.

However, there is emerging evidence that computational Natural Scientists are embracing deep learning to further investigate biology, chemistry, physics, and environmental science. With the 2024 Nobel Prizes for both Chemistry and Physics being awarded to scientists for the creation of deep learning models and methods respectfully, the scientific community is expected to leverage these models to advance our knowledge. But as computational Natural Scientists are not software engineers, and software engineers are not computational Natural Scientists, the communication chasm that divides the two parties makes it difficult to convey effective methods that enable reusing deep learning methods and models to save time, cost, and energy. This book is an attempt at a living document that addresses this chasm and builds a bridge between the two parties enabling both to meet each others needs.

This book is for those looking to derive value from existing deep learning models. For students to learn how to approach and apply deep learning methods to their work. For educators to teach their students how to leverage deep learning to solve scientific problems. For researchers to further interigate their data with the latest technology.

Acknowledgements#

I’d like to acknowledge my fellow Ph.D. cohorts whose advise has only made my writing better, my Ph.D. advisors and committee members for their support and mentorship, my parents and siblings for their constant cheerleading, and my wife Melissa for her love and support.

Authors#

Nicholas M. Synovic does not like writing in the third person and will henceforth be writing in the first throughout the remainder of the document. I’m a Ph.D. student at Loyola University Chicago (LUC) studying Software Engineering of AI for Science. I’ve published work on software engineering process metrics, reusing deep learning methods within software engineering, the security of the deep learning supply chain, datasets on the proliferation of deep learning models within open-source software.

You can read more of my writing at my website: https://nicholassynovic.github.io