Preface

Science is one of humanity’s greatest inventions. Academia, on the other hand, is not. It is remarkable how successful science has been, given the often chaotic habits of scientists. In contrast to other fields, like say landscaping or software engineering, science as a profession is largely unprofessional - apprentice scientists are taught less about how to work responsibly than about how to earn promotions. This results in ubiquitous and costly errors. Software development has become indispensable to scientific work. I want to playfully ask how it can become even more useful by transferring some aspects of its professionalism, the day-to-day tracking and back-tracking and testing that is especially part of distributed, open-source software development. Science, after all, aspires to be distributed, open-source knowledge development.

“Science as Amateur Software Development” Richard McElreath (2020)

https://youtu.be/zwRdO9_GGhY

Inspired by McElreath’s words, this book aims to describe programming good practices and introduce common tools used in software development to guarantee the reproducibility of analysis results. We want to make scientific research an open-source knowledge development.

The book is available online at https://arca-dpss.github.io/manual-open-science/.

A PDF copy is available at https://arca-dpss.github.io/manual-open-science/manual-open-science.pdf.

Book Summary

In the book, we will learn to:

  • Share our materials using the Open Science Framework
  • Organize project files and data in a well structured and documented Repository
  • Write readable and maintainable code using a Functional Style approach
  • Use Git and GitHub for tracking changes and managing collaboration during the development
  • Use dedicated tools for managing the Analysis Workflow pipeline
  • Use dedicated tools for creating Dynamic Documents
  • Manage project requirements and dependencies using Docker

As most researchers have no formal training in programming and software development, we provide a very gentle introduction to many programming concepts and tools without assuming any previous knowledge.

Examples and specific applications are based on the R programming language. However, this book provides recommendations and guidelines useful for any programming language.

About the Authors

During our careers, we both moved into the field of Data Science after a Ph.D. in Psychological Sciences. This book is our attempt to bring back into scientific research what we have learned outside of academia.

  • Claudio Zandonella Callegher (). During my Ph.D., I fell in love with data science. Understanding the complex phenomena that affect our lives by exploring data, formulating hypotheses, building models, and validating them. I find this whole process extremely challenging and motivating. Moreover, I am excited about new tools and solutions to enhance the replicability and transparency of scientific results.
  • Davide Massidda (). I am a data scientist specialized in psychometrics, trained at the University of Padua with a Ph.D. in experimental psychology. After my academic career and several years as a freelancer, I currently work in the private field. I deal with descriptive and predictive models using statistics and machine learning, developing web-based systems for human interaction with data, and automating statistical processing.

ARCA

ARCA courses are advanced and highly applicable courses on modern tools for research in Psychology. They are organised by the Department of Developmental and Social Psychology at the University of Padua.

Contribute

If you think something is missing, something should be described better, or something is wrong, please, feel free to contribute to this book. Anyone is welcome to contribute to this book.

This is the hearth of open-source: contribution. We will understand the real value of this book not by the number of people that will read it but by the number of people who will invest their own time trying to improve it.

For typos (the probability of typos per page is always above 1) just send a pull request with all the corrections. Instead, if you like to add new chapters or paragraphs to include new arguments or discuss more in detail some aspects, open an issue so we can find together the best way to organize the structure of the book.

View book source at GitHub repository https://github.com/arca-dpss/manual-open-science.

Cite

For attribution, please cite this work as:

Zandonella Callegher, C., & Massidda, D. (2022). The Open Science Manual: Make Your Scientific Research Accessible and Reproducible. Available at https://arca-dpss.github.io/manual-open-science/. https://doi.org/10.5281/zenodo.6521850

BibTeX citation:

@book{zandonellaMassiddaOpenScience2022,
  title = {The Open Science Manual: Make Your Scientific Research Accessible and Reproducible},
  author = {Zandonella Callegher, Claudio and Massidda, Davide},
  date = {2022},
  url = {https://arca-dpss.github.io/manual-open-science/},
  doi = {10.5281/zenodo.6521850}
}

License

This book is released under the CC BY-NC-SA 4.0 License.

This book is based on the ARCA Bookown Template released under CC BY-SA 4.0 License.

The icons used belong to rstudio4edu-book and are licensed under under CC BY-NC 2.0 License.

References

Richard McElreath (Director). (2020, September 26). Science as Amateur Software Development. https://www.youtube.com/watch?v=zwRdO9_GGhY