The FAIR principles have laid a foundation for sharing and publishing digital assets and, in particular, data. The FAIR principles emphasize machine accessibility and that all digital assets should be Findable, Accessible, Interoperable, and Reusable. Workflows encode the methods by which the scientific process is conducted and via which data are created. It is thus important that workflows both support the creation of FAIR data and themselves adhere to the FAIR principles.
In this working group, we seek to:
This working Group is open for anyone interested, please feel free to join the Workflows Community Initiative or attend one of our calls.
Adapted from the article FAIR Computational Workflows https://doi.org/10.1162/dint_a_00033:
Computational workflows describe the complex multi-step methods that are used for data collection, data preparation, analytics, predictive modelling, and simulation that lead to new data products.
Workflows can inherently contribute to the FAIR data principles: by processing data according to established metadata; by creating metadata themselves during the processing of data; and by tracking and recording data provenance.
These properties aid data quality assessment and contribute to secondary data usage. Moreover, workflows are digital objects in their own right.
We argue that FAIR principles for workflows need to address their specific nature in terms of their composition of executable software steps, their provenance, and their development.
This group is a gathering of community resources and literature on FAIR Computational Workflows. Feel free to suggest a change to help improve this page!
Carole Goble, Sarah Cohen-Boulakia, Stian Soiland-Reyes, Daniel Garijo, Yolanda Gil, Michael R. Crusoe, Kristian Peters, Daniel Schober (2020):
FAIR Computational Workflows.
Data Intelligence 2(1):108–121
https://doi.org/10.1162/dint_a_00033
Events, presentations and activities are also available in WorkflowHub in the FAIR Computational Workflows Team
Related past and upcoming events:
2023-02-21: FAIRPoints ‘Ask Me Anything’ webinar on Software & Workflows
Registries:
Related projects and initiatives supporting FAIR Computational Workflows aims:
Note that the similarly named project Implementing FAIR Workflows is about making the overall research process FAIR, not necessarily related to computational workflows.
Related standards for FAIR computational workflows:
Articles below are published as Open Access, or with green open access preprints where gold open access is not possible. Please let us know if you are unable to access any of our publications. To add to this list, please suggest a change.
Marine Djaffardjy, George Marchment, Clémence Sebe, Raphael Blanchet, Khalid Bellajhame, Alban Gaignard, Frédéric Lemoine, Sarah Cohen-Boulakia (2023):
Developing and reusing bioinformatics data analysis pipelines using scientific workflow systems.
Computational and Structural Biotechnology Journal 21
https://doi.org/10.1016/j.csbj.2023.03.003
Hirotaka Suetake, Tsukasa Fukusato, Takeo Igarashi, Tazro Ohta (2023):
Workflow sharing with automated metadata validation and test execution to improve the reusability of published workflows.
GigaScience 12:giad006
https://doi.org/10.1093/gigascience/giad006
Line Pouchard (2023):
FAIR enabling reuse of data-intensive workflows and scientific reproducibility.
ICPE ‘23 Companion: Companion of the 2023 ACM/SPEC International Conference on Performance Engineering
https://doi.org/10.1145/3578245.3586012
Fadoua Rafii, Horacio Gonzalez-Velez, Adriana E. Chis (2023):
Automatic FAIR Provenance Collection and Visualization for Time Series.
ICPE ‘23 Companion: Companion of the 2023 ACM/SPEC International Conference on Performance Engineering
https://doi.org/10.1145/3578245.3585026
Raül Sirvent, Javier Conejero, Francesc Lordan, Jorge Ejarque, Laura Rodríguez-Navas, José M. Fernández, Salvador Capella-Gutiérrez, Rosa M. Badia (2022):
Automatic, Efficient and Scalable Provenance Registration for FAIR HPC Workflows.
2022 IEEE/ACM Workshop on Workflows in Support of Large-Scale Science (WORKS)
https://doi.org/10.1109/WORKS56498.2022.00006
[preprint]
[slides]
Rudolf Wittner, Cecilia Mascia, Matej Gallo, Francesca Frexia, Heimo Müller, Markus Plass, Jörg Geiger, Petr Holub (2022):
Lightweight Distributed Provenance Model for Complex Real–world Environments.
Scientific Data 9(1)
https://doi.org/10.1038/s41597-022-01537-6
J. Machicao, A. Ben Abbes, L. Meneguzzi, P. L. P. Corrêa, A. Specht, R. David, G. Subsol, D. Vellenich, R. Devillers, S. Stall, N. Mouquet, M. Chaumont, L. Berti‐Equille, D. Mouillot (2022):
Mitigation Strategies to Improve Reproducibility of Poverty Estimations From Remote Sensing Images Using Deep Learning.
Earth and Space Science 9(8)
https://doi.org/10.1029/2022ea002379
Mahmood Shad, Ana Trisovic, Sonia Barbosa, Gustavo Durand, Katherine McNeill, Ceilyn Boyd, Robin Wendler, Julian Gautier, Tania Schlatter, Krista Valladares (2022):
Supporting FAIR Workflows at Harvard Data Commons.
17th International Digital Curation Conference (IDCCC 2022, 2022-06-13/–16.
https://doi.org/10.5281/zenodo.6640461
Michael R. Crusoe, Sanne Abeln, Alexandru Iosup, Peter Amstutz, John Chilton, Nebojša Tijanić, Hervé Ménager, Stian Soiland-Reyes, Carole Goble, CWL Community (2022):
Methods Included: Standardizing Computational Reuse and Portability with the Common Workflow Language.
Communications of the ACM 65(6)
https://doi.org/10.1145/3486897
Peter Wittenburg, Alex Hardisty, Amirpasha Mozzafari, Limor Peer, Nikolay Skvortsov, Alessandro Spinuso, Zhiming Zhao (2022):
Editors’ Note: Special Issue on Canonical Workflow Frameworks for Research.
Data Intelligence 4(2)
https://doi.org/10.1162/dint_e_00122
Stian Soiland-Reyes, Genís Bayarri, Pau Andrio, Robin Long, Douglas Lowe, Ania Niewielska, Adam Hospital, Paul Groth (2022):
Making Canonical Workflow Building Blocks interoperable across workflow languages.
Data Intelligence 4(2)
https://doi.org/10.1162/dint_a_00135
Alex Hardisty, Paul Brack, Carole Goble, Laurence Livermore, Ben Scott, Quentin Groom, Stuart Owen, Stian Soiland-Reyes (2022):
The Specimen Data Refinery: A canonical workflow framework and FAIR Digital Object approach to speeding up digital mobilisation of natural history collections.
Data Intelligence 4(2)
https://doi.org/10.1162/dint_a_00134
Amirpasha Mozaffari, Michael Langguth, Bing Gong, Jessica Ahring, Adrian Rojas Campos, Pascal Nieters, Otoniel José Campos Escobar, Martin Wittenbrink,
Peter Baumann, Martin G. Schultz (2022):
HPC-oriented Canonical Workflows for Machine Learning Applications in Climate and Weather Prediction.
Data Intelligence 4(2)
https://doi.org/10.1162/dint_a_00131
Peter Wittenburg, Alex Hardisty, Yann Le Franc, Amirpasha Mozaffari, Limor Peer, Nikolay A. Skvortsov, Zhiming Zhao, Alessandro Spinuso (2022):
Canonical Workflows to Make Data FAIR.
Data Intelligence 4(2)
https://doi.org/10.1162/dint_a_00132
Beatriz Serrano-Solano, Anne Fouilloux, Ignacio Eguinoa, Matúš Kalaš, Björn Grüning, Frederik Coppens (2022):
Galaxy: A Decade of Realising CWFR Concepts.
Data Intelligence 4(2)
https://doi.org/10.1162/dint_a_00136
Thomas Jejkal, Sabrine Chelbi, Andreas Pfeil, Peter Wittenburg (2022):
Evaluation of Application Possibilities for Packaging Technologies in Canonical Workflows.
Data Intelligence 4(2)
https://doi.org/10.1162/dint_a_00137
Hendrik Nolte, Philipp Wieder (2022):
Realising Data-Centric Scientific Workflows with Provenance-Capturing on Data Lakes.
Data Intelligence 4(2)
https://doi.org/10.1162/dint_a_00141
Nikolay A. Skvortsov, Sergey A. Stupnikov (2022):
A Semantic Approach to Workflow Management and Reuse for Research Problem Solving.
Data Intelligence 4(2)
https://doi.org/10.1162/dint_a_00142
Alessandro Spinuso, Mats Veldhuizen, Daniele Bailo, Valerio Vinciarelli, Tor Langeland (2022):
SWIRRL. Managing Provenance-aware and Reproducible Workspaces.
Data Intelligence 4(2)
https://doi.org/10.1162/dint_a_00129
Christian Ohmann, Romain David, Mónica Cano Abadia, Florence Bietrix, Jan-Willem Boiten, Steve Canham, Maria Luisa Chiusano, Walter Dastrù, Arnaud Laroquette, Dario Longo, Michaela Theresia Mayrhofer, Maria Panagiotopoulou, Audrey Richard, Pablo Emilio Verde (2022):
Pilot Study on the Intercalibration of a Categorisation System for FAIRer Digital Objects Related to Sensitive Data in the Life Sciences.
Data Intelligence 4(2)
https://doi.org/10.1162/dint_a_00126
Stian Soiland-Reyes, Peter Sefton, Mercè Crosas, Leyla Jael Castro, Frederik Coppens, José M. Fernández, Daniel Garijo, Björn Grüning, Marco La Rosa, Simone Leo, Eoghan Ó Carragáin, Marc Portier, Ana Trisovic, RO-Crate Community, Paul Groth, Carole Goble (2022):
Packaging research artefacts with RO-Crate.
Data Science 5(2)
https://doi.org/10.3233/DS-210053
Paul Brack, Peter Crowther, Stian Soiland-Reyes, Stuart Owen, Douglas Lowe, Alan R Williams, Quentin Groom, Mathias Dillen, Frederik Coppens, Björn Grüning, Ignacio Eguinoa, Philip Ewels, Carole Goble (2022):
10 Simple Rules for making a software tool workflow-ready
PLOS Computational Biology 18(3):e1009823
https://doi.org/10.1371/journal.pcbi.1009823
Neil P. Chue Hong, Daniel S. Katz, Michelle Barker; Anna-Lena Lamprecht, Carlos Martinez, Fotis E. Psomopoulos, Jen Harrow, Leyla Jael Castro, Morane Gruenpeter, Paula Andrea Martinez, Tom Honeyman; Alexander Struck, Allen Lee, Axel Loewe, Ben van Werkhoven, Catherine Jones, Daniel Garijo, Esther Plomp, Francoise Genova, Hugh Shanahan, Joanna Leng, Maggie Hellström, Malin Sandström, Manodeep Sinha, Mateusz Kuzak, Patricia Herterich, Qian Zhang, Sharif Islam, Susanna-Assunta Sansone, Tom Pollard, Udayanto Dwi Atmojo; Alan Williams, Andreas Czerniak, Anna Niehues, Anne Claire Fouilloux, Bala Desinghu, Carole Goble, Céline Richard, Charles Gray, Chris Erdmann, Daniel Nüst, Daniele Tartarini, Elena Ranguelova, Hartwig Anzt, Ilian Todorov, James McNally, Javier Moldon, Jessica Burnett, Julián Garrido-Sánchez, Khalid Belhajjame, Laurents Sesink, Lorraine Hwang, Marcos Roberto Tovani-Palone, Mark D. Wilkinson, Mathieu Servillat, Matthias Liffers, Merc Fox, Nadica Miljković, Nick Lynch, Paula Martinez Lavanchy, Sandra Gesing, Sarah Stevens, Sergio Martinez Cuesta, Silvio Peroni, Stian Soiland-Reyes, Tom Bakker, Tovo Rabemanantsoa, Vanessa Sochat, Yo Yehudi, FAIR4RS WG (2022):
FAIR Principles for Research Software version 1.0 (FAIR4RS Principles v1.0).
Research Data Alliance
https://doi.org/10.15497/RDA00068
Rafael Ferreira da Silva, Henri Casanova, Kyle Chard, Ilkay Altintas, Rosa M Badia, Bartosz Balis, Tainã Coleman, Frederik Coppens, Frank Di Natale, Bjoern Enders, Thomas Fahringer, Rosa Filgueira, Grigori Fursin, Daniel Garijo, Carole Goble, Dorran Howell, Shantenu Jha, Daniel S. Katz, Daniel Laney, Ulf Leser, Maciej Malawski, Kshitij Mehta, Loïc Pottier, Jonathan Ozik, J. Luc Peterson, Lavanya Ramakrishnan, Stian Soiland-Reyes, Douglas Thain, Matthew Wolf (2021):
A Community Roadmap for Scientific Workflows Research and Development.
arXiv:2110.02168 [cs.DC]
2021 IEEE Workshop on Workflows in Support of Large-Scale Science (WORKS), pp 81–90.
https://doi.org/10.1109/WORKS54523.2021.00016
Robin A Richardson, Remzi Celebi, Sven van der Burg, Djura Smits, Lars Ridder, Michel Dumontier, Tobias Kuhn (2021):
User-friendly Composition of FAIR Workflows in a Notebook Environment.
The Eleventh International Conference on Knowledge Capture (K-Cap2021).
https://arxiv.org/abs/2111.00831
Jeremy Leipzig, Daniel Nüst, Charles Tapley Hoyt, Karthik Ram, Jane Greenberg (2021):
The role of metadata in reproducible computational research
Patterns 2(1):100322
https://doi.org/10.1016/j.patter.2021.100322
Sveinung Gundersen, Sanjay Boddu, Salvador Capella-Gutierrez, Finn Drabløs, José M. Fernández, Radmila Kompova, Kieron Taylor, Dmytro Titov, Daniel Zerbino, Eivind Hovig (2021):
Recommendations for the FAIRification of genomic track metadata [version 1; peer review: 2 approved].
F1000Research 10(ELIXIR):268
https://doi.org/10.12688/f1000research.28449.1
Anna-Lena Lamprecht, Magnus Palmblad, Jon Ison, Veit Schwämmle,
Mohammad Sadnan Al Manir, Ilkay Altintas, Christopher J. O. Baker, Ammar Ben Hadj Amor, Salvador Capella-Gutierrez,
Paulos Charonyktakis, Michael R. Crusoe,
Yolanda Gil, Carole Goble, Timothy J. Griffin,
Paul Groth, Hans Ienasescu, Pratik Jagtap,
Matúš Kalaš, Vedran Kasalica, Alireza Khanteymoori,
Tobias Kuhn, Hailiang Mei, Hervé Ménager, Steffen Möller, Robin A. Richardson,
Vincent Robert, Stian Soiland-Reyes, Robert Stevens, Szoke Szaniszlo,
Suzan Verberne, Aswin Verhoeven, Katherine Wolstencroft (2021):
Perspectives on automated composition of workflows in the life sciences [version 1; peer review: 2 approved].
F1000Research 10:897
https://doi.org/10.12688/f1000research.54159.1
Rafael Ferreira da Silva, Henri Casanova, Kyle Chard, Dan Laney, Dong Ahn, Shantenu Jha, Carole Goble, Lavanya Ramakrishnan, Luc Peterson, Bjoern Enders, Douglas Thain, Ilkay Altintas, Yadu Babuji, Rosa Badia, Vivien Bonazzi, Taina Coleman, Michael Crusoe, Ewa Deelman, Frank Di Natale & Paolo Di Tommaso (2021):
Workflows Community Summit: Bringing the Scientific Workflows Community Together.
Workflows RI Technical Report. arXiv:2103.09181
https://doi.org/10.5281/zenodo.4606958
Carole Goble, Stian Soiland-Reyes, Finn Bacall, Stuart Owen, Alan Williams, Ignacio Eguinoa, Bert Droesbeke, Simone Leo, Luca Pireddu, Laura Rodriguez-Navas, José Mª Fernández, Salvador Capella-Gutierrez, Hervé Ménager, Björn Grüning, Beatriz Serrano-Solano, Philip Ewels, Frederik Coppens (2021):
Implementing FAIR Digital Objects in the EOSC-Life Workflow Collaboratory.
Zenodo
https://doi.org/10.5281/zenodo.4605654
Daniel S. Katz, Morane Gruenpeter, Tom Honeyman, Lorraine Hwang, Mark D. Wilkinson, Vanessa Sochat, Hartwig Anzt, Carole Goble, FAIR4RS Subgroup 1 (2021):
A Fresh Look at FAIR for Research Software.
arXiv:2101.10883 [pdf]
Carole Goble, Sarah Cohen-Boulakia, Stian Soiland-Reyes, Daniel Garijo, Yolanda Gil, Michael R. Crusoe, Kristian Peters, Daniel Schober (2020):
FAIR Computational Workflows.
Data Intelligence 2(1):108–121
https://doi.org/10.1162/dint_a_00033
Janno Harjes, Anton Link, Tanja Weibulat, Dagmar Triebel, Gerhard Rambold (2020):
FAIR digital objects in environmental and life sciences should comprise workflow operation design data and method information for repeatability of study setups and reproducibility of results.
Database 2020:baaa059
https://doi.org/10.1093/database/baaa059
Anna-Lena Lamprecht, Leyla Garcia, Mateusz Kuzak, Carlos Martinez, Ricardo Arcila, Eva Martin Del Pico, Victoria Dominguez Del Angel, Stephanie Van De Sandt, Jon Ison, Paula Andrea Martinez, Peter Mcquilton, Alfonso Valencia, Jennifer Harrow, Fotis Psomopoulos, Josep Ll. Gelpi, Neil Chue Hong, Carole Goble, Salvador Capella-Gutierrez (2020):
Towards FAIR principles for research software.
Data Science 3(1) pp. 37–59.
https://doi.org/10.3233/DS-190026
Farah Zaib Khan, Stian Soiland-Reyes, Richard O. Sinnott, Andrew Lonie, Carole Goble, Michael R. Crusoe (2019):
Sharing interoperable workflow provenance: A review of best practices and their practical application in CWLProv.
GigaScience 8(11):giz095
https://doi.org/10.1093/gigascience/giz095
Jeffrey M. Perkel (2019):
Workflow systems turn raw data into scientific knowledge
Nature 573
https://doi.org/10.1038/d41586-019-02619-z
Jeffrey M. Perkel (2018):
That’s the way we flow.
Nature 573 149-150.
https://doi.org/10.1038/d41586-019-02619-z
Natalie J Stanford, Finn Bacall, Fatemeh Zamanzad Ghavidel, Martin Golebiewski, Inge Jonassen, Rune Kleppe, Olga Krebs, Hadas Leonov, Stuart Owen, Kjell Petersen, Maja Rey, Stian Soiland-Reyes, Kidane Tekle, Andreas Weidemann, Alan Williams, Ulrike Wittig, Katy Wolstencroft, Anders Goksøyr, Jacky L. Snoep, Jon Olav Vik, Wolfgang Müller, Carole Goble (2018):
FAIR Bioinformatics computation and data management: FAIRDOM and the Norwegian Digital Life initiative.
NETTAB 2018 Network Tools and Applications in Biology.
[preprint]
[preprint server]
Gil Alterovitz, Dennis A Dean II, Carole Goble, Michael R Crusoe, Stian Soiland-Reyes, Amanda Bell, Anais Hayes, Anita Suresh, Charles Hadley S King IV, Dan Taylor, KanakaDurga Addepalli, Elaine Johanson, Elaine E Thompson, Eric Donaldson, Hiroki Morizono, Hsinyi Tsang, Jeet K Vora, Jeremy Goecks, Jianchao Yao, Jonas S Almeida, Konstantinos Krampis, Krista Smith, Lydia Guo, Mark Walderhaug, Marco Schito, Matthew Ezewudo, Nuria Guimera, Paul Walsh, Robel Kahsay, Srikanth Gottipati, Timothy C Rodwell, Toby Bloom, Yuching Lai, Vahan Simonyan, Raja Mazumder (2018):
Enabling Precision Medicine via standard communication of NGS provenance, analysis, and results.
PLOS Biology. 16(12):e3000099
https://doi.org/10.1371/journal.pbio.3000099
(bioXriv:191783)
Pablo Carbonell, Adrian J. Jervis, Christopher J. Robinson, Cunyu Yan, Mark Dunstan, Neil Swainston, Maria Vinaixa, Katherine A. Hollywood, Andrew Currin, Nicholas J. W. Rattray, Sandra Taylor, Reynard Spiess, Rehana Sung, Alan R. Williams, Donal Fellows, Natalie J. Stanford, Paul Mulherin, Rosalind Le Feuvre, Perdita Barran, Royston Goodacre, Nicholas J. Turner, Carole Goble, George Guoqiang Chen, Douglas B. Kell, Jason Micklefield, Rainer Breitling, Eriko Takano, Jean-Loup Faulon, Nigel S. Scrutton (2018):
An automated Design-Build-Test-Learn pipeline for enhanced microbial production of fine chemicals.
Communications Biology 1:66
https://doi.org/10.1038/s42003-018-0076-9
Stephen J Eglen, Ben Marwick, Yaroslav O Halchenko, Michael Hanke, Shoaib Sufi, Padraig Gleeson, R Angus Silver, Andrew P Davison, Linda Lanyon, Mathew Abrams, Thomas Wachtler, David J Willshaw, Christophe Pouzat, Jean-Baptiste Poline (2017):
Toward standard practices for sharing computer code and programs in neuroscience.
Nature Neuroscience 20, 770–773.
https://doi.org/10.1038/nn.4550 [bioRxiv preprint]
Steffen Möller, Stuart W. Prescott, Lars Wirzenius; Petter Reinholdtsen, Brad Chapman, Pjotr Prins, Stian Soiland-Reyes, Fabian Klötzl, Andrea Bagnacani, Matúš Kalaš, Andreas Tille, Michael R. Crusoe (2017):
Robust cross-platform workflows: how technical and scientific communities collaborate to develop, test and share best practices for data analysis.
Data Science and Engineering 2:232 pp 232–244.
https://doi.org/10.1007/s41019-017-0050-4
The FAIR Computational Workflows working group is composed of 13 members.
Join Working Group