FAIR Computational Workflows

  12 members  Establishing Join   

The FAIR principles have laid a foundation for sharing and publishing digital assets and, in particular, data. The FAIR principles emphasize machine accessibility and that all digital assets should be Findable, Accessible, Interoperable, and Reusable. Workflows encode the methods by which the scientific process is conducted and via which data are created. It is thus important that workflows both support the creation of FAIR data and themselves adhere to the FAIR principles.


Goals

In this working group, we seek to:

  • Define FAIR principles for computational workflows that consider the complex lifecycle from specification to execution and data products
  • Define metrics to measure the FAIRness of a workflow
  • Define recommendations for FAIR workflow developers and systems
  • Define processes to automate FAIRness in workflows by recording necessary provenance data

What are FAIR Computational Workflows?

Adapted from the article FAIR Computational Workflows https://doi.org/10.1162/dint_a_00033:

Computational workflows describe the complex multi-step methods that are used for data collection, data preparation, analytics, predictive modelling, and simulation that lead to new data products.

Workflows can inherently contribute to the FAIR data principles: by processing data according to established metadata; by creating metadata themselves during the processing of data; and by tracking and recording data provenance.

These properties aid data quality assessment and contribute to secondary data usage. Moreover, workflows are digital objects in their own right.

We argue that FAIR principles for workflows need to address their specific nature in terms of their composition of executable software steps, their provenance, and their development.

This group is a gathering of community resources and literature on FAIR Computational Workflows. Feel free to suggest a change to help improve this page!

Cite FAIR Computational Workflows

Carole Goble, Sarah Cohen-Boulakia, Stian Soiland-Reyes, Daniel Garijo, Yolanda Gil, Michael R. Crusoe, Kristian Peters, Daniel Schober (2020):
FAIR Computational Workflows.
Data Intelligence 2(1):108–121 https://doi.org/10.1162/dint_a_00033

Events and talks

Events, presentations and activities are also available in WorkflowHub in the FAIR Computational Workflows Team

Related past and upcoming events:

Projects and initiatives

Registries:

Related projects and initiatives supporting FAIR Computational Workflows aims:

Note that the similarly named project Implementing FAIR Workflows is about making the overall research process FAIR, not necessarily related to computational workflows.

Related standards for FAIR computational workflows:

References

Articles below are published as Open Access, or with green open access preprints where gold open access is not possible. Please let us know if you are unable to access any of our publications. To add to this list, please suggest a change.

Hirotaka Suetake, Tsukasa Fukusato, Takeo Igarashi, Tazro Ohta (2022):
Workflow sharing with automated metadata validation and test execution to improve the reusability of published workflows.
bioRxiv
https://doi.org/10.1101/2022.07.08.499265

Rudolf Wittner, Cecilia Mascia, Matej Gallo, Francesca Frexia, Heimo Müller, Markus Plass, Jörg Geiger, Petr Holub (2022):
Lightweight Distributed Provenance Model for Complex Real–world Environments.
Scientific Data 9(1)
https://doi.org/10.1038/s41597-022-01537-6

J. Machicao, A. Ben Abbes, L. Meneguzzi, P. L. P. Corrêa, A. Specht, R. David, G. Subsol, D. Vellenich, R. Devillers, S. Stall, N. Mouquet, M. Chaumont, L. Berti‐Equille, D. Mouillot (2022):
Mitigation Strategies to Improve Reproducibility of Poverty Estimations From Remote Sensing Images Using Deep Learning.
Earth and Space Science 9(8)
https://doi.org/10.1029/2022ea002379

Michael R. Crusoe, Sanne Abeln, Alexandru Iosup, Peter Amstutz, John Chilton, Nebojša Tijanić, Hervé Ménager, Stian Soiland-Reyes, Carole Goble, CWL Community (2022):
Methods Included: Standardizing Computational Reuse and Portability with the Common Workflow Language.
Communications of the ACM 65(6)
https://doi.org/10.1145/3486897

Peter Wittenburg, Alex Hardisty, Amirpasha Mozzafari, Limor Peer, Nikolay Skvortsov, Alessandro Spinuso, Zhiming Zhao (2022):
Editors’ Note: Special Issue on Canonical Workflow Frameworks for Research.
Data Intelligence 4(2)
https://doi.org/10.1162/dint_e_00122

Stian Soiland-Reyes, Genís Bayarri, Pau Andrio, Robin Long, Douglas Lowe, Ania Niewielska, Adam Hospital, Paul Groth (2022):
Making Canonical Workflow Building Blocks interoperable across workflow languages.
Data Intelligence 4(2)
https://doi.org/10.1162/dint_a_00135

Alex Hardisty, Paul Brack, Carole Goble, Laurence Livermore, Ben Scott, Quentin Groom, Stuart Owen, Stian Soiland-Reyes (2022):
The Specimen Data Refinery: A canonical workflow framework and FAIR Digital Object approach to speeding up digital mobilisation of natural history collections.
Data Intelligence 4(2)
https://doi.org/10.1162/dint_a_00134

Amirpasha Mozaffari, Michael Langguth, Bing Gong, Jessica Ahring, Adrian Rojas Campos, Pascal Nieters, Otoniel José Campos Escobar, Martin Wittenbrink, Peter Baumann, Martin G. Schultz (2022):
HPC-oriented Canonical Workflows for Machine Learning Applications in Climate and Weather Prediction.
Data Intelligence 4(2)
https://doi.org/10.1162/dint_a_00131

Peter Wittenburg, Alex Hardisty, Yann Le Franc, Amirpasha Mozaffari, Limor Peer, Nikolay A. Skvortsov, Zhiming Zhao, Alessandro Spinuso (2022):
Canonical Workflows to Make Data FAIR.
Data Intelligence 4(2)
https://doi.org/10.1162/dint_a_00132

Beatriz Serrano-Solano, Anne Fouilloux, Ignacio Eguinoa, Matúš Kalaš, Björn Grüning, Frederik Coppens (2022):
Galaxy: A Decade of Realising CWFR Concepts.
Data Intelligence 4(2)
https://doi.org/10.1162/dint_a_00136

Thomas Jejkal, Sabrine Chelbi, Andreas Pfeil, Peter Wittenburg (2022):
Evaluation of Application Possibilities for Packaging Technologies in Canonical Workflows.
Data Intelligence 4(2)
https://doi.org/10.1162/dint_a_00137

Hendrik Nolte, Philipp Wieder (2022):
Realising Data-Centric Scientific Workflows with Provenance-Capturing on Data Lakes.
Data Intelligence 4(2)
https://doi.org/10.1162/dint_a_00141

Nikolay A. Skvortsov, Sergey A. Stupnikov (2022):
A Semantic Approach to Workflow Management and Reuse for Research Problem Solving.
Data Intelligence 4(2)
https://doi.org/10.1162/dint_a_00142

Alessandro Spinuso, Mats Veldhuizen, Daniele Bailo, Valerio Vinciarelli, Tor Langeland (2022):
SWIRRL. Managing Provenance-aware and Reproducible Workspaces.
Data Intelligence 4(2)
https://doi.org/10.1162/dint_a_00129

Christian Ohmann, Romain David, Mónica Cano Abadia, Florence Bietrix, Jan-Willem Boiten, Steve Canham, Maria Luisa Chiusano, Walter Dastrù, Arnaud Laroquette, Dario Longo, Michaela Theresia Mayrhofer, Maria Panagiotopoulou, Audrey Richard, Pablo Emilio Verde (2022):
Pilot Study on the Intercalibration of a Categorisation System for FAIRer Digital Objects Related to Sensitive Data in the Life Sciences.
Data Intelligence 4(2)
https://doi.org/10.1162/dint_a_00126

Stian Soiland-Reyes, Peter Sefton, Mercè Crosas, Leyla Jael Castro, Frederik Coppens, José M. Fernández, Daniel Garijo, Björn Grüning, Marco La Rosa, Simone Leo, Eoghan Ó Carragáin, Marc Portier, Ana Trisovic, RO-Crate Community, Paul Groth, Carole Goble (2022):
Packaging research artefacts with RO-Crate.
Data Science 5(2)
https://doi.org/10.3233/DS-210053

Paul Brack, Peter Crowther, Stian Soiland-Reyes, Stuart Owen, Douglas Lowe, Alan R Williams, Quentin Groom, Mathias Dillen, Frederik Coppens, Björn Grüning, Ignacio Eguinoa, Philip Ewels, Carole Goble (2022):
10 Simple Rules for making a software tool workflow-ready
PLOS Computational Biology 18(3):e1009823 https://doi.org/10.1371/journal.pcbi.1009823

Neil P. Chue Hong, Daniel S. Katz, Michelle Barker; Anna-Lena Lamprecht, Carlos Martinez, Fotis E. Psomopoulos, Jen Harrow, Leyla Jael Castro, Morane Gruenpeter, Paula Andrea Martinez, Tom Honeyman; Alexander Struck, Allen Lee, Axel Loewe, Ben van Werkhoven, Catherine Jones, Daniel Garijo, Esther Plomp, Francoise Genova, Hugh Shanahan, Joanna Leng, Maggie Hellström, Malin Sandström, Manodeep Sinha, Mateusz Kuzak, Patricia Herterich, Qian Zhang, Sharif Islam, Susanna-Assunta Sansone, Tom Pollard, Udayanto Dwi Atmojo; Alan Williams, Andreas Czerniak, Anna Niehues, Anne Claire Fouilloux, Bala Desinghu, Carole Goble, Céline Richard, Charles Gray, Chris Erdmann, Daniel Nüst, Daniele Tartarini, Elena Ranguelova, Hartwig Anzt, Ilian Todorov, James McNally, Javier Moldon, Jessica Burnett, Julián Garrido-Sánchez, Khalid Belhajjame, Laurents Sesink, Lorraine Hwang, Marcos Roberto Tovani-Palone, Mark D. Wilkinson, Mathieu Servillat, Matthias Liffers, Merc Fox, Nadica Miljković, Nick Lynch, Paula Martinez Lavanchy, Sandra Gesing, Sarah Stevens, Sergio Martinez Cuesta, Silvio Peroni, Stian Soiland-Reyes, Tom Bakker, Tovo Rabemanantsoa, Vanessa Sochat, Yo Yehudi, FAIR4RS WG (2022):
FAIR Principles for Research Software version 1.0 (FAIR4RS Principles v1.0).
Research Data Alliance
https://doi.org/10.15497/RDA00068

2021

Rafael Ferreira da Silva, Henri Casanova, Kyle Chard, Ilkay Altintas, Rosa M Badia, Bartosz Balis, Tainã Coleman, Frederik Coppens, Frank Di Natale, Bjoern Enders, Thomas Fahringer, Rosa Filgueira, Grigori Fursin, Daniel Garijo, Carole Goble, Dorran Howell, Shantenu Jha, Daniel S. Katz, Daniel Laney, Ulf Leser, Maciej Malawski, Kshitij Mehta, Loïc Pottier, Jonathan Ozik, J. Luc Peterson, Lavanya Ramakrishnan, Stian Soiland-Reyes, Douglas Thain, Matthew Wolf (2021):
A Community Roadmap for Scientific Workflows Research and Development.
arXiv:2110.02168 [cs.DC] 2021 IEEE Workshop on Workflows in Support of Large-Scale Science (WORKS), pp 81–90.
https://doi.org/10.1109/WORKS54523.2021.00016

Robin A Richardson, Remzi Celebi, Sven van der Burg, Djura Smits, Lars Ridder, Michel Dumontier, Tobias Kuhn (2021):
User-friendly Composition of FAIR Workflows in a Notebook Environment.
The Eleventh International Conference on Knowledge Capture (K-Cap2021).
https://arxiv.org/abs/2111.00831

Jeremy Leipzig, Daniel Nüst, Charles Tapley Hoyt, Karthik Ram, Jane Greenberg (2021):
The role of metadata in reproducible computational research
Patterns 2(1):100322
https://doi.org/10.1016/j.patter.2021.100322

Sveinung Gundersen, Sanjay Boddu, Salvador Capella-Gutierrez, Finn Drabløs, José M. Fernández, Radmila Kompova, Kieron Taylor, Dmytro Titov, Daniel Zerbino, Eivind Hovig (2021):
Recommendations for the FAIRification of genomic track metadata [version 1; peer review: 2 approved].
F1000Research 10(ELIXIR):268
https://doi.org/10.12688/f1000research.28449.1

Anna-Lena Lamprecht, Magnus Palmblad, Jon Ison, Veit Schwämmle, Mohammad Sadnan Al Manir, Ilkay Altintas, Christopher J. O. Baker, Ammar Ben Hadj Amor, Salvador Capella-Gutierrez, Paulos Charonyktakis, Michael R. Crusoe, Yolanda Gil, Carole Goble, Timothy J. Griffin, Paul Groth, Hans Ienasescu, Pratik Jagtap, Matúš Kalaš, Vedran Kasalica, Alireza Khanteymoori, Tobias Kuhn, Hailiang Mei, Hervé Ménager, Steffen Möller, Robin A. Richardson, Vincent Robert, Stian Soiland-Reyes, Robert Stevens, Szoke Szaniszlo, Suzan Verberne, Aswin Verhoeven, Katherine Wolstencroft (2021):
Perspectives on automated composition of workflows in the life sciences [version 1; peer review: 2 approved].
F1000Research 10:897
https://doi.org/10.12688/f1000research.54159.1

Rafael Ferreira da Silva, Henri Casanova, Kyle Chard, Dan Laney, Dong Ahn, Shantenu Jha, Carole Goble, Lavanya Ramakrishnan, Luc Peterson, Bjoern Enders, Douglas Thain, Ilkay Altintas, Yadu Babuji, Rosa Badia, Vivien Bonazzi, Taina Coleman, Michael Crusoe, Ewa Deelman, Frank Di Natale & Paolo Di Tommaso (2021):
Workflows Community Summit: Bringing the Scientific Workflows Community Together.
Workflows RI Technical Report. arXiv:2103.09181
https://doi.org/10.5281/zenodo.4606958

Carole Goble, Stian Soiland-Reyes, Finn Bacall, Stuart Owen, Alan Williams, Ignacio Eguinoa, Bert Droesbeke, Simone Leo, Luca Pireddu, Laura Rodriguez-Navas, José Mª Fernández, Salvador Capella-Gutierrez, Hervé Ménager, Björn Grüning, Beatriz Serrano-Solano, Philip Ewels, Frederik Coppens (2021):
Implementing FAIR Digital Objects in the EOSC-Life Workflow Collaboratory.
Zenodo
https://doi.org/10.5281/zenodo.4605654

Daniel S. Katz, Morane Gruenpeter, Tom Honeyman, Lorraine Hwang, Mark D. Wilkinson, Vanessa Sochat, Hartwig Anzt, Carole Goble, FAIR4RS Subgroup 1 (2021):
A Fresh Look at FAIR for Research Software.
arXiv:2101.10883 [pdf]

2020

Carole Goble, Sarah Cohen-Boulakia, Stian Soiland-Reyes, Daniel Garijo, Yolanda Gil, Michael R. Crusoe, Kristian Peters, Daniel Schober (2020):
FAIR Computational Workflows.
Data Intelligence 2(1):108–121 https://doi.org/10.1162/dint_a_00033

Janno Harjes, Anton Link, Tanja Weibulat, Dagmar Triebel, Gerhard Rambold (2020):
FAIR digital objects in environmental and life sciences should comprise workflow operation design data and method information for repeatability of study setups and reproducibility of results.
Database 2020:baaa059
https://doi.org/10.1093/database/baaa059

Anna-Lena Lamprecht, Leyla Garcia, Mateusz Kuzak, Carlos Martinez, Ricardo Arcila, Eva Martin Del Pico, Victoria Dominguez Del Angel, Stephanie Van De Sandt, Jon Ison, Paula Andrea Martinez, Peter Mcquilton, Alfonso Valencia, Jennifer Harrow, Fotis Psomopoulos, Josep Ll. Gelpi, Neil Chue Hong, Carole Goble, Salvador Capella-Gutierrez (2020):
Towards FAIR principles for research software.
Data Science 3(1) pp. 37–59.
https://doi.org/10.3233/DS-190026

2019

Farah Zaib Khan, Stian Soiland-Reyes, Richard O. Sinnott, Andrew Lonie, Carole Goble, Michael R. Crusoe (2019):
Sharing interoperable workflow provenance: A review of best practices and their practical application in CWLProv.
GigaScience 8(11):giz095
https://doi.org/10.1093/gigascience/giz095

Jeffrey M. Perkel (2019):
Workflow systems turn raw data into scientific knowledge Nature 573
https://doi.org/10.1038/d41586-019-02619-z

2018

Jeffrey M. Perkel (2018):
That’s the way we flow.
Nature 573 149-150.
https://doi.org/10.1038/d41586-019-02619-z

Natalie J Stanford, Finn Bacall, Fatemeh Zamanzad Ghavidel, Martin Golebiewski, Inge Jonassen, Rune Kleppe, Olga Krebs, Hadas Leonov, Stuart Owen, Kjell Petersen, Maja Rey, Stian Soiland-Reyes, Kidane Tekle, Andreas Weidemann, Alan Williams, Ulrike Wittig, Katy Wolstencroft, Anders Goksøyr, Jacky L. Snoep, Jon Olav Vik, Wolfgang Müller, Carole Goble (2018):
FAIR Bioinformatics computation and data management: FAIRDOM and the Norwegian Digital Life initiative.
NETTAB 2018 Network Tools and Applications in Biology.
[preprint] [preprint server]

Gil Alterovitz, Dennis A Dean II, Carole Goble, Michael R Crusoe, Stian Soiland-Reyes, Amanda Bell, Anais Hayes, Anita Suresh, Charles Hadley S King IV, Dan Taylor, KanakaDurga Addepalli, Elaine Johanson, Elaine E Thompson, Eric Donaldson, Hiroki Morizono, Hsinyi Tsang, Jeet K Vora, Jeremy Goecks, Jianchao Yao, Jonas S Almeida, Konstantinos Krampis, Krista Smith, Lydia Guo, Mark Walderhaug, Marco Schito, Matthew Ezewudo, Nuria Guimera, Paul Walsh, Robel Kahsay, Srikanth Gottipati, Timothy C Rodwell, Toby Bloom, Yuching Lai, Vahan Simonyan, Raja Mazumder (2018):
Enabling Precision Medicine via standard communication of NGS provenance, analysis, and results.
PLOS Biology. 16(12):e3000099
https://doi.org/10.1371/journal.pbio.3000099 (bioXriv:191783)

Pablo Carbonell, Adrian J. Jervis, Christopher J. Robinson, Cunyu Yan, Mark Dunstan, Neil Swainston, Maria Vinaixa, Katherine A. Hollywood, Andrew Currin, Nicholas J. W. Rattray, Sandra Taylor, Reynard Spiess, Rehana Sung, Alan R. Williams, Donal Fellows, Natalie J. Stanford, Paul Mulherin, Rosalind Le Feuvre, Perdita Barran, Royston Goodacre, Nicholas J. Turner, Carole Goble, George Guoqiang Chen, Douglas B. Kell, Jason Micklefield, Rainer Breitling, Eriko Takano, Jean-Loup Faulon, Nigel S. Scrutton (2018):
An automated Design-Build-Test-Learn pipeline for enhanced microbial production of fine chemicals.
Communications Biology 1:66
https://doi.org/10.1038/s42003-018-0076-9

2017

Stephen J Eglen, Ben Marwick, Yaroslav O Halchenko, Michael Hanke, Shoaib Sufi, Padraig Gleeson, R Angus Silver, Andrew P Davison, Linda Lanyon, Mathew Abrams, Thomas Wachtler, David J Willshaw, Christophe Pouzat, Jean-Baptiste Poline (2017):
Toward standard practices for sharing computer code and programs in neuroscience.
Nature Neuroscience 20, 770–773. https://doi.org/10.1038/nn.4550 [bioRxiv preprint]

Steffen Möller, Stuart W. Prescott, Lars Wirzenius; Petter Reinholdtsen, Brad Chapman, Pjotr Prins, Stian Soiland-Reyes, Fabian Klötzl, Andrea Bagnacani, Matúš Kalaš, Andreas Tille, Michael R. Crusoe (2017): Robust cross-platform workflows: how technical and scientific communities collaborate to develop, test and share best practices for data analysis.
Data Science and Engineering 2:232 pp 232–244.
https://doi.org/10.1007/s41019-017-0050-4