www-from-model/content/publications/a2de1888-b547-4587-938c-ef9e7ecc8c67/_index.md
2026-04-22 12:30:13 +00:00

2.9 KiB
Raw Permalink Blame History

title persons topics params
FAIRly big: A framework for computationally reproducible processing of large-scale data
malgorzata-wierzba
michael-hanke
simon-eickhoff
adina-wagner
alex-waite
laura-waite
benjamin-poldrack
research-software-engineering
neuroimaging
research-data-management
distributed-systems
high-throughput-computing
graphRootNodePID pid doi date title description kind author topic
xyzrins:publications/a2de1888-b547-4587-938c-ef9e7ecc8c67 xyzrins:publications/a2de1888-b547-4587-938c-ef9e7ecc8c67 10.1038/s41597-022-01163-2 2022-03-11 FAIRly big: A framework for computationally reproducible processing of large-scale data Large-scale datasets present unique opportunities to perform scientific investigations with unprecedented breadth. However, they also pose considerable challenges for the findability, accessibility, interoperability, and reusability (FAIR) of research outcomes due to infrastructure limitations, data usage constraints, or software license restrictions. Here we introduce a DataLad-based, domain-agnostic framework suitable for reproducible data processing in compliance with open science mandates. The framework attempts to minimize platform idiosyncrasies and performance-related complexities. It affords the capture of machine-actionable computational provenance records that can be used to retrace and verify the origins of research outcomes, as well as be re-executed independent of the original computing infrastructure. We demonstrate the frameworks performance using two showcases: one highlighting data sharing and transparency (using the studyforrest.org dataset) and another highlighting scalability (using the largest public brain imaging dataset available: the UK Biobank dataset). bibo:AcademicArticle
pid given_name family_name
xyzrins:persons/malgorzata-wierzba Małgorzata Wierzba
pid given_name family_name
xyzrins:persons/michael-hanke Michael Hanke
pid given_name family_name
xyzrins:persons/simon-eickhoff Simon Eickhoff
pid given_name family_name
xyzrins:persons/adina-wagner Adina Wagner
pid given_name family_name
xyzrins:persons/alex-waite Alex Waite
pid given_name family_name
xyzrins:persons/laura-waite Laura Waite
pid given_name family_name
xyzrins:persons/benjamin-poldrack Benjamin Poldrack
pid display_label
xyzrins:topics/research-software-engineering Research software engineering (RSE)
pid display_label
xyzrins:topics/neuroimaging Neuroimaging
pid display_label
xyzrins:topics/research-data-management Research data management (RDM)
pid display_label
xyzrins:topics/distributed-systems Distributed systems
pid display_label
xyzrins:topics/high-throughput-computing High-throughput computing