www-from-model/content/publications/a2de1888-b547-4587-938c-ef9e7ecc8c67/_index.md
2026-04-22 12:30:13 +00:00

73 lines
No EOL
2.9 KiB
Markdown

---
title: 'FAIRly big: A framework for computationally reproducible processing of large-scale
data'
persons:
- malgorzata-wierzba
- michael-hanke
- simon-eickhoff
- adina-wagner
- alex-waite
- laura-waite
- benjamin-poldrack
topics:
- research-software-engineering
- neuroimaging
- research-data-management
- distributed-systems
- high-throughput-computing
params:
graphRootNodePID: xyzrins:publications/a2de1888-b547-4587-938c-ef9e7ecc8c67
pid: xyzrins:publications/a2de1888-b547-4587-938c-ef9e7ecc8c67
doi: 10.1038/s41597-022-01163-2
date: '2022-03-11'
title: 'FAIRly big: A framework for computationally reproducible processing of large-scale
data'
description: "Large-scale datasets present unique opportunities to perform scientific\
\ investigations with unprecedented breadth. However, they also pose considerable\
\ challenges for the findability, accessibility, interoperability, and reusability\
\ (FAIR) of research outcomes due to infrastructure limitations, data usage constraints,\
\ or software license restrictions. Here we introduce a DataLad-based, domain-agnostic\
\ framework suitable for reproducible data processing in compliance with open science\
\ mandates. The framework attempts to minimize platform idiosyncrasies and performance-related\
\ complexities. It affords the capture of machine-actionable computational provenance\
\ records that can be used to retrace and verify the origins of research outcomes,\
\ as well as be re-executed independent of the original computing infrastructure.\
\ We demonstrate the framework\u2019s performance using two showcases: one highlighting\
\ data sharing and transparency (using the studyforrest.org dataset) and another\
\ highlighting scalability (using the largest public brain imaging dataset available:\
\ the UK Biobank dataset)."
kind: bibo:AcademicArticle
author:
- pid: xyzrins:persons/malgorzata-wierzba
given_name: "Ma\u0142gorzata"
family_name: Wierzba
- pid: xyzrins:persons/michael-hanke
given_name: Michael
family_name: Hanke
- pid: xyzrins:persons/simon-eickhoff
given_name: Simon
family_name: Eickhoff
- pid: xyzrins:persons/adina-wagner
given_name: Adina
family_name: Wagner
- pid: xyzrins:persons/alex-waite
given_name: Alex
family_name: Waite
- pid: xyzrins:persons/laura-waite
given_name: Laura
family_name: Waite
- pid: xyzrins:persons/benjamin-poldrack
given_name: Benjamin
family_name: Poldrack
topic:
- pid: xyzrins:topics/research-software-engineering
display_label: Research software engineering (RSE)
- pid: xyzrins:topics/neuroimaging
display_label: Neuroimaging
- pid: xyzrins:topics/research-data-management
display_label: Research data management (RDM)
- pid: xyzrins:topics/distributed-systems
display_label: Distributed systems
- pid: xyzrins:topics/high-throughput-computing
display_label: High-throughput computing
---