Keyword search (3,448 papers available)


File-based localization of numerical perturbations in data analysis pipelines.

Author(s): Salari A, Kiar G, Lewis L, Evans AC, Glatard T

BACKGROUND: Data analysis pipelines are known to be affected by computational conditions, presumably owing to the creation and propagation of numerical errors. While this process could play a major role in the current reproducibility crisis, the precise cau...

Article GUID: 33269388


Title:File-based localization of numerical perturbations in data analysis pipelines.
Authors:Salari AKiar GLewis LEvans ACGlatard T
Link:https://www.ncbi.nlm.nih.gov/pubmed/33269388
DOI:10.1093/gigascience/giaa106
Category:Gigascience
PMID:33269388
Dept Affiliation: ENCS
1 Department of Computer Science and Software Engineering, Concordia University, Montreal, QC, Canada.
2 Department of Biomedical Engineering, McGill University, Montreal, QC, Canada.
3 Montreal Neurological Institute, McGill University, Montreal, QC, Canada.

Description:

File-based localization of numerical perturbations in data analysis pipelines.

Gigascience. 2020 Dec 02; 9(12):

Authors: Salari A, Kiar G, Lewis L, Evans AC, Glatard T

Abstract

BACKGROUND: Data analysis pipelines are known to be affected by computational conditions, presumably owing to the creation and propagation of numerical errors. While this process could play a major role in the current reproducibility crisis, the precise causes of such instabilities and the path along which they propagate in pipelines are unclear.

METHOD: We present Spot, a tool to identify which processes in a pipeline create numerical differences when executed in different computational conditions. Spot leverages system-call interception through ReproZip to reconstruct and compare provenance graphs without pipeline instrumentation.

RESULTS: By applying Spot to the structural pre-processing pipelines of the Human Connectome Project, we found that linear and non-linear registration are the cause of most numerical instabilities in these pipelines, which confirms previous findings.

PMID: 33269388 [PubMed - in process]