datalad-course/html/zimannheim.html

1034 lines
43 KiB
HTML

<!doctype html>
<html>
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no">
<!-- Edit me start! -->
<title>Open research software infrastructure in Neuro-Medicine</title>
<meta name="description" content=" Open research software infrastructure in Neuro-Medicine ">
<meta name="author" content=" Adina Wagner ">
<!-- Edit me end! -->
<link rel="stylesheet" href="../reveal.js/dist/reset.css">
<link rel="stylesheet" href="../reveal.js/dist/reveal.css">
<link rel="stylesheet" href="../reveal.js/dist/theme/beige.css">
<link rel="stylesheet" href="../css/main.css">
<!-- Theme used for syntax highlighted code -->
<link rel="stylesheet" href="../reveal.js/plugin/highlight/monokai.css">
</head>
<body>
<div class="reveal">
<div class="slides">
<!--...Datalad Basics...-->
<section>
<section>
<h2>Open research software infrastructure <br>
in Neuro-Medicine</h2>
<div style="margin-top:1em;text-align:center">
<table style="border: none;">
<tr>
<td style="border: none;">Adina Wagner
<br><small>
<a href="https://mas.to/@adswa" target="_blank">
<img data-src="../pics/mastodon.svg" style="height:30px;margin:0px" />
mas.to/@adswa</a></small></td>
<td style="border: none;">
<br></td>
</tr>
<tr>
<td style="border: none; vertical-align:top">
<small><a href="http://psychoinformatics.de" target="_blank">Psychoinformatics lab</a>,
<br> Institute of Neuroscience and
Medicine, Brain &amp; Behavior (INM-7)<br>
Research Center Jülich</small><br>
</td>
<td><img style="height:100px;margin-right:10px" data-src="../pics/fzj_logo.png" /></td>
</tr>
</table>
</div>
<p style="z-index: 100;position: fixed;background-color:#ede6d5;font-size:35px;box-shadow: 10px 10px 8px #888888;margin-top:0px;margin-bottom:100px;margin-left:1000px">
<img src="../pics/qr_zi.png" height="200">
</p>
<br><br><small>
Slides: <a href="https://doi.org/10.5281/zenodo.10149349" target="_blank">
DOI 10.5281/zenodo.10149349</a> (Scan the QR code) <br>
<a href="https://files.inm7.de/adina/talks/html/zimannheim.html"
target="_blank">files.inm7.de/adina/talks/html/zimannheim.html</a>
</small>
</a>
</section>
</section>
<!--- INM-7 --->
<section>
<section>
<h2>Open Science and Open Software go hand in hand</h2>
<ul>
<li class="fragment">Science has <b>specific requirements</b>;
research software from within science ("from scientists,
for scientists") can fulfill them. Open formats, protocols, and
code allow re-use, interoperability, and customization.</li> <br>
<li class="fragment">Open and reproducible science has specific needs for
<b>transparency</b> and <b>accessibility</b>: Open source software provides the necessary auditability.</li><br>
<li class="fragment">Creating software becomes
<b>increasingly possible for scientists</b>:
The San Francisco Declaration on Research Assessment
(<a href="https://sfdora.org" target="_blank">DORA</a>; signed by FZJ),
the Agreement on Reforming Research Assessment
(<a href="https://coara.eu" target="_blank">CoARA</a>),
and the DFG recognize software as academic output.</li>
</ul>
</section>
<section data-transition="None">
<h2>The Institute for Neuroscience and Medicine (INM-7)</h2>
<div class="r-stack">
<img src="../pics/inm7-homepage.png">
</div>
</ul>
</section>
<section data-transition="None">
<h2>The Institute for Neuroscience and Medicine (INM-7)</h2>
<ul>
<li>Interdisciplinary institute with 11 research groups</li>
<li>Research foci:</li>
<ul>
<li>Infrastructure and method development: Digital biomarker,
machine learning, meta analysis,
research data management</li>
<li>Basic research in human brain mapping: Connectomics, genetic gradients,
in-vivo brain mapping, multimodal integration</li>
<li>AI Applications in medical research: Cognition,
Personality, Aging & neurodegenerative disease, Schizophrenia</li>
<li>Ethical implications of medical AI: Bias in AI applications, medical AI and society,
individualized predictions</li>
</ul>
</ul>
</section>
<section>
<h2>Software @ INM-7</h2>
<ul>
<li>The institute has a history of open source software, starting with the
<a href="https://github.com/inm7/jubrain-anatomy-toolbox" target="_blank">
SPM Anatomy Toolbox (Eickhoff, 2005)</a></li>
<li>Multiple groups develop and maintain open source research software for their
respecitve subdomain</li>
<li>Recent integration efforts connect our open software
stack to open research software infrastructure for neuro-medicine</li>
</ul>
</section>
</section>
<!--- DATALAD --->
<section>
<section data-transition="None">
<img style="height:150px;margin-bottom:30px" data-src="../pics/datalad_logo_wide.svg">
<br>
<ul style="font-size:37px">
<li class="fragment fade-in-then-semi-out" data-fragment-index="1">Domain-agnostic data management tool
<strong>(command-line </strong> + <strong>graphical user interface</strong>),
built on top of <a href="https://git-scm.com/" target="_blank">Git</a>
& <a href="https://git-annex.branchable.com/" target="_blank">Git-annex</a></li>
<li class="fragment fade-in-then-semi-out" data-fragment-index="2">10+ year open source project (100+ contributors), available for all major OS</li>
<li class="fragment" data-fragment-index="3">Born from rethinking data:</li>
<ul>
<li class="fragment" data-fragment-index="4">Just like code, <b>data is not static</b>.</li>
<li class="fragment" data-fragment-index="4">Just like code, <b>data is subject to collaboration</b>.
Stream-lined workflows for sharing and collaborating should be possible, mirroring those in software development. </li>
<li class="fragment" data-fragment-index="4"><b>Provenance</b> of data is essential for reproducible, trustworthy, and FAIR science</li>
<li class="fragment" data-fragment-index="4">Flexibility and <b>interoperability with existing tools</b> is the key to sustainability and ease of use</li>
</ul>
</ul>
</section>
<section data-transition="None">
<img style="height:150px;margin-bottom:30px" data-src="../pics/datalad_logo_wide.svg"><br>
<ul style="font-size:37px">
<li>Domain-agnostic <strong>command-line tool</strong> (+ <strong>graphical user interface</strong>),
built on top of <a href="https://git-scm.com/" target="_blank">Git</a>
& <a href="https://git-annex.branchable.com/" target="_blank">Git-annex</a></li>
<li>10+ year open source project (100+ contributors), available for all major OS</li>
<li>Major features:</li>
<dt>Version-controlling arbitrarily large content </dt>
<dd>Version control data & software alongside to code!</dd>
<dt>Transport mechanisms for sharing, updating & obtaining data </dt>
<dd>Consume & collaborate on data (analyses) like software</dd>
<dt>(Computationally) reproducible data analysis</dt>
<dd>Track and share provenance of all digital objects</dd>
<dt>(... and <i>much</i> more) </dt>
<br>
</ul>
</section>
<section data-markdown data-transition="none"><script type="text/template">
## Exhaustive tracking of research components
![](../pics/vamp_0_start.png)<!-- .element: width="100%" -->
Well-structured datasets (using community standards), and portable computational environments &mdash; and their evolution &mdash; are the precondition for reproducibility
<table width=100% style="padding:0px">
<tr><td style="padding:0px">
<code><pre>
# turn any directory into a dataset
# with version control
% datalad create &lt;directory&gt;
</pre></code>
</td><td style="padding:0px">
<code><pre>
# save a new state of a dataset with
# file content of any size
% datalad save
</pre></code>
</td></tr></table>
Note:
- link to prev. statements on description standards
- your community could be really small (your lab), when data are precious resources
will be spent to understand it, but information must be capture to make this possible
</script></section>
<section data-markdown data-transition="none"><script type="text/template">
## Capture computational provenance
![](../pics/vamp_1_provcapture.png)<!-- .element: width="100%" -->
Which data was needed at which version, as input into which code, running with what parameterization in which
computional environment, to generate an outcome?
<table width=100% style="padding:0px">
<tr><td style="padding:0px">
<code><pre>
# execute any command and capture its output
# while recording all input versions too
% datalad run --input ... --output ... &lt;command&gt;
</pre></code>
</td></tr></table>
Note:
The missing link: even when everything is shared, we still don't know how to start.
README is minimum, but executable prov-records are much better.
</script></section>
<section data-markdown data-transition="none"><script type="text/template">
## Exhaustive capture enables portability
![](../pics/vamp_2_pushtocloud.png)<!-- .element: width="100%" -->
Precise identification of data and computational environments
combined with provenance records form a comprehensive and portable
data structure, capturing all aspects of an investigation.
<table width=100% style="padding:0px">
<tr><td style="padding:0px">
<code><pre>
# transfer data and metadata to other sites and services
# with fine-grained access control for dataset components
% datalad push --to &lt;site-or-service&gt;
</pre></code>
</td></tr></table>
Note:
Does it fly? Can you give it to someone? Or can you take it with you to your new lab?
</script></section>
<section data-markdown data-transition="none"><script type="text/template">
## Reproducibility strengthens trust
![](../pics/vamp_3_reproduce.png)<!-- .element: width="100%" -->
Outcomes of computational transformations can be validated by authorized 3rd-parties. This enables audits, promotes accountability, and streamlines automated "upgrades" of outputs
<table width=100% style="padding:0px">
<tr><td style="padding:0px">
<code><pre>
# obtain dataset (initially only identity,
# availability, and provenance metadata)
% datalad clone &lt;url&gt;
</pre></code>
</td><td style="padding:0px">
<code><pre>
# immediately actionable provenance records
# full abstraction of input data retrieval
% datalad rerun &lt;commit|tag|range&gt;
</pre></code>
</td></tr></table>
Note:
Goal is automated reproducibility, enables assessment of robustness and benchmarking algorithmic developments
</script></section>
<section data-markdown data-transition="none"><script type="text/template">
## Ultimate goal: (re-)usability
![](../pics/vamp_4_reuse.png)<!-- .element: width="100%" -->
Verifiable, portable, self-contained data structures that track all aspects of an investigation exhaustively can be (re-)used as modular components in larger contexts &mdash; propagating their traits
<table width=100% style="padding:0px">
<tr><td style="padding:0px">
<code><pre>
# declare a dependency on another dataset and
# re-use it a particular state in a new context
% datalad clone -d &lt;superdataset&gt; &lt;url&gt; &lt;path-in-dataset&gt;
</pre></code>
</td></tr></table>
Note:
With these in place, re-usability is a small(er) step
</script>
</section>
<section>
<h3>DataLad usecases</h3>
<div class="r-stack">
<li data-fragment-index="1" class="fragment fade-in-then-out"> <b>Publish or consume datasets</b>
via GitHub, GitLab, OSF, the European Open Science Cloud, or similar services</li>
<li data-fragment-index="2" class="fragment fade-in-then-out">
Behind-the-scenes <b>infrastructure component for data transport and versioning</b>
(e.g., used by <a href="https://openneuro.org/" target="_blank"> OpenNeuro</a>,
<a href="https://brainlife.io/" target="_blank"> brainlife.io </a>,
the <a href="https://conp.ca/" target="_blank">Canadian Open Neuroscience Platform (CONP)</a>,
<a href="https://mcin.ca/technology/cbrain/" target="_blank"> CBRAIN</a>)</li>
<li data-fragment-index="3" class="fragment fade-in-then-out"><b>Central data management</b> and archival system</li>
<li data-fragment-index="4" class="fragment fade-in-then-out"><b>Decentral data and metadata catalog</b></li>
<li data-fragment-index="5" class="fragment fade-in-then-out"> <b>Creating and sharing reproducible, open science</b>: Sharing data, software, code, and provenance </li>
<li data-fragment-index="6" class="fragment fade-in-then-out"> <b>Reproducibility at the largest scale</b>: Framework for computationally reproducible processing </li>
</div>
<div class="r-stack">
<img data-fragment-index="1" height="700" class="fragment fade-in-then-out" src="../pics/getdata_studyforrest.gif" alt="a screenrecording of cloning studyforrest data from github">
<img height="700" class="fragment fade-in-then-out" data-fragment-index="2" src="../pics/openneuro_new_2.gif" alt="a screenrecording of browsing open neuro">
<img height="700" data-fragment-index="3" class="fragment fade-in-then-out" src="../pics/centralmanagement2.gif">
<img height="1000" data-fragment-index="4" class="fragment fade-in-then-out" src="../pics/sfb-catalog.gif">
<img height="700" class="fragment fade-in" data-fragment-index="5" src="../pics/remodnavpaper_2.gif" alt="a screenrecording of cloning REMODNAV paper dataset from github">
<img height="900" class="fragment fade-in" data-fragment-index="6" src="../pics/fairly-big-paper.png" alt="a screenrecording of cloning REMODNAV paper dataset from github">
</div>
</section>
<section>
<h2>Acknowledgements</h2>
<table>
<tr style="vertical-align:middle">
<td style="vertical-align:middle">
<dl>
<dt style="margin-top:20px">DataLad software <br>
& ecosystem</dt>
<dd style="margin-left:5px!important">
<ul style="margin-left:5px!important">
<li>Psychoinformatics Lab, <br>
Research center Jülich</li>
<li>Center for Open <br>
Neuroscience, <br>
Dartmouth College</li>
<li>Joey Hess (git-annex)</li>
<li><em>>100 additional contributors</em></li>
</ul>
</dd>
<dt style="margin-top:20px">DataLad Office Hour </dt>
<dd style="margin-left:5px!important">
<ul style="margin-left:5px!important">
Every Tuesday, 4pm. <br>Join the <a href="https://matrix.to/#/!NaMjKIhMXhSicFdxAj:matrix.org?via=matrix.waite.eu&via=matrix.org&via=inm7.de" target="_blank">
Matrix Chatroom!
</a>
</ul></dd>
</td>
<td style="vertical-align:middle">
<div style="margin-bottom:-20px;text-align:center"><strong>Funders</strong></div>
<img style="height:150px;margin-right:50px" data-src="../pics/nsf.png" />
<img style="height:150px;margin-right:50pxi;margin-left:50px" data-src="../pics/binc.png" />
<img style="height:150px;margin-left:50px" data-src="../pics/bmbf.png" />
<div style="margin-top:-20px">
<img style="height:80px;margin-top:-40px;margin-left:40px" data-src="../pics/fzj_logo.svg" />
<img style="height:60px;margin-left:50px;margin-bottom:25px" data-src="../pics/dfg_logo.png" />
</div>
<div style="margin-top:-20px">
<img style="height:60px;margin-right:20px" data-src="../pics/erdf.png" />
<img style="height:60px;margin-right:20px" data-src="../pics/cbbs_logo.png" />
<img style="height:60px" data-src="../pics/LSA-Logo.png" />
</div>
<div style="margin-top:40px;margin-bottom:20px;text-align:center"><strong>Collaborators</strong></div>
<div style="margin-top:-20px">
<img style="height:100px;margin:20px" data-src="../pics/hbp_logo.png" />
<img style="height:100px;margin:20px" data-src="../pics/conp_logo.png" />
<img style="height:120px;margin:10px" data-src="../pics/openneuro_logo.png" />
</div>
<div style="margin-top:-40px">
<img style="height:100px;margin:20px" data-src="../pics/ebrains-logo.png"/>
<img style="height:100px;margin:0px" data-src="../pics/gin-logo.png" />
<img style="height:120px;margin:10px" data-src="../pics/sfb1451_logo.png" />
</div>
<div style="margin-top:-40px;align:middle">
<img style="height:140px;margin:10px" data-src="../pics/brainlife_logo.png" />
<img style="height:100px;margin:0px" data-src="../pics/cbrain_logo.png" />
<img style="height:100px;margin:20px" data-src="../pics/vbc_logo.png" />
</div>
</td>
</tr>
</table>
</section>
</section>
<!-------- JTRACK
/login/?next=/
https://preprints.jmir.org/preprint/51689
https://www.mdpi.com/1424-8220/22/13/4975
https://jtrack.readthedocs.io/en/latest/JTrack_Dashboard.html
https://www.frontiersin.org/articles/10.3389/fpubh.2021.763621/full
----------->
<section>
<section>
<h2>JTrack: Digital biomarkers from your smartphone</h2>
<ul>
<li><b>Objective</b>: Close monitoring of patients/participants in non-clinical settings</li>
<li>Modern smartphones contain a variety of sensors
for passive monitoring and active acquisition:</li>
<table>
<tr>
<td style="vertical-align:top">
<div class="r-stack">
<img src="../pics/jtrack/jtrack1.png">
</div>
</td>
<td>
<ul style="vertical-align:center">
<li>gyroscope</li>
<li>accelerometer</li>
<li>location</li>
<li>human activity recognition</li>
<li>application usage</li>
<li>screen time</li>
<li>microphone</li>
</ul>
</td>
</tr>
</table>
</ul>
</section>
<section data-transition="None">
<h2>JTrack</h2>
<img src="../pics/jtrack/hospital.png">
<ul>Flexible components for different users:
<li><b>JTrack Social</b>: Smartphone app for participants</li>
<li><b>JTrack EMA</b>: Smartphone app for participants</li>
<li><b>JDash</b>: Monitoring and analytics tool for study owners</li>
</ul>
</section>
<section data-transition="None">
<h2>JTrack components: JTrack Social</h2>
<ul>
Smartphone App for active labeling and passive monitoring
<table>
<tr>
<td style="vertical-align:top">
<ul><br>
<li>Sensor data (passive <br>
collection mode default: <br>
Accelerometer & Gyroscope) </li>
<li>Application usage statistics</li>
<li>Human activaty recognition <br>
(e.g., walking, running, driving)</li>
<li>Location information (anonymized)</li>
<li>Active recording, e.g., free-speech <br>
generation tasks</li>
</ul>
</td>
<td >
<div class="r-stack">
<img class="fragment fade-out" height="600px" src="../pics/jtrack/jtrack_social_3.png">
<img height="600px" src="../pics/jtrack/jtrack_social_1.png">
</div>
</td>
</tr>
</table>
</ul>
<small>Available for Android + IPhone</small>
</section>
<section data-transition="None">
<h2>JTrack components: JTrack Social</h2>
<img src="../pics/jtrack/image7.gif" height="600px">
<img src="../pics/jtrack/image9.png">
</section>
<section>
<h2>JTrack components: JTrack EMA</h2>
<ul>
Smartphone App for Ecological Momentary Assessment
<table>
<tr>
<td style="vertical-align:top">
<ul><br>
<li>Binary Questions</li>
<li>Date and Time <br>
Questions</li>
<li>Sliding Questions</li>
<li>Multiple/Single <br>
coice questions</li>
</ul>
</td>
<td >
<div class="r-stack">
<img class="fragment fade-in-then-out" height="600px" src="../pics/jtrack/jtrack_ema_1.png">
<img class="fragment fade-in-then-out" height="600px" src="../pics/jtrack/jtrack_ema_2.png">
<img class="fragment fade-in-then-out" height="600px" src="../pics/jtrack/jtrack_ema_3.png">
<img class="fragment fade-in-then-out" height="600px" src="../pics/jtrack/jtrack_ema_4.png">
<img class="fragment fade-in-then-out" height="600px" src="../pics/jtrack/jtrack_ema_5.png">
</div>
</td>
</tr>
</table>
</ul>
<small>Available for Android + IPhone</small>
</section>
<section>
<h2>JTrack components: JDash</h2>
<ul>
Dashboard for Study Administration
<table>
<tr>
<td style="vertical-align:top">
<ul><br>
<li>Investigator's <br>
study & user <br>
management</li>
<li>Data Quality <br>
Control</li>
<li>Notification <br>
Center</li>
</ul>
</td>
<td >
<div class="r-stack">
<img class="fragment fade-in-then-out" height="600px" src="../pics/jtrack/jtrack_jdash_desktop.png">
<img class="fragment fade-in-then-out" height="500px" src="../pics/jtrack/jdash_overview2.png">
</div>
</td>
</tr>
</table>
</ul>
</section>
<section>
<h2>Behind the scenes</h2>
<img src="../pics/jtrack/jtrack_backend.png">
<ul>
<li>Servers in Germany</li>
<li>Data versioning via DataLad</li>
<li>Authenticated data access via JDash</li>
<li>Data transfer via HTTPS</li>
</ul>
</section>
<section data-transition="None">
<h2>Participant's / Investigator's point of view</h2>
<table>
<tr>
<td style="vertical-align:top">
<ul>
<li class="fragment" data-fragment-index="1">Install JTrack</li>
<li class="fragment" data-fragment-index="2">Scan QR code and give permissions to App</li>
<li class="fragment" data-fragment-index="3">Resume daily life</li>
<div class="r-stack">
<img class="fragment fade-in-then-out" data-fragment-index="1" height="600px" src="../pics/jtrack/jtrack_social.png">
<img style="horizontal-align:center" class="fragment fade-in-then-out" data-fragment-index="2" class="fragment" src="../pics/jtrack/jtrack_qr.png">
</div>
</ul>
</td>
<td style="vertical-align:top">
<ul>
<li class="fragment" data-fragment-index="1">Install JTrack with Participant</li>
<li class="fragment" data-fragment-index="2">Provide study- and subject-specific QR code from JDash</li>
<li class="fragment" data-fragment-index="3">Monitor study and communicate with participants via JDash</li>
<div class="r-stack">
<img class="fragment fade-in-then-out" data-fragment-index="1" height="600px" src="../pics/jtrack/jtrack_social.png">
<img class="fragment fade-in-then-out" height="400px" style="horizontal-align:center" data-fragment-index="2" src="../pics/jtrack/jtrack_jdash_tablet.png">
<img style="horizontal-align:center" data-fragment-index="3" class="fragment" src="../pics/jtrack/jtrack_jdash_monitor.png">
</div>
</ul>
</td>
</tr>
</table>
</section>
<section>
<h2>Advantages</h2>
<ul>
<li>Easy to deploy and free environment for collection
of real world data (RWD) basically at no cost</li>
<li>Standardized data collection across centers</li>
<li>High-density longitudinal data with fully customizable data
collection</li>
<li>Opportunity for citizen science</li>
</ul>
</section>
<section>
<h2>Acknowledgements</h2>
<table>
<tr>
<td>
<ul>
<b>Publications</b>:
<li>JTrack Social: <a href="https://doi.org/10.3389/fpubh.2021.763621" target="_blank">Sahandi Far et al. 2021</a></li>
<li>JTrack EMA: <a href="https://preprints.jmir.org/preprint/51689" target="_blank">Sahandi Far et al. 2023</a></li>
<br>
<b>Contact</b>:<br>
<li>via JDash: <a href="https://jdash.inm7.de/login/?next=/" target="_blank">jdash.inm7.de</a></li>
<br>
<li><b>JTrack Hour</b> (open to everyone)<br>
Every second Tuesday at 1PM (uneven-numbered calendar weeks) - <br>
Join the <a href="https://fz-juelich-de.zoom.us/j/65477322046?pwd=K1JPbG0wSGdTL0lIV1h4WjVtc25MQT09" target="_blank">
Zoom meeting</a>!</li>
</ul>
</td>
<td style="vertical-align: top">
<b>Team</b>:
<ul>
<img src="../pics/jtrack/jtrack_team.png">
</ul>
</td>
</tr>
</table>
<ul>
</ul>
</section>
</section>
<!-------- JULEARN ---------->
<section>
<section>
<img src="../pics/julearn/julearn_logo.png">
<ul>
<li>Open source Python library for easy-to use ML-pipelines, built upon scikit-learn</li>
<li>Domain-general, but aims to simplify entry into ML for domain scientists with built-in
guarantees against most common pitfalls:</li>
<ul>
<li>Data leakage</li>
<li>Overfitting of hyperparameters</li>
</ul>
</ul>
</section>
<section>
<ul>
<table>
<tr>
<td class="fragment" data-fragment-index="1">
<b>The problem</b>: Expensive AI mistakes
</td>
<td style="vertical-align:top" class="fragment" data-fragment-index="2">
<b>A solution</b>: User-friendly <br>solutions to common <br>complex use cases
</td>
</tr>
<tr>
<td class="fragment" data-fragment-index="1">
<img src="../pics/julearn/nature.png" >
</td>
<td style="vertical-align:top" class="fragment" data-fragment-index="2">
<ul style="font-size:35px">
<li>Simplifies common use cases <br>
for supervised ML pipelines, <br>
with feature such as:</li>
<ul>
<li>Automatic usage of nested cross-<br>validation
for proper evaluation in hyperparameter tuning </li>
<li>Preprocessing based on feature <br>
types, incl. confound removal</li>
<li>Built-in visualization for model <br>
inspection and comparison</li>
</ul>
<li>Plug-and-play with scikit-learn transformers</li>
</ul>
</td>
</tr>
</table>
</ul>
</section>
<section>
<h2>Visualization</h2>
Interactive "Scores Viewer" for easier model comparison
<img src="../pics/julearn/scores_viewer.png" height="800px">
</section>
<section data-transition="None">
<h2>Julearn vs scikit-learn</h2>
Simple CV pipeline
<pre style="margin-left: 0;"><code data-trim class="language-python" >from julearn import run_cross_validation
run_cross_validation(
X=X, y=y, data=data,
preprocess=["zscore"], model="svm",
problem_type="classification",
X_types={"continuous": X} # X_types optional here</code></pre>
<pre style="margin-left: 0;"><code data-trim class="language-python" >from sklearn.model_selection import cross_validate
from sklearn.svm import SVC # SVR in case of regression
from sklearn.preprocessing import StandardScaler
from sklearn.pipelines import make_pipeline
pipeline = make_pipeline(StandardScaler(), SVC())
cross_validate(X=data.loc[:,X], y=data.loc[:,y], estimator=pipeline)</code></pre>
</section>
<section data-transition="None">
<h2>Julearn vs scikit-learn</h2>
Nested CV with hyperparameter tuning
<pre style="margin-left: 0;"><code data-trim class="language-python" >from julearn import run_cross_validation, PipelineCreator
creator=PipelineCreator(problem_type="classification")
creator.add("zscore", with_mean=[True, False])
creator.add("pca", n_components=2)
creator.add("svm", C=[1,2], degree=[3,4])
# X_types optional
run_cross_validation(
X=X, y=y, data=data, model=creator, X_types={"continuous": X})</code></pre>
<pre style="margin-left: 0;"><code data-trim class="language-python" >from sklearn.model_selection import cross_validate, GridSearchCV
from sklearn.svm import SVC # SVR in case of regression
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.pipelines import make_pipeline
pipeline = make_pipeline(StandardScaler(), PCA(), SVC())
param_grid = {
"standardscaler__with_mean": [True, False],
"pca__n_components": [2],
"scv__C": [1,2],
"svc__degree": [3, 4]
}
grid_pipeline = GridSearchCV(estimator=pipeline, param_grid=param_grid)
cross_validate(X=data.loc[:,X], y=data.loc[:,y], estimator=pipeline)</code></pre>
</section>
<section>
<h2>Acknowledgements</h2>
<br><br>
<ul>
<li>Preprint: <a href="https://arxiv.org/pdf/2310.12568v1.pdf" target="_blank">
Hamdan et al., 2023</a></li>
<li>Documentation: <a href="https://juaml.github.io/julearn/" target="_blank">juaml.github.io/julearn</a> </li>
<li>Source Code: <a href="https://github.com/juaml/julearn" target="_blank">github.com/juaml/julearn</a> </li>
</ul>
<br><br>
<table>
<tr>
<td>
<img src="../pics/julearn/fede.png" height="130px">
<small>Fede Raimondo</small>
</td>
<td>
<img src="../pics/julearn/sami.png" height="130px">
<small>Sami Hamdan</small>
</td>
<td>
<img src="../pics/julearn/kaustubh.png" height="130px">
<small>Kaustubh Patil</small>
</td>
<td>
<img src="../pics/julearn/shammi.png" height="130px">
<small>Shammi More</small>
</td>
<td>
<img src="../pics/julearn/vera.png" height="130px">
<small>Vera Komeyer</small>
</td>
<td>
<img src="../pics/julearn/synchon.png" height="130px">
<small>Synchon Mandal</small>
</td>
<td>
<img src="../pics/julearn/leo.png" height="130px">
<small>Leonard Sasse</small>
</td>
</tr>
</table>
<br>
<div>
<b>ML Hours</b> (open to everyone)<br>
Consultancy on Machine-Learning, every second Thursday, 2-4pm<br>
Chat: <a href="https://matrix.to/#/#ml:inm7.de" target="_blank">
https://matrix.to/#/#ml:inm7.de</a>
</div>
</section>
</section>
<!------- ABCD-J --------->
<section>
<section>
<h2>Even better together</h2>
<iframe src="https://giphy.com/embed/26gR2f01UTynjCPNS" width="680" height="560" frameBorder="0" class="giphy-embed" allowFullScreen></iframe>
</section>
<section>
<h2>The ABCD-J platform</h2>
<h4>An open source platform for digital biomarker for neuro-medicine in NRW</h4>
<br>
<ul>
A collaboration between clinical, academic, and industry partners:
<table>
<tr>
<td style="vertical-align:top; font-size:30px">
<br>
Research Center Jülich
<br>
<br>
RWTH Aachen <br>
University Bonn <br>
University Cologne <br>
HHU Düsseldorf <br>
<br>
LVR Clinics <br>
DZNE <br>
<br>
PeakProfiling <br>
CanControl<br>
IXP<br>
<br>
(open to future additions)
</td>
<td style="vertical-align:middle">
<img src="../pics/abcdj/logo.png">
</td>
</tr>
</table>
</ul>
</section>
<section>
<h2>Goals</h2>
<ul>
<li><b>Social:</b> Promote and facilitate collaboration between
multiple centers </li>
<li><b>Technical:</b> Accelarate research through homogenization
of workflows and processes, with emphasis on digital biomarker
development; Elevate existing open technical solutions for
research practice adoption</li>
</ul>
</section>
<section>
<h2>Open research infrastructure</h2>
<table>
<tr>
<td><b>Clinicians' point of view</b></td>
<td><b>Patients' point of view</b></td>
</tr>
<tr>
<td style="vertical-align:top">
<ul>
<li class="fragment" data-fragment-index="1">"Deep phenotyping": ecologically valid, multimodal data</li>
<li class="fragment" data-fragment-index="2">Decentral data acquisition, standardized and reproducible</li>
<li class="fragment" data-fragment-index="3">Focus on patient well-being and optimal treatment</li>
</ul>
</td>
<td style="vertical-align:top">
<ul>
<li class="fragment" data-fragment-index="4">Accurate diagnosis and optimal treatment</li>
<li class="fragment" data-fragment-index="5">Strict data protection</li>
<li class="fragment" data-fragment-index="6">Individual patient is central</li>
<li class="fragment" data-fragment-index="7">Minimal disturbance in daily life</li>
</ul>
</td>
</tr>
<tr>
<td>
<div class="r-stack">
<img class="fragment fade-in" data-fragment-index="1" height="300px" src="../pics/abcdj/multimodal_data.png">
</div>
</td>
<td>
<div class="r-stack">
<img class="fragment fade-in" data-fragment-index="2" height="250px" src="../pics/abcdj/decentral_monitoring.png">
</div>
</td>
</tr>
</table>
</section>
<section>
<h2>Open Research Infrastructure</h2>
<table>
<tr>
<td><b>Research data management's point of view</b></td>
</tr>
<tr>
<td style="vertical-align:top">
<ul>
<li class="fragment" data-fragment-index="1">Technical solutions exists, need elevation for research practice adoption </li>
<li class="fragment" data-fragment-index="2">Homogenization of workflows and processes fosters collaboration across sites</li>
<li class="fragment" data-fragment-index="3">Decentralized approach with centralized services and web-based multi-center integration</li>
</ul>
</td>
</tr>
<tr>
<td>
<div class="r-stack">
</div>
</td>
<td>
<div class="r-stack">
</div>
</td>
</tr>
</table>
</section>
<section>
<h2>Front-end and Back-end</h2>
<ul style="font-size:35px">
<li>JTrack for decentral, ecologically valid acquisitions, complementing in-clinic assessments</li>
<li>JDash for study management, participant management,
and analytics overview (derived study data at subject & group level)</li>
<li>Central data overview and analytics at FZJ</li>
<ul>
<li>Provenance-tracked analysis and modeling</li>
<li>Automated meta-data extraction for data discoverability</li>
<li>Result overview for clinical decision making</li>
</ul>
</ul>
<table>
<tr>
<td>
<img src="../pics/abcdj/backend.png" height="500px">
</td>
<td>
<img src="../pics/jtrack/jdash_overview.png" height="500px" width="500px">
</td>
</tr>
</table>
</section>
<section>
<h2>Opportunities</h2>
<b>Software improves with its use cases</b>
<ul style="font-size:30px">
<li>JTrack integration into different types wearables</li>
<li>JTrack integration of cognitive tasks and feedback to participants </li>
<li>Julearn integration into JDash</li>
<li>More meta-data extractors for DataLad</li>
<li>...</li>
</ul>
<img src="../pics/abcdj/timeline.png">
</section>
<section>
<h2>Current (first) steps</h2>
<ul>
<b>Data cataloging</b>
<li>Leveraging legacy data via data census and meta-data catalog </li>
<ul>
<li>improved discovery without direct data transfer</li>
<li>homogenization of access request procedures</li>
<li>establishing a legal basis for (re-)use</li>
<li>Example: <a href="http://data.sfb1451.de" target="_blank">data.sfb1451.de</a></li>
</ul>
<li>Demonstrator for data infrastructure based on §21 data
(standardized and anonymized performance data of hospitals, legally required,
submitted yearly to InEK by all hospitals)</li>
<br>
<b>Feasibility/Proof-of-concept study</b>
<li>Recommendations for common digital tools and workflows for common tasks and processes</li>
<li>Selection of digital measures from clinical routine for first trials</li>
</ul>
</section>
</section>
<section>
<section>
<h2>Summary</h2>
<ul>
<li class="fragment">Open Source (Research) Software aids in various domain-general or
-specific applications.</li>
<li class="fragment">Open Science needs Open Source: For transparency and reproducibility, for
science-specific requirements, for open formats, for re-use, and to
enable interoperability across tools.</li>
<li class="fragment">Collaboration across clinical and research settings is a technical,
social, and legal challenge. Technical solutions won't save us alone,
but they are a good first step.</li>
<li class="fragment">We build clinical research infrastructure on open tools, for better science</li>
</ul>
</section>
<section>
<h2>Thanks!</h2>
<br><br><br>
Questions? <br><br>
Inputs, Resources, Synergies <br><br>
</section>
</section>
</div>
</div>
<script src="../reveal.js/dist/reveal.js"></script>
<script src="../reveal.js/plugin/notes/notes.js"></script>
<script src="../reveal.js/plugin/markdown/markdown.js"></script>
<script src="../reveal.js/plugin/highlight/highlight.js"></script>
<script src="../custom_functions.js"></script>
<script>
// More info about initialization & config:
// - https://revealjs.com/initialization/
// - https://revealjs.com/config/
Reveal.initialize({
hash: true,
// The "normal" size of the presentation, aspect ratio will be preserved
// when the presentation is scaled to fit different resolutions. Can be
// specified using percentage units.
width: 1280,
height: 960,
// Factor of the display size that should remain empty around the content
margin: 0.3,
// Bounds for smallest/largest possible scale to apply to content
minScale: 0.2,
maxScale: 1.0,
controls: true,
progress: true,
history: true,
center: true,
slideNumber: 'c',
pdfSeparateFragments: false,
pdfMaxPagesPerSlide: 1,
pdfPageHeightOffset: -1,
transition: 'slide', // none/fade/slide/convex/concave/zoom
// Learn about plugins: https://revealjs.com/plugins/
plugins: [ RevealMarkdown, RevealHighlight, RevealNotes ]
});
</script>
</body>
</html>