datalad-handbook/docs/code_from_chapters/02_reproducible_execution_code.rst
2020-01-09 08:35:18 +01:00

277 lines
6.2 KiB
ReStructuredText

Code from chapter: 02_reproducible_execution
--------------------------------------------
Code snippet 40::
cd DataLad-101 && mkdir code && tree -d
Code snippet 41::
cat << EOT > code/list_titles.sh
for i in recordings/longnow/Long_Now__Seminars*/*.mp3; do
# get the filename
base=\$(basename "\$i");
# strip the extension
base=\${base%.mp3};
# date as yyyy-mm-dd
printf "\${base%%__*}\t" | tr '_' '-';
# name and title without underscores
printf "\${base#*__}\n" | tr '_' ' ';
done
EOT
Code snippet 42::
datalad status
Code snippet 43::
datalad save -m "Add short script to write a list of podcast speakers and titles"
Code snippet 44::
datalad run -m "create a list of podcast titles" "bash code/list_titles.sh > recordings/podcasts.tsv"
Code snippet 45::
git log -p -n 1
Code snippet 46::
datalad run -m "Try again to create a list of podcast titles" "bash code/list_titles.sh > recordings/podcasts.tsv"
Code snippet 47::
git log --oneline
Code snippet 48::
less recordings/podcasts.tsv
Code snippet 49::
cat << EOT >| code/list_titles.sh
for i in recordings/longnow/Long_Now*/*.mp3; do
# get the filename
base=\$(basename "\$i");
# strip the extension
base=\${base%.mp3};
printf "\${base%%__*}\t" | tr '_' '-';
# name and title without underscores
printf "\${base#*__}\n" | tr '_' ' ';
done
EOT
Code snippet 50::
datalad status
Code snippet 51::
datalad save -m "BF: list both directories content" code/list_titles.sh
Code snippet 52::
git log -n 2
Code snippet 53::
echo "$ datalad rerun $(git rev-parse HEAD~1)" && datalad rerun $(git rev-parse HEAD~1)
Code snippet 54::
git log -n 1
Code snippet 55::
datalad diff --to HEAD~1
Code snippet 56::
git diff HEAD~1
Code snippet 57::
cat << EOT >> notes.txt
There are two useful functions to display changes between two
states of a dataset: "datalad diff -f/--from COMMIT -t/--to COMMIT"
and "git diff COMMIT COMMIT", where COMMIT is a shasum of a commit
in the history.
EOT
Code snippet 58::
datalad save -m "add note datalad and git diff"
Code snippet 59::
git log -- recordings/podcasts.tsv
Code snippet 60::
cat << EOT >> notes.txt
The datalad run command can record the impact a script or command has on a Dataset.
In its simplest form, datalad run only takes a commit message and the command that
should be executed.
Any datalad run command can be re-executed by using its commit shasum as an argument
in datalad rerun CHECKSUM. DataLad will take information from the run record of the original
commit, and re-execute it. If no changes happen with a rerun, the command will not be written
to history. Note: you can also rerun a datalad rerun command!
EOT
Code snippet 61::
datalad save -m "add note on basic datalad run and datalad rerun"
Code snippet 62::
ls recordings/longnow/.datalad/feed_metadata/*jpg
Code snippet 63::
datalad run -m "Resize logo for slides" \
"convert -resize 400x400 recordings/longnow/.datalad/feed_metadata/logo_salt.jpg recordings/salt_logo_small.jpg"
Code snippet 64::
datalad run --input "recordings/longnow/.datalad/feed_metadata/logo_salt.jpg" "convert -resize 400x400 recordings/longnow/.datalad/feed_metadata/logo_salt.jpg recordings/salt_logo_small.jpg"
Code snippet 65::
datalad run --input "recordings/longnow/.datalad/feed_metadata/logo_salt.jpg" "convert -resize 450x450 recordings/longnow/.datalad/feed_metadata/logo_salt.jpg recordings/salt_logo_small.jpg"
Code snippet 66::
datalad unlock recordings/salt_logo_small.jpg
Code snippet 67::
datalad status
Code snippet 68::
convert -resize 450x450 recordings/longnow/.datalad/feed_metadata/logo_salt.jpg recordings/salt_logo_small.jpg
Code snippet 69::
datalad save -m "resized picture by hand"
Code snippet 70::
datalad run --input "recordings/longnow/.datalad/feed_metadata/logo_interval.jpg" --output "recordings/interval_logo_small.jpg" "convert -resize 450x450 recordings/longnow/.datalad/feed_metadata/logo_interval.jpg recordings/interval_logo_small.jpg"
Code snippet 71::
cat << EOT >> notes.txt
You should specify all files that a command takes as input with an -i/--input flag. These
files will be retrieved prior to the command execution. Any content that is modified or
produced by the command should be specified with an -o/--output flag. Upon a run or rerun
of the command, the contents of these files will get unlocked so that they can be modified.
EOT
Code snippet 72::
datalad run -m "Resize logo for slides" \
--input "recordings/longnow/.datalad/feed_metadata/logo_interval.jpg" \
--output "recordings/interval_logo_small.jpg" \
"convert -resize 400x400 recordings/longnow/.datalad/feed_metadata/logo_interval.jpg recordings/interval_logo_small.jpg"
Code snippet 73::
datalad status
Code snippet 74::
datalad save -m "add additional notes on run options"
Code snippet 75::
datalad run -m "Resize logo for slides" \
--input "recordings/longnow/.datalad/feed_metadata/logo_interval.jpg" \
--output "recordings/interval_logo_small.jpg" \
"convert -resize 400x400 recordings/longnow/.datalad/feed_metadata/logo_interval.jpg recordings/interval_logo_small.jpg"
Code snippet 76::
cat << EOT >> notes.txt
Important! If the dataset is not "clean" (a datalad status output is empty),
datalad run will not work - you will have to save modifications present in your
dataset.
EOT
Code snippet 77::
datalad run -m "Resize logo for slides" \
--input "recordings/longnow/.datalad/feed_metadata/logo_salt.jpg" \
--output "recordings/salt_logo_small.jpg" \
--explicit \
"convert -resize 400x400 recordings/longnow/.datalad/feed_metadata/logo_salt.jpg recordings/salt_logo_small.jpg"
Code snippet 78::
datalad status
Code snippet 79::
cat << EOT >> notes.txt
A suboptimal alternative is the --explicit flag,
used to record only those changes done
to the files listed with --output flags.
EOT
Code snippet 80::
datalad save -m "add note on clean datasets"
Code snippet 81::
git log -p -n 2