datalad-handbook/docs/basics/101-103-modify.rst
2023-11-09 15:44:42 +01:00

188 lines
7 KiB
ReStructuredText

.. _modify:
Modify content
--------------
So far, we've only added new content to the dataset. And we have not done
much to that content up to this point, to be honest. Let's see what happens if
we add content, and then modify it.
For this, in the root of ``DataLad-101``, create a plain text file
called ``notes.txt``. It will contain all of the notes that you take
throughout the course.
Let's write a short summary of how to create a DataLad dataset from scratch:
"One can create a new dataset with 'datalad create
[--description] PATH'. The dataset is created empty".
This is meant to be a note you would take in an educational course.
You can take this note and write it to a file with an editor of your choice.
The code snippet, however, contains this note within the start and end part of a
`heredoc <https://en.wikipedia.org/wiki/Here_document>`_.
You can also copy the full code snippet, starting
from ``cat << EOT > notes.txt``, including the ``EOT`` in the last line, in your
terminal to write this note from the terminal (without any editor) into ``notes.txt``.
.. index:: here-document, heredoc
.. find-out-more:: How does a heredoc (here-document) work?
The code snippet makes sure to write lines of text into a
file (that so far does not exist) called ``notes.txt``.
To do this, the content of the "document" is wrapped in between
*delimiting identifiers*. Here, these identifiers are *EOT* (short
for "end of text"), but naming is arbitrary as long as the two identifiers
are identical. The first "EOT" identifies the start of the text stream, and
the second "EOT" terminates the text stream.
The characters ``<<`` redirect the text stream into
`"standard input" (stdin) <https://en.wikipedia.org/wiki/Standard_streams#Standard_input_(stdin)>`_,
the standard location that provides the *input* for a command.
Thus, the text stream becomes the input for the
`cat command <https://en.wikipedia.org/wiki/Cat_(Unix)>`_, which takes
the input and writes it to
`"standard output" (stdout) <https://en.wikipedia.org/wiki/Standard_streams#Standard_output_(stdout)>`_.
Lastly, the ``>`` character takes ``stdout`` can creates a new file
``notes.txt`` with ``stdout`` as its contents.
It might seem like a slightly convoluted way to create a text file with
a note in it. But it allows to write notes from the terminal, enabling
this book to create commands you can execute with nothing other than your terminal.
You are free to copy-paste the snippets with the heredocs,
or find a workflow that suites you better. The only thing important is that
you create and modify a ``.txt`` file over the course of the Basics part of this
handbook.
Running this command will create ``notes.txt`` in the
root of your ``DataLad-101`` dataset:
.. index:: heredoc
pair: heredoc; on Windows in a terminal
.. windows-wit:: Heredocs don't work under non-Git-Bash Windows terminals
.. include:: topic/heredoc-windows.rst
.. index::
pair: create heredoc; in a terminal
.. runrecord:: _examples/DL-101-103-101
:language: console
:workdir: dl-101/DataLad-101
:cast: 01_dataset_basics
:notes: Let's find out how we can modify files in dataset. Lets create a text file with notes about the DataLad commands we learned. (maybe explain here docs)
$ cat << EOT > notes.txt
One can create a new dataset with 'datalad create [--description] PATH'.
The dataset is created empty
EOT
.. index::
pair: check dataset for modification; with DataLad
Run :dlcmd:`status` to confirm that there is a new, untracked file:
.. runrecord:: _examples/DL-101-103-102
:language: console
:workdir: dl-101/DataLad-101
:cast: 01_dataset_basics
:notes: As expected, there is a new file in the dataset. At first the file is untracked. We can save without a path specification because it is the only existing modification
$ datalad status
.. index::
pair: save dataset modification; with DataLad
Save the current state of this file in your dataset's history. Because it is the only modification
in the dataset, there is no need to specify a path.
.. runrecord:: _examples/DL-101-103-103
:language: console
:workdir: dl-101/DataLad-101
:cast: 01_dataset_basics
$ datalad save -m "Add notes on datalad create"
But now, let's see how *changing* tracked content works.
Modify this file by adding another note. After all, you already know how to use
:dlcmd:`save`, so write a short summary on that as well.
Again, the example uses Unix commands (``cat`` and redirection, this time however
with ``>>`` to *append* new content to the existing file)
to accomplish this, but you can take any editor of your choice.
.. runrecord:: _examples/DL-101-103-104
:language: console
:workdir: dl-101/DataLad-101
:cast: 01_dataset_basics
:notes: Now let's start to modify this text file by adding more notes to it. Think about this being a code file that you add functions to:
$ cat << EOT >> notes.txt
The command "datalad save [-m] PATH" saves the file (modifications) to
history.
Note to self: Always use informative, concise commit messages.
EOT
Let's check the dataset's current state:
.. runrecord:: _examples/DL-101-103-105
:language: console
:workdir: dl-101/DataLad-101
:cast: 01_dataset_basics
$ datalad status
and save the file in DataLad:
.. runrecord:: _examples/DL-101-103-106
:language: console
:workdir: dl-101/DataLad-101
:cast: 01_dataset_basics
:notes: The modification can be saved as well
$ datalad save -m "add note on datalad save"
Let's take another look into our history to see the development of this file.
We are using :gitcmd:`log -p -n 2` to see last two commits and explore
the difference to the previous state of a file within each commit.
.. runrecord:: _examples/DL-101-103-107
:language: console
:workdir: dl-101/DataLad-101
:lines: 1-28
:emphasize-lines: 6, 25
:cast: 01_dataset_basics
:notes: An the history gives an accurate record of what happened to this file
$ git log -p -n 2
We can see that the history can not only show us the commit message attached to
a commit, but also the precise change that occurred in the text file in the commit.
Additions are marked with a ``+``, and deletions would be shown with a leading ``-``.
From the dataset's history, we can therefore also find out *how* the text file
evolved over time. That's quite neat, isn't it?
.. index::
pair: log; Git command
pair: get help; with Git
pair: filter history; with Git
.. find-out-more:: 'git log' has many more useful options
``git log``, as many other ``Git`` commands, has a good number of options
which you can discover if you run ``git log --help``. Those options could
help to find specific changes (e.g., which added or removed a specific word
with ``-S``), or change how ``git log`` output will look (e.g.,
``--word-diff`` to highlight individual word changes).
.. only:: adminmode
Add a tag at the section end.
.. runrecord:: _examples/DL-101-103-108
:language: console
:workdir: dl-101/DataLad-101
$ git branch sct_modify_content