This acts on the report by `tool/link_consolidator.py`. This changeset is large but conceptually simple: - no http:// links anymore, unless there is no https:// alternative - no redundant trailing slashes - no redundant "latest" flavors in doc links - resolve permanent redirects - if there is a simpler/shorter URL that points to the same information, go for that one - reduce duplicate specifications of URLs in a single file With all changes applied, the link consolidator still reports the following issues: - `docs/glossary.rst`: - *line 163*: `permanent-redirect` [https://try.github.io - permanently to https://docs.github.com/en/get-started/quickstart/set-up-git] - `docs/basics/101-127-yoda.rst`: - *line 453*: `permanent-redirect` [https://the-turing-way.netlify.app - permanently to https://the-turing-way.netlify.app/index.html] - `docs/basics/101-180-FAQ.rst`: - *line 128*: `none-https` [http://www.humanconnectomeproject.org] - `docs/beyond_basics/101-173-contributing.rst`: - *line 34*: `none-https` [http://source.git-annex.branchable.com/?p=source.git;a=summary] - `docs/usecases/HCP_dataset.rst`: - *line 63*: `none-https` [http://www.humanconnectomeproject.org/wp-content/uploads/2010/01/HCP_Data_Agreement.pdf] The non-https URLs have no alternative. The perma-redirects make the URLs needlessly long. So I decided to keep them.
54 lines
2.4 KiB
ReStructuredText
54 lines
2.4 KiB
ReStructuredText
.. _summaryshare:
|
|
|
|
Summary
|
|
-------
|
|
|
|
Without access to the same computational infrastructure, you can share your
|
|
DataLad datasets with friends and collaborators by leveraging third party
|
|
services. DataLad integrates well with a variety of free or commercial services,
|
|
and with many available service options this gives you freedom in deciding where
|
|
you store your data and thus who can get access.
|
|
|
|
- An easy, free, and fast option is :term:`GIN`, a
|
|
web-based repository store for scientific data management. If you are registered
|
|
and have SSH authentication set up, you can create a new, empty repository,
|
|
add it as a sibling to your dataset, and publish all dataset contents -- including
|
|
annexed data, as GIN supports repositories with an annex.
|
|
|
|
- Other repository hosting services such as GitHub and GitLab [#f1]_ do not support
|
|
an annex. If a dataset is shared via one of those platforms, annexed data needs
|
|
to be published to an external data store. The published dataset stores
|
|
information about where to obtain annexed file contents from such that a
|
|
:dlcmd:`get` works.
|
|
|
|
- The external data store can be any of a variety of third party hosting providers.
|
|
To enable data transfer to and from this service, you (may) need to configure an
|
|
appropriate :term:`special remote`, and configure a publication dependency. The
|
|
section :ref:`sharethirdparty` walked you through how this can be done with
|
|
`Dropbox <https://www.dropbox.com>`_.
|
|
|
|
- The ``--data`` and ``--force`` options of :dlcmd:`push` allows to override
|
|
automatic decision making on to-be-published contents. If it isn't specified,
|
|
DataLad will attempt to figure out itself which and how dataset contents
|
|
shall be published. With a path to files, directories, or subdatasets you
|
|
can also publish only selected contents' data.
|
|
|
|
|
|
.. figure:: ../artwork/src/going_up.svg
|
|
:width: 40%
|
|
|
|
|
|
Now what can I do with it?
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Finally you can share datasets and their annexed contents with others without the
|
|
need for a shared computational infrastructure. It remains your choice where to
|
|
publish your dataset to -- considerations of data access, safety, or potential
|
|
costs will likely influence your choice of service.
|
|
|
|
|
|
.. rubric:: Footnotes
|
|
|
|
.. [#f1] Older versions of :term:`GitLab` provide a git-annex configuration, but it is disabled
|
|
by default, and to enable it you would need to have administrative
|
|
access to the server and client side of your GitLab instance.
|