Dockerizing the analysis #20
Labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
remodnav/paper#20
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
TL;DR Versions did not matter, time did not matter, but it matters who compiles numpy
Five years after we did this analysis, I am trying to compile a docker-based environment. I can success building the stats and figures in a wide variety of configurations. However, there are small differences.
I am collecting some notes here, trying to narrow down on a setup the reproduces the stats exeactly:
Debian buster PY3.7
pip freezegivesafter running the analysis, the following diff occurs
Debian bullseye PY3.9
pip freezegivesThe diff of the statistical scores is identical compared to the bullseye container. Also the same SVG are modified (also looks identical inside).
Ubuntu focal PY3.8
pip freezegivesThe diff of the statistical scores is identical compared to the bullseye and buster containers. Also the same SVG are modified (also looks identical inside).
Conclusions
reproduced the same diff as you with the python3.7 Docker image
For comparison: trying with a virtualenv, trying to go with whatever latest version that is still API compatible with the code.
The previously pinned seaborn 0.10.1 is incompatible with numpy 1.26, and had to be unpinned.
Pandas had to be pinned to the last 1.x release. Pandas 2.1.1 incompatibility:
pip freezegivesThis REPRODUCES all stats exactly!!!
The remaining diff is in the SVGs
Trying to drill down on the SVG diff. I had the hunch that pinning the seaborn version is probably a more important aspect than being able to upgrade numpy. And indeed:
gives an environment that fully reproduces the stats, and the full remaining diff is:
Visually, this is the part of the figure that is different:
Closeups of the two versions of the figure at the difference (it is the height of the bar in the middle).
Here is the relevant code that is resonsible for this plot:
It is plain matplotlib. We know the matplotlib version that was originally used, it is included in the files RDF metadata:
We have that exact version installed. but this obviously does not mean that we have the exact some binary running. Still weird to have this be the only difference.
With this success, I am back in Docker land. Clearly the virtualenv has an impact. So let's try to put a (superfluous) virtualenv inside the docker container.
Known that we can use much more recent software, I am basing on Debian bookworm and use the versions for the previous non-docker exploration:
And indeed! It also arrives at the minimal diff shown in https://github.com/psychoinformatics-de/paper-remodnav/issues/20#issuecomment-1757462683
The image compresses down to 625MB.
I can now confirm that the presence or absence of a virtualenv is irrelevant (as it should be). Here is another configuration that achieves the diff from https://github.com/psychoinformatics-de/paper-remodnav/issues/20#issuecomment-1757462683 without any virtualenv:
Probably the final post in this saga: The trigger for the reproducibility issue is who compiles numpy?
It depends on whether I am using a pip-compiled installation or one downloaded from Debian.
Either of these leads to reproducible results on their own, and that across a wide range of versions. But there is a noticeable difference in results across these means of compiling the sources.
Below is a complete Dockerfile for anyone interested in digging deeper. The key line is the specification of the numpy version. Whenever it is different from the numpy version provided by the respective Debian release (and it does not matter which one), pip will compile it, and it will reproduce the results published many years ago. So change
a version that is not in Debian bookwork to
a version that is in Debian bookworm, and the results will not reproduce. Make it
1.24.1and they will reproduce again, because it will also be compiled locally.Even when I set up a system like it would have existed at the time of publication (Debian buster), the results do not reproduce, unless pip compiles numpy.