Commit d96d20b4 authored by Daniel Scheffler's avatar Daniel Scheffler
Browse files

Fix slow conda environment creation.


Signed-off-by: Daniel Scheffler's avatarDaniel Scheffler <danschef@gfz-potsdam.de>
parent d4d98e33
Pipeline #27376 passed with stages
in 2 minutes and 45 seconds
......@@ -57,7 +57,7 @@ test_geoarray_install:
- mamba update -n base -c conda-forge --all
# create geoarray environment from environment_geoarray.yml
- conda env create -v -v --name geoarray_test -f tests/CI_docker/context/environment_geoarray.yml
- mamba env create --name geoarray_test -f tests/CI_docker/context/environment_geoarray.yml
- conda activate geoarray_test
# install geoarray
......
......@@ -4,8 +4,10 @@ FROM ci_base_centos:0.2
COPY *.yml /root/
# update the ci_env environment (that already contains all packages installed via 'docker_pyenvs' repo)
# NOTE: The pkgs directory (cache) is deleted because otherwise conda env create takes hours within a docker container.
RUN /bin/bash -i -c "\
source /root/mambaforge/bin/activate ; \
mamba update -n base -c conda-forge --all;\
mamba update -n base -c conda-forge --all; \
conda activate ci_env; \
mamba env update -n ci_env -f /root/environment_geoarray.yml"
mamba env update -n ci_env -f /root/environment_geoarray.yml; \
rm -rf /root/mambaforge/pkgs"
  • @nbohn @romulo I am currently moving all my CI environments from Miniconda to Mambaforge due to license issues with Miniconda and because of the much faster package solver within Mambaforge.

    I think I found a way to avoid conda env create ... (e.g., used in the CI job that tests the package installation) hanging for ages at "Executing transaction" when running in a docker container. BTW, it does not matter if I use conda env create or mamba env create - both are somehow blocked.

    It seems like this is related to Conda´s package cache. So if I just remove the package cache directory completely at the end of the docker file, Conda has to re-download everything at runtime but the CI job finishes successfully within a few minutes. Take a look here - the job took around 1 hour before deleting the pkgs directory and 00:04:30 afterwards (including all the downloads). I think the issue could be related to the user permissions within the docker container, therefore we don´t have the problem when running it on a native Linux machine.

    It might be that Conda only hangs due to the package cache of a single package (according to the verbose output, it was boost-cpp which took most of the time in my case) but so far I could not find a way to operationally avoid that without deleting the entire package cache. Conda also has an official conda clean --all command but this did not completely remove the cache files and had no effect.

    Edited by Daniel Scheffler
  • @danschef Thanks for the information! I also have already been dealing with the slow execution of the miniconda package solver. It took hours to build a new docker image on mefe2 yesterday, so that I aborted the process finally and switched to mambaforge. That worked much faster. Likewise, I replaced each miniconda call in the SICOR CI jobs with mambaforge.

    So far, I didn't have execution time issues of the magnitude you mention. Using mamba create (I think you can omit the env here) took about 16 minutes. However, reducing this to 4 or 5 minutes would of course be nice. I'll try the solutions you brought up above.

  • @danschef I just implemented and tested your solution with adding rm -rf /root/mambaforge/pkgs to the docker file and it works perfectly! The runtime of a complete pipeline reduces from about 30-40 minutes to 5-10 minutes. So, thanks again for this hint!

Please register or sign in to reply
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment