I use Anaconda for managing my computational software environments. It’s flexible, easy to install (even if you don’t have sudo access), and is well-supported with computational biology tools thanks to bioconda. Here are some pragmatic tips for making conda environments easier to deal with.
Anaconda is, at heart, a Python installation, and is written in Python. To solve which packages and versions are required, it uses the pycosat Python package. While excellent, it can be slow for large repositories like bioconda and conda-forge. This can make solving packaging problems difficult and slow.
That’s where Mamba comes in.
It is a C++ application that uses libsolv, a C++ satisfiability solver that manages dependecies for RedHat, Debian, and other Linux operating systems.
This makes mamba fast.
It also has other advantages, like multi-threaded downloads and better repository indexing.
Check out mamba’s announcement post for some more details.
You can install it replace almost every
conda ... command with
Installing it is easy.
conda install -c conda-forge mamba
And you’re ready to start using it.
You can replace
mamba for almost every command in the rest of this post.
Undo changes with revisions
Anaconda stores the states of environment before and after installing new packages. It helps with rolling back to the earlier set of packages if something goes wrong during a new installation. You can see the state of your environment with
conda list --revisions
and roll back to a previous set of installed packages with
conda install --revision <N>
It’s a great little feature for when you accidentally update your R or Python version and need to update almost every other package in your environment. I rarely see this feature mentioned, but when it is I only ever see this blog post. It’s worth a read and expands a bit more on what I’ve listed here.
Exporting and importing environments
To help others with reproducible builds of your work, listing out packages and versions is useful.
An easy way to do this is by pairing
conda env export with
conda env create.
After installing necessary packages via the command line, save the environment with
conda env export --no-builds > environment.yaml
--no-builds flag means that you don’t export things related to a specific operating system, like Windows or Mac.
It makes things a bit more flexible, but still keeps the rest of the package version number.
environment.yaml file, a collaborator (or just yourself on another computer) can recreate the environment with
conda env create -n <NAME> -f environment.yaml
This will build an environment from the
environment.yaml file on the new computer.
Let’s say you’re on a system that doesn’t have external internet access, like a private cluster partition. Not a problem! Installing packages can be done in offline mode, as long as the packages are locally available, with
conda install --offline
Conda will check the
$CONDA_PREFIX/pkgs directory for available packages and solve dependencies based on this limitation.
If the package can’t be installed because it requires a dependency that isn’t locally available, it will tell you.
You can copy and paste packages into that folder, like from a USB key, to build up your available packages.
Periodically clean your conda folder
Over time, the
$CONDA_PREFIX/pkgs folder will fill up with dozens or hundreds of packages.
Conda uses these packages when solving dependencies, so if this folder becomes very cluttered, it can take an installation command a long time to finish.
Periodic cleaning with
conda clean can remove downloaded packages from this folder, making future installation commands faster.
Obviously, don’t do this if you’re only using
conda install --offline.
clean subcommand can be used in a few ways to remove different types of files.
# remove only the package tarballs conda clean -t # remove the extracted packages conda clean -p # remove the index of packages to force a reindexing conda clean -i # remove everything conda clean -a
I do this when I find my install commands taking a long time. If you want to, you could make a cron job out of it so you get the benefits automatically without thinking about it.
Build your own conda packages
Sometimes you find a Python package, or something from another language, that isn’t available from the default conda channels or conda-forge.
No worries, you can often build your own packages pretty easily with
Even better, if the package you want to make a conda pacakge out of is on a major repository like PyPI or CRAN, you can use
This builds the conda package using structured information from the repository itself.
In many cases, you don’t have to do anything special for making a conda package.
Sometimes you’ll have to modify the resultant
build.bat files, created from the
conda skeleton command, if the original package requires special build commands or libraries.
I do this occasionally and have a few special conda packages in my own git repo.
I then store these conda packages on the Anaconda.org hosting service with
anaconda-client and add my personal channel to my conda configuration with
conda config --add channels <my_anaconda.org_username>
Anaconda is an extremely useful tool to help you get things done. I hope some of these tips will prove useful for people trying it out for the first time, or are hoping to find new ways to use it more effectively.