Building Conda Packages

Published: March 28, 2019   |   Read time:

Tagged:

Conda is a useful environment manager that takes care of packages and libraries across multiple languages. It has strong community support and thousands of packages are available from the conda-forge and bioconda channels. This makes it easy to use and incorporate conda packages into your development environment. But occassionally you will come across vital R or Python packages you need to use in your work that aren’t available as conda packages. Or you’re in the process of developing your own package and want to be able to add it to your conda environment.

To resolve this discrepancy, you can make your own conda package. The official documentation1 for building your own packages is thorough and useful, but doesn’t serve as an easy start-up guide. This blog post is a more high-level overview, with some step-by-step instructions, on how to convert your favourite bit of code into a conda package. The goal of this post is to make it easy for people new to conda packages to understand how to make one, and what they’re about.

Structure of a conda package

In short, conda packages are just zipped folders with installation instructions.

At a bare minimum, conda packages require:

  • The files you want to be included in the package
  • Metadata information (contained in a meta.yaml file)
  • Build instructions (contained in a build.sh file for macOS/Linux, or bld.bat for Windows)

These files are all zipped up in a Bzip2 archive, and that archive is what gets downloaded and installed when you run a conda install <package> command.

Getting started

Conda packages are downloaded from conda repositories, called channels.

While you can set up your own self-hosted channels, Anaconda (the company behind conda) created Anaconda Cloud2, a site where you can upload your packages to a variety of channels, manage favourites, share environments, and more.

Step 1: Create an Anaconda Cloud account

Think of Anaconda Cloud like GitHub for conda packages. Each channel is like a GitHub user or organization that manages their own packages and has their own set of rules. Each conda package is like an individual git repo that you can download, share, modify, etc.

Step 2: create an API token

You’ll need this if you want to upload packages directly from the command line, instead of through the Anaconda Cloud website via a browser. After you’ve created and logged into your account, navigate to User > Settings > Access.

Give the token a memorable name (like Desktop Upload), check `Allow all operations on Conda repositories`, and set whatever expiration date you’d like, say, a year from now.

Save this token, you’ll need it for later.

Step 3: download tools

You can install all the tools you’ll need with the following command:

conda install conda-build conda-verify anaconda-client

Next, log into your Anaconda Cloud account via the command line. This will allow you to automatically upload your newly built packages.

To do this, run anaconda login and enter your credentials, when prompted.

Creating a package skeleton

If this is your first conda package, you likely don’t want to make it from scratch. Thankfully, conda comes with the conda skeleton command to make that first package more easily.

Porting an existing package

If you’re porting an existing Perl package that exists on CPAN, for example, the conda skeleton command can import information for this package and create the meta and build scripts, mentioned prevously. At the time of writing, this can be done for packages on CPAN (Perl), CRAN (R), PyPI (Python), Luarocks (Lua), and RPM (generic).

For details on creating a skeleton from any of these repositories, see conda skeleton ---help.

Creating a new package

If you’re creating a brand new package, I’ve provided some empty templates in this GitHub Gist3, that you can fill is as needed. These are very generic templates, and won’t suit any and every package someone may make. But they’re simple enough and contain most of the details you’ll need to at least get started.

You should grab the <language>_meta.yaml, <language>_build.sh, and <language>_bld.bat gists, for your desired language.

Alternatively, you can use the conda skeleton command like above for some similar or easy to install package, and then just remove that package’s information and replace it with your own.

You should now edit these build files as needed, such that your package should be able to install. This includes any compilation or make steps, adding libraries, etc. For details on what to put in these files, see the official documentation1.

Building the package

Building the package is accomplished with the conda build command. The simplest version of this command that takes into account everything mentioned so far is:

conda build --token <ANACONDA_CLOUD_API_TOKEN> --user <ANACONDA_CLOUD_USERNAME> meta.yaml

This command goes through the following steps:

  1. Reads the metadata.
  2. Downloads the source into a cache.
  3. Extracts the source into the source directory.
  4. Applies any patches.
  5. Re-evaluates the metadata, if source is necessary to fill any metadata values.
  6. Creates a build environment, and then installs the build dependencies there.
  7. Runs the build script. The current working directory is the source directory with environment variables set. The build script installs into the build environment.
  8. Performs some necessary post-processing steps, such as shebang and rpath.
  9. Creates a conda package containing all the files in the build environment that are new from step 5, along with the necessary conda package metadata.
  10. Tests the new conda package if the recipe includes tests:
    1. Deletes the build environment.
    2. Creates a test environment with the package and its dependencies.
    3. Runs the test scripts.
  11. Uploads the package to your Anaconda Cloud account.

This process tends to take a few minutes, but you don’t have to do anything.

And that’s it, you’ve made your first conda package!

You install it to your local environment by running:

conda install -c <ANACONDA_CLOUD_USERNAME> <PACKAGENAME>

Summary

This is a brief, but hopefully straightforward, overview on how to create your own conda package. It entails 4 main steps:

  • Creating and Anaconda Cloud account
  • Installing necessary build tools
  • Writing metadata and installation scripts
  • Building the package

The first two steps you only have to do once, and the last two steps are relatively simple if you have a simple codebase that doesn’t require a particularly finicky setup.

Hopefully, with this introduction, you’ll be able to create and distribute your own conda packages, and make it easier to create reproducible and portable computational environments for whatever kind of work you do.

Advanced tips and troubleshooting

Streamlining the upload process

If you don’t want to type out --token <ANACONDA_CLOUD_API_TOKEN> --user <ANACONDA_CLOUD_USERNAME> every time, you can add the following to you $HOME/.condarc configuration file:

anaconda_upload: true
conda-build:
  anaconda_token: <ANACONDA_CLOUD_API_TOKEN>

If you do run into issues uploading with conda build, try running conda build ---no-anaconda-upload ..., navigate to the output build directory (most likely ${CONDA_PREFIX}/conda-bld/<platform>/), and run anaconda upload <package>.

Converting packages to multiple platforms

Conda packages will be created for the platform you create them on. So if you develop on a Linux machine, you’ll make Linux-based conda packages.

To make them portable to other platforms, you can use the conda convert command.

Start by navigating to the parent build directory where your package has been compiled. If your package has been created in ${CONDA_PREFIX}/conda-bld/<platform>/, navigate to ${CONDA_PREFIX}/conda-bld/. Then, run the convert command:

conda convert -p all -o . <current_platform>/<package>.tar.bz2

This will create packages for other platforms, like osx-64, or win-32 in their respective platform folders. See conda convert --help for details.

You can then upload them all simultaneously to Anaconda Cloud with:

anaconda upload */<package>.tar.bz2

This tends to work with simple Python-only packages, but can work with more complicated ones. Since many R packages include compiled C code, you may need to force this with the -f option. In other cases, you’ll get more stable package performance if you build them on your desired platform.

Continuous integration

The build and upload process can be included in your continuous integration pipelines, to streamline your development process. See my bed-jaccard4 GitHub repo for an example.

PyPI integration

Conda build also plays nicely with PyPI. You can configure API tokens for your PyPI account so that conda build commands also automatically upload full Python packages to PyPI.

References & Footnotes