At CarbonPlan, we use different types of computational environments to support our data science work:
These environments support various activities across our workflow:
Environment Comparison | ||
Activity | Local Environment | Docker |
Development | ✅ Primary choice | ⚠️ Can be slower |
Testing | ✅ Quick iterations | ✅ CI integration |
Deployment | ❌ Not recommended | ✅ Best practice |
Reproducibility | ⚠️ Limited | ✅ Excellent |
Collaboration | ⚠️ Setup required | ✅ Consistent experience |
We primarily use two tools for managing local environments, depending on project needs.
Pixi is our recommended tool for managing local environments, especially for projects with complex geospatial dependencies.
GDAL
, rasterio
, etc.)Follow the installation instructions to set up Pixi:
# Install Pixi
curl -fsSL https://pixi.sh/install.sh | bash
# Initialize a new project
pixi init
# Add dependencies (including conda-forge packages)
pixi add numpy pandas xarray
pixi add -c conda-forge gdal rasterio
# Run commands within the environment
pixi run python my_script.py
For projects that benefit from the broader conda ecosystem, you can use conda or its faster alternative, mamba.
We use Docker to create and manage containerized environments. These environments are used for deployment, testing, and reproducibility. We publish our docker images to Quay.io.
Docker is particularly valuable in these scenarios:
We typically use repo2docker
to create Docker images from GitHub repositories.
Setup: Create environment files in your repository:
environment.yml
for conda dependenciesrequirements.txt
for pip dependenciesapt.txt
for system dependenciesBuilding locally:
python -m pip install jupyter-repo2docker
repo2docker --no-run path/to/your/repo
Automated builds: We use GitHub Actions to build and push images to Quay.io when changes are pushed. This approach allows us to automatically build and publish Docker images whenever we push changes to the repository. An example GitHub Action workflow for building and pushing a Docker image can be found in the carbonplan/argo-docker repository