Python for climate scientists

where to start

This page is a work in progress.

If you’re at the beginning of your Python experience, and you’re not totally sure where to begin, I’m writing this for you.

And I have to say, if this is your first foray into Python, I’m very excited for you! It’s easier now than ever before to manage Python on an operating system. For you, that means less time banging your head against a keyboard trying to get your libraries to cooperate with one another, and more time :clap: learning :clap: that :clap: language.

Are you excited yet?!

doge

Note: Most of the tips here are based on experience with Mac OS. If you’re a Windows or LINUX user, use this resource with other documentation specific to your OS.


1. To get Python working, install Anaconda or Miniconda

I recommend using either Anaconda or Miniconda to install and maintain Python on your machine(s).

  • Anaconda and Miniconda are software packages that install the Python language, some other useful packages, and most importantly, conda.
  • conda itself is an open source package manager that was built in Python and helps keep all of its libraries compatible. (Well, it was originally built for Python, but it’s technically language-agnostic. So if you find yourself using other open-source languages like R or Julia often, conda is a great way to maintain them.)

Should I download Anaconda or Miniconda?

If you’re brand new to Python, Anaconda will probably be a safer bet. It’s a little bulky and will take a little longer to install, but it will also give you the most options while learning the language. Miniconda is a stripped down version of Anaconda, so if you don’t have much disk space, go with that. Personally, I like Miniconda, since it’s more lightweight for laptops, shared computers, or login nodes where disk space is limited.

Why conda?

  1. This is getting a little ahead of myself, but with conda, you can install multiple parallel “environments” of Python (or whatever other languages you prefer). That means you can have a Python installation you use most of the time and an older one that works with some random chunk of code you inherited from someone else who worked on an older version of Python.

  2. Another powerful aspect of conda is that you can use it to install non-Python-related software. For example, you can set up a single or different environments for NCO (NetCDF Operators), CDO (Climate Data Operators), and the NCL (NCAR Command Language). This, in my opinion, is what makes it so valuable; see the post on setting up these environments for more information.

  3. Anaconda/Miniconda are free (their developer, Continuum, offers proprietary add-ons, but there’s no reason you’ll ever need those).

Alternatives to Anaconda/Miniconda

“So what about pip for installing Python? I know someone who seems to prefer that.” Sure, pip is great! But for what it’s worth, pip comes with Anaconda and Miniconda, so you may as well go with one of those instead. They work well together.

“Hmm… ok, and what about Canopy? I think I met a ghost once who uses it!” Canopy seems like it could be great, but it’s not free to use the full distribution, and the Python community really seems to be gathering around conda these days.

2. Mess around with conda and get the hang of it

Alright, now that you’ve decided which one you want, install it and start learning about conda. I can’t do this as much justice as the half-hour getting started with conda documentation. That will get you where you need to be.

3. Install the libraries you’ll need most

If you have Anaconda, some of these will already be on your system. If instead you went with Miniconda, you’ll likely need to grab a few extra things. The most useful libraries for any Python installation are below:

library main use
numpy, scipy core Python tools
matplotlib plotting
jupyter Jupyter Notebook and related tools

My favorites more specific to Earth science data analysis include:

library main use
cartopy plotting maps
pandas loading/saving .csv, .txt, and other spreadsheet files; general panel data statistics package
xarray pandas for 3+ dimensions; NetCDF and HDF input/output; quick plotting
netcdf4 NetCDF data analysis; great as secondary option to xarray
gdal library and packages for the Geospatial Data Abstraction Library; useful for reading in HDF and geotiff files (remote sensing data sets)
wrf-python wrapper for Fortran functions that analyze WRF output
seaborn more plotting options; has a nice color bar builder and interfaces with ColorBrewer
cmocean really great color blind-friendly colormaps

You can install these one at a time:

conda install matplotlib

Or you can install a bunch at once, e.g.:

conda install numpy scipy matplotlib pandas xarray jupyter

If you install cartopy, I’d recommend going for the conda-forge channel option (per the recommendation of the folks over at SciTools, who develop it):

conda install -c conda-forge cartopy

Warning: I think cartopy is the best option for geospatial plotting, since it’s the replacement for the soon-to-be-retired basemap and will give you less grief down the line. But if you’re inheriting code from others, or if you simply prefer basemap for one reason or another, they don’t work well together (the short reason: cartopy uses a package called shapely, and basemap doesn’t work with it installed). So you’ll have to pick just one on your default environment.

4. Go forth and code

Make sure you install the four packages in the first table above (numpy, scipy, matplotlib, and jupyter). This will get you the main ingredients you need to get familiar with Python.

Probably the easiest way to learn the language is using Jupyter Notebook, which takes Python and ports it through a browser window, providing a great interface where you can add notes, images, and even Latex to document your workflow.

To start up a notebook, you want to navigate to a directory where you’d like to save it on your computer, then type:

jupyter notebook

If all works smoothly, your default web browser will pop up with a window, and you’re good go to.

I recommend skimming the Jupyter Notebook documentation and going deeper on some tutorials from here. YouTube has plenty of good videos, and here are some website options:

5. Optional: Read about how to set up some useful conda environments

Check out my approach on setting up NCO, CDO, and NCL using conda alone.

This site's design was adapted from Jekyll Swiss and is maintained on GitHub here.