Difference between revisions of "Python for Scientists"
Line 1: | Line 1: | ||
{{Needs Update}} | {{Needs Update}} | ||
+ | {{Consider Delete}} | ||
=Python Course Outline= | =Python Course Outline= | ||
Line 124: | Line 125: | ||
===gridding?=== | ===gridding?=== | ||
* Are there generic gridding modules in python? e.g. mercator, cubic sphere, tripolar, etc? | * Are there generic gridding modules in python? e.g. mercator, cubic sphere, tripolar, etc? | ||
+ | |||
+ | [[Category: Python]] |
Latest revision as of 00:07, 5 December 2019
Contents
Python Course Outline
Note: This is an outline for a future course currently in development.
Introduction to Python
Installation
Not intended as a CMS support topic, only how to quickly get started
- Enthought (self-contained python/numpy/scipy/matplotlib installation)
- Distribution packages (apt-get, yum, MacPorts)
Starting Python
- Python's interactive shell
- Python scripting (#!/usr/bin/env python stuff)
- iPython, PyLab: Interactive Matlab-like environments
- Spyder: A python-based GUI to emulate Matlab (not recommended?)
Basics
Focus on major differences from Matlab, programming languages:
- Forced indentation
- Built in data types: bool, int, float, complex, strings, lists, tuples, dicts, files, ...
- Indexing and Slicing (0..N-1, -1, -2, etc.)
- Aliasing: variables as labels
- Conditional syntax (if-then-else, while, etc.)
- Iterators: Looping without indexing ('for x in list:')
- Possibly some itertools examples?
- Functions
- Simple string manipulation
- Objects: Not OO design, just the persistence of objects
- e.g. anything (variables, functions, etc.) can be function arguments
Modules
- Using modules (import xyz, import xyz as x, from xyz import y)
- There are lots of modules for any given task, we should settle on a few recommendations (but also not be afraid to change our minds in the future)
Numpy
Numpy vs Matlab
- numpy arrays vs. Matlab matrices
- Array generation and manipulation (arange, linspace, zeros, meshgrid)
Review basic operations:
- Arithmetic on arrays
- built-in functions (sqrt, sin, etc.) and constants (pi)
- Numpy array indexing (syntax, slicing, etc.)
- Basic manipulations (concatenation, reshaping, tiling, vstack/hstack, etc.)
Performance
- Introduce numpy as a series of job submissions to the C libraries
- Vectorisation arithmetic (avoiding index loops)
- An Intel MKL numpy is often several times faster than a gcc BLAS numpy (though never actually confirmed this)
- numexpr: An easy-to-use, high-performance alternative to numpy for certain tasks, includes limited parallelisation options.
Masked arrays (numpy.ma)
(I have only some limited experience with ma, but it is very useful for land masks in CMIP ocean output)
- Mask creation (inc. logical numpy operators)
- Masked array manipulation
SciPy
I use scipy in a very ad-hoc manner, I don't have a comprehensive grasp of its features. But the following come to mind (with examples):
- Interpolation (scipy.interpolate)
- Earth grid generation
- Statistics (scipy.stats, scipy.random)
- Signal processing (scipy.signal)
- Time series analysis, filtering
- Linear Algebra
- SVD, EOF/PC analysis
Image Analysis
Note: ANU GFD has a strong fluid dynamics laboratory, and CoE researcher Andy Hogg often supervises students in lab experiments.
Python Image Library (PIL)
- I have no experience with PIL, but it has been used by at least one student here. It is a natural alternative to Matlab's builtin image analysis tools.
- I don't regard image analysis as CMS work, but including it in a course would help promote collaboration here at ANU.
I/O
NetCDF
- scipy.io.netcdf:
- Good performance
- Included with scipy
- imperfect NetCDF implementation (occasional garbage 'inf' data)
- netcdf4-python:
- Complete(?) NetCDF3/4 support
- Reduced performance
- Nontrivial installation
ASCII/raw text input
- I haven't done this in python, but there is still an occasional need for this (esp. for old data sets)
pydap
- Simplified interface to access netcdf files via OPeNDAP
PyTables
- HDF5 support, good performance
- (I haven't used this much)
Plotting
Matplotlib (2D plotting)
- Line/curve plots
- Scatter plots
- Field plots: Contours, image maps, etc.
- Subplots
- Frills (labels, legends, arrows, etc.)
Mayavi (3D plotting)
- I have never used this, but it's the only option that I know of
Shell Interface (OS)
The ANU group uses python for job submissions of numerical models on vayu, so there may be some interest in how to run subprocesses and manipulate files through python scripts, as if it were a traditional shell script.
- os, sys, shutils
- subprocess
Earth Science Tools and Modules
basemap (matplotlib)
- geographic plotting
- similar to mmap in Matlab
datetime, calendar
- Useful for calendar tracking
gridding?
- Are there generic gridding modules in python? e.g. mercator, cubic sphere, tripolar, etc?