Ensembles Open Notebook

Monday May 11, 2009

This is an open notebook science post about my recent work. My boss and I are interested in assessing how well coupled models from the ENSEMBLES project reproduce the feedback between sea surface temperatures (SST) in the South Atlantic Ocean and the South Atlantic Convergence Zone (SACZ).

Data

I’m using data from the stream 2 seasonal runs. The data is available in an Opendap server, but I found it easier to download everything from their FTP server. I downloaded the 14-month hindcasts that start in November, since the SACZ is stronger during Austral summer. Here’s the URL for the data I’m using:

http://ensembles.ecmwf.int/download/ensembles/
        stream2/
        seasonal/
        atmospheric/
        monthly/
        ${VAR}/
        FC_${VAR}_mon_[1960-2005]1101.nc

For precipitation $VAR is 228 (see table). Note that each year, starting in 1960 and up to 2005, is a separate file. The integrations start on November 1st (1101).

After the data is downloaded I have 46 files for each variable. I wrote a Python script to concatenate the data in a 5D NetCDF file that GrADS can understand, trying to make my boss’ life easier. Me, since I use Python, even binary blobs are fine for me. The script uses pynetcdf, since I was having problems with memory exhaustion using pupynere.

import sys

from numpy import *
##from pupynere import NetCDFFile as netcdf_file
from pynetcdf import NetCDFFile as netcdf_file

# dimensions
MEMBERS = 45
TIME = 46  # 1960--2005
INTEGRATION = 14
LATITUDE = 73
LONGITUDE = 144

VARS = {
        '228': 'prlr',
        '139': 'ts',
        '151': 'psl',
        '164': 'clt',
        '165': 'uas',
        '166': 'vas',
        '167': 'tas',
        '169': 'rsds',
        '175': 'rlds',
        '176': 'rss',
        '177': 'rls',
        }

years = range(1960, 2005+1)
code = sys.argv[1]
var = VARS[code]

out = netcdf_file('%s.nc' % var, 'w')

# Ensemble dimension.
out.createDimension('ensemble', MEMBERS)
ensemble = out.createVariable('ensemble', 'i', ('ensemble',))
ensemble[:] = range(MEMBERS)
ensemble.grads_dim = 'e'
ensemble.axis = 'e'

# Time axis. Data will be filled later.
out.createDimension('time', TIME)
time = out.createVariable('time', 'f', ('time',))
time.units = 'days since 1950-01-01 00:00:00'

# Other axes.
z = out.createDimension('z', INTEGRATION)
z = out.createVariable('z', 'i', ('z',))
z[:] = range(INTEGRATION)
z.units = 'level'
z.axis = 'z'
out.createDimension('latitude', LATITUDE)
out.createDimension('longitude', LONGITUDE)

data = out.createVariable(var, 'f', 
        ('ensemble', 'time', 'z', 'latitude', 'longitude'))

for i, year in enumerate(years):
    inp = netcdf_file('FC_%s_mon_%d1101.nc' % (code, year))

    time[i] = inp.variables['reftime'][0]
    data[:,i,:,:,:] = swapaxes(inp.variables[var][:,:,:,:], 0, 1)

    if i == 0:
        latitude = out.createVariable('latitude', 'f', ('latitude',))
        latitude[:] = inp.variables['latitude'][:]
        for k in dir(inp.variables['latitude']):
            v = getattr(inp.variables['latitude'], k)
            if not hasattr(latitude, k): setattr(latitude, k, v)

        longitude = out.createVariable('longitude', 'f', ('longitude',))
        longitude[:] = inp.variables['longitude'][:]
        for k in dir(inp.variables['longitude']):
            v = getattr(inp.variables['longitude'], k)
            if not hasattr(longitude, k): setattr(longitude, k, v)

        for k in dir(inp.variables[var]):
            v = getattr(inp.variables[var], k)
            if not hasattr(data, k): setattr(data, k, v)

    inp.close()
out.close() 

I’m currently downloading 11 variables for my analyses. After running the above script on each variable I combine the 46 files in a single 1.2 Gb NetCDF that GrADS can open.

Roberto De Almeida

,

---

Comment

Commenting is closed for this article.

---