Functions for data definition and access¶

Except for the first three paragraphs, this section is for advanced use. As a first step, you should consider using the built-in data data definitions described at projects. You may need to come back to this section for reference

ds : define a dataset object (actually a front-end for `cdataset`)¶

climaf.classes.ds(*args, **kwargs)[source]¶

Returns a dataset from its full Climate Reference Syntax string. Example

>>> ds('CMIP5.historical.pr.[1980].global.monthly.CNRM-CM5.r1i1p1.mon.Amon.atmos.last')

Also a shortcut for cdataset(), when used with with only keywords arguments. Example

>>> ds(project='CMIP5', model='CNRM-CM5', experiment='historical', frequency='monthly',              simulation='r2i3p9', domain=[40,60,-10,20], variable='tas', period='1980-1989', version='last')

In that latter case, you may use e.g. period=’last_50y’ to get the last 50 years (or less) of data; but this will work only if no dataset’s attribute is ambiguous. ‘first_50y’ also works, similarly; and also period=’*’.

You must refer to doc at : cdataset()

cdataset : define a dataset object¶

class climaf.classes.cdataset(**kwargs)[source]¶

Create a CLIMAF dataset.

A CLIMAF dataset is a description of what the data (rather than the data itself or a file). It is basically a set of pairs attribute-value. The list of attributes actually used to describe a dataset is defined by the project it refers to.

To display the attributes you may use for a given project, type e.g.:

>>> cprojects["CMIP5"]

For further details on projects , see cproject

None of the project’s attributes are mandatory arguments, because all attributes defaults to the value set by cdef() (which also applies if providing a None value for an attribute)

Some attributes have a special format or processing :

period : see init_period(). See also function climaf.classes.ds() for added flexibility in defining periods as last of first set of years among available data
domain : allowed values are either ‘global’ or a list for latlon corners ordered as in : [ latmin, latmax, lonmin, lonmax ]
variable : name of the geophysical variable ; this should be :
- either a variable actually included in the datafiles,
- or a ‘derived’ variable (see derive() ),
- or, an aliased variable name (see calias() )

check : optional argument that drives the check of the period covered by the datafiles w.r.t. the period defined for the dataset; allowed values are True, False and “if_found”; the latter means : do check except if there is no data file for the dataset; this is intended for cases where datafiles are not (no more) accessible while the user expect to get processed data from the cache. Default value is env.environment.data_check. An error is raised if check fails.
check_type : defines the extent of period check; default value is env.environment.period_check_type; allowed values are :
- ‘none’ : don’t check period
- ‘light’ : checks that the period indicated by dates in data filenames includes dataset’s period (see method light_check())
- ‘medium’ : checks that the period covered by data in files includes dataset’s period (see method check())
- ‘full’ : in addition to case ‘period’, also checks for gaps in data, and for frequency (see method check())
If check is True (or “if_found”, and some datafile exists), an error is raised if the check fails.
in project CMIP5 , for triplets (frequency, simulation, period, table ) : if any is ‘fx’ (or ‘r0i0p0 for simulation), the others are forced to ‘fx’ (resp. ‘r0i0p0’) too.

Example, using no default value, and adressing some CMIP5 data

>>>  cdataset(project='CMIP5', model='CNRM-CM5', experiment='historical', frequency='monthly',
>>>           simulation='r2i3p9', domain=[40,60,-10,20], variable='tas', period='1980-1989', version='last')

You may use wildcard (‘*’) in attribute values, and use explore() for having CliMAF doing something sensible matching such attributes with available data

`cdataset.explore: explore data and periods, and match wildcard attributes`¶

cdataset.explore(option='check_and_store', group_periods_on=None, operation='intersection', first=None)[source]¶

Versatile datafile exploration for a dataset which possibly has wildcards (* and ? ) in attributes.

option can be :

‘choices’ for returning a dict which keys are wildcard attributes and entries are values list

‘resolve’ for returning a NEW DATASET with instanciated attributes (if uniquely)

‘ensemble’ for returning AN ENSEMBLE based on multiple possible values of one or more attributes (tell which one is first in labels by using arg ‘first’)

‘check_and_store’ (or missing) for just identifying and storing dataset files list (while ensuring non-ambiguity check for wildcard attributes)

This feature works only for projects which organization is of type ‘generic’

See further below, after the first examples, what can done with wildcard on ‘period’

Toy example

>>> rst=ds(project="example", simulation="*", variable="rst", period="1980-1981")
>>> rst
ds('example|*|rst|1980-1981|global|monthly')

>>> rst.explore('choices')
{'simulation': ['AMIPV6ALB2G']}

>>> instanciated_dataset=rst.explore('resolve')
>>> instanciated_dataset
ds('example|AMIPV6ALB2G|rst|1980-1981|global|monthly')

>>> my_ensemble=rst.explore('ensemble')
error    : "Creating an ensemble does not make sense because all wildcard attributes have a single possible
            value ({'simulation': ['AMIPV6ALB2G']})"

Real life example for options choices and ensemble

>>> rst=ds(project="CMIP6", model='*', experiment="*ontrol*", realization="r1i1p1f*", table="Amon",
...        variable="rsut", period="1980-1981")
>>> clog('info')
>>> rst.explore('choices')
info     : Attribute institute has matching value CNRM-CERFACS
info     : Attribute experiment has multiple values : set(['piClim-control', 'piControl'])
info     : Attribute grid has matching value gr
info     : Attribute realization has matching value r1i1p1f2
info     : Attribute mip has multiple values : set(['CMIP', 'RFMIP'])
info     : Attribute model has multiple values : set(['CNRM-ESM2-1', 'CNRM-CM6-1'])
{'institute': ['CNRM-CERFACS'], 'experiment': ['piClim-control', 'piControl'], 'grid': ['gr'],
'realization': ['r1i1p1f2'], 'mip': ['CMIP', 'RFMIP'], 'model': ['CNRM-ESM2-1', 'CNRM-CM6-1']}

>>> # Let us further select by setting experiment=piControl
>>> mrst=ds(project="CMIP6", model='*', experiment="piControl", realization="r1i1p1f*", table="Amon",
...         variable="rsut", period="1980-1981")
>>> mrst.explore('choices')
{'institute': ['CNRM-CERFACS'], 'mip': ['CMIP'], 'model': ['CNRM-ESM2-1', 'CNRM-CM6-1'], 'grid': ['gr'],
 'realization': ['r1i1p1f2']}
>>> small_ensemble=mrst.explore('ensemble')
>>> small_ensemble
cens({
      'CNRM-ESM2-1':ds('CMIP6%%rsut%1980-1981%global%/cnrm/cmip%CNRM-ESM2-1%CNRM-CERFACS%CMIP%Amon%piControl%'
                       'r1i1p1f2%gr%latest'),
      'CNRM-CM6-1' :ds('CMIP6%%rsut%1980-1981%global%/cnrm/cmip%CNRM-CM6-1%CNRM-CERFACS%CMIP%Amon%piControl%'
                       'r1i1p1f2%gr%latest')
     })

When option=’choices’ and period= ‘*’, the period of all matching files will be either :

aggregated among all instances of all attributes with wildcards (default)

or, if argument group_periods_on provides an attribute name, aggregated after being sorted on that attribute and merged

The aggregation is governed by argument operation, which can be either :

‘intersection’ : which is the most useful case, and hence is the default

‘union’ : which has not much sense except to know which periods are definitely not covered by any data

None : no aggregation occurs, and you get a dict of the merged periods, which keys are the value of the grouping attribute

Attribute ‘period’ cannot use a * without being == * ;

Examples without grouping periods over any attribute

>>> # Let us use a kind of dataset which data files are temporally splitted,
>>> # and allow for various models, and use a wildcard for period
>>> so=ds(project="CMIP6", model='CNRM*', experiment="piControl", realization="r1i1p1f2",
... table="Omon", variable="so", period="*")

>>> # What is the overall period covered by the union of all datafiles
>>> # (but not necessarily by a single model!)
>>> so.explore('choices', operation='union')
{ 'period': [1850-2349], 'model': ['CNRM-ESM2-1', 'CNRM-CM6-1'] .....}

>>> # What is the intersection of periods covered by each datafile
>>> so.explore('choices')
{ 'period': [None], 'model': ['CNRM-ESM2-1', 'CNRM-CM6-1'] .....}

>>> # What is the list of periods covered by datafiles
>>> so.explore('choices', operation=None)
{ 'period': {None: [1850-1899, 1900-1949, 1950-1999, 2000-2049, 2050-2099,
                    2100-2149, 2150-2199, 2200-2249, 2250-2299, 2300-2349]},
   'model': ['CNRM-ESM2-1', 'CNRM-CM6-1'] .....}

Examples using periods grouping over an attribute

>>> # What is the intersection of available periods after grouping them on the various values of 'model'
>>> so.explore('choices',group_periods_on='model')
{ 'period': [1850-2349], 'model': ['CNRM-ESM2-1', 'CNRM-CM6-1'], ....}

>>> # Same, but explicit the default value
>>> so.explore('choices',group_periods_on='model',operation='intersection')
{ 'period': [1850-2349], 'model': ['CNRM-ESM2-1', 'CNRM-CM6-1'], ....}

>>> # What are the aggregated periods for each value of 'model'
>>> so.explore('choices',group_periods_on='model',operation=None)
{ 'period':
    {'CNRM-ESM2-1': [1850-2349],
     'CNRM-CM6-1' : [1850-2349] },
  'model': ['CNRM-ESM2-1', 'CNRM-CM6-1'], ...}

`cdataset.glob: explore data and/or periods, and match wildcard attributes`¶

cdataset.glob(what=None, ensure_period=True, merge_periods=True, split=None, use_frequency=False)[source]¶

Datafile exploration for a dataset which possibly has wildcards (* and ?) in attributes/facets.

Returns info regarding matching datafile or directories:

if WHAT = ‘files’ , returns a string of all data filenames

otherwise, returns a list of facet/value dictionnaries for matching data (or a pair of lists, see SPLIT below)

If ENSURE_PERIOD is True, returns only results where the requested data period is fully covered by the set of data files. Each returned period is then the same as the requested period

Otherwise, if MERGE_PERIODS is True, each returned period is actually a list of the intersections of the requested period and (merged) available data periods.

Otherwise, individual data file periods are returned.

if SPLIT is not None, a pair is returned instead of the dicts list :

first element is a dict with facets which values are the same among all cases

second element is the dicts list as above, but in which facets with common values are discarded

Example :

>>> tos_data = ds(project='CMIP6', mip='CMIP', variable='tos', period='*',
       table='Omon', institute='CNRM-CERFACS', model='CNRM*', realization='r1i1p1f2' )

>>> common_values, varied_values = tos_data.glob(merge_periods=True, split=True)

>>> common_values
{'variable': 'tos', 'period': [1850-2014], 'root': '/bdd',
 'institute': 'CNRM-CERFACS', 'mip': 'CMIP', 'table': 'Omon',
 'experiment': 'historical', 'realization': 'r1i1p1f2', 'version': 'latest',
 'project': 'CMIP6'}

>>> varied_values
[{'model': 'CNRM-ESM2-1'  , 'grid': 'gn' },
 {'model': 'CNRM-ESM2-1'  , 'grid': 'gr1'},
 {'model': 'CNRM-CM6-1'   , 'grid': 'gn' },
 {'model': 'CNRM-CM6-1'   , 'grid': 'gr1'},
 {'model': 'CNRM-CM6-1-HR', 'grid': 'gn' } ]

`cdataset.check: check time consistency of a dataset`¶

cdataset.check(frequency=False, gap=False, period=True)[source]¶

Check time consistency of first variable of a dataset or ensemble members:

if frequency is True : check if datafile frequency is consistent with facet frequency
if gap is True : check if file data have a gap
if period is True : check if period covered by data actually includes the whole of dataset period (regardless of possible gaps)

Default case is to check only period

Returns: True if every check is OK, False if one fails, None if any cannot be analyzed

For gap and period check, monthly data are processed quite empirically

`cdataset.light_check: check time consistency of a dataset w.r.t to dates in data filenames`¶

cdataset.light_check()[source]¶

Check that dataset’s period is covered by the period deduced from the filenames of its datafiles. Filenames with non-date digits (e.g. initialization year) and which period has no end date may generate interpretation problems.

Return True if the period is covered

Nervertheless, data in files may show gaps; use dataset.check(gap=True) if you need a deeper check

`cdataset.listfiles: returns the list of (local) files of a dataset`¶

cdataset.listfiles(force=False, ensure_dataset=True)[source]¶

Returns the list of (local or remote) files which include the data for the dataset

Use cached value unless called with arg force=True

If ensure_dataset is True, forbid ambiguous datasets

cdef : define some default values for datasets attributes¶

climaf.classes.cdef(attribute, value=None, project=None)[source]¶

Set or get the default value for a CliMAF dataset attribute or facet (such as e.g. ‘model’, ‘simulation’ …), for use by next calls to cdataset() or to ds()

Argument ‘project’ allows to restrict the use/query of the default value to the context of the given ‘project’. On can also set the (global) default value for attribute ‘project’

There is no actual check that ‘attribute’ is a valid keyword for a call to ds or cdataset

Example:

>>> cdef('project','OCMPI5')
>>> cdef('frequency','monthly',project='OCMPI5')

eds : define an ensemble of datasets¶

climaf.classes.eds(first=None, **kwargs)[source]¶

Create a dataset ensemble using the same calling sequence as cdataset(), except that some facets are lists, which defines the ensemble members; these facets must be among the facets authorized for ensemble in the (single) project involved

Example:

>>> cdef("frequency","monthly") ;  cdef("project","CMIP5"); cdef("model","CNRM-CM5")
>>> cdef("variable","tas"); cdef("period","1860")
>>> ens=eds(experiment="historical", simulation=["r1i1p1","r2i1p1"])

Argument ‘first’ is used when multiple attributes are of list type, and tells which of these attributes appears first in member labels

cens : define an ensemble of objects¶

class climaf.classes.cens(dic={}, order=None, sortfunc=None)[source]¶

Function cens creates a CliMAF object of class cens , i.e. a dict of objects, which keys are member labels, and which members are ordered, using method set_order

In some cases, ensembles of datasets from the same project can also be built easily using eds()

When applying an operator to an ensemble, CliMAF will know, from operator’s declaration (see cscript()), whether the operator ‘wishes’ to get the ensemble or, on the reverse, is not ‘ensemble-capable’ :

if the operator is ensemble-capable it will deliver it :

if it is a script : with a string composed by concatenating the corresponding input files; it will also provide the labels list to the script if its declaration calls for it with keyword ${labels} (see cscript())

if it is a Python function : with the dict of corresponding objects

if the operator is ‘ensemble-dumb’, CliMAF will loop applying it on each member, and will form a new ensemble with the results.

The dict keys must be label strings, which describe what is basically different among members. They are usually used by plot scripts to provide a caption allowing to identify each dataset/object e.g using various colors.

Examples (see also ../examples/ensemble.py) :

>>> cdef('project','example'); cdef('simulation',"AMIPV6ALB2G")
>>> cdef('variable','tas');cdef('frequency','monthly')
>>> #
>>> ds1980=ds(period="1980")
>>> ds1981=ds(period="1981")
>>> #
>>> myens=cens({'1980':ds1980 , '1981':ds1981 })
>>> ncview(myens)  # will launch ncview once per member
>>>
>>> myens=cens({'1980':ds1980 , '1981':ds1981 }, order=['1981','1980'])
>>> myens.set_order(['1981','1980'])
>>>
>>> # Add a member
>>> myens['abcd']=ds(period="1982")

Limitations : Even if an ensemble is a dict, some dict methods are not properly implemented (popitem, fromkeys) and function iteritems does not use member order

You can write an ensemble to a file using function efile()

fds : define a dataset from a data file¶

climaf.classes.fds(filename, simulation=None, variable=None, period=None, model=None)[source]¶

fds stands for FileDataSet; it allows to create a dataset simply by providing a filename and optionally a simulation name , a variable name, a period and a model name.

For dataset attributes which are not provided, these defaults apply :

simulation : the filename basename (without suffix ‘.nc’)
variable : the set of variables in the data file
period : the period actually covered by the data file (if it has time_bnds)
model : the ‘model_id’ attribute if it exists, otherwise : ‘no_model’
project : ‘file’ (with separator = ‘|’)
frequency : the value of global attribute fequency in datafile, if it exists

The following restriction apply to such datasets :

functions calias() and derive() cannot be used for project ‘file’

Results are unforeseen if all variables do not have the same time axis

Examples : See data_file.py

cprojects : dictionary of known projects¶

env.environment.cprojects = {None: ${project}.${simulation}.${variable}.${period}.${domain}}¶: Dictionary of declared projects (type is cproject)

env.environment.data_check = False¶: Should ds() calls be checked w.r.t. datafiles. “if_found” means yes if some relevant datafiles exists. Other allowed values are True and False. See that section of class cdataset’s documentation

env.environment.period_check_type = 'light'¶: On ds() calls, which level of check of the requested period w.r.t datafiles. See that section of class cdataset’s documentation

dataloc : describe data locations for a series of simulations¶

class climaf.dataloc.dataloc(project='*', organization='generic', url=None, model='*', simulation='*', realm='*', table='*', frequency='*')[source]¶

Create an entry in the data locations dictionary for an ensemble of datasets.

Parameters:

project (str,optional) – project name
model (str,optional) – model name
simulation (str,optional) – simulation name
frequency (str,optional) – frequency
organization (str) – name of the organization type, among those handled by selectFiles()
url (list of strings) – list of URLS for the data root directories, local or remote

Each entry in the dictionary allows to store :

a list of path or URLS (local or remote), which are root paths for finding some sets of datafiles which share a file organization scheme.

For remote data:

url is supposed to be in the format ‘protocol:user@host:path’, but ‘protocol’ and ‘user’ are optional. So, url can also be ‘user@host:path’ or ‘protocol:host:path’ or ‘host:path’. ftp is default protocol (and the only one which is yet managed, AMOF).

If ‘user’ is given:

if ‘host’ is in $HOME/.netrc file, CliMAF check if corresponding ‘login == ‘user’. If it is, CliMAF get associated password; otherwise it will prompt the user for entering password;

if ‘host’ is not present in $HOME/.netrc file, CliMAF will prompt the user for entering password.

If ‘user’ is not given:

if ‘host’ is in $HOME/.netrc file, CliMAF get corresponding ‘login’ as ‘user’ and also get associated password;

if ‘host’ is not present in $HOME/.netrc file, CliMAF prompt the user for entering ‘user’ and ‘password’.

Remark: The .netrc file contains login and password used by the auto-login process. It generally resides in the user’s home directory ($HOME/.netrc). So, it is highly recommended to supply this information in .netrc file not to have to enter password in every request.

Warning: python netrc module does not handle multiple entries for a single host. So, if netrc file has two entries for the same host, the netrc module only returns the last entry.

We define two kinds of host: hosts with evolving files, e.g. ‘beaufix’; and the others.

For any file returned by function listfiles() which is found in cache:

in case of hosts with dynamic files, the file is transferred only if its date on server is more recent than that found in cache;

for other hosts, the file found in cache is used

the name for the corresponding data files organization scheme. The current set of known schemes is :

CMIP5_DRSany datafile organized after the CMIP5 data reference syntax, such as on IPSL’s Ciclad and
CNRM’s Lustre

EM : CNRM-CM post-processed outputs as organized using EM (please use a list of anyone string for arg urls)

generic : a data organization described by the user, using patterns such as described for selectGenericFiles(). This is the default

Please ask the CliMAF dev team for implementing further organizations. It is quite quick for data which are on the filesystem. Organizations considered for future implementations are :

NetCDF model outputs as available during an ECLIS or ligIGCM simulation

ESGF

the set of attribute values which simulation’s data are stored at that URLS and with that organization

For remote files, filename pattern must include ${varname}, which is instanciated by variable name or filenameVar (given via calias()), for the sake of efficiency. Please complain if this is inadequate

For the sake of brievity, each attribute can have the ‘*’ wildcard value; when using the dictionary, the most specific entries will be used (which means : the entry (or entries) with the lowest number of wildcards)

Example :

Declaring that all IPSLCM-Z-HR data for project PRE_CMIP6 are stored under a single root path and folllows organization named CMIP6_DRS:
>>> dataloc(project='PRE_CMIP6', model='IPSLCM-Z-HR', organization='CMIP6_DRS', url=['/prodigfs/esg/'])
and declaring an exception for one simulation (here, both location and organization are supposed to be different):
>>> dataloc(project='PRE_CMIP6', model='IPSLCM-Z-HR', simulation='my_exp', organization='EM',
...         url=['~/tmp/my_exp_data'])
and declaring a project to access remote data (on multiple servers):
>>> cproject('MY_REMOTE_DATA', ('frequency', 'monthly'), separator='|')
>>> dataloc(project='MY_REMOTE_DATA', organization='generic',
...         url=['beaufix:/home/gmgec/mrgu/vignonl/*/${simulation}SFX${PERIOD}.nc',
...              'ftp:vignonl@hendrix:/home/vignonl/${model}/${variable}_1m_${PERIOD}_${model}.nc']),
>>> calias('MY_REMOTE_DATA','tas','tas',filenameVar='2T')
>>> tas = ds(project='MY_REMOTE_DATA', simulation='AMIPV6ALB2G', variable='tas', frequency='monthly',
...          period='198101')
Please refer to the example section of the documentation for an example with each organization scheme

derive : define a variable as computed from other variables¶

climaf.operators_derive.derive(project, derivedVar, Operator, *invars, **params)[source]¶

Define that ‘derivedVar’ is a derived variable in ‘project’, computed by applying ‘Operator’ to input streams which are datasets whose variable names take the values in *invars and the parameter/arguments of Operator take the values in **params

‘project’ may be the wildcard : ‘*’

Example, assuming that operator ‘minus’ has been defined as

>>> cscript('minus','cdo sub ${in_1} ${in_2} ${out}')

which means that minus uses CDO for substracting the two datasets; you may define, for a given project ‘CMIP5’, a new variable e.g. for cloud radiative effect at the surface, named ‘rscre’, using the difference of values of all-sky and clear-sky net radiation at the surface by:

>>> derive('CMIP5', 'rscre','minus','rs','rscs')

You may then use this variable name at any location you would use any other variable name

Note : you may use wildcard ‘*’ for the project

Another example is rescaling or renaming some variable; here, let us define how variable ‘ta’ can be derived from ERAI variable ‘t’ :

>>> derive('erai', 'ta','rescale', 't', scale=1., offset=0.)

However, this is not the most efficient way to do that. See calias()

Expert use : argument ‘derivedVar’ may be a dictionary, which keys are derived variable names and values are scripts outputs names; example

>>> cscript('vertical_interp', 'vinterp.sh ${in} surface_pressure=${in_2} ${out_l500} ${out_l850} method=${opt}')
>>> derive('*', {'z500' : 'l500' , 'z850' : 'l850'},'vertical_interp', 'zg', 'ps', opt='log')

calias : define a variable as computed, in a project, from another, single, variable¶

climaf.classes.calias(project, variable, fileVariable=None, scale=1.0, offset=0.0, units=None, missing=None, filenameVar=None, conditions=None)[source]¶

Declare that in project, variable is to be computed by reading filevariable, and applying scale and offset; (see first example erai below)

Arg conditions allows to restrict the effect, based on the value of some facets. It is a dictionary of applicable values or values’list, which keys are the facets (see example CMIP6 below)

Arg filenameVar allows to tell which fake variable name should be used when computing the filename for this variable in this project (for optimisation purpose); (see seconf example erai below)

Can tell that a given constant must be interpreted as a missing value (see 4th example, EM, below)

variable may be a list. In that case, fileVariable and filenameVar, if provided, should be parallel lists

`` variable`` can be a comma separated list of variables, in which case this tells how variables are grouped in files (it make sense to use filenameVar in that case, as this is a way to provide the label which is unique to this grouping of variable; scale, offset and missing args must be the same for all variables in that case

Example

>>> # scale and offset may be provided
>>> calias('erai','tas_degC','t2m',scale=1., offset=-273.15)
>>> calias('CMIP6','evspsbl',scale=-1., conditions={ 'model':'CanESM5' , 'version': ['20180103', '20190112'] })
>>> calias('erai','tas','t2m',filenameVar='2T')
>>> calias('EM',[ 'sic', 'sit', 'sim', 'snd', 'ialb', 'tsice'], missing=1.e+20)
>>> calias('data_CNRM','so,thetao',filenameVar='grid_T_table2.2')

NB: A wrapper with same name of this function is defined in climaf.driver.calias() and it is the one which is exported by module climaf.api. It allows to use a list of variable.

cfreqs : declare non-standard frequency names, for a project¶

climaf.classes.cfreqs(project, dic)[source]¶

Allow to declare a dictionary specific to project for matching normalized frequency values to project-specific frequency values

Normalized frequency values are :: decadal, yearly, monthly, daily, 6h, 3h, fx and annual_cycle

When defining a dataset, any reference to a non-standard frequency will be left unchanged both in the datset’s CRS and when trying to access corresponding datafiles

Examples:

>>> cfreqs('CMIP5',{'monthly':'mon' , 'daily':'day' })

crealms : declare non-standard realm names, for a project¶

climaf.classes.crealms(project, dic)[source]¶

Allow to declare a dictionary specific to project for matching normalized realm names to project-specific realm names

Normalized realm names are :: atmos, ocean, land, seaice

When defining a dataset, any reference to a non-standard realm will be left unchanged both in the datset’s CRS and when trying to access corresponding datafiles

Examples:

>>> crealms('CMIP5',{'atmos':'ATM' , 'ocean':'OCE' })

Functions for data definition and access¶

ds : define a dataset object (actually a front-end for `cdataset`)¶

cdataset : define a dataset object¶

`cdataset.explore: explore data and periods, and match wildcard attributes`¶

`cdataset.glob: explore data and/or periods, and match wildcard attributes`¶

`cdataset.check: check time consistency of a dataset`¶

`cdataset.light_check: check time consistency of a dataset w.r.t to dates in data filenames`¶

`cdataset.listfiles: returns the list of (local) files of a dataset`¶

cdef : define some default values for datasets attributes¶

eds : define an ensemble of datasets¶

cens : define an ensemble of objects¶

fds : define a dataset from a data file¶

cproject : declare a new project and its non-standard attributes/facets¶

derive_cproject : create a new project from an existing one by changing its name and possibly its facets¶

cprojects : dictionary of known projects¶

dataloc : describe data locations for a series of simulations¶

cdefault: set or get a default value for some data attribute/facet¶

derive : define a variable as computed from other variables¶

calias : define a variable as computed, in a project, from another, single, variable¶

cfreqs : declare non-standard frequency names, for a project¶

crealms : declare non-standard realm names, for a project¶

Table of Contents

Previous topic

Next topic

This Page

Functions for data definition and access¶

ds : define a dataset object (actually a front-end for cdataset)¶

cdataset : define a dataset object¶

cdataset.explore: explore data and periods, and match wildcard attributes¶

cdataset.glob: explore data and/or periods, and match wildcard attributes¶

cdataset.check: check time consistency of a dataset¶

cdataset.light_check: check time consistency of a dataset w.r.t to dates in data filenames¶

cdataset.listfiles: returns the list of (local) files of a dataset¶

cdef : define some default values for datasets attributes¶

eds : define an ensemble of datasets¶

cens : define an ensemble of objects¶

fds : define a dataset from a data file¶

cproject : declare a new project and its non-standard attributes/facets¶

derive_cproject : create a new project from an existing one by changing its name and possibly its facets¶

cprojects : dictionary of known projects¶

dataloc : describe data locations for a series of simulations¶

cdefault: set or get a default value for some data attribute/facet¶

derive : define a variable as computed from other variables¶

calias : define a variable as computed, in a project, from another, single, variable¶

cfreqs : declare non-standard frequency names, for a project¶

crealms : declare non-standard realm names, for a project¶

ds : define a dataset object (actually a front-end for `cdataset`)¶

`cdataset.explore: explore data and periods, and match wildcard attributes`¶

`cdataset.glob: explore data and/or periods, and match wildcard attributes`¶

`cdataset.check: check time consistency of a dataset`¶

`cdataset.light_check: check time consistency of a dataset w.r.t to dates in data filenames`¶

`cdataset.listfiles: returns the list of (local) files of a dataset`¶