Functions for data definition and access

Except for the first three paragraphs, this section is for advanced use. As a first step, you should consider using the built-in data data definitions described at projects. You may need to come back to this section for reference

ds : define a dataset object (actually a front-end for cdataset)

climaf.classes.ds(*args, **kwargs)[source]

Returns a dataset from its full Climate Reference Syntax string. Example

>>> ds('CMIP5.historical.pr.[1980].global.monthly.CNRM-CM5.r1i1p1.mon.Amon.atmos.last')

Also a shortcut for cdataset(), when used with with only keywords arguments. Example

>>> cdataset(project='CMIP5', model='CNRM-CM5', experiment='historical', frequency='monthly',              simulation='r2i3p9', domain=[40,60,-10,20], variable='tas', period='1980-1989', version='last')

In that latter case, you may use e.g. period=’last_50y’ to get the last 50 years (or less) of data; but this will work only if no dataset’s attribute is ambiguous. ‘first_50y’ also works, similarly; and also period=’*’.

You must refer to doc at : cdataset()

cdataset : define a dataset object

class climaf.classes.cdataset(**kwargs)[source]

Create a CLIMAF dataset.

A CLIMAF dataset is a description of what the data (rather than the data itself or a file). It is basically a set of pairs attribute-value. The list of attributes actually used to describe a dataset is defined by the project it refers to.

To display the attributes you may use for a given project, type e.g.:

>>> cprojects["CMIP5"]

For further details on projects , see cproject

None of the project’s attributes are mandatory arguments, because all attributes defaults to the value set by cdef() (which also applies if providing a None value for an attribute)

Some attributes have a special format or processing :

  • period : see init_period(). See also function climaf.classes.ds() for added flexibility in defining periods as last of first set of years among available data

  • domain : allowed values are either ‘global’ or a list for latlon corners ordered as in : [ latmin, latmax, lonmin, lonmax ]

  • variable : name of the geophysical variable ; this should be :

    • either a variable actually included in the datafiles,
    • or a ‘derived’ variable (see derive() ),
    • or, an aliased variable name (see alias() )
  • in project CMIP5 , for triplets (frequency, simulation, period, table ) : if any is ‘fx’ (or ‘r0i0p0 for simulation), the others are forced to ‘fx’ (resp. ‘r0i0p0’) too.

Example, using no default value, and adressing some CMIP5 data

>>>  cdataset(project='CMIP5', model='CNRM-CM5', experiment='historical', frequency='monthly',
>>>           simulation='r2i3p9', domain=[40,60,-10,20], variable='tas', period='1980-1989', version='last')

You may use wildcard (‘*’) in attribute values, and use explore() for having CliMAF doing something sensible matching such attributes with available data

cdataset.explore: explore data and periods, and match wildcard attributes

cdataset.explore(option='check_and_store', group_periods_on=None, operation='intersection', first=None)[source]

Versatile datafile exploration for a dataset which possibly has wildcards (* and ? ) in attributes.

option can be :

  • ‘choices’ for returning a dict which keys are wildcard attributes and entries are values list
  • ‘resolve’ for returning a NEW DATASET with instanciated attributes (if uniquely)
  • ‘ensemble’ for returning AN ENSEMBLE based on multiple possible values of one or more attributes (tell which one is first in labels by using arg ‘first’)
  • ‘check_and_store’ (or missing) for just identifying and storing dataset files list (while ensuring non-ambiguity check for wildcard attributes)

This feature works only for projects which organization is of type ‘generic’

See further below, after the first examples, what can done with wildcard on ‘period’

Toy example

>>> rst=ds(project="example", simulation="*", variable="rst", period="1980-1981")
>>> rst
ds('example|*|rst|1980-1981|global|monthly')

>>> rst.explore('choices')
{'simulation': ['AMIPV6ALB2G']}

>>> instanciated_dataset=rst.explore('resolve')
>>> instanciated_dataset
ds('example|AMIPV6ALB2G|rst|1980-1981|global|monthly')

>>> my_ensemble=rst.explore('ensemble')
error    : "Creating an ensemble does not make sense because all wildcard attributes have a single possible
            value ({'simulation': ['AMIPV6ALB2G']})"

Real life example for options choices and ensemble

>>> rst=ds(project="CMIP6", model='*', experiment="*ontrol*", realization="r1i1p1f*", table="Amon",
...        variable="rsut", period="1980-1981")
>>> clog('info')
>>> rst.explore('choices')
info     : Attribute institute has matching value CNRM-CERFACS
info     : Attribute experiment has multiple values : set(['piClim-control', 'piControl'])
info     : Attribute grid has matching value gr
info     : Attribute realization has matching value r1i1p1f2
info     : Attribute mip has multiple values : set(['CMIP', 'RFMIP'])
info     : Attribute model has multiple values : set(['CNRM-ESM2-1', 'CNRM-CM6-1'])
{'institute': ['CNRM-CERFACS'], 'experiment': ['piClim-control', 'piControl'], 'grid': ['gr'],
'realization': ['r1i1p1f2'], 'mip': ['CMIP', 'RFMIP'], 'model': ['CNRM-ESM2-1', 'CNRM-CM6-1']}

>>> # Let us further select by setting experiment=piControl
>>> mrst=ds(project="CMIP6", model='*', experiment="piControl", realization="r1i1p1f*", table="Amon",
...         variable="rsut", period="1980-1981")
>>> mrst.explore('choices')
{'institute': ['CNRM-CERFACS'], 'mip': ['CMIP'], 'model': ['CNRM-ESM2-1', 'CNRM-CM6-1'], 'grid': ['gr'],
 'realization': ['r1i1p1f2']}
>>> small_ensemble=mrst.explore('ensemble')
>>> small_ensemble
cens({
      'CNRM-ESM2-1':ds('CMIP6%%rsut%1980-1981%global%/cnrm/cmip%CNRM-ESM2-1%CNRM-CERFACS%CMIP%Amon%piControl%'
                       'r1i1p1f2%gr%latest'),
      'CNRM-CM6-1' :ds('CMIP6%%rsut%1980-1981%global%/cnrm/cmip%CNRM-CM6-1%CNRM-CERFACS%CMIP%Amon%piControl%'
                       'r1i1p1f2%gr%latest')
     })

When option=’choices’ and period= ‘*’, the period of all matching files will be either :

  • aggregated among all instances of all attributes with wildcards (default)
  • or, if argument group_periods_on provides an attribute name, aggregated after being sorted on that attribute and merged

The aggregation is governed by argument operation, which can be either :

  • ‘intersection’ : which is the most useful case, and hence is the default
  • ‘union’ : which has not much sense except to know which periods are definitely not covered by any data
  • None : no aggregation occurs, and you get a dict of the merged periods, which keys are the value of the grouping attribute

Attribute ‘period’ cannot use a * without being == * ;

Examples without grouping periods over any attribute

>>> # Let us use a kind of dataset which data files are temporally splitted,
>>> # and allow for various models, and use a wildcard for period
>>> so=ds(project="CMIP6", model='CNRM*', experiment="piControl", realization="r1i1p1f2",
... table="Omon", variable="so", period="*")

>>> # What is the overall period covered by the union of all datafiles
>>> # (but not necessarily by a single model!)
>>> so.explore('choices', operation='union')
{ 'period': [1850-2349], 'model': ['CNRM-ESM2-1', 'CNRM-CM6-1'] .....}

>>> # What is the intersection of periods covered by each datafile
>>> so.explore('choices')
{ 'period': [None], 'model': ['CNRM-ESM2-1', 'CNRM-CM6-1'] .....}

>>> # What is the list of periods covered by datafiles
>>> so.explore('choices', operation=None)
{ 'period': {None: [1850-1899, 1900-1949, 1950-1999, 2000-2049, 2050-2099,
                    2100-2149, 2150-2199, 2200-2249, 2250-2299, 2300-2349]},
   'model': ['CNRM-ESM2-1', 'CNRM-CM6-1'] .....}

Examples using periods grouping over an attribute

>>> # What is the intersection of available periods after grouping them on the various values of 'model'
>>> so.explore('choices',group_periods_on='model')
{ 'period': [1850-2349], 'model': ['CNRM-ESM2-1', 'CNRM-CM6-1'], ....}

>>> # Same, but explicit the default value
>>> so.explore('choices',group_periods_on='model',operation='intersection')
{ 'period': [1850-2349], 'model': ['CNRM-ESM2-1', 'CNRM-CM6-1'], ....}

>>> # What are the aggregated periods for each value of 'model'
>>> so.explore('choices',group_periods_on='model',operation=None)
{ 'period':
    {'CNRM-ESM2-1': [1850-2349],
     'CNRM-CM6-1' : [1850-2349] },
  'model': ['CNRM-ESM2-1', 'CNRM-CM6-1'], ...}

cdataset.glob: explore data and/or periods, and match wildcard attributes

cdataset.glob(what=None, periods=None, split=None, use_frequency=False)[source]

Datafile exploration for a dataset which possibly has wildcards (* and ?) in attributes/facets.

Returns info regarding matching datafile or directories:

  • if WHAT = ‘files’ , returns a string of all data filenames
  • otherwise, returns a list of facet/value dictionnaries for matching data (or a pair, see below)

In last case, data file periods are not returned if arg PERIODS is None and data search is optimized for the project. In that case, the globbing is done on data directories and not on data files, which is much faster.

If PERIODS is not None, individual data files periods are merged among cases with same facets values

if SPLIT is not None, a pair is returned intead of the dicts list :

  • first element is a dict with facets which values are the same among all cases
  • second element is the dicts list as above, but in which facets with common values are discarded

Example :

>>> tos_data = ds(project='CMIP6', variable='tos', period='*',
       table='Omon', model='CNRM*', realization='r1i1p1f*' )
>>> common_keys, varied_keys = tos_data.glob(periods=True, split=True)
>>> common_keys
{'mip': 'CMIP', 'institute': 'CNRM-CERFACS', 'experiment': 'historical',
'realization': 'r1i1p1f2', 'table': 'Omon', 'variable': 'tos',
'version': 'latest', 'period': [1850-2014], 'root': '/bdd'}
>>> varied_keys
[{'model': 'CNRM-ESM2-1'  , 'grid': 'gn' },
 {'model': 'CNRM-ESM2-1'  , 'grid': 'gr1'},
 {'model': 'CNRM-CM6-1'   , 'grid': 'gn' },
 {'model': 'CNRM-CM6-1'   , 'grid': 'gr1'},
 {'model': 'CNRM-CM6-1-HR', 'grid': 'gn' } ]

cdataset.check: check time consistency of a dataset

cdataset.check(frequency=True, gap=True, period=True)[source]

Check time consistency of first variable of a dataset or ensemble members: - if frequency is True : check if data frequency is consistent with dataset frequency - if gap is True : check if file data have a gap - if period is True : check if period covered by data actually includes the whole of dataset period

Returns: True if every check is OK, False if one fails, None if analysis is not yet possible

cdataset.listfiles: returns the list of (local) files of a dataset

cdataset.listfiles(force=False, ensure_dataset=True)[source]

Returns the list of (local or remote) files which include the data for the dataset

Use cached value unless called with arg force=True

If ensure_dataset is True, forbid ambiguous datasets

cdef : define some default values for datasets attributes

climaf.classes.cdef(attribute, value=None, project=None)[source]

Set or get the default value for a CliMAF dataset attribute or facet (such as e.g. ‘model’, ‘simulation’ …), for use by next calls to cdataset() or to ds()

Argument ‘project’ allows to restrict the use/query of the default value to the context of the given ‘project’. On can also set the (global) default value for attribute ‘project’

There is no actual check that ‘attribute’ is a valid keyword for a call to ds or cdataset

Example:

>>> cdef('project','OCMPI5')
>>> cdef('frequency','monthly',project='OCMPI5')

eds : define an ensemble of datasets

climaf.classes.eds(first=None, **kwargs)[source]

Create a dataset ensemble using the same calling sequence as cdataset(), except that some facets are lists, which defines the ensemble members; these facets must be among the facets authorized for ensemble in the (single) project involved

Example:

>>> cdef("frequency","monthly") ;  cdef("project","CMIP5"); cdef("model","CNRM-CM5")
>>> cdef("variable","tas"); cdef("period","1860")
>>> ens=eds(experiment="historical", simulation=["r1i1p1","r2i1p1"])

Argument ‘first’ is used when multiple attributes are of list type, and tells which of these attributes appears first in member labels

cens : define an ensemble of objects

class climaf.classes.cens(dic={}, order=None, sortfunc=None)[source]

Function cens creates a CliMAF object of class cens , i.e. a dict of objects, which keys are member labels, and which members are ordered, using method set_order

In some cases, ensembles of datasets from the same project can also be built easily using eds()

When applying an operator to an ensemble, CliMAF will know, from operator’s declaration (see cscript()), whether the operator ‘wishes’ to get the ensemble or, on the reverse, is not ‘ensemble-capable’ :

  • if the operator is ensemble-capable it will deliver it :
    • if it is a script : with a string composed by concatenating the corresponding input files; it will also provide the labels list to the script if its declaration calls for it with keyword ${labels} (see cscript())
    • if it is a Python function : with the dict of corresponding objects
  • if the operator is ‘ensemble-dumb’, CliMAF will loop applying it on each member, and will form a new ensemble with the results.

The dict keys must be label strings, which describe what is basically different among members. They are usually used by plot scripts to provide a caption allowing to identify each dataset/object e.g using various colors.

Examples (see also ../examples/ensemble.py) :

>>> cdef('project','example'); cdef('simulation',"AMIPV6ALB2G")
>>> cdef('variable','tas');cdef('frequency','monthly')
>>> #
>>> ds1980=ds(period="1980")
>>> ds1981=ds(period="1981")
>>> #
>>> myens=cens({'1980':ds1980 , '1981':ds1981 })
>>> ncview(myens)  # will launch ncview once per member
>>>
>>> myens=cens({'1980':ds1980 , '1981':ds1981 }, order=['1981','1980'])
>>> myens.set_order(['1981','1980'])
>>>
>>> # Add a member
>>> myens['abcd']=ds(period="1982")

Limitations : Even if an ensemble is a dict, some dict methods are not properly implemented (popitem, fromkeys) and function iteritems does not use member order

You can write an ensemble to a file using function efile()

fds : define a dataset from a data file

climaf.classes.fds(filename, simulation=None, variable=None, period=None, model=None)[source]

fds stands for FileDataSet; it allows to create a dataset simply by providing a filename and optionally a simulation name , a variable name, a period and a model name.

For dataset attributes which are not provided, these defaults apply :

  • simulation : the filename basename (without suffix ‘.nc’)
  • variable : the set of variables in the data file
  • period : the period actually covered by the data file (if it has time_bnds)
  • model : the ‘model_id’ attribute if it exists, otherwise : ‘no_model’
  • project : ‘file’ (with separator = ‘|’)
  • frequency : the value of global attribute fequency in datafile, if it exists

The following restriction apply to such datasets :

Results are unforeseen if all variables do not have the same time axis

Examples : See data_file.py

cproject : declare a new project and its non-standard attributes/facets

class climaf.classes.cproject(name, *args, **kwargs)[source]

Declare a project and its facets/attributes in CliMAF (see below)

Parameters:
  • name (string) – project name; do not use the chosen separator in it (see below)
  • args (strings) – attribute names; they are free; do not use the chosen separator in it (see below); CliMAF anyway will add attributes : project, simulation, variable, period, and domain
  • kwargs (dict) –

    can only be used with keywords :

    • sep or separator for indicating the symbol separating facets in the dataset syntax. Defaults to “.”.
    • ensemble for declaring a list of attribute names which are allowed for defining an ensemble in this project (‘simulation’ is automatically allowed)
    • use_frequency to declare that the frequency can not be derived from time bounds of the file. In this case the facet frequency is mandatory for the project and a default value must be defined.

Returns : a cproject object, which string representation is the pattern later used in CliMAF Refreence Syntax for representing datasets in this project

A ‘cproject’ is the definition of a set of attributes, or facets, which values will completely define a ‘dataset’ as managed by CliMAF. Its name is one of the possible keys for describing data locations (see dataloc)

For instance, cproject CMIP5, after its Data Reference Syntax, has attributes : model, simulation (used for rip), experiment, variable, frequency, realm, table, version

A number of projects are built-in. See projects

A dataset in a cproject declared as

>>> cproject('MINE','myfreq','myfacet',sep='_')

will return

${project}_${simulation}_${variable}_${period}_${domain}_${myfreq}_${myfacet}

and will have datasets represented as e.g.:

'MINE_hist_tas_[1980-1999]_global_decadal_gabu'

while an example for built-in cproject CMIP5 will be:

'CMIP5.historical.pr.[1980].global.monthly.CNRM-CM5.r1i1p1.mon.Amon.atmos.last'

The attributes list should include all facets which are useful for distinguishing datasets from each other, and for computing datafile pathnames in the ‘generic’ organization (see dataloc)

A default value for a given facet can be specified, by providing a tuple (facet_name,default_value) instead of the facet name. This default value is however of lower priority than the value set using cdef()

A project can be declared as having non-standard variable names in datafiles, or variables that should undergo re-scaling; see calias()

A project can be declared as having non-standard frequency names (this is used when accessing datafiles); see cfreqs())

derive_cproject : create a new project from an existing one by changing its name and possibly its facets

climaf.classes.derive_cproject(name, parent_name, new_project_facets=[])[source]

Create a new project named ‘name’ from the project ‘parent_name’ adding the facets listed in ‘new_project_facets’ if specified. Also derive the location list from the parent project.

Parameters:
  • name – name of the new project
  • parent_name – name of the source project
  • new_project_facets – the list of the facets to add to the new project (could be already present in parent).
Returns:

the new project

cprojects : dictionary of known projects

env.environment.cprojects = {None: ${project}.${simulation}.${variable}.${period}.${domain}}

Dictionary of declared projects (type is cproject)

dataloc : describe data locations for a series of simulations

class climaf.dataloc.dataloc(project='*', organization='generic', url=None, model='*', simulation='*', realm='*', table='*', frequency='*')[source]

Create an entry in the data locations dictionary for an ensemble of datasets.

Parameters:
  • project (str,optional) – project name
  • model (str,optional) – model name
  • simulation (str,optional) – simulation name
  • frequency (str,optional) – frequency
  • organization (str) – name of the organization type, among those handled by selectFiles()
  • url (list of strings) – list of URLS for the data root directories, local or remote

Each entry in the dictionary allows to store :

  • a list of path or URLS (local or remote), which are root paths for finding some sets of datafiles which share a file organization scheme.

    • For remote data:

      url is supposed to be in the format ‘protocol:user@host:path’, but ‘protocol’ and ‘user’ are optional. So, url can also be ‘user@host:path’ or ‘protocol:host:path’ or ‘host:path’. ftp is default protocol (and the only one which is yet managed, AMOF).

      If ‘user’ is given:

      • if ‘host’ is in $HOME/.netrc file, CliMAF check if corresponding ‘login == ‘user’. If it is, CliMAF get associated password; otherwise it will prompt the user for entering password;
      • if ‘host’ is not present in $HOME/.netrc file, CliMAF will prompt the user for entering password.

      If ‘user’ is not given:

      • if ‘host’ is in $HOME/.netrc file, CliMAF get corresponding ‘login’ as ‘user’ and also get associated password;
      • if ‘host’ is not present in $HOME/.netrc file, CliMAF prompt the user for entering ‘user’ and ‘password’.

      Remark: The .netrc file contains login and password used by the auto-login process. It generally resides in the user’s home directory ($HOME/.netrc). So, it is highly recommended to supply this information in .netrc file not to have to enter password in every request.

      Warning: python netrc module does not handle multiple entries for a single host. So, if netrc file has two entries for the same host, the netrc module only returns the last entry.

      We define two kinds of host: hosts with evolving files, e.g. ‘beaufix’; and the others.

      For any file returned by function listfiles() which is found in cache:

      • in case of hosts with dynamic files, the file is transferred only if its date on server is more recent than that found in cache;
      • for other hosts, the file found in cache is used
  • the name for the corresponding data files organization scheme. The current set of known schemes is :

    • CMIP5_DRS : any datafile organized after the CMIP5 data reference syntax, such as on IPSL’s Ciclad and
      CNRM’s Lustre
    • EM : CNRM-CM post-processed outputs as organized using EM (please use a list of anyone string for arg urls)
    • generic : a data organization described by the user, using patterns such as described for selectGenericFiles(). This is the default

    Please ask the CliMAF dev team for implementing further organizations. It is quite quick for data which are on the filesystem. Organizations considered for future implementations are :

    • NetCDF model outputs as available during an ECLIS or ligIGCM simulation
    • ESGF
  • the set of attribute values which simulation’s data are stored at that URLS and with that organization

    For remote files, filename pattern must include ${varname}, which is instanciated by variable name or filenameVar (given via calias()), for the sake of efficiency. Please complain if this is inadequate

For the sake of brievity, each attribute can have the ‘*’ wildcard value; when using the dictionary, the most specific entries will be used (which means : the entry (or entries) with the lowest number of wildcards)

Example :

  • Declaring that all IPSLCM-Z-HR data for project PRE_CMIP6 are stored under a single root path and folllows organization named CMIP6_DRS:

    >>> dataloc(project='PRE_CMIP6', model='IPSLCM-Z-HR', organization='CMIP6_DRS', url=['/prodigfs/esg/'])
    
  • and declaring an exception for one simulation (here, both location and organization are supposed to be different):

    >>> dataloc(project='PRE_CMIP6', model='IPSLCM-Z-HR', simulation='my_exp', organization='EM',
    ...         url=['~/tmp/my_exp_data'])
    
  • and declaring a project to access remote data (on multiple servers):

    >>> cproject('MY_REMOTE_DATA', ('frequency', 'monthly'), separator='|')
    >>> dataloc(project='MY_REMOTE_DATA', organization='generic',
    ...         url=['beaufix:/home/gmgec/mrgu/vignonl/*/${simulation}SFX${PERIOD}.nc',
    ...              'ftp:vignonl@hendrix:/home/vignonl/${model}/${variable}_1m_${PERIOD}_${model}.nc']),
    >>> calias('MY_REMOTE_DATA','tas','tas',filenameVar='2T')
    >>> tas = ds(project='MY_REMOTE_DATA', simulation='AMIPV6ALB2G', variable='tas', frequency='monthly',
    ...          period='198101')
    

Please refer to the example section of the documentation for an example with each organization scheme

cdefault: set or get a default value for some data attribute/facet

climaf.classes.cdef(attribute, value=None, project=None)[source]

Set or get the default value for a CliMAF dataset attribute or facet (such as e.g. ‘model’, ‘simulation’ …), for use by next calls to cdataset() or to ds()

Argument ‘project’ allows to restrict the use/query of the default value to the context of the given ‘project’. On can also set the (global) default value for attribute ‘project’

There is no actual check that ‘attribute’ is a valid keyword for a call to ds or cdataset

Example:

>>> cdef('project','OCMPI5')
>>> cdef('frequency','monthly',project='OCMPI5')

derive : define a variable as computed from other variables

climaf.operators_derive.derive(project, derivedVar, Operator, *invars, **params)[source]

Define that ‘derivedVar’ is a derived variable in ‘project’, computed by applying ‘Operator’ to input streams which are datasets whose variable names take the values in *invars and the parameter/arguments of Operator take the values in **params

‘project’ may be the wildcard : ‘*’

Example, assuming that operator ‘minus’ has been defined as

>>> cscript('minus','cdo sub ${in_1} ${in_2} ${out}')

which means that minus uses CDO for substracting the two datasets; you may define, for a given project ‘CMIP5’, a new variable e.g. for cloud radiative effect at the surface, named ‘rscre’, using the difference of values of all-sky and clear-sky net radiation at the surface by:

>>> derive('CMIP5', 'rscre','minus','rs','rscs')

You may then use this variable name at any location you would use any other variable name

Note : you may use wildcard ‘*’ for the project

Another example is rescaling or renaming some variable; here, let us define how variable ‘ta’ can be derived from ERAI variable ‘t’ :

>>> derive('erai', 'ta','rescale', 't', scale=1., offset=0.)

However, this is not the most efficient way to do that. See calias()

Expert use : argument ‘derivedVar’ may be a dictionary, which keys are derived variable names and values are scripts outputs names; example

>>> cscript('vertical_interp', 'vinterp.sh ${in} surface_pressure=${in_2} ${out_l500} ${out_l850} method=${opt}')
>>> derive('*', {'z500' : 'l500' , 'z850' : 'l850'},'vertical_interp', 'zg', 'ps', opt='log')

calias : define a variable as computed, in a project, from another, single, variable

climaf.classes.calias(project, variable, fileVariable=None, scale=1.0, offset=0.0, units=None, missing=None, filenameVar=None, conditions=None)[source]

Declare that in project, variable is to be computed by reading filevariable, and applying scale and offset; (see first example erai below)

Arg conditions allows to restrict the effect, based on the value of some facets. It is a dictionary of applicable values or values’list, which keys are the facets (see example CMIP6 below)

Arg filenameVar allows to tell which fake variable name should be used when computing the filename for this variable in this project (for optimisation purpose); (see seconf example erai below)

Can tell that a given constant must be interpreted as a missing value (see 4th example, EM, below)

variable may be a list. In that case, fileVariable and filenameVar, if provided, should be parallel lists

`` variable`` can be a comma separated list of variables, in which case this tells how variables are grouped in files (it make sense to use filenameVar in that case, as this is a way to provide the label which is unique to this grouping of variable; scale, offset and missing args must be the same for all variables in that case

Example

>>> calias('erai','tas_degC','t2m',scale=1., offset=-273.15)  # scale and offset may be provided
>>> calias('CMIP6','evspsbl',scale=-1., conditions={ 'model':'CanESM5' , 'version': ['20180103', '20190112'] })
>>> calias('erai','tas','t2m',filenameVar='2T')
>>> calias('EM',[ 'sic', 'sit', 'sim', 'snd', 'ialb', 'tsice'], missing=1.e+20)
>>> calias('data_CNRM','so,thetao',filenameVar='grid_T_table2.2')

NB: A wrapper with same name of this function is defined in climaf.driver.calias() and it is the one which is exported by module climaf.api. It allows to use a list of variable.

climaf.driver.calias(project, variable, fileVariable=None, **kwargs)[source]

See climaf.classes.calias()

Declare that in project, variable is to be computed by reading filevariable; It allows to use a list of variables, given as a string where the name of variables are separated by commas

cfreqs : declare non-standard frequency names, for a project

climaf.classes.cfreqs(project, dic)[source]

Allow to declare a dictionary specific to project for matching normalized frequency values to project-specific frequency values

Normalized frequency values are :
decadal, yearly, monthly, daily, 6h, 3h, fx and annual_cycle

When defining a dataset, any reference to a non-standard frequency will be left unchanged both in the datset’s CRS and when trying to access corresponding datafiles

Examples:

>>> cfreqs('CMIP5',{'monthly':'mon' , 'daily':'day' })

crealms : declare non-standard realm names, for a project

climaf.classes.crealms(project, dic)[source]

Allow to declare a dictionary specific to project for matching normalized realm names to project-specific realm names

Normalized realm names are :
atmos, ocean, land, seaice

When defining a dataset, any reference to a non-standard realm will be left unchanged both in the datset’s CRS and when trying to access corresponding datafiles

Examples:

>>> crealms('CMIP5',{'atmos':'ATM' , 'ocean':'OCE' })