Functions for processing data

Standard operators

For documented standard operators see Standard operators and functions

Functions returning CliMAF objects

For functions which looks like CLiMAF operators, see : Functions returning CliMAF objects

Functions for creating new processing functions, or tuning their behaviour

cscript : define a new CliMAF operator

Defining a new CliMAF operator also defines a new Python function, with the same name

class climaf.operators.cscript(name, command, format='nc', select=True, canOpendap=False, commuteWithTimeConcatenation=False, commuteWithSpaceConcatenation=False, canSelectVar=False, doCatTime=False, fatal=False, **kwargs)[source]

Declare a script or binary as a ‘CliMAF operator’, and define a Python function with the same name

Parameters:
  • name (str) – name for the CliMAF operator.
  • command (str) – script calling sequence, according to the syntax described below.
  • format (str) – script outputs format – either ‘nc’, ‘png’, ‘pdf’, ‘eps’, ‘None’ or ‘graph’ (‘graph’ allows to the user to choose three different graphic output formats: ‘png’, ‘pdf’ or ‘eps’) or ‘txt’ (the text output are not managed by CliMAF, but only displayed - ‘txt’ allows to use e.g. ‘ncdump -h’ from inside CliMAF); defaults to ‘nc’
  • select (bool, optional) – should data selection/transformation be automatically done by CliMAF when applying this script directly to some dataset(s) (i.e. selection on variable, time, domain, aliasing … according to the definition(s) of input dataset()s). Defaults to True
  • canOpendap (bool, optional) – is the script able to use OpenDAP URIs ? default to False
  • commuteWithTimeConcatenation (bool, optional) – can the operation commute with concatenation of time periods ? set it to true, if the operator can be applied on time chunks separately, in order to allow for incremental computation / time chunking; defaults to False
  • commuteWithSpaceConcatenation (bool, optional) – can the operation commute with concatenation of space domains ? defaults to False (see commuteWithTimeConcatenation)
  • doCatTime (bool, optional) – does this script concatenate data over time. Defaults to False. See example in $CLIMAF/doc/operators_which_concatenate_over_time.html
  • fatal (bool, optional) – if False and the executable is not available, do not crash but print a warning
  • **kwargs – possible keyword arguments, with keys matching ‘<outname>_var’, for providing a format string allowing to compute the variable name for output ‘outname’ (see below).
Returns:

None

The script calling sequence pattern string (arg ‘command’) indicates how to build the system call which actually launches the script, with a match between python objects and formal arguments;

For introducing the syntax, please consider this example, with the following commands:

>>> cscript('mycdo','cdo ${operator} ${in} ${out}')
>>> # define some dataset
>>> tas_ds = ds(project='example', simulation='AMIPV6ALB2G', variable='tas', period='1980-1981')
>>> # Apply operator 'mycdo' to dataset 'tas_ds', choosing a given 'operator' argument
>>> tas_avg = mycdo(tas_ds,operator='timavg')

CliMAF will later on launch this call behind the curtain:

$ cdo tim_avg /home/my/tmp/climaf_cache/8a/5.nc /home/my/tmp/climaf_cache/4e/4.nc

where :

  • the last filename is generated by CliMAF from the formal expression describing ‘tas_avg’, and will receive the result
  • the first filename provides a file generated by CliMAF which includes the data required for tas_ds

There are a number of examples declared in module standard_operators.

Detailed syntax:

  • formal arguments appear as : ${argument} (in the example : ${in}, ${out}, ${operator} )
  • except for reserved keywords, arguments in the pattern will be replaced by the values for corresponding keywords used when invoking the diagnostic operator:
  • in the example above : argument operator is replaced by value timavg, which is a keyword known to the external binary called, CDO
  • reserved argument keywords are :
  • in, in_<digit>, ins, ins_<digit>, mmin : they will be replaced by CliMAF managed filenames for input data, as deduced from dataset description or upstream computation; these filenames can actually be remote URLs (if the script can use OpenDAP, see args), local ‘raw’ data files, or CliMAF cache filenames
  • in stands for the URL of the first dataset invoked in the operator call
  • in_<digit> stands for the next ones, in the same order
  • ins and ins_<digit> stand for the case where the script can select input from multiple input files or URLs (e.g. when the whole period to process spans over multiple files); in that case, a single string (surrounded with double quotes) will carry multiple URLs
  • mmin stands for the case where the script accepts as argument an ensemble of datasets. CliMAF will replace the keyword by a string composed of the corresponding input filenames (not surrounded by quotes - please add them yourself in declaration); see also labels below
  • var, var_<digit> : when a script can select a variable in a multi-variable input stream, this is declared by adding this keyword in the calling sequence; CliMAF will replace it by the actual variable name to process, but only if it has not already filtered data for that variable; ‘var’ stands for first input stream, ‘var_<digit>’ for the next ones;

    • in the example above, we assume that external binary CDO is not tasked with selecting the variable, and that CliMAF must feed CDO with a datafile where it has already performed the selection
    • if the script MUST receive the name of the variable in all circumstances, use keyword Var
  • period, period_<digit> : when a script can select a time period in the content of a file or stream, it should declare it by putting this keyword in the pattern, which will be replaced at call time by the period written as <date1>-<date2>, where date is formated as YYYYMMDD ;

    • time intervals must be interpreted as [date1, date2[
    • ‘period’ stands for the first input_stream,
    • ‘period_<n>’ for the next ones, in the order of actual call;
    • in the example above, this keyword is not used, which means that CliMAF has to select the period upstream of feeding CDO with the data
  • period_iso, period_iso_<digit> : as for period above, except that the date formating fits CDO conventions :

    • date format is ISO : YYYY-MM-DDTHH:MM:SS
    • interval is [date1,date2_iso], where date2_iso is 1 minute before date2
    • separator between dates is : ,
  • domain, domain_<digit> : when a script can select a domain in the input grid, this is declared by adding this keyword in the calling sequence; CliMAF will replace it by the domain definition if needed, as ‘latmin,latmax,lonmin,lonmax’ ; ‘domain’ stands for first input stream, ‘domain_<digit>’ for the next ones :

    • in the example above, we assume that external binary CDO is not tasked with selecting the domain, and that CliMAF must feed CDO with a datafile where it has already performed the selection
  • out, out_<word> : CliMAF provide file names for output files (if there is no such field, the script will have only ‘side effects’, e.g. launch a viewer). Main output file must be created by the script with the name provided at the location of argument ${out}. Using arguments like ‘out_<word>’ tells CliMAF that the script provide some secondary output, which will be symbolically known in CliMAF syntax as an attribute of the main object; by default, the variable name of each output equals the name of the output (except for the main ouput, which variable name is supposed to be the same as for the first input); for other cases, see argument **kwargs to provide a format string, used to derive the variable name from first input variable name as in e.g. : output2_var='std_dev(%s)' for the output labelled output2 (i.e. declared as ‘${out_output2}’) or _var='std_dev(%s)' for the default (main) output

    • in the example above, we just apply the convention used by CDO, which expects that you provide an output filename as last argument on the command line. See example mean_and_sdev in doc for advanced usage.
  • crs : will be replaced by the CliMAF Reference Syntax expression describing the first input stream; can be useful for plot title or legend

  • alias : used if the script can make an on the fly re-scaling and renaming of a variable. Will be replaced by a string which pattern is : ‘new_varname,file_varname,scale,offset’. The script should then transform on reading as new_varname = file_varname * scale + offset

  • units, units_<digit> : means that the script can set the units on-the-fly while reading one of the input streams

  • missing : means that the script can make an on-the-fly transformation of a given constant to missing values

  • labels : for script accepting ensembles, CliMAF will replace this keyword by a string bearing the labels associated with the ensemble, with delimiter $ as e.g. in: “CNRM-CM5 is fine$IPSL-CM5-LR is not bad$CCSM-29 is …”

fixed_fields : when operators need auxilliray data fields (e.g. grid, mesh, mask)

And you may need to tell how an operator will receive some fixed fields ‘behind the curtain’ (in addition to the datasets, which are provided as arguments)

climaf.operators.fixed_fields(operator, *paths)[source]

Declare that an operator (or a list of) needs fixed fields. CliMAF will provide them to the operator at execution time through symbolic links. This is ‘set’ type of operation, not an ‘add’ one : only the last call is considered (it reset the list of fields)

Parameters:
  • operator (string, or list of strings) – name of the CliMAF operator.
  • paths (couples) – a number of couples composed of the filename as expected by the operator and a path for the data; the path may uses placeholders : ${model}, ${project}, ${simulation}, ${realm} and ${grid}, which will be replaced by the corresponding facet values for the first operand of the target operator.
Returns:

None

Example

>>> fixed_fields('ccdftransport',
 ... ('mesh_hgr.nc','/data/climaf/${project}/${model}/ORCA1_mesh_hgr.nc'),
 ... ('mesh_zgr.nc','/data/climaf/${project}/${model}/ORCA1_mesh_zgr.nc'))
>>> fixed_fields('plot',
 ... ('coordinates.nc','/cnrm/ioga/Users/chevallier/chevalli/Partage/NEMO/eORCA_R025_coordinates_v1.0.nc'))

cmacro : define a macro

climaf.cmacro.macro(name, cobj, lobjects=[])[source]

Define a CliMAF macro from a CliMAF compound object.

Transform a Climaf object in a macro, replacing all datasets, and the objects of lobjects, by a dummy argument. Register it in dict cmacros, if name is not None

Parameters:
  • name (string) – the name you want to give to the macro; a Python function with the same name will be defined
  • cobj (CliMAF object, or string) – any CliMAF object, usually the result of a series of operators, that you would like to repeat using other input datasets; alternatively, you can provide the macro formula as a string (when accustomed to the syntax)
  • lobjects (list, optional) – for expert use- a list of objects, which are sub-objects of cobject, and which should become arguments of the macro
Returns:

a python function is also defined in module cmacros and in main namespace, and you may use it in the same way as a CliMAF operator. All the datasets involved in cobj become arguments of the macro, which allows you to re-do the same computations and easily define objects similar to cobjs

Return type:

a macro; the returned value is usualy not used ‘as is’

Example:

>>> # First use and combine CliMAF operators to get some interesting result using some dataset(s)
>>> january_ta=ds(project='example',simulation='AMIPV6ALB2G',variable='ta',frequency='monthly',period='198001')
>>> ta_europe=llbox(january_ta,latmin=40,latmax=60,lonmin=-15,lonmax=25)
>>> ta_ezm=ccdo(ta_europe,operator='zonmean')
>>> fig_ezm=plot(ta_ezm)
>>> #
>>> # Using this result as an example, define a macro named 'eu_cross_section',
>>> # which arguments will be the datasets involved in this result
>>> cmacro('eu_cross_section',fig_ezm)
>>> #
>>> # You can of course apply a macro to another dataset(s) (even here to a 2D variable)
>>> pr=ds(project='example',simulation='AMIPV6ALB2G', variable='pr', frequency='monthly', period='198001')
>>> pr_ezm=eu_cross_section(pr)
>>> #
>>> # All macros are registered in dictionary climaf.cmacro.cmacros,
>>> # which is imported by climaf.api; you can list it by :
>>> cmacros

Note : macros are automatically saved in file ~/.climaf.macros, and can be edited

See also much more explanations in the example at macro.py