Internal functions - presented here for their doc¶

init_period¶

This function should not be called directly ; it is presented here mainly for documenting the syntax of the strings describing a period of time

climaf.period.init_period(dates)[source]¶

Init a CliMAF ‘period’ object

Parameters:: dates (str) – must match r’YYYY[MM[DD[HH[MM]]]][(-|_)YYYY[MM[DD[HH[MM]]]]]’ , or be ‘fx’ for fixed fields
Returns:: the corresponding CliMAF ‘period’ object

When using only YYYY, can omit some Ys (for zeros). Cannot handle year 0000

Examples :

a one-year long period : ‘1980’, or ‘1980-1980’
a decade : ‘1980-1989’
first millenium : 1-1000 # Must have leading zeroes if you want to quote a month
first century : 1-100
one month : ‘198005’
two months : ‘198003-198004’
one day : ‘17890714’
the same single day, in a more complicated way : ‘17890714-17890714’

CliMAF internally handles date-time values with a 1 minute accurracy; it can provide date information to external scripts in two forms; see keywords ‘period’ and ‘period_iso’ in cscript()

selectFiles¶

This function should not be called directly ; it is presented here mainly for documenting the list of organizations it can handle for function dataloc

climaf.dataloc.selectFiles(return_wildcards=None, merge_periods_on=None, return_combinations=None, with_periods=None, use_frequency=False, **search_dict)[source]¶

Returns the shortest list of (local or remote) files which include the data for the list of (facet,value) pairs provided

Method :

use datalocations indexed by dataloc() to identify data organization and data store urls for these (facet,value) pairs
check that data organization is as known one, i.e. is one of ‘generic’, ‘intake’, ‘CMIP5_DRS’ or ‘EM’
derive relevant filenames search function such as as : py:func:~climaf.dataloc.selectCmip5DrsFiles from data organization scheme
pass urls and relevant facet values to this filenames search function

selectGenericFiles¶

This function should not be called directly ; it is presented here mainly for documenting the syntax of argument url of function dataloc when organization is set to generic

climaf.dataloc.selectGenericFiles(urls, kwargs, return_combinations=None, use_frequency=False, return_wildcards=None, merge_periods_on=None)[source]¶

Allow to describe a generic file organization : the list of files returned by this function is composed of files which :

match the patterns in url once these patterns are instantiated by
the values in kwargs, and

contain the variable provided in kwargs

match the period` provided in kwargs

kwargs can have entries which are list, and are then interpreted as :

a first element which is a pattern (i.e. which include * or ?)
more elements which are the possible values, as diagnosed by some logic upstream

In the pattern strings, no keyword is mandatory. However, for remote files, filename pattern must include ${varname}, which is instanciated by variable name or filenameVar (given via calias()); this is for the sake of efficiency (please complain if inadequate)

Example :

>>> selectGenericFiles(project ='my_projet',model ='my_model', simulation ='lastexp', variable ='tas',
...                    period ='1980', urls =['~/DATA/${project}/${model}/*${variable}*${PERIOD}*.nc)']

/home/stephane/DATA/my_project/my_model/somefilewith_tas_Y1980.nc

In the pattern strings, the keywords that can be used in addition to the argument names (e.g. ${model}) are:

${variable}use it if the files are split by variable and
filenames do include the variable name, as this speed up the search
${PERIOD}use it for indicating the period covered by each file, if this
is applicable in the file naming; this period can appear in filenames as YYYY, YYYYMM, YYYYMMDD, YYYYMMDDHHMM, either once only, or twice with separator =’-’ or ‘_’
wildcards ‘?’ and ‘*’ for matching respectively one and any number of characters

Résumé en francais :

On construit une expression régulière pour matcher les périodes
On boucle sur les patterns de la liste url :
- Instancier le pattern par les valeurs des facettes fournies, et par “.*” pour $PERIOD
- on fait glob.glob
- on affine : on ne retient que les valeurs qui matchent avec la regexp de périodes (sous réserve que le pattern contienne $PERIOD) si on n’a rien, on essaie aussi avec filenameVar; d’où une liste de fichiers lfiles
- on cherche a connaitre les valeurs rencontrées pour chaque facette : on construit une expression régulière (avec groupes) qui capture les valeurs de facettes (y/c PERIOD) et une autre pour capturer la date seulement (est-ce bien encore nécessaire ???)
- Boucle sur les fichiers de lfiles:
  si le pattern n’indique pas qu’on peut extraire la date,
  
  si la frequence indique un champ fixe, on retient le fichier;
  
  sinon , on le retient aussi sans filtrer sur la période
  
  si oui,
  
  on extrait la periode
  
  si elle convient (divers cas …)
  
  si on a pu filtrer sur la variable, ou que variable =”*” ou variable multiple, ou que le fichier contient la bonne variable, eventuellement après renommage alors on retient le fichier
  
  A chaque fois qu’on retient un fichier, on ajoute au dict wildcard_facets les valeurs recontrées pour les attributs
- Dès qu’un pattern de la liste url a eu des fichiers qui collent, on abandonne l’examen des patterns suivants
A la fin , on formate le dictionnaire de valeurs de facettes qui est rendu

Internal functions - presented here for their doc¶

init_period¶

selectFiles¶

selectGenericFiles¶

Table of Contents

Previous topic

Next topic

This Page