YAML-based tests: design principles and implementation details¶

This document discusses the new YAML-based format used to represent physical results in the main output file. The connection with the ABINIT test suite and the yaml syntax used to define tolerances, parameters and constraints is also discussed.

The new infrastructure consists of a set of Fortran modules to output structured data in YAML format and Python code to parse the output files and analyze data.

Motivations¶

In ABINITv8 and previous versions, the ABINIT test suite is based on input files with the associated reference output files. Roughly speaking, an automatic test consists in comparing the reference file with the output file in a line-oriented fashion by computing differences between floating-point numbers without any knowledge about the meaning and the importance of the numerical values. This approach is very rigid and rather limited because it obliges developers to use a single (usually large) tolerance to account for possibly large fluctuations in the intermediate results whereas the tolerance criteria should be ideally applied only to the final results that are (hopefully) independent of details such as hardware, optimization level and parallelism.

This limitation is clearly seen when comparing the results of iterative algorithms. The number of iterations required to converge, indeed, may depend on several factors especially when the input parameters are far from convergence or when different parallelization schemes or stochastic methods are used. From the point of view of code validation, what really matters is the final converged value plus the time to solution if performance ends up being of concern. Unfortunately, any line-by-line comparison algorithm will miserably fail in such conditions because it will continue to insist on having the same number of iterations (lines) in the two calculations to consider the test succeeded. It is therefore clear that the approach used so far to validate new developments in Abinit is not able to cope with the challenges posed by high-performance computing and that smarter and more flexible approaches are needed to address these limitations. Ideally, we would like to be able to

use different tolerances for particular quantities that are selected by keyword
be able to replace the check on the absolute and relative difference with a threshold check (is this quantity smaller that the give threshold?)
have some sort of syntax to apply different rules depending on the iteration state e.g. the dataset index
execute python code (callbacks) that operates on the data to perform more advanced tests requiring some sort of post-processing
provide an easy-to-use declarative interface that allows developers to define the logic to compare selected quantities.

In what follows, we present this new infrastructure, the design principles and the steps required to define YAML-based tests.

Important

The new YAML-based testsuite relies on libraries that are not provided by the python standard library. To install these dependencies in user mode, use:

pip install numpy pyyaml pandas --user

If these dependencies are not available, the new test system will be disabled and a warning message is printed to the terminal.

Implementation details¶

For reasons that will be clear later, implementing smart algorithms requires metadata and context. In other words, the python code needs to have some basic understanding of the meaning of the numerical values extracted from the output file and must be able to locate a particular property by name or by its “position” inside the output file. For this reason, the most important physical results are now written in the main output file (ab_out) using machine-readable YAML documents.

A YAML document starts with three hyphens (—) followed by an optional tag beginning with an exclamation mark (e.g. !ETOT). Three periods (…) signals the end of the document. Following these rules, one can easily write a dictionary containing the different contributions to the total free energy using:

--- !EnergyTerms
comment             : Components of total free energy (in Hartree)
kinetic             :  4.323825067238190201E+01
hartree             :  3.133994553363548619E+01
xc                  : -2.246182631222462689E+01
Ewald energy        : -1.142203169790575572E+02
psp_core            :  6.536925226004570710E+00
local_psp           : -9.852521179991468614E+01
spherical_terms     :  2.587718945528139081E+00
internal            : -1.515045147136467563E+02
'-kT*entropy'       : -3.174881766478041684E-03
total_energy        : -1.515076895954132397E+02
total_energy_eV     : -4.122733899322517573E+03
...

Further details about the meaning of tags, labels and their connection with the testing infrastructure will be given in the below sections. For the time being, it is sufficient to say that we opted for YAML because it is a human-readable data serialization language already used in the log file to record important events such as WARNINGs, ERRORs and COMMENTs (this is indeed the protocol employed by AbiPy to monitor the status of Abinit calculations). Many programming languages, including python, provide support for YAML hence it is relatively easy to implement post processing tools based on well-established python libraries for scientific computing such as NumPy, SciPy, Matplotlib and Pandas. Last but not least, writing YAML in Fortran does not represent an insurmountable problem provided one keeps the complexity of the YAML document at a reasonable level. Our Fortran implementation, indeed, supports only a subset of the YAML specifications:

scalars
arrays with one or two dimensions
dictionaries mapping strings to scalars
tables in CSV format (this is an extension of the standard)

Important

YAML is not designed to handle large amount of data therefore it should not be used to represent large arrays for which performance is critical and human-readability is lost by definition (do you really consider a YAML list with one thousand numbers human-readable?). Following this philosophy, YAML is supposed to be used to print the most important results in the main output file and should not be considered as a replacement for binary netcdf files when it comes to storing large data structures with lots of metadata.

Note also that we do not plan to rewrite entirely the main output file in YAML syntax but we prefer to focus on those physical properties that will be used by the new test procedure to validate new developments. This approach, indeed, will facilitate the migration to the new YAML-based approach as only selected portions of the output file will be ported to the new format thus maintaining the look and the feel relatively close to the previous unstructured format.

YAML configuration file¶

How to activate the YAML mode¶

The parameters governing the execution of the test are specified in the TEST_INFO section located at the end of the input file. The options are given in the INI file format. The integration of the new YAML-based tests with the pre-existent infrastructure is obtained via two modifications of the current specifications. More specifically:

the files_to_test section now accepts the optional argument use_yaml. The allowed values are:
- “yes” → activate YAML mode
- “no” → do not use YAML mode (default)
- “only” → use YAML mode, deactivate legacy fldiff algorithm
a new optional section [yaml_test] has been added. This section contains two mutually exclusive fields:
- file → path of the YAML configuration file. The path is relative to the input file. A natural choice would be to use the same prefix as the input file e.g. “./t21.yaml” is the configuration file associated to the input file “t21.in”.
- test → multi-line string with the YAML specifications. This option may be used for really short configurations heavily relying on the default values.

An example of TEST_INFO section that activates the YAML mode can be found in paral[86]:

#%%<BEGIN TEST_INFO>
#%% [setup]
#%% executable = abinit
#%% [files]
#%% psp_files = 23v.paw, 38sr.paw, 8o.paw
#%% [paral_info]
#%% nprocs_to_test = 4
#%% max_nprocs = 4
#%% [NCPU_4]
#%% files_to_test =
#%%   t86_MPI4.out, use_yaml = yes, tolnlines = 4, tolabs = 2.0e-2, tolrel = 1.0, fld_options = -easy;
#%% [extra_info]
#%% authors = B. Amadon, T. Cavignac
#%% keywords = DMFT, FAILS_IFMPI
#%% description = DFT+DMFT for SrVO3 using Hubard I code with KGB parallelism
#%% topics = DMFT, parallelism
#%% [yaml_test]
#%% file = ./t86.yaml
#%%<END TEST_INFO>

with the associated YAML configuration file given by:

Our first example of YAML configuration file¶

Let us start with a minimalistic example in which we compare the components of the total free energy in the Etot document with an absolute tolerance of 1.0e-7 Ha. The YAML configuration file will look like:

Etot:
    tol_abs: 1.0e-7

The tol_abs keyword defines the constraint that will applied to all the children of the Etot document. In other words, all the entries in the Etot dictionary will be compared with an absolute tolerance of 1.0e-7 and the default value for the relative difference tol_rel as this tolerance is not explicitly specified.

There are however cases in which we would like to specify different tolerances for particular entries instead of relying on the global tolerances. The Etot document, for example, contains the total energy in eV in the Total energy (eV) entry. To use a different absolute tolerance for this property, we specialize the rule with the syntax:

Etot:
    tol_abs: 1.0e-7
    Total energy (eV):
        tol_abs: 1.0e-5

To change the default value for the relative difference, it is sufficient to specify the constraint outside of the document:

tol_rel: 1.0e-2

Etot:
    tol_abs: 1.0e-7
    Total energy (eV):
        tol_abs: 1.0e-5

Basic concepts¶

In the previous section, we presented a minimal example of configuration file. In the next paragraphs we will discuss in more detail how to implement more advanced test but before proceeding with the examples, we need to introduce some basic terminology to facilitate the discussion.

Document tree: The YAML document is a dictionary that can be treated as a tree whose nodes have a label and leaf are scalars or special data structures identified by a tag (note however that not all tags mark a leaf). The top-level nodes are the YAML documents and their labels are the names of their tag.
Config tree: The YAML configuration also takes the form of a tree where nodes are specializations and its leaf represent parameters or constraints. Its structure matches the structure of the document tree thus one can define rules (constraint and parameters) that will be applied to a specific part of the document tree.
Specialization: The rules defined under a specialization will apply only on the matching node of the document tree and its children.
Constraint: A constraint is a condition one imposes for the test to succeed. Constraints can apply to leafs of the document tree or to nodes depending of the nature of the constraint.
Parameter: A parameter is a value that can be used by the constraints to modify their behavior.
Iteration state: An iteration state describes how many iterations of each possible level are present in the run (e.g. idtset = 2, itimimage = not used, image = 5, time = not used). It gives information on the current state of the run. Documents are implicitly associated to their iteration state. This information is made available to the test engine through specialized YAML documents with IterStart tag.

Tip

To get the list of constraints and parameters, run:

~abinit/tests/testtools.py explore

and type show *. You can then type for example show tol_eq to learn more about a specific constraint or parameter.

A more complicated example¶

The Etot document is the simplest possible document. It only contains fields with real values. Now we will have a look at the ResultsGS document that represents the results stored in the corresponding Fortran datatype used in Abinit. The YAML document is now given by:

--- !ResultsGS
comment   : Summary of ground states results.
natom     :        5
nsppol    :        1
cut       : {"ecut":   1.20000000000000000E+01, "pawecutdg":   2.00000000000000000E+01, }
convergence: {
    "deltae":   2.37409381043107715E-09, "res2":   1.41518780109792898E-08, 
    "residm":   2.60254842131463755E-07, "diffor": 0.00000000000000000E+00, 
}
etotal    :  -1.51507711707660150E+02
entropy   :   0.00000000000000000E+00
fermie    :   3.09658145725792422E-01
stress tensor: !Tensor
- [  3.56483996349480498E-03,   0.00000000000000000E+00,   0.00000000000000000E+00, ]
- [  0.00000000000000000E+00,   3.56483996349480151E-03,   0.00000000000000000E+00, ]
- [  0.00000000000000000E+00,   0.00000000000000000E+00,   3.56483996349478416E-03, ]

cartesian forces: !CartForces
- [ -0.00000000000000000E+00,  -0.00000000000000000E+00,  -0.00000000000000000E+00, ]
- [ -0.00000000000000000E+00,  -0.00000000000000000E+00,  -0.00000000000000000E+00, ]
- [ -0.00000000000000000E+00,  -0.00000000000000000E+00,  -0.00000000000000000E+00, ]
- [ -0.00000000000000000E+00,  -0.00000000000000000E+00,  -0.00000000000000000E+00, ]
- [ -0.00000000000000000E+00,  -0.00000000000000000E+00,  -0.00000000000000000E+00, ]
...

This YAML document is more complicated as it contains scalar fields, dictionaries and even 2D arrays. MG: Are integer values always compared without tolerance? Still, the parsers will be able to locate the entire document via its tag/label and address all the entries by name. To specify the tolerance for the relative difference for all the scalar quantities in ResultsGS, we just add a new entry to the YAML configuration file similarly to what we did for EnergyTerms:

EnergyTerms:
    tol_abs: 1.0e-7
    total_energy_eV:
        tol_abs: 1.0e-5
        tol_rel: 1.0e-10

ResultsGS:
    tol_rel: 1.0e-8

At this point we have to precise that there are implicit top-level settings that are applied on all quantities that are not subject to a more specific rule. For example in the above example ResultsGS is subject to the default absolute tolerance, whereas the default relative tolerance have been overridden.

Unfortunately, such a strict value for tol_rel will become very problematic when we have to compare the residues stored in the convergence dictionary! In this case, it makes more sense to check that all the residues are below a certain threshold. This is what the ceil constraint is for:

ResultsGS:
    tol_rel: 1.0e-8
    convergence:
        ceil: 3.0e-7

Now the test will fail if one of the components of the convergence dictionary is above 3.0e-7. Note that the ceil constraint automatically disables the check for tol_rel and tol_abs inside convergence. In other words, all the scalar entries in ResultsGS will be compared with our tol_rel and the default tol_abs whereas the entries in the convergence dictionary will be tested against ceil.

Tip

Within the explore shell show ceil will list the constraints that are disabled by the use of ceil in the exclude field.

Up to now we have been focusing on scalar quantities for which the concept of relative and absolute difference is unambiguously defined but how do we compare vectors and matrices? Fields with the !TensorCart tags are leafs of the tree. The tester routine won’t try to compare each individual coefficient with tol_rel. However we still want to check that it does not change too much. For that purpose we use the tol_vec constraint which apply to all arrays derived from BaseArray (most arrays with a tag). BaseArray let us use the capabilities of Numpy arrays with YAML defined arrays. tol_vec check the euclidean distance between the reference and the output arrays. Since we also want to apply this constraint to cartesian_force, we will define the constraint at the top level of ResultsGS.

ResultsGS:
    tol_rel: 1.0e-8
    tol_vec: 1.0e-5
    convergence:
        ceil: 3.0e-7

How to use filters to select documents by iteration state¶

Thanks to the syntax presented in the previous sections, one can customize tolerances for different documents and different entries. Note however that these rules will be applied to all the documents found in the output file. This means that we are implicitly assuming that all the different steps of the calculation have similar numerical stability. There are however cases in which the results of particular datasets are less numerically stable than the others. An example will help clarify.

The test paral[86] uses two datasets to perform two different computations. The first dataset computes the DFT density with LDA while the second dataset uses the LDA density to perform a DMFT computation. The entire calculation is supposed to take less than ~3-5 minutes hence the input parameters are severely under converged and the numerical noise propagates quickly through the different steps. As a consequence, one cannot expect the DFMT results to have the same numerical stability as the LDA part. Fortunately, one can use filters to specify different convergence criteria for the two datasets.

A filter is a mechanism that allows one to associate a specific configuration to a set of iteration states. A filter is defined in a separated section of the configuration file under the node filters. Let’s declare two filters with the syntax:

filters:
    ks:
        dtset: 1
    dmft:
        dtset: 2

Here we are simply saying that we want to associate the label ks to all documents created in the first dataset and the label dmft to all document created in the second dataset. The chose of the names ks and dmft are absolutely arbitrary. Pick anything that make sense to your test. This is the simplest filter declaration possible. See here for more info on filter declarations. Now we can use our filters. First of all we will associate the configuration we already wrote to the ks filter so we can have a different configuration for the second dataset. The YAML file now reads

filters:
    ks:
        dtset: 1
    dmft:
        dtset: 2

ks:
    EnergyTerms:
        tol_abs: 1.0e-7
        total_energy_eV:
            tol_abs: 1.0e-5
            tol_rel: 1.0e-10

    ResultsGS:
        tol_rel: 1.0e-8
        tol_vec: 1.0e-5
        convergence:
            ceil: 3.0e-7

By inserting the configuration options under the ks node, we specify that these rules apply only to the first dataset. We will then create a new dmft node and create a configuration following the same procedure as before. We end up with something like this:

filters:
    ks:
        dtset: 1
    dmft:
        dtset: 2

ks:
    EnergyTerms:
        tol_abs: 1.0e-7
        total_energy_eV:
            tol_abs: 1.0e-5
            tol_rel: 1.0e-10

    ResultsGS:
        tol_rel: 1.0e-8
        tol_vec: 1.0e-5
        convergence:
            ceil: 3.0e-7

dmft:
    tol_abs: 2.0e-8
    tol_rel: 5.0e-9

    ResultsGS:
        convergence:
            ceil: 1.0e-6
            diffor:
                ignore: true
        fermie:
            tol_abs: 1.0e-7
            tol_rel: 1.0e-8

        stress tensor:
            ignore: true
    EnergyTerms:
        total_energy_eV:
            tol_abs: 1.0e-5
            tol_rel: 1.0e-8

    EnergyTermsDC:
        tol_abs: 1.0e-7
        total_energy_dc_eV:
            tol_abs: 1.0e-5
            tol_rel: 1.0e-8

Filters API¶

Filters provide a practical way to specify different configuration for different states of iterations without having to rewrite everything from scratch.

Filter declaration¶

A filter can specify all currently known iterators: dtset, timimage, image, and time. For each iterator, a set of integers can be defined with three different methods:

a single integer value e.g. dtset: 1
a YAML list of values e.g. dtset: [1, 2, 5]
a mapping with the optional members “from” and “to” specifying the boundaries (both included) of the integer interval e.g. dtset: {from: 1, to: 5}. If “from” is omitted, the default is 1. If “to” is omitted the default is no upper boundary.

Tip

The order is never relevant in parsing YAML (unless you are writing a list of course). As a consequence you can define filter wherever you want in the file.

Filter overlapping¶

Several filters can apply to the same document even when they overlap. Note, however, that overlapping filters must have a trivial order of specificity. In other words, one filter must be a subset of the other one. The example below is OK because f2 is included in f1 i.e. is more specific:

# this is fine
filters:
    f1:
        dtset:
            from: 2
            to: 7
        image:
            from: 4

    f2:
        dtset: 7
        image:
        - 4
        - 5
        - 6

whereas this second example will raise an error because f4 is not included in f3.

# this will raise an error
filters:
    f3:
        dtset:
            from: 2
            to: 7
        image:
            from: 4

    f4:
        dtset: 7
        image:
            from: 1
            to: 5

When a test is defined, the default tree is overridden by the user-defined tree. When a filtered tree is used, it overrides the less specific tree. Trees are sequentially applied to the tree from the most general to the most specific one. The overriding process is often used, though it is important to know how it works. By default, only what is explicitly specified in the file is overridden which means that if a constraint is defined at a deeper level on the default tree than what is done on the new tree, the original constraints will be kept. For example let f1 and f2 be two filters such that f2 is included in f1.

filters:
    f1:
        dtset: 1
    f2:
        dtset: 1
        image: 5

f1:
    ResultsGS:
        tol_abs: 1.0e-6
        convergence:
            ceil: 1.0e-6
            diffor:
                1.0e-4

f2:
    ResultsGS:
        tol_rel: 1.0e-7
        convergence:
            ceil: 1.0e-7

When the tester will reach the fifth image of the first dataset, the config tree used will be the following:

ResultsGS:
    tol_abs: 1.0e-6  # this come from application of f1
    tol_rel: 1.0e-7  # this has been appended without modifying anything else when appling f2
    convergence:
        ceil: 1.0e-7  # this one have been overridden
        diffor:
            1.0e-4  # this one have been kept

If this is not the behavior you need, you can use the “hard reset marker”. Append ! to the name of the specialization you want to override to completely replace it. Let the f2 tree be:

f2:
    ResultsGS:
        convergence!:
            ceil: 1.0e-7

and now the resulting tree for the fifth image of the first dataset is:

ResultsGS:
    tol_abs: 1.0e-6
    convergence:  # the whole convergence node have been overriden
        ceil: 1.0e-7

Tip

Here again the explore shell could be of great help to know what is inherited from the other trees and what is overridden.

How to use equation and callback¶

equation and callback are special constraints because their actual effects are defined directly in the configuration file. They have been introduced to increase the flexibility of the configuration file without having to change the python code.

equation takes a string in input. This string will be interpreted as a python expression that must return in a number. The absolute value of this number will be compared to the value of the tol_eq parameter and if tol_eq is greater the test will succeed. The expression can also result in a numpy array. In this case, the returned value if the euclidean norm of the array that will be compared to tol_eq value. A minimal example:

EnergyTerms:
    tol_eq: 1.0e-6
    equation: 'this["Etotal"] - this["Total energy(eV)"]/27.2114'

equations works exactly the same but has a list of string as value. Each string is a different expression that will be tested independently from the others. In both case the tested object can be referred as this and the reference object can be referred as ref.

callback requires a bit of python coding since it will invoke a method of the structure. Suppose we have a tag !AtomSpeeds associated to a document and a class AtomSpeeds. The AtomSpeeds class have a method not_going_anywhere that checks that the atoms are not going to try to leave the box. We would like to pass some kind of tolerance d_min the minimal distance atoms can approach the border of the box. The signature of the method have to be not_going_anywhere(self, tested, d_min=DEFAULT_VALUE) and should return True, False or an instance of FailDetail (see Add a new constraint for explanations about those). Note that self will be the reference instance. We can then use it by with the following configuration:

AtomSpeeds:
    callback:
        method: not_going_anywhere
        d_min: 1.0e-2

Command line interface¶

The ~abinit/tests/testtools.py script provides a command line interface to facilitate the creation of new tests and the exploration of the YAML configuration file. The syntax is:

./testtools.py COMMAND [options]

Run the script without arguments to get the list of possible commands and use:

./testtools.py COMMAND --help

to display the options supported by COMMAND. The list of available commands is:

fldiff: Interface to the fldiff.py module. This command can be used to compare output and reference files without executing ABINIT. It is also possible to specify the YAML configuration file with the --yaml-conf option so that one can employ the same parameters as those used by runtests.py
explore: This command allows the user to explore and validate a YAML configuration file. It provides a shell like interface in which the user can explore the tree defined by the configuration file and print the constraints. It also provides documentation about constraints and parameters via the show command.