Tutorial on basic parallelism¶
Parallelism in ABINIT, generalities and environments.¶
There are many situations where a sequential code is not enough, often because it would take too much time to get a result. There are also cases where you just want things to go as fast as your computational resources allow it. By using more than one processor, you might also have access to more memory than with only one processor. To this end, it is possible to use ABINIT in parallel, with dozens, hundreds or even thousands of processors.
This tutorial offers you a quick guided tour inside the complex world that emerges as soon as you want to use more than one processor. From now on, we will suppose that you are already familiar with ABINIT and that you have gone through all four basic tutorials. If this is not the case, we strongly advise you to do so, in order to truly benefit from this tutorial.
We strongly recommend you to acquaint yourself with some basic concepts of parallel computing too. In particular Almdalh’s law, that rationalizes the fact that, beyond some number of processors, the inherently sequential parts will dominate parallel parts, and give a limitation to the maximal speedup that can be achieved.
This tutorial describes only basic possibilities of parallel computing with ABINIT. After reading this basic tutorial, you will likely benefit to read other tutorials related to parallelism in ABINIT. This will be explained later.
Generalities¶
With the broad availability of multi-core processors, everybody now has a parallel machine at hand. ABINIT will be able to take advantage of the availability of several cores for most of its capabilities, be it ground-state calculations, molecular dynamics, linear-response, many-body perturbation theory, …
Such tightly integrated multi-core processors (or so-called SMP machines, meaning Symmetric Multi-Processing) can be interlinked within networks, based on Ethernet or other types of connections. The number of cores in such composite machines can easily exceed one hundred, and go up to several millions these days. Most ABINIT capabilities can use efficiently several hundred computing cores. In some cases, even more than ten thousand computing cores can be used efficiently.
Before actually starting this tutorial and the associated ones, we strongly
advise you to get familiar with your own parallel environment. It might be
relatively simple for a SMP machine, but more difficult for very powerful
machines. You will need at least to have MPI (see next section) installed on
your machine. Take some time to determine how you can launch a job in parallel
with MPI, what are the resources available and the limitations as well.
Perhaps you will have to use a batch system
(typically the qsub
or sbatch
command and an associated shell script).
Do not hesitate to
discuss with your system administrator if you feel that something is not clear to you.
We will suppose in the following that you know how to run a parallel program and that you are familiar with the peculiarities of your system. Please remember that, as there is no standard way of setting up a parallel environment, we are not able to provide you with support beyond ABINIT itself.
Characteristics of parallel environments¶
Different software solutions can be used to benefit from parallelism. Most of ABINIT parallelism is based on MPI, but significant additional speedup (or a better distribution of data, allowing to run bigger calculations) is based on OpenMP and multi-threaded libraries. As of writing, efforts also focus on Graphical Processing Units (GPUs), with CUDA and MAGMA. The latter will not be described in the present tutorial.
MPI¶
MPI stands for Message Passing Interface. The goal of MPI, simply stated, is to develop a widely used standard for writing message-passing programs. As such the interface attempts to establish a practical, portable, efficient, and flexible standard for message passing.
The main advantages of establishing a message-passing standard are portability and ease of use. In a distributed memory communication environment in which the higher-level routines and/or abstractions are build upon lower-level message-passing routines, the benefits of standardization are particularly obvious. Furthermore, the definition of a message-passing standard provides vendors with a clearly defined base set of routines that they can implement efficiently, or in some cases provide hardware support for, thereby enhancing scalability (see http://mpi-forum.org).
At some point in its history MPI has reach a critical popularity level, and a bunch of projects have popped-up like daisies in the grass. Now the tendency is back to gathering and merging. For instance, Open MPI is a project combining technologies and resources from several other projects (FT-MPI, LA-MPI, LAM/MPI, and PACX-MPI) in order to build the best MPI library available. Open MPI is a completely new MPI3.1-compliant implementation, offering advantages for system and software vendors, application developers and computer science researchers (see https://www.open-mpi.org)
OpenMP¶
The OpenMP Application Program Interface (API) supports multi-platform shared-memory parallel programming in C/C++ and Fortran on all architectures, including Unix platforms and Windows NT platforms. Jointly defined by a group of major computer hardware and software vendors, OpenMP is a portable, scalable model that gives shared-memory parallel programmers a simple and flexible interface for developing parallel applications for platforms ranging from the desktop to the supercomputer (http://www.openmp.org).
OpenMP was rarely used within ABINIT versions < 8.8.x, and only for specific purposes. Last versions > 8.8 now benefits from multi-threaded libraries speedup like MKL and fftw3. Still not mandatory, on new architectures, multithreading shows better performances than MPI (if and only if an multithread version of linear algebra library is provided)
Scalapack¶
Scalapack is the parallel version of the popular LAPACK library (for linear algebra). It can play some role in the parallelism of several parts of ABINIT, especially the LOBPCG algorithm in ground state calculations, and the parallelism for the Bethe-Salpether equation. ScaLAPACK being itself based on MPI, we will not discuss its use in ABINIT in this tutorial.
Warning
Scalapack is not thread-safe in many versions. Combining OpenMP and Scalapack can result in unpredictable behaviours.
Fast/slow communications¶
Characterizing the data-transfer efficiency between two computing cores (or the whole set of cores) is a complex task. At a quite basic level, one has to recognize that not only the quantity of data that can be transferred per unit of time is important, but also the time that is needed to initialize such a transfer (so called latency).
Broadly speaking, one can categorize computers following the speed of communications. In the fast communication machines, the latency is very low and the transfer time, once initialized, is very low too. For the parallelised part of ABINIT, SMP machines and machines with fast interconnect will usually not be limited by their network characteristics, but by the existence of residual sequential parts. The tutorials that have been developed for ABINIT have been based on fast communication machines.
If the set of computing cores that you plan to use is not entirely linked using a fast network, but includes some connections based e.g. on Ethernet, then, you might not be able to benefit from the speed-up announced in the tutorials. You have to perform some tests on your actual machine to gain knowledge of it, and perhaps consider using multithreading.
What parts of ABINIT are parallel?¶
Parallelizing a code is a very delicate and complicated task, thus do not expect that things will systematically go faster just because you are using more processors. Please keep also in mind that in some situations, parallelization is simply impossible. At the present time, the parts of ABINIT that have been parallelized, and for which a tutorial is available, include:
- parallelism over bands and plane waves,
- ground state with wavelets,
- molecular dynamics,
- parallelism on “images”,
- density-functional perturbation theory (DFPT),
- Many-Body Perturbation Theory.
Note that the tutorial on parallelism over bands and plane waves presents a complete overview of the parallelism for the ground state, including up to four levels of parallelisation and, as such, is rather complex. Of course, it is also quite powerful, and allows to use several hundreds of processors.
Albeit, the two levels based on
- the treatment of k-points in reciprocal space;
- the treatment of spins, for spin-polarized collinear situations nsppol = 2);
are, on the contrary, quite easy to use. Examples of such parallelism will be given in the next sections.
A simple example of parallelism in ABINIT¶
Note
Supposing you made your own installation of ABINIT, the input files to run the examples are in the ~abinit/tests/ directory where ~abinit is the absolute path of the abinit top-level directory. If you have NOT made your own install, ask your system administrator where to find the package, especially the executable and test files.
In case you work on your own PC or workstation, to make things easier, we suggest you define some handy environment variables by executing the following lines in the terminal:
export ABI_HOME=Replace_with_absolute_path_to_abinit_top_level_dir # Change this line
export PATH=$ABI_HOME/src/98_main/:$PATH # Do not change this line: path to executable
export ABI_TESTS=$ABI_HOME/tests/ # Do not change this line: path to tests dir
export ABI_PSPDIR=$ABI_TESTS/Pspdir/ # Do not change this line: path to pseudos dir
Examples in this tutorial use these shell variables: copy and paste
the code snippets into the terminal (remember to set ABI_HOME first!) or, alternatively,
source the set_abienv.sh
script located in the ~abinit directory:
source ~abinit/set_abienv.sh
The ‘export PATH’ line adds the directory containing the executables to your PATH so that you can invoke the code by simply typing abinit in the terminal instead of providing the absolute path.
To execute the tutorials, create a working directory (Work*
) and
copy there the input files of the lesson.
Most of the tutorials do not rely on parallelism (except specific tutorials on parallelism). However you can run most of the tutorial examples in parallel with MPI, see the topic on parallelism.
Running a job¶
Before starting, you might consider working in a different subdirectory as for the other tutorials. Why not Work_paral?
First one needs to copy the input file from the $ABI_TESTS/tutorial directory to your work directory, namely tbasepar_1.abi.
cd $ABI_TESTS/tutorial/Input
mkdir Work_paral
cd Work_paral
cp ../tbasepar_1.abi .
# # Lead crystal # #Definition of the unit cell acell 10.0 10.0 10.0 rprim 0.0 0.5 0.5 0.5 0.0 0.5 0.5 0.5 0.0 #Definition of the atom types and pseudopotentials ntypat 1 znucl 82 pp_dirpath "$ABI_PSPDIR" pseudos "Psdj_nc_sr_04_pw_std_psp8/Pb.psp8" #Definition of the atoms and atoms positions natom 1 typat 1 xred 0.000 0.000 0.000 #Numerical parameters of the calculation : planewave basis set and k point grid ecut 24.0 ngkpt 12 12 12 nshiftk 4 shiftk 0.5 0.5 0.5 0.5 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.5 occopt 7 tsmear 0.01 nband 7 #Parameters for the SCF procedure nstep 10 tolvrs 1.0d-10 ############################################################## # This section is used only for regression testing of ABINIT # ############################################################## #%%<BEGIN TEST_INFO> #%% [setup] #%% executable = abinit #%% [files] #%% files_to_test = tbasepar_1.abo, tolnlines=0, tolabs=0.0, tolrel=0.0 #%% [paral_info] #%% max_nprocs = 4 #%% [extra_info] #%% authors = Unknown #%% keywords = NC #%% description = Lead crystal. Parallelism over k-points #%%<END TEST_INFO>
You can start immediately a sequential run with
abinit tbasepar_1.abi >& log 2> err &
to have a reference CPU time. On a Intel Xeon 20C 2.1 GHz, it runs in about 40 seconds.
The input file (*.abi) might possibly be modified for parallel execution, as one should avoid
unnecessary network communications. Indeed, if every node has its own temporary or
scratch directory (so not in the multicore case), you can achieve this by providing a path to a local disk
for the temporary files in the input file by using the tmpdata_prefix variable. Supposing each processor has access
to a local temporary disk space named /scratch/user
, then you might add to the input *.abi file the following line
tmpdata_prefix="/scratch/user/tbasepar_1"
Note that determining ahead of time the precise resources you will need for your run will save you a lot of time if you are using a batch queue system.
Also, for parallel runs, note that the log files will not be written except the main log file.
You can change this behaviour by creating a file named _LOG
to enforce the creation of all log files
touch _LOG
On the contrary, you can create a _NOLOG file if you want to avoid all log files.
Parallelism over the k-points¶
The most favorable case for a parallel run is to treat the k-points concurrently, since most calculations can be done independently for each one of them.
Actually, tbasepar_1.abi corresponds to the investigation of a FCC crystal of
lead, which requires a large number of k-points if one wants to get an
accurate description of the ground state. Examine this file. Note that the
cut-off is realistic, as well as the grid of k-points (giving 182 k points in
the irreducible Brillouin zone).
Once done, your output files for the sequential run, launched while starting to read this section, have likely been produced.
Examine the timing in the output file (the last line gives the Overall time
, cpu
and wall
), and keep note of it.
We assume you have compiled ABINIT indicating with_mpi="yes"
at configuration step.
On a multi-core PC, you might succeed to use two compute cores by issuing the run command for your MPI implementation, and mention the number of processors you want to use, as well as the abinit command:
mpirun -n 2 abinit tbasepar_1.abi >& tbasepar_1.log &
Depending on your particular machine, mpirun might have to be replaced by
mpiexec, and -n
by some other option.
At variance, on a cluster, with the MPICH implementation of MPI, you have to set up a file with the addresses of the different CPUs. Let’s suppose you call it cluster. For a PC bi-processor machine, this file could have only one line, like the following:
sleepy.pcpm.ucl.ac.be:2
For a cluster of four machines, you might have something like:
tux0
tux1
tux2
tux3
Then, you have to issue the run command for your MPI implementation, and mention the number of processors you want to use, as well as the abinit command and the file containing the CPU addresses.
On a PC bi-processor machine, this gives the following:
mpirun -np 2 -machinefile cluster ../../src/main/abinit tbasepar_1.abi >& tbasepar_1.log &
Now, examine the corresponding output file. If you have kept the output from the sequential job, you can make a diff between the two files.
.Version 10.1.4.5 of ABINIT, released Sep 2024. .(MPI version, prepared for a x86_64_linux_gnu13.2 computer) .Copyright (C) 1998-2024 ABINIT group . ABINIT comes with ABSOLUTELY NO WARRANTY. It is free software, and you are welcome to redistribute it under certain conditions (GNU General Public License, see ~abinit/COPYING or http://www.gnu.org/copyleft/gpl.txt). ABINIT is a project of the Universite Catholique de Louvain, Corning Inc. and other collaborators, see ~abinit/doc/developers/contributors.txt . Please read https://docs.abinit.org/theory/acknowledgments for suggested acknowledgments of the ABINIT effort. For more information, see https://www.abinit.org . .Starting date : Fri 13 Sep 2024. - ( at 19h06 ) - input file -> /home/buildbot/ABINIT3/eos_gnu_13.2_mpich/trunk_merge-10.0/tests/TestBot_MPI1/tutorial_tbasepar_1/tbasepar_1.abi - output file -> tbasepar_1.abo - root for input files -> tbasepar_1i - root for output files -> tbasepar_1o Symmetries : space group Fm -3 m (#225); Bravais cF (face-center cubic) ================================================================================ Values of the parameters that define the memory need of the present run intxc = 0 ionmov = 0 iscf = 7 lmnmax = 6 lnmax = 6 mgfft = 32 mpssoang = 3 mqgrid = 3001 natom = 1 nloc_mem = 1 nspden = 1 nspinor = 1 nsppol = 1 nsym = 48 n1xccc = 2501 ntypat = 1 occopt = 7 xclevel = 1 - mband = 7 mffmem = 1 mkmem = 182 mpw = 1418 nfft = 32768 nkpt = 182 ================================================================================ P This job should need less than 42.468 Mbytes of memory. Rough estimation (10% accuracy) of disk space for files : _ WF disk file : 27.567 Mbytes ; DEN or POT disk file : 0.252 Mbytes. ================================================================================ -------------------------------------------------------------------------------- ------------- Echo of variables that govern the present computation ------------ -------------------------------------------------------------------------------- - - outvars: echo of selected default values - iomode0 = 0 , fftalg0 =512 , wfoptalg0 = 0 - - outvars: echo of global parameters not present in the input file - max_nthreads = 0 - -outvars: echo values of preprocessed input variables -------- acell 1.0000000000E+01 1.0000000000E+01 1.0000000000E+01 Bohr amu 2.07200000E+02 ecut 2.40000000E+01 Hartree - fftalg 512 ixc -1012 kpt -4.16666667E-02 -8.33333333E-02 0.00000000E+00 -4.16666667E-02 -1.66666667E-01 0.00000000E+00 -8.33333333E-02 -1.25000000E-01 0.00000000E+00 -4.16666667E-02 -1.25000000E-01 4.16666667E-02 -4.16666667E-02 -2.50000000E-01 0.00000000E+00 -8.33333333E-02 -2.08333333E-01 0.00000000E+00 -4.16666667E-02 -2.08333333E-01 4.16666667E-02 -1.25000000E-01 -1.66666667E-01 0.00000000E+00 -8.33333333E-02 -1.66666667E-01 4.16666667E-02 -4.16666667E-02 -1.66666667E-01 8.33333333E-02 -4.16666667E-02 -3.33333333E-01 0.00000000E+00 -8.33333333E-02 -2.91666667E-01 0.00000000E+00 -4.16666667E-02 -2.91666667E-01 4.16666667E-02 -1.25000000E-01 -2.50000000E-01 0.00000000E+00 -8.33333333E-02 -2.50000000E-01 4.16666667E-02 -4.16666667E-02 -2.50000000E-01 8.33333333E-02 -1.66666667E-01 -2.08333333E-01 0.00000000E+00 -1.25000000E-01 -2.08333333E-01 4.16666667E-02 -8.33333333E-02 -2.08333333E-01 8.33333333E-02 -4.16666667E-02 -2.08333333E-01 1.25000000E-01 -4.16666667E-02 -4.16666667E-01 0.00000000E+00 -8.33333333E-02 -3.75000000E-01 0.00000000E+00 -4.16666667E-02 -3.75000000E-01 4.16666667E-02 -1.25000000E-01 -3.33333333E-01 0.00000000E+00 -8.33333333E-02 -3.33333333E-01 4.16666667E-02 -4.16666667E-02 -3.33333333E-01 8.33333333E-02 -1.66666667E-01 -2.91666667E-01 0.00000000E+00 -1.25000000E-01 -2.91666667E-01 4.16666667E-02 -8.33333333E-02 -2.91666667E-01 8.33333333E-02 -4.16666667E-02 -2.91666667E-01 1.25000000E-01 -2.08333333E-01 -2.50000000E-01 0.00000000E+00 -1.66666667E-01 -2.50000000E-01 4.16666667E-02 -1.25000000E-01 -2.50000000E-01 8.33333333E-02 -8.33333333E-02 -2.50000000E-01 1.25000000E-01 -4.16666667E-02 -2.50000000E-01 1.66666667E-01 -4.16666667E-02 5.00000000E-01 0.00000000E+00 -8.33333333E-02 -4.58333333E-01 0.00000000E+00 -4.16666667E-02 -4.58333333E-01 4.16666667E-02 -1.25000000E-01 -4.16666667E-01 0.00000000E+00 -8.33333333E-02 -4.16666667E-01 4.16666667E-02 -4.16666667E-02 -4.16666667E-01 8.33333333E-02 -1.66666667E-01 -3.75000000E-01 0.00000000E+00 -1.25000000E-01 -3.75000000E-01 4.16666667E-02 -8.33333333E-02 -3.75000000E-01 8.33333333E-02 -4.16666667E-02 -3.75000000E-01 1.25000000E-01 -2.08333333E-01 -3.33333333E-01 0.00000000E+00 -1.66666667E-01 -3.33333333E-01 4.16666667E-02 -1.25000000E-01 -3.33333333E-01 8.33333333E-02 -8.33333333E-02 -3.33333333E-01 1.25000000E-01 -4.16666667E-02 -3.33333333E-01 1.66666667E-01 outvar_i_n : Printing only first 50 k-points. kptrlatt 12 -12 12 -12 12 12 -12 -12 12 kptrlen 1.20000000E+02 P mkmem 182 natom 1 nband 7 ngfft 32 32 32 nkpt 182 nstep 10 nsym 48 ntypat 1 occ 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 prtocc : prtvol=0, do not print more k-points. occopt 7 rprim 0.0000000000E+00 5.0000000000E-01 5.0000000000E-01 5.0000000000E-01 0.0000000000E+00 5.0000000000E-01 5.0000000000E-01 5.0000000000E-01 0.0000000000E+00 shiftk 5.00000000E-01 5.00000000E-01 5.00000000E-01 spgroup 225 symrel 1 0 0 0 1 0 0 0 1 -1 0 0 0 -1 0 0 0 -1 0 -1 1 0 -1 0 1 -1 0 0 1 -1 0 1 0 -1 1 0 -1 0 0 -1 0 1 -1 1 0 1 0 0 1 0 -1 1 -1 0 0 1 -1 1 0 -1 0 0 -1 0 -1 1 -1 0 1 0 0 1 -1 0 0 -1 1 0 -1 0 1 1 0 0 1 -1 0 1 0 -1 0 -1 1 1 -1 0 0 -1 0 0 1 -1 -1 1 0 0 1 0 1 0 0 0 0 1 0 1 0 -1 0 0 0 0 -1 0 -1 0 0 1 -1 0 0 -1 1 0 -1 0 -1 1 0 0 1 -1 0 1 -1 0 1 -1 1 0 -1 0 0 1 0 -1 1 -1 0 1 0 0 0 -1 0 1 -1 0 0 -1 1 0 1 0 -1 1 0 0 1 -1 1 0 -1 0 0 -1 0 1 -1 -1 0 1 0 0 1 0 -1 1 0 1 0 0 0 1 1 0 0 0 -1 0 0 0 -1 -1 0 0 1 0 -1 0 1 -1 0 0 -1 -1 0 1 0 -1 1 0 0 1 0 -1 0 0 -1 1 1 -1 0 0 1 0 0 1 -1 -1 1 0 -1 0 1 -1 0 0 -1 1 0 1 0 -1 1 0 0 1 -1 0 0 1 0 1 0 0 0 0 1 0 -1 0 -1 0 0 0 0 -1 0 0 -1 0 1 -1 1 0 -1 0 0 1 0 -1 1 -1 0 1 1 -1 0 0 -1 1 0 -1 0 -1 1 0 0 1 -1 0 1 0 0 0 1 1 0 0 0 1 0 0 0 -1 -1 0 0 0 -1 0 -1 1 0 -1 0 0 -1 0 1 1 -1 0 1 0 0 1 0 -1 0 0 1 0 1 0 1 0 0 0 0 -1 0 -1 0 -1 0 0 1 -1 0 0 -1 0 0 -1 1 -1 1 0 0 1 0 0 1 -1 0 0 -1 1 0 -1 0 1 -1 0 0 1 -1 0 1 0 -1 1 -1 1 0 -1 0 1 -1 0 0 1 -1 0 1 0 -1 1 0 0 tolvrs 1.00000000E-10 typat 1 wtk 0.00347 0.00347 0.00347 0.00694 0.00347 0.00347 0.00694 0.00347 0.00694 0.00694 0.00347 0.00347 0.00694 0.00347 0.00694 0.00694 0.00347 0.00694 0.00694 0.00694 0.00347 0.00347 0.00694 0.00347 0.00694 0.00694 0.00347 0.00694 0.00694 0.00694 0.00347 0.00694 0.00694 0.00694 0.00694 0.00347 0.00347 0.00694 0.00347 0.00694 0.00694 0.00347 0.00694 0.00694 0.00694 0.00347 0.00694 0.00694 0.00694 0.00694 outvars : Printing only first 50 k-points. znucl 82.00000 ================================================================================ chkinp: Checking input parameters for consistency. ================================================================================ == DATASET 1 ================================================================== - mpi_nproc: 1, omp_nthreads: -1 (-1 if OMP is not activated) --- !DatasetInfo iteration_state: {dtset: 1, } dimensions: {natom: 1, nkpt: 182, mband: 7, nsppol: 1, nspinor: 1, nspden: 1, mpw: 1418, } cutoff_energies: {ecut: 24.0, pawecutdg: -1.0, } electrons: {nelect: 1.40000000E+01, charge: 0.00000000E+00, occopt: 7.00000000E+00, tsmear: 1.00000000E-02, } meta: {optdriver: 0, ionmov: 0, optcell: 0, iscf: 7, paral_kgb: 0, } ... Real(R)+Recip(G) space primitive vectors, cartesian coordinates (Bohr,Bohr^-1): R(1)= 0.0000000 5.0000000 5.0000000 G(1)= -0.1000000 0.1000000 0.1000000 R(2)= 5.0000000 0.0000000 5.0000000 G(2)= 0.1000000 -0.1000000 0.1000000 R(3)= 5.0000000 5.0000000 0.0000000 G(3)= 0.1000000 0.1000000 -0.1000000 Unit cell volume ucvol= 2.5000000E+02 bohr^3 Angles (23,13,12)= 6.00000000E+01 6.00000000E+01 6.00000000E+01 degrees getcut: wavevector= 0.0000 0.0000 0.0000 ngfft= 32 32 32 ecut(hartree)= 24.000 => boxcut(ratio)= 2.05208 --- Pseudopotential description ------------------------------------------------ - pspini: atom type 1 psp file is /home/buildbot/ABINIT3/eos_gnu_13.2_mpich/trunk_merge-10.0/tests/Pspdir/Psdj_nc_sr_04_pw_std_psp8/Pb.psp8 - pspatm: opening atomic psp file /home/buildbot/ABINIT3/eos_gnu_13.2_mpich/trunk_merge-10.0/tests/Pspdir/Psdj_nc_sr_04_pw_std_psp8/Pb.psp8 - Pb ONCVPSP-3.3.0 r_core= 2.42823 2.42823 2.10438 - 82.00000 14.00000 171114 znucl, zion, pspdat 8 -1012 2 4 600 0.00000 pspcod,pspxc,lmax,lloc,mmax,r2well 5.99000000000000 6.00000000000000 0.00000000000000 rchrg,fchrg,qchrg nproj 2 2 2 extension_switch 1 pspatm : epsatm= 44.14136879 --- l ekb(1:nproj) --> 0 6.261251 1.210156 1 4.271695 0.573698 2 -3.250355 -0.847110 pspatm: atomic psp has been read and splines computed 6.17979163E+02 ecore*ucvol(ha*bohr**3) -------------------------------------------------------------------------------- _setup2: Arith. and geom. avg. npw (full set) are 1403.977 1403.966 ================================================================================ --- !BeginCycle iteration_state: {dtset: 1, } solver: {iscf: 7, nstep: 10, nline: 4, wfoptalg: 0, } tolerances: {tolvrs: 1.00E-10, } ... iter Etot(hartree) deltaE(h) residm vres2 ETOT 1 -69.373802516378 -6.937E+01 6.987E-01 1.363E+02 ETOT 2 -70.180556181606 -8.068E-01 7.184E-02 1.619E+02 ETOT 3 -70.220120451211 -3.956E-02 3.052E-03 3.085E+01 ETOT 4 -70.225315529283 -5.195E-03 2.686E-04 1.069E+01 ETOT 5 -70.226696641059 -1.381E-03 1.191E-05 5.264E+00 ETOT 6 -70.228027230143 -1.331E-03 1.017E-05 3.945E-02 ETOT 7 -70.228033183044 -5.953E-06 3.064E-06 1.825E-02 ETOT 8 -70.228037608726 -4.426E-06 1.675E-07 1.016E-03 ETOT 9 -70.228037874421 -2.657E-07 3.499E-07 1.318E-06 ETOT 10 -70.228037875241 -8.198E-10 4.981E-09 1.118E-09 Cartesian components of stress tensor (hartree/bohr^3) sigma(1 1)= 6.74725506E-05 sigma(3 2)= 0.00000000E+00 sigma(2 2)= 6.74725506E-05 sigma(3 1)= 0.00000000E+00 sigma(3 3)= 6.74725506E-05 sigma(2 1)= 0.00000000E+00 scprqt: WARNING - nstep= 10 was not enough SCF cycles to converge; potential residual= 1.118E-09 exceeds tolvrs= 1.000E-10 --- !ResultsGS iteration_state: {dtset: 1, } comment : Summary of ground state results lattice_vectors: - [ 0.0000000, 5.0000000, 5.0000000, ] - [ 5.0000000, 0.0000000, 5.0000000, ] - [ 5.0000000, 5.0000000, 0.0000000, ] lattice_lengths: [ 7.07107, 7.07107, 7.07107, ] lattice_angles: [ 60.000, 60.000, 60.000, ] # degrees, (23, 13, 12) lattice_volume: 2.5000000E+02 convergence: {deltae: -8.198E-10, res2: 1.118E-09, residm: 4.981E-09, diffor: null, } etotal : -7.02280379E+01 entropy : 0.00000000E+00 fermie : 3.75100947E-01 cartesian_stress_tensor: # hartree/bohr^3 - [ 6.74725506E-05, 0.00000000E+00, 0.00000000E+00, ] - [ 0.00000000E+00, 6.74725506E-05, 0.00000000E+00, ] - [ 0.00000000E+00, 0.00000000E+00, 6.74725506E-05, ] pressure_GPa: -1.9851E+00 xred : - [ 0.0000E+00, 0.0000E+00, 0.0000E+00, Pb] cartesian_forces: # hartree/bohr - [ -0.00000000E+00, -0.00000000E+00, -0.00000000E+00, ] force_length_stats: {min: 0.00000000E+00, max: 0.00000000E+00, mean: 0.00000000E+00, } ... Integrated electronic density in atomic spheres: ------------------------------------------------ Atom Sphere_radius Integrated_density 1 2.00000 10.01712687 ================================================================================ ----iterations are completed or convergence reached---- Mean square residual over all n,k,spin= 54.944E-13; max= 49.809E-10 reduced coordinates (array xred) for 1 atoms 0.000000000000 0.000000000000 0.000000000000 rms dE/dt= 0.0000E+00; max dE/dt= 0.0000E+00; dE/dt below (all hartree) 1 0.000000000000 0.000000000000 0.000000000000 cartesian coordinates (angstrom) at end: 1 0.00000000000000 0.00000000000000 0.00000000000000 cartesian forces (hartree/bohr) at end: 1 -0.00000000000000 -0.00000000000000 -0.00000000000000 frms,max,avg= 0.0000000E+00 0.0000000E+00 0.000E+00 0.000E+00 0.000E+00 h/b cartesian forces (eV/Angstrom) at end: 1 -0.00000000000000 -0.00000000000000 -0.00000000000000 frms,max,avg= 0.0000000E+00 0.0000000E+00 0.000E+00 0.000E+00 0.000E+00 e/A length scales= 10.000000000000 10.000000000000 10.000000000000 bohr = 5.291772085900 5.291772085900 5.291772085900 angstroms prteigrs : about to open file tbasepar_1o_EIG Fermi (or HOMO) energy (hartree) = 0.37510 Average Vxc (hartree)= -0.36041 Eigenvalues (hartree) for nkpt= 182 k points: kpt# 1, nband= 7, wtk= 0.00347, kpt= -0.0417 -0.0833 0.0000 (reduced coord) -0.48897 -0.48864 -0.48863 -0.48602 -0.48582 -0.25760 0.30753 occupation numbers for kpt# 1 2.00000 2.00000 2.00000 2.00000 2.00000 2.00000 2.00000 prteigrs : prtvol=0 or 1, do not print more k-points. --- !EnergyTerms iteration_state : {dtset: 1, } comment : Components of total free energy in Hartree kinetic : 3.43521518604370E+01 hartree : 1.43338856701900E+01 xc : -1.97016619996615E+01 Ewald energy : -4.49316483263154E+01 psp_core : 2.47191665196533E+00 local_psp : -4.25638492498807E+01 non_local_psp : -1.41888324819753E+01 internal : -7.02280378752405E+01 '-kT*entropy' : -8.03916585037366E-15 total_energy : -7.02280378752405E+01 total_energy_eV : -1.91100209635779E+03 band_energy : -5.03929368074888E+00 ... Cartesian components of stress tensor (hartree/bohr^3) sigma(1 1)= 6.74725506E-05 sigma(3 2)= 0.00000000E+00 sigma(2 2)= 6.74725506E-05 sigma(3 1)= 0.00000000E+00 sigma(3 3)= 6.74725506E-05 sigma(2 1)= 0.00000000E+00 -Cartesian components of stress tensor (GPa) [Pressure= -1.9851E+00 GPa] - sigma(1 1)= 1.98511064E+00 sigma(3 2)= 0.00000000E+00 - sigma(2 2)= 1.98511064E+00 sigma(3 1)= 0.00000000E+00 - sigma(3 3)= 1.98511064E+00 sigma(2 1)= 0.00000000E+00 == END DATASET(S) ============================================================== ================================================================================ -outvars: echo values of variables after computation -------- acell 1.0000000000E+01 1.0000000000E+01 1.0000000000E+01 Bohr amu 2.07200000E+02 ecut 2.40000000E+01 Hartree etotal -7.0228037875E+01 fcart -0.0000000000E+00 -0.0000000000E+00 -0.0000000000E+00 - fftalg 512 ixc -1012 kpt -4.16666667E-02 -8.33333333E-02 0.00000000E+00 -4.16666667E-02 -1.66666667E-01 0.00000000E+00 -8.33333333E-02 -1.25000000E-01 0.00000000E+00 -4.16666667E-02 -1.25000000E-01 4.16666667E-02 -4.16666667E-02 -2.50000000E-01 0.00000000E+00 -8.33333333E-02 -2.08333333E-01 0.00000000E+00 -4.16666667E-02 -2.08333333E-01 4.16666667E-02 -1.25000000E-01 -1.66666667E-01 0.00000000E+00 -8.33333333E-02 -1.66666667E-01 4.16666667E-02 -4.16666667E-02 -1.66666667E-01 8.33333333E-02 -4.16666667E-02 -3.33333333E-01 0.00000000E+00 -8.33333333E-02 -2.91666667E-01 0.00000000E+00 -4.16666667E-02 -2.91666667E-01 4.16666667E-02 -1.25000000E-01 -2.50000000E-01 0.00000000E+00 -8.33333333E-02 -2.50000000E-01 4.16666667E-02 -4.16666667E-02 -2.50000000E-01 8.33333333E-02 -1.66666667E-01 -2.08333333E-01 0.00000000E+00 -1.25000000E-01 -2.08333333E-01 4.16666667E-02 -8.33333333E-02 -2.08333333E-01 8.33333333E-02 -4.16666667E-02 -2.08333333E-01 1.25000000E-01 -4.16666667E-02 -4.16666667E-01 0.00000000E+00 -8.33333333E-02 -3.75000000E-01 0.00000000E+00 -4.16666667E-02 -3.75000000E-01 4.16666667E-02 -1.25000000E-01 -3.33333333E-01 0.00000000E+00 -8.33333333E-02 -3.33333333E-01 4.16666667E-02 -4.16666667E-02 -3.33333333E-01 8.33333333E-02 -1.66666667E-01 -2.91666667E-01 0.00000000E+00 -1.25000000E-01 -2.91666667E-01 4.16666667E-02 -8.33333333E-02 -2.91666667E-01 8.33333333E-02 -4.16666667E-02 -2.91666667E-01 1.25000000E-01 -2.08333333E-01 -2.50000000E-01 0.00000000E+00 -1.66666667E-01 -2.50000000E-01 4.16666667E-02 -1.25000000E-01 -2.50000000E-01 8.33333333E-02 -8.33333333E-02 -2.50000000E-01 1.25000000E-01 -4.16666667E-02 -2.50000000E-01 1.66666667E-01 -4.16666667E-02 5.00000000E-01 0.00000000E+00 -8.33333333E-02 -4.58333333E-01 0.00000000E+00 -4.16666667E-02 -4.58333333E-01 4.16666667E-02 -1.25000000E-01 -4.16666667E-01 0.00000000E+00 -8.33333333E-02 -4.16666667E-01 4.16666667E-02 -4.16666667E-02 -4.16666667E-01 8.33333333E-02 -1.66666667E-01 -3.75000000E-01 0.00000000E+00 -1.25000000E-01 -3.75000000E-01 4.16666667E-02 -8.33333333E-02 -3.75000000E-01 8.33333333E-02 -4.16666667E-02 -3.75000000E-01 1.25000000E-01 -2.08333333E-01 -3.33333333E-01 0.00000000E+00 -1.66666667E-01 -3.33333333E-01 4.16666667E-02 -1.25000000E-01 -3.33333333E-01 8.33333333E-02 -8.33333333E-02 -3.33333333E-01 1.25000000E-01 -4.16666667E-02 -3.33333333E-01 1.66666667E-01 outvar_i_n : Printing only first 50 k-points. kptrlatt 12 -12 12 -12 12 12 -12 -12 12 kptrlen 1.20000000E+02 P mkmem 182 natom 1 nband 7 ngfft 32 32 32 nkpt 182 nstep 10 nsym 48 ntypat 1 occ 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 prtocc : prtvol=0, do not print more k-points. occopt 7 rprim 0.0000000000E+00 5.0000000000E-01 5.0000000000E-01 5.0000000000E-01 0.0000000000E+00 5.0000000000E-01 5.0000000000E-01 5.0000000000E-01 0.0000000000E+00 shiftk 5.00000000E-01 5.00000000E-01 5.00000000E-01 spgroup 225 strten 6.7472550601E-05 6.7472550601E-05 6.7472550601E-05 0.0000000000E+00 0.0000000000E+00 0.0000000000E+00 symrel 1 0 0 0 1 0 0 0 1 -1 0 0 0 -1 0 0 0 -1 0 -1 1 0 -1 0 1 -1 0 0 1 -1 0 1 0 -1 1 0 -1 0 0 -1 0 1 -1 1 0 1 0 0 1 0 -1 1 -1 0 0 1 -1 1 0 -1 0 0 -1 0 -1 1 -1 0 1 0 0 1 -1 0 0 -1 1 0 -1 0 1 1 0 0 1 -1 0 1 0 -1 0 -1 1 1 -1 0 0 -1 0 0 1 -1 -1 1 0 0 1 0 1 0 0 0 0 1 0 1 0 -1 0 0 0 0 -1 0 -1 0 0 1 -1 0 0 -1 1 0 -1 0 -1 1 0 0 1 -1 0 1 -1 0 1 -1 1 0 -1 0 0 1 0 -1 1 -1 0 1 0 0 0 -1 0 1 -1 0 0 -1 1 0 1 0 -1 1 0 0 1 -1 1 0 -1 0 0 -1 0 1 -1 -1 0 1 0 0 1 0 -1 1 0 1 0 0 0 1 1 0 0 0 -1 0 0 0 -1 -1 0 0 1 0 -1 0 1 -1 0 0 -1 -1 0 1 0 -1 1 0 0 1 0 -1 0 0 -1 1 1 -1 0 0 1 0 0 1 -1 -1 1 0 -1 0 1 -1 0 0 -1 1 0 1 0 -1 1 0 0 1 -1 0 0 1 0 1 0 0 0 0 1 0 -1 0 -1 0 0 0 0 -1 0 0 -1 0 1 -1 1 0 -1 0 0 1 0 -1 1 -1 0 1 1 -1 0 0 -1 1 0 -1 0 -1 1 0 0 1 -1 0 1 0 0 0 1 1 0 0 0 1 0 0 0 -1 -1 0 0 0 -1 0 -1 1 0 -1 0 0 -1 0 1 1 -1 0 1 0 0 1 0 -1 0 0 1 0 1 0 1 0 0 0 0 -1 0 -1 0 -1 0 0 1 -1 0 0 -1 0 0 -1 1 -1 1 0 0 1 0 0 1 -1 0 0 -1 1 0 -1 0 1 -1 0 0 1 -1 0 1 0 -1 1 -1 1 0 -1 0 1 -1 0 0 1 -1 0 1 0 -1 1 0 0 tolvrs 1.00000000E-10 typat 1 wtk 0.00347 0.00347 0.00347 0.00694 0.00347 0.00347 0.00694 0.00347 0.00694 0.00694 0.00347 0.00347 0.00694 0.00347 0.00694 0.00694 0.00347 0.00694 0.00694 0.00694 0.00347 0.00347 0.00694 0.00347 0.00694 0.00694 0.00347 0.00694 0.00694 0.00694 0.00347 0.00694 0.00694 0.00694 0.00694 0.00347 0.00347 0.00694 0.00347 0.00694 0.00694 0.00347 0.00694 0.00694 0.00694 0.00347 0.00694 0.00694 0.00694 0.00694 outvars : Printing only first 50 k-points. znucl 82.00000 ================================================================================ - Timing analysis has been suppressed with timopt=0 ================================================================================ Suggested references for the acknowledgment of ABINIT usage. The users of ABINIT have little formal obligations with respect to the ABINIT group (those specified in the GNU General Public License, http://www.gnu.org/copyleft/gpl.txt). However, it is common practice in the scientific literature, to acknowledge the efforts of people that have made the research possible. In this spirit, please find below suggested citations of work written by ABINIT developers, corresponding to implementations inside of ABINIT that you have used in the present run. Note also that it will be of great value to readers of publications presenting these results, to read papers enabling them to understand the theoretical formalism and details of the ABINIT implementation. For information on why they are suggested, see also https://docs.abinit.org/theory/acknowledgments. - - [1] Libxc: A library of exchange and correlation functionals for density functional theory. - M.A.L. Marques, M.J.T. Oliveira, T. Burnus, Computer Physics Communications 183, 2227 (2012). - Comment: to be cited when LibXC is used (negative value of ixc) - Strong suggestion to cite this paper. - DOI and bibtex: see https://docs.abinit.org/theory/bibliography/#marques2012 - - [2] The Abinit project: Impact, environment and recent developments. - Computer Phys. Comm. 248, 107042 (2020). - X.Gonze, B. Amadon, G. Antonius, F.Arnardi, L.Baguet, J.-M.Beuken, - J.Bieder, F.Bottin, J.Bouchet, E.Bousquet, N.Brouwer, F.Bruneval, - G.Brunin, T.Cavignac, J.-B. Charraud, Wei Chen, M.Cote, S.Cottenier, - J.Denier, G.Geneste, Ph.Ghosez, M.Giantomassi, Y.Gillet, O.Gingras, - D.R.Hamann, G.Hautier, Xu He, N.Helbig, N.Holzwarth, Y.Jia, F.Jollet, - W.Lafargue-Dit-Hauret, K.Lejaeghere, M.A.L.Marques, A.Martin, C.Martins, - H.P.C. Miranda, F.Naccarato, K. Persson, G.Petretto, V.Planes, Y.Pouillon, - S.Prokhorenko, F.Ricci, G.-M.Rignanese, A.H.Romero, M.M.Schmitt, M.Torrent, - M.J.van Setten, B.Van Troeye, M.J.Verstraete, G.Zerah and J.W.Zwanzig - Comment: the fifth generic paper describing the ABINIT project. - Note that a version of this paper, that is not formatted for Computer Phys. Comm. - is available at https://www.abinit.org/sites/default/files/ABINIT20.pdf . - The licence allows the authors to put it on the Web. - DOI and bibtex: see https://docs.abinit.org/theory/bibliography/#gonze2020 - - [3] Optimized norm-conserving Vanderbilt pseudopotentials. - D.R. Hamann, Phys. Rev. B 88, 085117 (2013). - Comment: Some pseudopotential generated using the ONCVPSP code were used. - DOI and bibtex: see https://docs.abinit.org/theory/bibliography/#hamann2013 - - [4] ABINIT: Overview, and focus on selected capabilities - J. Chem. Phys. 152, 124102 (2020). - A. Romero, D.C. Allan, B. Amadon, G. Antonius, T. Applencourt, L.Baguet, - J.Bieder, F.Bottin, J.Bouchet, E.Bousquet, F.Bruneval, - G.Brunin, D.Caliste, M.Cote, - J.Denier, C. Dreyer, Ph.Ghosez, M.Giantomassi, Y.Gillet, O.Gingras, - D.R.Hamann, G.Hautier, F.Jollet, G. Jomard, - A.Martin, - H.P.C. Miranda, F.Naccarato, G.Petretto, N.A. Pike, V.Planes, - S.Prokhorenko, T. Rangel, F.Ricci, G.-M.Rignanese, M.Royo, M.Stengel, M.Torrent, - M.J.van Setten, B.Van Troeye, M.J.Verstraete, J.Wiktor, J.W.Zwanziger, and X.Gonze. - Comment: a global overview of ABINIT, with focus on selected capabilities . - Note that a version of this paper, that is not formatted for J. Chem. Phys - is available at https://www.abinit.org/sites/default/files/ABINIT20_JPC.pdf . - The licence allows the authors to put it on the Web. - DOI and bibtex: see https://docs.abinit.org/theory/bibliography/#romero2020 - - [5] Recent developments in the ABINIT software package. - Computer Phys. Comm. 205, 106 (2016). - X.Gonze, F.Jollet, F.Abreu Araujo, D.Adams, B.Amadon, T.Applencourt, - C.Audouze, J.-M.Beuken, J.Bieder, A.Bokhanchuk, E.Bousquet, F.Bruneval - D.Caliste, M.Cote, F.Dahm, F.Da Pieve, M.Delaveau, M.Di Gennaro, - B.Dorado, C.Espejo, G.Geneste, L.Genovese, A.Gerossier, M.Giantomassi, - Y.Gillet, D.R.Hamann, L.He, G.Jomard, J.Laflamme Janssen, S.Le Roux, - A.Levitt, A.Lherbier, F.Liu, I.Lukacevic, A.Martin, C.Martins, - M.J.T.Oliveira, S.Ponce, Y.Pouillon, T.Rangel, G.-M.Rignanese, - A.H.Romero, B.Rousseau, O.Rubel, A.A.Shukri, M.Stankovski, M.Torrent, - M.J.Van Setten, B.Van Troeye, M.J.Verstraete, D.Waroquier, J.Wiktor, - B.Xu, A.Zhou, J.W.Zwanziger. - Comment: the fourth generic paper describing the ABINIT project. - Note that a version of this paper, that is not formatted for Computer Phys. Comm. - is available at https://www.abinit.org/sites/default/files/ABINIT16.pdf . - The licence allows the authors to put it on the Web. - DOI and bibtex: see https://docs.abinit.org/theory/bibliography/#gonze2016 - - Proc. 0 individual time (sec): cpu= 38.6 wall= 38.7 ================================================================================ Calculation completed. .Delivered 11 WARNINGs and 2 COMMENTs to log file. +Overall time at end (sec) : cpu= 38.6 wall= 38.7
You will notice
that the numerical results are quite identical. You will also see that 182
k-points have been kept in the memory in the sequential case (keyword mkmem
), while 91
k-points have been kept in the memory (per processor !) in the parallel case.
The timing can be found at the end of the file. Here is an example:
- Proc. 0 individual time (sec): cpu= 20.0 wall= 20.1
================================================================================
Calculation completed.
Delivered 0 WARNINGs and 1 COMMENTs to log file.
+Overall time at end (sec) : cpu= 40.1 wall= 40.3
This corresponds effectively to a speed-up of the job by a factor of two.
Let’s examine it. The line beginning with Proc. 0
corresponds to the CPU and
Wall clock timing seen by the processor number 0 (processor indexing always
starts at 0: here the other is number 1): 20.0 sec of CPU time, and nearly the same
amount of Wall clock time. The line that starts with +Overall time
corresponds to the sum of CPU times and Wall clock timing for all processors.
The summation is quite meaningful for the CPU time, but not so for the wall
clock time: the job was finished after 20.1 sec, and not 40.3 sec.
Now, you might try to increase the number of processors, and see whether the CPU time is shared equally amongst the different processors, so that the Wall clock time seen by each processor decreases. At some point (depending on your machine, and the sequential part of ABINIT), you will not be able to decrease further the Wall clock time seen by one processor. It is not worth to try to use more processors. Let us define the speedup as the time taken in a sequential calculation divided by the time for your parallel calculation (hopefully > 1) . You should get a curve similar to this one:
Speedup with k point parallelization
The red curve materializes the speedup achieved, while the green one is the \(y = x\) line. The shape of the red curve will vary depending on your hardware configuration.
One last remark: the number of k-points need not be a multiple of the number of processors. As an example, you might try to run the above case with 16 processors: all will treat \(\lfloor 182/16 \rfloor=11\) k points, but \(182-16\times11=6\) processors will have to treat one more k point so that \(6*12+10*11=182\). The maximal speedup will only be \(15.2 (=182/12)\), instead of 16.
Try to avoid leaving an empty processor as this can make abinit fail with certain compilers. An empty processor happens, for example, if you use more processors than the number of k point. The extra processors do no useful work, but have to run anyway, just to confirm to abinit once in a while that all processors are alive.
Parallelism over the spins¶
The parallelization over the spins (up, down) is done along with the one over the k-points, so it works exactly the same way. The file tbasepar_2.abi in $ABI_TESTS/tutorial treats a spin-polarized system (distorted FCC Iron) with only one k-point in the Irreducible Brillouin Zone. This is quite unphysical, and has the sole purpose to show the spin parallelism with as few as two processors: the k-point parallelism has precedence over the spin parallelism, so that with 2 processors, one ought to have only one k-point to see the spin parallelism.
# FCC Fe (ferromagnetic for fun) with four atoms per cell # Distorted with a A1 phonon, so as to keep the symmetry ... # Only one k point in the IBZ # Test the parallelism over the spins #Definition of the unit cell acell 3*7.00 #Definition of the atom types and pseudopotentials ntypat 1 znucl 26.0 pp_dirpath "$ABI_PSPDIR" pseudos "Psdj_nc_sr_04_pw_std_psp8/Fe.psp8" #Definition of the atoms and atoms positions natom 4 typat 4*1 xred 0.01 0.01 0.01 0.49 0.49 0.01 0.49 0.01 0.49 0.01 0.49 0.49 #Numerical parameters of the calculation : planewave basis set and k point grid ecut 39 ngkpt 2 2 2 shiftk 0.5 0.5 0.5 occopt 7 nband 40 nsppol 2 spinat 0.0 0.0 3.0 0.0 0.0 3.0 0.0 0.0 3.0 0.0 0.0 3.0 #Parameters for the SCF procedure nstep 5 tolvrs 1.0d-13 nline 5 ############################################################## # This section is used only for regression testing of ABINIT # ############################################################## #%%<BEGIN TEST_INFO> #%% [setup] #%% executable = abinit #%% [files] #%% files_to_test = tbasepar_2.abo, tolnlines=0, tolabs=0.0, tolrel=0.0 #%% [paral_info] #%% max_nprocs = 2 #%% [extra_info] #%% keywords = NC #%% authors = Unknown #%% description = #%% FCC Fe (ferromagnetic for fun) with four atoms per cell #%% Distorted with a A1 phonon, so as to keep the symmetry ... #%% Only one k point in the IBZ #%% Test the parallelism over the spins #%%<END TEST_INFO>
If needed, modify the input file, to provide a local temporary disk space. Run this test case, in sequential, then in parallel.
While the jobs are running, read the input. Then look closely at the output and log files in the sequential and parallel cases. They are quite similar. Actually, apart the mention of two processors and the speedup, there is no other manifestation of the parallelism.
.Version 10.1.4.5 of ABINIT, released Sep 2024. .(MPI version, prepared for a x86_64_linux_gnu13.2 computer) .Copyright (C) 1998-2024 ABINIT group . ABINIT comes with ABSOLUTELY NO WARRANTY. It is free software, and you are welcome to redistribute it under certain conditions (GNU General Public License, see ~abinit/COPYING or http://www.gnu.org/copyleft/gpl.txt). ABINIT is a project of the Universite Catholique de Louvain, Corning Inc. and other collaborators, see ~abinit/doc/developers/contributors.txt . Please read https://docs.abinit.org/theory/acknowledgments for suggested acknowledgments of the ABINIT effort. For more information, see https://www.abinit.org . .Starting date : Fri 13 Sep 2024. - ( at 19h06 ) - input file -> /home/buildbot/ABINIT3/eos_gnu_13.2_mpich/trunk_merge-10.0/tests/TestBot_MPI1/tutorial_tbasepar_2/tbasepar_2.abi - output file -> tbasepar_2.abo - root for input files -> tbasepar_2i - root for output files -> tbasepar_2o Symmetries : space group P-4 3 m (#215); Bravais cP (primitive cubic) ================================================================================ Values of the parameters that define the memory need of the present run intxc = 0 ionmov = 0 iscf = 7 lmnmax = 6 lnmax = 6 mgfft = 40 mpssoang = 3 mqgrid = 3001 natom = 4 nloc_mem = 1 nspden = 2 nspinor = 1 nsppol = 2 nsym = 24 n1xccc = 2501 ntypat = 1 occopt = 7 xclevel = 1 - mband = 40 mffmem = 1 mkmem = 1 mpw = 4013 nfft = 64000 nkpt = 1 ================================================================================ P This job should need less than 36.804 Mbytes of memory. Rough estimation (10% accuracy) of disk space for files : _ WF disk file : 4.901 Mbytes ; DEN or POT disk file : 0.979 Mbytes. ================================================================================ -------------------------------------------------------------------------------- ------------- Echo of variables that govern the present computation ------------ -------------------------------------------------------------------------------- - - outvars: echo of selected default values - iomode0 = 0 , fftalg0 =512 , wfoptalg0 = 0 - - outvars: echo of global parameters not present in the input file - max_nthreads = 0 - -outvars: echo values of preprocessed input variables -------- acell 7.0000000000E+00 7.0000000000E+00 7.0000000000E+00 Bohr amu 5.58470000E+01 ecut 3.90000000E+01 Hartree - fftalg 512 ixc -1012 kpt 2.50000000E-01 2.50000000E-01 2.50000000E-01 kptrlatt 2 0 0 0 2 0 0 0 2 kptrlen 1.40000000E+01 P mkmem 1 natom 4 nband 40 ngfft 40 40 40 nkpt 1 nline 5 nspden 2 nsppol 2 nstep 5 nsym 24 ntypat 1 occ 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 occopt 7 shiftk 5.00000000E-01 5.00000000E-01 5.00000000E-01 spgroup 215 spinat 0.0000000000E+00 0.0000000000E+00 3.0000000000E+00 0.0000000000E+00 0.0000000000E+00 3.0000000000E+00 0.0000000000E+00 0.0000000000E+00 3.0000000000E+00 0.0000000000E+00 0.0000000000E+00 3.0000000000E+00 symrel 1 0 0 0 1 0 0 0 1 -1 0 0 0 1 0 0 0 -1 -1 0 0 0 -1 0 0 0 1 1 0 0 0 -1 0 0 0 -1 0 1 0 1 0 0 0 0 1 0 -1 0 1 0 0 0 0 -1 0 -1 0 -1 0 0 0 0 1 0 1 0 -1 0 0 0 0 -1 0 0 1 1 0 0 0 1 0 0 0 -1 1 0 0 0 -1 0 0 0 -1 -1 0 0 0 1 0 0 0 1 -1 0 0 0 -1 0 1 0 0 0 0 1 0 1 0 -1 0 0 0 0 1 0 -1 0 -1 0 0 0 0 -1 0 1 0 1 0 0 0 0 -1 0 -1 0 0 1 0 0 0 1 1 0 0 0 -1 0 0 0 1 -1 0 0 0 -1 0 0 0 -1 1 0 0 0 1 0 0 0 -1 -1 0 0 0 0 1 0 1 0 1 0 0 0 0 -1 0 1 0 -1 0 0 0 0 -1 0 -1 0 1 0 0 0 0 1 0 -1 0 -1 0 0 tnons 0.0000000 0.0000000 0.0000000 0.5000000 0.0000000 0.5000000 0.5000000 0.5000000 0.0000000 0.0000000 0.5000000 0.5000000 0.0000000 0.0000000 0.0000000 0.0000000 0.5000000 0.5000000 0.5000000 0.5000000 0.0000000 0.5000000 0.0000000 0.5000000 0.0000000 0.0000000 0.0000000 0.0000000 0.5000000 0.5000000 0.5000000 0.0000000 0.5000000 0.5000000 0.5000000 0.0000000 0.0000000 0.0000000 0.0000000 0.5000000 0.5000000 0.0000000 0.5000000 0.0000000 0.5000000 0.0000000 0.5000000 0.5000000 0.0000000 0.0000000 0.0000000 0.5000000 0.5000000 0.0000000 0.0000000 0.5000000 0.5000000 0.5000000 0.0000000 0.5000000 0.0000000 0.0000000 0.0000000 0.5000000 0.0000000 0.5000000 0.0000000 0.5000000 0.5000000 0.5000000 0.5000000 0.0000000 tolvrs 1.00000000E-13 typat 1 1 1 1 xangst 3.7042404601E-02 3.7042404601E-02 3.7042404601E-02 1.8150778255E+00 1.8150778255E+00 3.7042404601E-02 1.8150778255E+00 3.7042404601E-02 1.8150778255E+00 3.7042404601E-02 1.8150778255E+00 1.8150778255E+00 xcart 7.0000000000E-02 7.0000000000E-02 7.0000000000E-02 3.4300000000E+00 3.4300000000E+00 7.0000000000E-02 3.4300000000E+00 7.0000000000E-02 3.4300000000E+00 7.0000000000E-02 3.4300000000E+00 3.4300000000E+00 xred 1.0000000000E-02 1.0000000000E-02 1.0000000000E-02 4.9000000000E-01 4.9000000000E-01 1.0000000000E-02 4.9000000000E-01 1.0000000000E-02 4.9000000000E-01 1.0000000000E-02 4.9000000000E-01 4.9000000000E-01 znucl 26.00000 ================================================================================ chkinp: Checking input parameters for consistency. ================================================================================ == DATASET 1 ================================================================== - mpi_nproc: 1, omp_nthreads: -1 (-1 if OMP is not activated) --- !DatasetInfo iteration_state: {dtset: 1, } dimensions: {natom: 4, nkpt: 1, mband: 40, nsppol: 2, nspinor: 1, nspden: 2, mpw: 4013, } cutoff_energies: {ecut: 39.0, pawecutdg: -1.0, } electrons: {nelect: 6.40000000E+01, charge: 0.00000000E+00, occopt: 7.00000000E+00, tsmear: 1.00000000E-02, } meta: {optdriver: 0, ionmov: 0, optcell: 0, iscf: 7, paral_kgb: 0, } ... Real(R)+Recip(G) space primitive vectors, cartesian coordinates (Bohr,Bohr^-1): R(1)= 7.0000000 0.0000000 0.0000000 G(1)= 0.1428571 0.0000000 0.0000000 R(2)= 0.0000000 7.0000000 0.0000000 G(2)= 0.0000000 0.1428571 0.0000000 R(3)= 0.0000000 0.0000000 7.0000000 G(3)= 0.0000000 0.0000000 0.1428571 Unit cell volume ucvol= 3.4300000E+02 bohr^3 Angles (23,13,12)= 9.00000000E+01 9.00000000E+01 9.00000000E+01 degrees getcut: wavevector= 0.0000 0.0000 0.0000 ngfft= 40 40 40 ecut(hartree)= 39.000 => boxcut(ratio)= 2.03266 --- Pseudopotential description ------------------------------------------------ - pspini: atom type 1 psp file is /home/buildbot/ABINIT3/eos_gnu_13.2_mpich/trunk_merge-10.0/tests/Pspdir/Psdj_nc_sr_04_pw_std_psp8/Fe.psp8 - pspatm: opening atomic psp file /home/buildbot/ABINIT3/eos_gnu_13.2_mpich/trunk_merge-10.0/tests/Pspdir/Psdj_nc_sr_04_pw_std_psp8/Fe.psp8 - Fe ONCVPSP-3.3.0 r_core= 1.26437 1.20546 1.56719 - 26.00000 16.00000 171102 znucl, zion, pspdat 8 -1012 2 4 600 0.00000 pspcod,pspxc,lmax,lloc,mmax,r2well 5.99000000000000 3.00000000000000 0.00000000000000 rchrg,fchrg,qchrg nproj 2 2 2 extension_switch 1 pspatm : epsatm= 15.14527328 --- l ekb(1:nproj) --> 0 10.102794 1.450128 1 1.554943 -0.538064 2 -5.598275 -2.050812 pspatm: atomic psp has been read and splines computed 3.87718996E+03 ecore*ucvol(ha*bohr**3) -------------------------------------------------------------------------------- _setup2: Arith. and geom. avg. npw (full set) are 4013.000 4013.000 ================================================================================ --- !BeginCycle iteration_state: {dtset: 1, } solver: {iscf: 7, nstep: 5, nline: 5, wfoptalg: 0, } tolerances: {tolvrs: 1.00E-13, } ... iter Etot(hartree) deltaE(h) residm vres2 magn ETOT 1 -500.52579610578 -5.01E+02 1.49E-01 1.86E+02 6.894 ETOT 2 -500.72971263932 -2.04E-01 3.28E-03 7.48E+02 8.790 ETOT 3 -500.92730216230 -1.98E-01 7.43E-04 3.22E+00 8.013 ETOT 4 -500.92914664276 -1.84E-03 7.31E-06 1.32E+00 8.464 ETOT 5 -500.93128655057 -2.14E-03 4.85E-06 8.04E-01 9.039 Cartesian components of stress tensor (hartree/bohr^3) sigma(1 1)= 9.12184193E-04 sigma(3 2)= 0.00000000E+00 sigma(2 2)= 9.12184193E-04 sigma(3 1)= 0.00000000E+00 sigma(3 3)= 9.12184193E-04 sigma(2 1)= 0.00000000E+00 scprqt: WARNING - nstep= 5 was not enough SCF cycles to converge; potential residual= 8.041E-01 exceeds tolvrs= 1.000E-13 --- !ResultsGS iteration_state: {dtset: 1, } comment : Summary of ground state results lattice_vectors: - [ 7.0000000, 0.0000000, 0.0000000, ] - [ 0.0000000, 7.0000000, 0.0000000, ] - [ 0.0000000, 0.0000000, 7.0000000, ] lattice_lengths: [ 7.00000, 7.00000, 7.00000, ] lattice_angles: [ 90.000, 90.000, 90.000, ] # degrees, (23, 13, 12) lattice_volume: 3.4300000E+02 convergence: {deltae: -2.140E-03, res2: 8.041E-01, residm: 4.854E-06, diffor: null, } etotal : -5.00931287E+02 entropy : 0.00000000E+00 fermie : 4.05636277E-01 cartesian_stress_tensor: # hartree/bohr^3 - [ 9.12184193E-04, 0.00000000E+00, 0.00000000E+00, ] - [ 0.00000000E+00, 9.12184193E-04, 0.00000000E+00, ] - [ 0.00000000E+00, 0.00000000E+00, 9.12184193E-04, ] pressure_GPa: -2.6837E+01 xred : - [ 1.0000E-02, 1.0000E-02, 1.0000E-02, Fe] - [ 4.9000E-01, 4.9000E-01, 1.0000E-02, Fe] - [ 4.9000E-01, 1.0000E-02, 4.9000E-01, Fe] - [ 1.0000E-02, 4.9000E-01, 4.9000E-01, Fe] cartesian_forces: # hartree/bohr - [ -1.00197896E-02, -1.00197896E-02, -1.00197896E-02, ] - [ 1.00197896E-02, 1.00197896E-02, -1.00197896E-02, ] - [ 1.00197896E-02, -1.00197896E-02, 1.00197896E-02, ] - [ -1.00197896E-02, 1.00197896E-02, 1.00197896E-02, ] force_length_stats: {min: 1.73547847E-02, max: 1.73547847E-02, mean: 1.73547847E-02, } ... Integrated electronic and magnetization densities in atomic spheres: --------------------------------------------------------------------- Radius=ratsph(iatom), smearing ratsm= 0.0000. Diff(up-dn)=approximate z local magnetic moment. Atom Radius up_density dn_density Total(up+dn) Diff(up-dn) 1 2.00000 8.156050 5.875402 14.031452 2.280648 2 2.00000 8.156050 5.875402 14.031452 2.280648 3 2.00000 8.156050 5.875402 14.031452 2.280648 4 2.00000 8.156050 5.875402 14.031452 2.280648 --------------------------------------------------------------------- Sum: 32.624200 23.501610 56.125810 9.122590 Total magnetization (from the atomic spheres): 9.122590 Total magnetization (exact up - dn): 9.038543 ================================================================================ ----iterations are completed or convergence reached---- Mean square residual over all n,k,spin= 51.146E-08; max= 48.544E-07 reduced coordinates (array xred) for 4 atoms 0.010000000000 0.010000000000 0.010000000000 0.490000000000 0.490000000000 0.010000000000 0.490000000000 0.010000000000 0.490000000000 0.010000000000 0.490000000000 0.490000000000 rms dE/dt= 7.0139E-02; max dE/dt= 7.0139E-02; dE/dt below (all hartree) 1 0.070138527151 0.070138527151 0.070138527151 2 -0.070138527151 -0.070138527151 0.070138527151 3 -0.070138527151 0.070138527151 -0.070138527151 4 0.070138527151 -0.070138527151 -0.070138527151 cartesian coordinates (angstrom) at end: 1 0.03704240460130 0.03704240460130 0.03704240460130 2 1.81507782546370 1.81507782546370 0.03704240460130 3 1.81507782546370 0.03704240460130 1.81507782546370 4 0.03704240460130 1.81507782546370 1.81507782546370 cartesian forces (hartree/bohr) at end: 1 -0.01001978959295 -0.01001978959295 -0.01001978959295 2 0.01001978959295 0.01001978959295 -0.01001978959295 3 0.01001978959295 -0.01001978959295 0.01001978959295 4 -0.01001978959295 0.01001978959295 0.01001978959295 frms,max,avg= 1.0019790E-02 1.0019790E-02 0.000E+00 0.000E+00 0.000E+00 h/b cartesian forces (eV/Angstrom) at end: 1 -0.51523825362150 -0.51523825362150 -0.51523825362150 2 0.51523825362150 0.51523825362150 -0.51523825362150 3 0.51523825362150 -0.51523825362150 0.51523825362150 4 -0.51523825362150 0.51523825362150 0.51523825362150 frms,max,avg= 5.1523825E-01 5.1523825E-01 0.000E+00 0.000E+00 0.000E+00 e/A length scales= 7.000000000000 7.000000000000 7.000000000000 bohr = 3.704240460130 3.704240460130 3.704240460130 angstroms prteigrs : about to open file tbasepar_2o_EIG Fermi (or HOMO) energy (hartree) = 0.40564 Average Vxc (hartree)= -0.49725 Magnetization (Bohr magneton)= 9.03854254E+00 Total spin up = 3.65192713E+01 Total spin down = 2.74807287E+01 Eigenvalues (hartree) for nkpt= 1 k points, SPIN UP: kpt# 1, nband= 40, wtk= 1.00000, kpt= 0.2500 0.2500 0.2500 (reduced coord) -2.79539 -2.79409 -2.79408 -2.79408 -1.56741 -1.56718 -1.56718 -1.56433 -1.56265 -1.56265 -1.56136 -1.56093 -1.56081 -1.56081 -1.55877 -1.55877 0.18600 0.26485 0.26600 0.26600 0.28622 0.28826 0.28826 0.31924 0.32815 0.32815 0.33348 0.33900 0.33900 0.35435 0.35435 0.35607 0.37144 0.37144 0.39818 0.39818 0.39940 0.49571 0.49601 0.49601 occupation numbers for kpt# 1 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 0.85417 0.85417 0.81093 0.00000 0.00000 0.00000 Eigenvalues (hartree) for nkpt= 1 k points, SPIN DOWN: kpt# 1, nband= 40, wtk= 1.00000, kpt= 0.2500 0.2500 0.2500 (reduced coord) -2.71304 -2.71164 -2.71164 -2.71164 -1.48349 -1.48268 -1.48268 -1.47972 -1.47775 -1.47775 -1.47676 -1.47589 -1.47589 -1.47576 -1.47368 -1.47368 0.19876 0.31329 0.31422 0.31422 0.34219 0.34446 0.34446 0.37875 0.38865 0.38865 0.39177 0.41016 0.41016 0.43372 0.43372 0.43628 0.45466 0.45466 0.48130 0.48130 0.48213 0.54662 0.54760 0.54760 occupation numbers for kpt# 1 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 0.99993 0.99186 0.99186 0.97509 0.26095 0.26095 0.00004 0.00004 0.00001 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 --- !EnergyTerms iteration_state : {dtset: 1, } comment : Components of total free energy in Hartree kinetic : 2.29488819590612E+02 hartree : 1.30548375056168E+02 xc : -7.01653796888672E+01 Ewald energy : -3.34599691126113E+02 psp_core : 1.13037608148300E+01 local_psp : -4.38416353644505E+02 non_local_psp : -2.90803418147169E+01 internal : -5.00920810812593E+02 '-kT*entropy' : -1.04757379752997E-02 total_energy : -5.00931286550568E+02 total_energy_eV : -1.36310335258112E+04 band_energy : -4.79524729535292E+01 ... Cartesian components of stress tensor (hartree/bohr^3) sigma(1 1)= 9.12184193E-04 sigma(3 2)= 0.00000000E+00 sigma(2 2)= 9.12184193E-04 sigma(3 1)= 0.00000000E+00 sigma(3 3)= 9.12184193E-04 sigma(2 1)= 0.00000000E+00 -Cartesian components of stress tensor (GPa) [Pressure= -2.6837E+01 GPa] - sigma(1 1)= 2.68373810E+01 sigma(3 2)= 0.00000000E+00 - sigma(2 2)= 2.68373810E+01 sigma(3 1)= 0.00000000E+00 - sigma(3 3)= 2.68373810E+01 sigma(2 1)= 0.00000000E+00 == END DATASET(S) ============================================================== ================================================================================ -outvars: echo values of variables after computation -------- acell 7.0000000000E+00 7.0000000000E+00 7.0000000000E+00 Bohr amu 5.58470000E+01 ecut 3.90000000E+01 Hartree etotal -5.0093128655E+02 fcart -1.0019789593E-02 -1.0019789593E-02 -1.0019789593E-02 1.0019789593E-02 1.0019789593E-02 -1.0019789593E-02 1.0019789593E-02 -1.0019789593E-02 1.0019789593E-02 -1.0019789593E-02 1.0019789593E-02 1.0019789593E-02 - fftalg 512 ixc -1012 kpt 2.50000000E-01 2.50000000E-01 2.50000000E-01 kptrlatt 2 0 0 0 2 0 0 0 2 kptrlen 1.40000000E+01 P mkmem 1 natom 4 nband 40 ngfft 40 40 40 nkpt 1 nline 5 nspden 2 nsppol 2 nstep 5 nsym 24 ntypat 1 occ 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 0.999999 0.999999 0.854169 0.854169 0.810934 0.000000 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 0.999928 0.991863 0.991863 0.975090 0.260953 0.260953 0.000036 0.000036 0.000007 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 occopt 7 shiftk 5.00000000E-01 5.00000000E-01 5.00000000E-01 spgroup 215 spinat 0.0000000000E+00 0.0000000000E+00 3.0000000000E+00 0.0000000000E+00 0.0000000000E+00 3.0000000000E+00 0.0000000000E+00 0.0000000000E+00 3.0000000000E+00 0.0000000000E+00 0.0000000000E+00 3.0000000000E+00 strten 9.1218419250E-04 9.1218419250E-04 9.1218419250E-04 0.0000000000E+00 0.0000000000E+00 0.0000000000E+00 symrel 1 0 0 0 1 0 0 0 1 -1 0 0 0 1 0 0 0 -1 -1 0 0 0 -1 0 0 0 1 1 0 0 0 -1 0 0 0 -1 0 1 0 1 0 0 0 0 1 0 -1 0 1 0 0 0 0 -1 0 -1 0 -1 0 0 0 0 1 0 1 0 -1 0 0 0 0 -1 0 0 1 1 0 0 0 1 0 0 0 -1 1 0 0 0 -1 0 0 0 -1 -1 0 0 0 1 0 0 0 1 -1 0 0 0 -1 0 1 0 0 0 0 1 0 1 0 -1 0 0 0 0 1 0 -1 0 -1 0 0 0 0 -1 0 1 0 1 0 0 0 0 -1 0 -1 0 0 1 0 0 0 1 1 0 0 0 -1 0 0 0 1 -1 0 0 0 -1 0 0 0 -1 1 0 0 0 1 0 0 0 -1 -1 0 0 0 0 1 0 1 0 1 0 0 0 0 -1 0 1 0 -1 0 0 0 0 -1 0 -1 0 1 0 0 0 0 1 0 -1 0 -1 0 0 tnons 0.0000000 0.0000000 0.0000000 0.5000000 0.0000000 0.5000000 0.5000000 0.5000000 0.0000000 0.0000000 0.5000000 0.5000000 0.0000000 0.0000000 0.0000000 0.0000000 0.5000000 0.5000000 0.5000000 0.5000000 0.0000000 0.5000000 0.0000000 0.5000000 0.0000000 0.0000000 0.0000000 0.0000000 0.5000000 0.5000000 0.5000000 0.0000000 0.5000000 0.5000000 0.5000000 0.0000000 0.0000000 0.0000000 0.0000000 0.5000000 0.5000000 0.0000000 0.5000000 0.0000000 0.5000000 0.0000000 0.5000000 0.5000000 0.0000000 0.0000000 0.0000000 0.5000000 0.5000000 0.0000000 0.0000000 0.5000000 0.5000000 0.5000000 0.0000000 0.5000000 0.0000000 0.0000000 0.0000000 0.5000000 0.0000000 0.5000000 0.0000000 0.5000000 0.5000000 0.5000000 0.5000000 0.0000000 tolvrs 1.00000000E-13 typat 1 1 1 1 xangst 3.7042404601E-02 3.7042404601E-02 3.7042404601E-02 1.8150778255E+00 1.8150778255E+00 3.7042404601E-02 1.8150778255E+00 3.7042404601E-02 1.8150778255E+00 3.7042404601E-02 1.8150778255E+00 1.8150778255E+00 xcart 7.0000000000E-02 7.0000000000E-02 7.0000000000E-02 3.4300000000E+00 3.4300000000E+00 7.0000000000E-02 3.4300000000E+00 7.0000000000E-02 3.4300000000E+00 7.0000000000E-02 3.4300000000E+00 3.4300000000E+00 xred 1.0000000000E-02 1.0000000000E-02 1.0000000000E-02 4.9000000000E-01 4.9000000000E-01 1.0000000000E-02 4.9000000000E-01 1.0000000000E-02 4.9000000000E-01 1.0000000000E-02 4.9000000000E-01 4.9000000000E-01 znucl 26.00000 ================================================================================ - Timing analysis has been suppressed with timopt=0 ================================================================================ Suggested references for the acknowledgment of ABINIT usage. The users of ABINIT have little formal obligations with respect to the ABINIT group (those specified in the GNU General Public License, http://www.gnu.org/copyleft/gpl.txt). However, it is common practice in the scientific literature, to acknowledge the efforts of people that have made the research possible. In this spirit, please find below suggested citations of work written by ABINIT developers, corresponding to implementations inside of ABINIT that you have used in the present run. Note also that it will be of great value to readers of publications presenting these results, to read papers enabling them to understand the theoretical formalism and details of the ABINIT implementation. For information on why they are suggested, see also https://docs.abinit.org/theory/acknowledgments. - - [1] Libxc: A library of exchange and correlation functionals for density functional theory. - M.A.L. Marques, M.J.T. Oliveira, T. Burnus, Computer Physics Communications 183, 2227 (2012). - Comment: to be cited when LibXC is used (negative value of ixc) - Strong suggestion to cite this paper. - DOI and bibtex: see https://docs.abinit.org/theory/bibliography/#marques2012 - - [2] The Abinit project: Impact, environment and recent developments. - Computer Phys. Comm. 248, 107042 (2020). - X.Gonze, B. Amadon, G. Antonius, F.Arnardi, L.Baguet, J.-M.Beuken, - J.Bieder, F.Bottin, J.Bouchet, E.Bousquet, N.Brouwer, F.Bruneval, - G.Brunin, T.Cavignac, J.-B. Charraud, Wei Chen, M.Cote, S.Cottenier, - J.Denier, G.Geneste, Ph.Ghosez, M.Giantomassi, Y.Gillet, O.Gingras, - D.R.Hamann, G.Hautier, Xu He, N.Helbig, N.Holzwarth, Y.Jia, F.Jollet, - W.Lafargue-Dit-Hauret, K.Lejaeghere, M.A.L.Marques, A.Martin, C.Martins, - H.P.C. Miranda, F.Naccarato, K. Persson, G.Petretto, V.Planes, Y.Pouillon, - S.Prokhorenko, F.Ricci, G.-M.Rignanese, A.H.Romero, M.M.Schmitt, M.Torrent, - M.J.van Setten, B.Van Troeye, M.J.Verstraete, G.Zerah and J.W.Zwanzig - Comment: the fifth generic paper describing the ABINIT project. - Note that a version of this paper, that is not formatted for Computer Phys. Comm. - is available at https://www.abinit.org/sites/default/files/ABINIT20.pdf . - The licence allows the authors to put it on the Web. - DOI and bibtex: see https://docs.abinit.org/theory/bibliography/#gonze2020 - - [3] Optimized norm-conserving Vanderbilt pseudopotentials. - D.R. Hamann, Phys. Rev. B 88, 085117 (2013). - Comment: Some pseudopotential generated using the ONCVPSP code were used. - DOI and bibtex: see https://docs.abinit.org/theory/bibliography/#hamann2013 - - [4] ABINIT: Overview, and focus on selected capabilities - J. Chem. Phys. 152, 124102 (2020). - A. Romero, D.C. Allan, B. Amadon, G. Antonius, T. Applencourt, L.Baguet, - J.Bieder, F.Bottin, J.Bouchet, E.Bousquet, F.Bruneval, - G.Brunin, D.Caliste, M.Cote, - J.Denier, C. Dreyer, Ph.Ghosez, M.Giantomassi, Y.Gillet, O.Gingras, - D.R.Hamann, G.Hautier, F.Jollet, G. Jomard, - A.Martin, - H.P.C. Miranda, F.Naccarato, G.Petretto, N.A. Pike, V.Planes, - S.Prokhorenko, T. Rangel, F.Ricci, G.-M.Rignanese, M.Royo, M.Stengel, M.Torrent, - M.J.van Setten, B.Van Troeye, M.J.Verstraete, J.Wiktor, J.W.Zwanziger, and X.Gonze. - Comment: a global overview of ABINIT, with focus on selected capabilities . - Note that a version of this paper, that is not formatted for J. Chem. Phys - is available at https://www.abinit.org/sites/default/files/ABINIT20_JPC.pdf . - The licence allows the authors to put it on the Web. - DOI and bibtex: see https://docs.abinit.org/theory/bibliography/#romero2020 - - [5] Recent developments in the ABINIT software package. - Computer Phys. Comm. 205, 106 (2016). - X.Gonze, F.Jollet, F.Abreu Araujo, D.Adams, B.Amadon, T.Applencourt, - C.Audouze, J.-M.Beuken, J.Bieder, A.Bokhanchuk, E.Bousquet, F.Bruneval - D.Caliste, M.Cote, F.Dahm, F.Da Pieve, M.Delaveau, M.Di Gennaro, - B.Dorado, C.Espejo, G.Geneste, L.Genovese, A.Gerossier, M.Giantomassi, - Y.Gillet, D.R.Hamann, L.He, G.Jomard, J.Laflamme Janssen, S.Le Roux, - A.Levitt, A.Lherbier, F.Liu, I.Lukacevic, A.Martin, C.Martins, - M.J.T.Oliveira, S.Ponce, Y.Pouillon, T.Rangel, G.-M.Rignanese, - A.H.Romero, B.Rousseau, O.Rubel, A.A.Shukri, M.Stankovski, M.Torrent, - M.J.Van Setten, B.Van Troeye, M.J.Verstraete, D.Waroquier, J.Wiktor, - B.Xu, A.Zhou, J.W.Zwanziger. - Comment: the fourth generic paper describing the ABINIT project. - Note that a version of this paper, that is not formatted for Computer Phys. Comm. - is available at https://www.abinit.org/sites/default/files/ABINIT16.pdf . - The licence allows the authors to put it on the Web. - DOI and bibtex: see https://docs.abinit.org/theory/bibliography/#gonze2016 - - Proc. 0 individual time (sec): cpu= 6.3 wall= 6.8 ================================================================================ Calculation completed. .Delivered 1 WARNINGs and 2 COMMENTs to log file. +Overall time at end (sec) : cpu= 6.3 wall= 6.8
If you have more than 2 processors at hand, you might increase the value of ngkpt, so that more than one k-point is available, and see that the k-point and spin parallelism indeed work concurrently.
Number of computing cores to accomplish a task¶
Balancing efficiently the load on the processors is not always straightforward. When using k-point- and spin-parallelism, the ideal numbers of processors to use are those that divide the product of nsppol by nkpt (e.g. for nsppol * nkpt = 12, it is quite efficient to use 2, 3, 4, 6 or 12 processors). ABINIT will nevertheless handle correctly other numbers of processors, albeit slightly less efficiently, as the final time will be determined by the processor that will have the biggest share of the work to do.
Evidencing overhead¶
Beyond a certain number of processors, the efficiency of parallelism saturates, and may even decrease. This is due to the inevitable overhead resulting from the increasing amount of communication between the processors. The loss of efficiency is highly dependent on the implementation and linked to the decreasing charge on each processor too.