1. An overview of PySCF

Python-based simulations of chemistry framework (PySCF) is a general-purpose electronic structure platform designed from the ground up to emphasize code simplicity, so as to facilitate new method development and enable flexible computational workflows. The package provides a wide range of tools to support simulations of finite-size systems, extended systems with periodic boundary conditions, low-dimensional periodic systems, and custom Hamiltonians, using mean-field and post-mean-field methods with standard Gaussian basis functions. To ensure ease of extensibility, PySCF uses the Python language to implement almost all of its features, while computationally critical paths are implemented with heavily optimized C routines. Using this combined Python/C implementation, the package is as efficient as the best existing C or Fortran- based quantum chemistry programs.

1.1. How to cite

Bibtex entry:

@Misc{PYSCF,
  title = {PySCF: the Python‐based simulations of chemistry framework},
  author = {Qiming Sun and Timothy C. Berkelbach and Nick S. Blunt and George H. Booth and Sheng Guo and Zhendong Li and Junzi Liu and James D. McClain and Elvira R. Sayfutyarova and Sandeep Sharma and Sebastian Wouters and Garnet Kin‐Lic Chan},
  year = {2017},
  journal = {Wiley Interdisciplinary Reviews: Computational Molecular Science},
  volume = {8},
  number = {1},
  pages = {e1340},
  doi = {10.1002/wcms.1340},
  url = {https://onlinelibrary.wiley.com/doi/abs/10.1002/wcms.1340},
  eprint = {https://onlinelibrary.wiley.com/doi/pdf/10.1002/wcms.1340},
}

1.2. Features

  • Hartree-Fock
    • Non-relativistic restricted open-shell, unrestricted HF (~5000 basis for serial version, ~30000 basis in MPI mode)
    • Scalar relativistic HF
    • 2-component relativistic HF
    • 4-component relativistic Dirac-Hartree-Fock
    • Density fitting HF
    • Second order SCF
    • General J/K build function
    • DIIS, EDIIS, ADIIS and second order solver
    • SCF wavefunction (RHF, UHF, GHF) stability analysis
    • Generalized Hartree-Fock (GHF)
  • DFT
    • Non-relativistic restricted, restricted open-shell, unrestricted Kohn-Sham (~5000 basis for serial version, ~30000 basis in MPI mode)
    • Scalar relativistic DFT
    • Density fitting DFT
    • General XC functional evaluator (for Libxc or XcFun)
    • General AO evaluator
    • VV10 NLC functional for finite size systems
    • range-separated hybrid features for RKS and UKS, including > Analytical nuclear gradients > Second order SCF > Hessian and frequency > TDDFT > TDDFT nuclear gradients > NMR
  • TDSCF/TDDFT
    • TDA (and density-fitting TDA) for RHF, UHF, RKS and UKS methods
    • TDHF (and density-fitting TDHF) for RHF and UHF methods
    • TDDFT (and density-fitting TDDFT) for RKS and UKS methods
    • TDA nuclear gradients for RHF, UHF, RKS and UKS methods
    • TDHF nuclear gradients for RHF and UHF methods
    • TDDFT nuclear gradients for RKS and UKS methods
    • Natural transition orbital analysis
    • direct-RPA (no exchange, aka TDH)
    • direct-TDA (TDA without exchange)
  • GW methods - G0W0 approximation
  • General CASCI/CASSCF solver (up to ~3000 basis)
    • State-average CASCI/CASSCF
    • State-specific CASCI/CASSCF for excited states
    • Multiple roots CASCI
    • Support DMRG as plugin FCI solver to do DMRG-CASSCF
    • Support FCIQMC as plugin FCI solver to do FCIQMC-CASSCF
    • Support Selected CI algorithm as plugin FCI solver to do SHCI-CASSCF
    • UCASSCF
    • Density-fitting CASSCF
    • DMET-CAS and AVAS active space constructor
    • CASCI and CASSCF analytical nuclear gradients
  • MP2 (up to ~200 occupied, ~2000 virtual orbitals)
    • Canonical RMP2, UMP2, GMP2
    • Density-fitting RMP2
    • RMP2, UMP2 and GMP2 1-particle and 2-particle density matrices
    • RMP2 and UMP2 nuclear gradients
  • CCSD (up to ~100 occupied, ~1500 virtual orbitals)
    • canonical RCCSD, UCCSD
    • canonical RCCSD, UCCSD lambda solver
    • RCCSD, UCCSD and GCCSD 1-particle and 2-particle density matrices
    • RCCSD and UCCSD nuclear gradients
    • EOM-IP/EA/EE-RCCSD and EOM-IP/EA/EE-UCCSD
    • RCC2
    • Density-fitting RCCSD
  • CCSD(T)
    • RCCSD(T) and UCCSD(T)
    • RCCSD(T), UCCSD(T) and GCCSD(T) 1- and 2-particle density matrices
    • RCCSD(T) and UCCSD(T) analytical nuclear gradients
  • CI
    • RCISD, UCISD and GCISD
    • RCISD, UCISD and GCISD 1, 2-particle density matrices
    • Selected-CI
    • Selected-CI 1, 2-particle density matrices
    • RCISD, UCISD and GCISD 1-particle transition density matrices
  • Full CI
    • Direct-CI solver for spin degenerated Hamiltonian (RHF-FCI)
    • Direct-CI solver for spin non-degenerated Hamiltonian (UHF-FCI)
    • 1, and 2-particle transition density matrices
    • 1, 2, 3, and 4-particle density matrices
    • CI wavefunction overlap
  • Analytical Nuclear Gradients
    • Non-relativistic HF nuclear gradients
    • 4-component DHF nuclear gradients
    • Non-relativistic DFT nuclear gradients
    • Non-relativistic CISD nuclear gradients
    • Non-relativistic CCSD and CCSD(T) nuclear gradients
    • Non-relativistic CASCI and CASSCF nuclear gradients
    • Non-relativistic TDA, TDHF and TDDFT nuclear gradients
    • Non-relativistic nuclear gradients with SF-X2C-1e correction
    • ECP nuclear gradients
    • nuclear gradients for solvent model ddCOSMO
    • Frozen orbitals for MP2, CISD, CCSD, CCSD(T), CASCI, CASSCF nuclear gradients
  • Nuclear Hessian
    • Non-relativistic HF nuclear hessian
    • Non-relativistic DFT nuclear hessian
    • Non-relativistic nuclear hessian with SF-X2C-1e correction
    • ECP nuclear hessian
  • Properties
    • Non-relativistic RHF, UHF, RKS, UKS NMR shielding
    • 4-component DHF NMR shielding
    • Non-relativistic RHF, UHF spin-spin coupling
    • 4-component DHF spin-spin coupling
    • Non-relativistic UHF, UKS hyperfine coupling
    • 4-component DHF hyperfine coupling
    • Non-relativistic UHF, UKS g-tensor
    • 4-component DHF g-tensor
    • Non-relativistic UHF zero-field splitting
    • Molecular electrostatic potential (MEP)
    • EFG and Mossbauer spectroscopy
    • Non-relativistic RHF, UHF, RKS, UKS magnetizability
  • MRPT
    • Strongly contracted NEVPT2 (SC-NEVPT2)
    • DMRG-NEVPT2
    • IC-MPS-PT2
  • Extended systems with periodic boundary condition
    • gamma point RHF, ROHF, UHF, RKS, ROKS, UKS
    • gamma point TDDFT, MP2, CCSD
    • RHF, ROHF, UHF, GHF, RKS, ROKS, UKS with k-point sampling
    • Restricted MP2 with k-point sampling
    • KRCCSD (RCCSD with k-point sampling)
    • KUCCSD
    • KGCCSD (Generalized CCSD with k-point sampling)
    • k-point GCCSD(T) and RCCSD(T)
    • k-point EOM-IP/EA-CCSD
    • PBC AO integrals
    • PBC MO integral transformation
    • PBC density fitting and mixed-density fitting methods
    • Smearing for mean-field methods
    • Low-dimensional (0D, 1D, 2D) PBC systems
    • (restricted and unrestricted) TDA, TDHF and TDDFT with k-point sampling
    • Multigrid DFT
    • EFG and Mossbauer spectroscopy
  • Relativistic effects
    • 4-component HF with Dirac-Coulomb Hamiltonian (DHF)
    • 4-component DHF with Gaunt and Breit corrections
    • 2-component X2C HF
    • 4-component and 2-component Kohn-Sham DFT (LDA only)
  • AO integrals
    • Interface to access all AO integrals of Libcint library
    • 1-electron real-GTO and spinor-GTO integrals
    • 2-electron real-GTO and spinor-GTO integrals
    • 3-center 1-electron real-GTO and spinor-GTO integrals
    • 3-center 2-electron real-GTO and spinor-GTO integrals
    • General basis value evaluator (for numeric integration)
    • PBC 1-electron integrals
    • PBC 2-electron integrals
    • F12 integrals
  • MO integrals
    • 2-electron integral transformation for any integrals provided by Libcint library
    • Support for 4-index integral transformation with 4 different orbitals
    • PBC 2-electron MO integrals
    • Integral transformation for (4-component and 2-compoent relativistic) spinor integrals
  • Localizer
    • Boys
    • Edmiston
    • Meta-Lowdin for both finite size and PBC systems
    • Natural atomic orbital (NAO) for both finite size and PBC systems
    • Intrinsic atomic orbital (IAO) for both finite size and PBC systems
    • Pipek-Mezey for both finite size and PBC systems
    • Intrinsic bond orbital (IBO) for both finite size and PBC systems
  • Geometry optimization
    • HF, DFT, CCSD, CCSD(T), CISD, CASCI, CASSCF and TDSCF/TDDFT with pyberny geometry optimizer
  • D2h symmetry and linear molecule symmetry
    • Molecule symmetry detection
    • Symmetry adapted basis
    • Label orbital symmetry on the fly
    • Hot update symmetry information
    • Function to symmetrize given orbital space
  • Solvent model - ddCOSMO - ddPCM - ddCOSMO analytical nuclear gradients
  • Tools
    • fcidump writer
    • molden writer and reader
    • cubegen writer
    • Molpro XML reader
    • (GAMESS-format) wfn writer
    • Vasp CHGCAR-format writer
  • Interface to integral package Libcint
  • Interface to DMRG CheMPS2
  • Interface to DMRG Block
  • Interface to FCIQMC NECI
  • Interface to XC functional library XCFun
  • Interface to XC functional library Libxc
  • Interface to tensor contraction library TBLIS
  • Interface to Heat-bath Selected CI program Dice
  • Interface to geometry optimizer Pyberny

1.3. Designs

1.3.1. Kernel and Stream functions

Every class has the kernel method which serves as the entry or the driver of the method. Once an object of one method was created, you can always call .kernel() to start or restart a calculation.

The return value of kernel method is different for different class. To unify the return value, the package introduces the stream methods to pipe the computing stream. A stream method of an object only return the object itself. There are three general stream methods available for most method classes. They are:

1 .set method to update object attributes, for example:

mf = scf.RHF(mol).set(conv_tol=1e-5)

is identical to two lines of statements:

mf = scf.RHF(mol)
mf.conv_tol = 1e-5

2 .run method to pass the call to the .kernel method. If arguments are presented in .run method, the arguments will be passed to the kernel function. If keyword arguments are given, .run method will first call .set method to update the attributes then execute the .kernel method. For example:

mf = scf.RHF(mol).run(dm_init, conv_tol=1e-5)

is identical to three lines of statements:

mf = scf.RHF(mol)
mf.conv_tol = 1e-5
mf.kernel(dm_init)

3 .apply method to pass the current object (as the first argument) to the given function/class and return a new object. If arguments and keyword arguments are presented, they will all be passed to the function/class. For example:

mc = mol.apply(scf.RHF).run().apply(mcscf.CASSCF, 6, 4, frozen=4)

is identical to:

mf = scf.RHF(mol)
mf.kernel()
mc = mcscf.CASSCF(mf, 6, 4, frozen=4)

Aside from the three general stream methods, the regular class methods may return the objects as well when the methods do not have particular value to return. Using the stream methods, you can evaluate certain quantities with one line of code:

dm = gto.M(atom='H 0 0 0; H 0 0 1') \
.apply(scf.RHF) \
.dump_flags() \
.run() \
.make_rdm1()

1.3.2. Pure function and Class

Class are designed to hold only the final results and the control parameters such as maximum number of iterations, convergence threshold, etc. Intermediates are NOT saved in the class. After calling the .kernel() or .run() method, results will be generated and saved in the object. For example:

from pyscf import gto, scf, ccsd
mol = gto.M(atom='H 0 0 0; H 0 0 1.1', basis='ccpvtz')
mf = scf.RHF(mol).run()
mycc = ccsd.CCSD(mf).run()
print(mycc.e_tot)
print(mycc.e_corr)
print(mycc.t1.shape)
print(mycc.t2.shape)

Many useful functions are defined at both the module level and class level. They can be accessed from either the module functions or the class methods and the return values should be the same:

vj, vk = scf.hf.get_jk(mol, dm)
vj, vk = SCF(mol).get_jk(mol, dm)

Note some module functions may require the class as the first argument.

Most functions and classes are pure, i.e. no intermediate status are held within the classes, and the argument of the methods and functions are immutable during calculations. These functions can be called arbitrary times in arbitrary order and their returns should be always the same.

Exceptions are often suffixed with underscore in the function name, e.g. mcscf.state_average_(mc) where the attributes of mc object may be changed or overwritten by the state_average_ method. Cautious should be taken when you see the functions or methods with ugly suffices.

1.3.3. Global configurations

Global configuration file is a Python script that contains PySCF configurations. When importing pyscf module in a Python program (or Python interpreter), the package will preload the global configuration file and take the configurations as the default values of the parameters of functions or attributes of classes during initialization. For example, the configuration file below detects the available memory in the operate system at the runtime and set the maximum memory for PySCF:

$ cat ~/.pyscf_conf.py
import psutil
total, available, percent, used, free, active, inactive, buffers, cached, shared = psutil.virtual_memory()
MAX_MEMORY = available

By setting MAX_MEMORY in the global configuration file, you don’t need the statement to set the max_memory attribute in every script. The dynamically determined max_memory will be loaded during the program initialization step automatically.

There are two methods to let the PySCF package load the global configurations. One is to create a configuration file .pyscf_conf.py in home directory or in work directory. Another is to set the environment variable PYSCF_CONFIG_FILE which points to the configuration file (with the absolute path). The environment variable PYSCF_CONFIG_FILE has high priority than the configuration file in default locations (home directory or work directory). If environment variable PYSCF_CONFIG_FILE is available, the program will read the configurations from the $PYSCF_CONFIG_FILE. If PYSCF_CONFIG_FILE is not set or the file it points to does not exist, the program will turn to the default location for the file .pyscf_conf.py. If none of the configuration file exists, the program will use the built-in configurations which are generally conservative settings.

In the source code, global configurations are loaded by importing pyscf.__config__ module:

from pyscf import __config__
MAX_MEMORY = getattr(__config__, 'MAX_MEMORY')

Please refer to the source code for the available configurations.

1.3.4. Scanner

Scanner is a function that takes an Mole (or Cell) object as input and return the energy or nuclear gradients of the given Mole (or Cell) object. Scanner can be considered as a shortcut function for a sequence of statements which includes the initialization of a required calculation model with necessary precomputing, next updating the attributes based on the settings of the referred object, then calling kernel function and finally returning results. For example:

cc_scanner = gto.M().apply(scf.RHF).apply(cc.CCSD).as_scanner()
for r in (1.0, 1.1, 1.2):
  print(cc_scanner(gto.M(atom='H 0 0 0; H 0 0 %g'%r)))

An equivalent but slightly complicated code is:

for r in (1.0, 1.1, 1.2):
  mol = gto.M(atom='H 0 0 0; H 0 0 %g'%r)
  mf = scf.RHF(mol).run()
  mycc = cc.CCSD(mf).run()
  print(mycc.e_tot)

There are two types of scanner available in the package. They are energy scanner and nuclear gradients scanner. The example above is the energy scanner. Energy scanner only returns the energy of the given molecular structure while the nuclear gradients scanner returns the nuclear gradients in addition.

Scanner is a special derived object of the caller. Most methods which are defined in the caller class can be used with the scanner object. For example:

mf_scanner = gto.M().apply(scf.RHF).as_scanner()
mf_scanner(gto.M(atom='H 0 0 0; H 0 0 1.2'))
mf_scanner.analyze()
dm1 = mf_scanner.make_rdm1()

mf_grad_scanner = mf_scanner.nuc_grad_method().as_scanner()
mf_grad_scanner(gto.M(atom='H 0 0 0; H 0 0 1.2'))

As shown in the example above, the scanner works pretty close to the relevant class object except that the scanner does not need the kernel or run methods to run a calculation. Given molecule structure, the scanner automatically checks and updates the necessary object dependence and passes the work flow to the kernel method. The computational results are held in the scanner object as the regular class object does.

To make structure of scanner object uniform for all methods, two attributes (.e_tot and .converged) are defined for all energy scanner and three attributes (.e_tot, .de and .converged) are defined for all nuclear gradients scanner.