# 1. An overview of PySCF¶

Python-based simulations of chemistry framework (PySCF) is a general-purpose electronic structure platform designed from the ground up to emphasize code simplicity, so as to facilitate new method development and enable flexible computational workflows. The package provides a wide range of tools to support simulations of finite-size systems, extended systems with periodic boundary conditions, low-dimensional periodic systems, and custom Hamiltonians, using mean-field and post-mean-field methods with standard Gaussian basis functions. To ensure ease of extensibility, PySCF uses the Python language to implement almost all of its features, while computationally critical paths are implemented with heavily optimized C routines. Using this combined Python/C implementation, the package is as efficient as the best existing C or Fortran- based quantum chemistry programs.

## 1.1. How to cite¶

Bibtex entry:

```
@Misc{PYSCF,
title = {PySCF: the Python‐based simulations of chemistry framework},
author = {Qiming Sun and Timothy C. Berkelbach and Nick S. Blunt and George H. Booth and Sheng Guo and Zhendong Li and Junzi Liu and James D. McClain and Elvira R. Sayfutyarova and Sandeep Sharma and Sebastian Wouters and Garnet Kin‐Lic Chan},
year = {2017},
journal = {Wiley Interdisciplinary Reviews: Computational Molecular Science},
volume = {8},
number = {1},
pages = {e1340},
doi = {10.1002/wcms.1340},
url = {https://onlinelibrary.wiley.com/doi/abs/10.1002/wcms.1340},
eprint = {https://onlinelibrary.wiley.com/doi/pdf/10.1002/wcms.1340},
}
```

## 1.2. Features¶

- Hartree-Fock
- Non-relativistic restricted open-shell, unrestricted HF (~5000 basis for serial version, ~30000 basis in MPI mode)
- Scalar relativistic HF
- 2-component relativistic HF
- 4-component relativistic Dirac-Hartree-Fock
- Density fitting HF
- Second order SCF
- General J/K build function
- DIIS, EDIIS, ADIIS and second order solver
- SCF wavefunction (RHF, UHF, GHF) stability analysis
- Generalized Hartree-Fock (GHF)

- DFT
- Non-relativistic restricted, restricted open-shell, unrestricted Kohn-Sham (~5000 basis for serial version, ~30000 basis in MPI mode)
- Scalar relativistic DFT
- Density fitting DFT
- General XC functional evaluator (for Libxc or XcFun)
- General AO evaluator
- VV10 NLC functional for finite size systems
- range-separated hybrid features for RKS and UKS, including > Analytical nuclear gradients > Second order SCF > Hessian and frequency > TDDFT > TDDFT nuclear gradients > NMR

- TDSCF/TDDFT
- TDA (and density-fitting TDA) for RHF, UHF, RKS and UKS methods
- TDHF (and density-fitting TDHF) for RHF and UHF methods
- TDDFT (and density-fitting TDDFT) for RKS and UKS methods
- TDA nuclear gradients for RHF, UHF, RKS and UKS methods
- TDHF nuclear gradients for RHF and UHF methods
- TDDFT nuclear gradients for RKS and UKS methods
- Natural transition orbital analysis
- direct-RPA (no exchange, aka TDH)
- direct-TDA (TDA without exchange)

- GW methods - G0W0 approximation
- General CASCI/CASSCF solver (up to ~3000 basis)
- State-average CASCI/CASSCF
- State-specific CASCI/CASSCF for excited states
- Multiple roots CASCI
- Support DMRG as plugin FCI solver to do DMRG-CASSCF
- Support FCIQMC as plugin FCI solver to do FCIQMC-CASSCF
- Support Selected CI algorithm as plugin FCI solver to do SHCI-CASSCF
- UCASSCF
- Density-fitting CASSCF
- DMET-CAS and AVAS active space constructor
- CASCI and CASSCF analytical nuclear gradients

- MP2 (up to ~200 occupied, ~2000 virtual orbitals)
- Canonical RMP2, UMP2, GMP2
- Density-fitting RMP2
- RMP2, UMP2 and GMP2 1-particle and 2-particle density matrices
- RMP2 and UMP2 nuclear gradients

- CCSD (up to ~100 occupied, ~1500 virtual orbitals)
- canonical RCCSD, UCCSD
- canonical RCCSD, UCCSD lambda solver
- RCCSD, UCCSD and GCCSD 1-particle and 2-particle density matrices
- RCCSD and UCCSD nuclear gradients
- EOM-IP/EA/EE-RCCSD and EOM-IP/EA/EE-UCCSD
- RCC2
- Density-fitting RCCSD

- CCSD(T)
- RCCSD(T) and UCCSD(T)
- RCCSD(T), UCCSD(T) and GCCSD(T) 1- and 2-particle density matrices
- RCCSD(T) and UCCSD(T) analytical nuclear gradients

- CI
- RCISD, UCISD and GCISD
- RCISD, UCISD and GCISD 1, 2-particle density matrices
- Selected-CI
- Selected-CI 1, 2-particle density matrices
- RCISD, UCISD and GCISD 1-particle transition density matrices

- Full CI
- Direct-CI solver for spin degenerated Hamiltonian (RHF-FCI)
- Direct-CI solver for spin non-degenerated Hamiltonian (UHF-FCI)
- 1, and 2-particle transition density matrices
- 1, 2, 3, and 4-particle density matrices
- CI wavefunction overlap

- Analytical Nuclear Gradients
- Non-relativistic HF nuclear gradients
- 4-component DHF nuclear gradients
- Non-relativistic DFT nuclear gradients
- Non-relativistic CISD nuclear gradients
- Non-relativistic CCSD and CCSD(T) nuclear gradients
- Non-relativistic CASCI and CASSCF nuclear gradients
- Non-relativistic TDA, TDHF and TDDFT nuclear gradients
- Non-relativistic nuclear gradients with SF-X2C-1e correction
- ECP nuclear gradients
- nuclear gradients for solvent model ddCOSMO
- Frozen orbitals for MP2, CISD, CCSD, CCSD(T), CASCI, CASSCF nuclear gradients

- Nuclear Hessian
- Non-relativistic HF nuclear hessian
- Non-relativistic DFT nuclear hessian
- Non-relativistic nuclear hessian with SF-X2C-1e correction
- ECP nuclear hessian

- Properties
- Non-relativistic RHF, UHF, RKS, UKS NMR shielding
- 4-component DHF NMR shielding
- Non-relativistic RHF, UHF spin-spin coupling
- 4-component DHF spin-spin coupling
- Non-relativistic UHF, UKS hyperfine coupling
- 4-component DHF hyperfine coupling
- Non-relativistic UHF, UKS g-tensor
- 4-component DHF g-tensor
- Non-relativistic UHF zero-field splitting
- Molecular electrostatic potential (MEP)
- EFG and Mossbauer spectroscopy
- Non-relativistic RHF, UHF, RKS, UKS magnetizability

- MRPT
- Strongly contracted NEVPT2 (SC-NEVPT2)
- DMRG-NEVPT2
- IC-MPS-PT2

- Extended systems with periodic boundary condition
- gamma point RHF, ROHF, UHF, RKS, ROKS, UKS
- gamma point TDDFT, MP2, CCSD
- RHF, ROHF, UHF, GHF, RKS, ROKS, UKS with k-point sampling
- Restricted MP2 with k-point sampling
- KRCCSD (RCCSD with k-point sampling)
- KUCCSD
- KGCCSD (Generalized CCSD with k-point sampling)
- k-point GCCSD(T) and RCCSD(T)
- k-point EOM-IP/EA-CCSD
- PBC AO integrals
- PBC MO integral transformation
- PBC density fitting and mixed-density fitting methods
- Smearing for mean-field methods
- Low-dimensional (0D, 1D, 2D) PBC systems
- (restricted and unrestricted) TDA, TDHF and TDDFT with k-point sampling
- Multigrid DFT
- EFG and Mossbauer spectroscopy

- Relativistic effects
- 4-component HF with Dirac-Coulomb Hamiltonian (DHF)
- 4-component DHF with Gaunt and Breit corrections
- 2-component X2C HF
- 4-component and 2-component Kohn-Sham DFT (LDA only)

- AO integrals
- Interface to access all AO integrals of Libcint library
- 1-electron real-GTO and spinor-GTO integrals
- 2-electron real-GTO and spinor-GTO integrals
- 3-center 1-electron real-GTO and spinor-GTO integrals
- 3-center 2-electron real-GTO and spinor-GTO integrals
- General basis value evaluator (for numeric integration)
- PBC 1-electron integrals
- PBC 2-electron integrals
- F12 integrals

- MO integrals
- 2-electron integral transformation for any integrals provided by Libcint library
- Support for 4-index integral transformation with 4 different orbitals
- PBC 2-electron MO integrals
- Integral transformation for (4-component and 2-compoent relativistic) spinor integrals

- Localizer
- Boys
- Edmiston
- Meta-Lowdin for both finite size and PBC systems
- Natural atomic orbital (NAO) for both finite size and PBC systems
- Intrinsic atomic orbital (IAO) for both finite size and PBC systems
- Pipek-Mezey for both finite size and PBC systems
- Intrinsic bond orbital (IBO) for both finite size and PBC systems

- Geometry optimization
- HF, DFT, CCSD, CCSD(T), CISD, CASCI, CASSCF and TDSCF/TDDFT with pyberny geometry optimizer

- D2h symmetry and linear molecule symmetry
- Molecule symmetry detection
- Symmetry adapted basis
- Label orbital symmetry on the fly
- Hot update symmetry information
- Function to symmetrize given orbital space

- Solvent model - ddCOSMO - ddPCM - ddCOSMO analytical nuclear gradients
- Tools
- fcidump writer
- molden writer and reader
- cubegen writer
- Molpro XML reader
- (GAMESS-format) wfn writer
- Vasp CHGCAR-format writer

- Interface to integral package Libcint
- Interface to DMRG CheMPS2
- Interface to DMRG Block
- Interface to FCIQMC NECI
- Interface to XC functional library XCFun
- Interface to XC functional library Libxc
- Interface to tensor contraction library TBLIS
- Interface to Heat-bath Selected CI program Dice
- Interface to geometry optimizer Pyberny

## 1.3. Designs¶

### 1.3.1. Kernel and Stream functions¶

Every class has the `kernel`

method which serves as the entry or the driver of
the method. Once an object of one method was created, you can always call
`.kernel()`

to start or restart a calculation.

The return value of kernel method is different for different class. To unify the return value, the package introduces the stream methods to pipe the computing stream. A stream method of an object only return the object itself. There are three general stream methods available for most method classes. They are:

1 `.set`

method to update object attributes, for example:

```
mf = scf.RHF(mol).set(conv_tol=1e-5)
```

is identical to two lines of statements:

```
mf = scf.RHF(mol)
mf.conv_tol = 1e-5
```

2 `.run`

method to pass the call to the `.kernel`

method. If arguments are
presented in `.run`

method, the arguments will be passed to the kernel
function. If keyword arguments are given, `.run`

method will first
call `.set`

method to update the attributes then execute the `.kernel`

method. For example:

```
mf = scf.RHF(mol).run(dm_init, conv_tol=1e-5)
```

is identical to three lines of statements:

```
mf = scf.RHF(mol)
mf.conv_tol = 1e-5
mf.kernel(dm_init)
```

3 `.apply`

method to pass the current object (as the first argument) to the
given function/class and return a new object. If arguments and keyword
arguments are presented, they will all be passed to the function/class. For
example:

```
mc = mol.apply(scf.RHF).run().apply(mcscf.CASSCF, 6, 4, frozen=4)
```

is identical to:

```
mf = scf.RHF(mol)
mf.kernel()
mc = mcscf.CASSCF(mf, 6, 4, frozen=4)
```

Aside from the three general stream methods, the regular class methods may return the objects as well when the methods do not have particular value to return. Using the stream methods, you can evaluate certain quantities with one line of code:

```
dm = gto.M(atom='H 0 0 0; H 0 0 1') \
.apply(scf.RHF) \
.dump_flags() \
.run() \
.make_rdm1()
```

### 1.3.2. Pure function and Class¶

Class are designed to hold only the final results and the control parameters
such as maximum number of iterations, convergence threshold, etc.
Intermediates are NOT saved in the class. After calling the `.kernel()`

or
`.run()`

method, results will be generated and saved in the object. For
example:

```
from pyscf import gto, scf, ccsd
mol = gto.M(atom='H 0 0 0; H 0 0 1.1', basis='ccpvtz')
mf = scf.RHF(mol).run()
mycc = ccsd.CCSD(mf).run()
print(mycc.e_tot)
print(mycc.e_corr)
print(mycc.t1.shape)
print(mycc.t2.shape)
```

Many useful functions are defined at both the module level and class level. They can be accessed from either the module functions or the class methods and the return values should be the same:

```
vj, vk = scf.hf.get_jk(mol, dm)
vj, vk = SCF(mol).get_jk(mol, dm)
```

Note some module functions may require the class as the first argument.

Most functions and classes are **pure**, i.e. no intermediate status are held
within the classes, and the argument of the methods and functions are immutable
during calculations. These functions can be called arbitrary times in
arbitrary order and their returns should be always the same.

Exceptions are often suffixed with underscore in the function name, e.g.
`mcscf.state_average_(mc)`

where the attributes of `mc`

object may be
changed or overwritten by the `state_average_`

method. Cautious should be
taken when you see the functions or methods with ugly suffices.

### 1.3.3. Global configurations¶

Global configuration file is a Python script that contains PySCF configurations.
When importing `pyscf`

module in a Python program (or Python interpreter), the
package will preload the global configuration file and take the configurations
as the default values of the parameters of functions or attributes of classes
during initialization. For example, the configuration file below detects the
available memory in the operate system at the runtime and set the maximum memory
for PySCF:

```
$ cat ~/.pyscf_conf.py
import psutil
total, available, percent, used, free, active, inactive, buffers, cached, shared = psutil.virtual_memory()
MAX_MEMORY = available
```

By setting `MAX_MEMORY`

in the global configuration file, you don’t need the
statement to set the `max_memory`

attribute in every script. The dynamically
determined `max_memory`

will be loaded during the program initialization step
automatically.

There are two methods to let the PySCF package load the global configurations.
One is to create a configuration file `.pyscf_conf.py`

in home directory or
in work directory. Another is to set the environment variable
`PYSCF_CONFIG_FILE`

which points to the configuration file (with the absolute
path). The environment variable `PYSCF_CONFIG_FILE`

has high priority than
the configuration file in default locations (home directory or work directory).
If environment variable `PYSCF_CONFIG_FILE`

is available, the program will
read the configurations from the `$PYSCF_CONFIG_FILE`

. If
`PYSCF_CONFIG_FILE`

is not set or the file it points to does not exist, the
program will turn to the default location for the file `.pyscf_conf.py`

. If
none of the configuration file exists, the program will use the built-in
configurations which are generally conservative settings.

In the source code, global configurations are loaded by importing
`pyscf.__config__`

module:

```
from pyscf import __config__
MAX_MEMORY = getattr(__config__, 'MAX_MEMORY')
```

Please refer to the source code for the available configurations.

### 1.3.4. Scanner¶

Scanner is a function that takes an `Mole`

(or `Cell`

) object as input and
return the energy or nuclear gradients of the given `Mole`

(or `Cell`

)
object. Scanner can be considered as a shortcut function for a sequence of
statements which includes the initialization of a required calculation model
with necessary precomputing, next updating the attributes based on the settings
of the referred object, then calling kernel function and finally returning
results. For example:

```
cc_scanner = gto.M().apply(scf.RHF).apply(cc.CCSD).as_scanner()
for r in (1.0, 1.1, 1.2):
print(cc_scanner(gto.M(atom='H 0 0 0; H 0 0 %g'%r)))
```

An equivalent but slightly complicated code is:

```
for r in (1.0, 1.1, 1.2):
mol = gto.M(atom='H 0 0 0; H 0 0 %g'%r)
mf = scf.RHF(mol).run()
mycc = cc.CCSD(mf).run()
print(mycc.e_tot)
```

There are two types of scanner available in the package. They are *energy
scanner* and *nuclear gradients scanner*. The example above is the energy
scanner. Energy scanner only returns the energy of the given molecular
structure while the nuclear gradients scanner returns the nuclear gradients in
addition.

Scanner is a special derived object of the caller. Most methods which are defined in the caller class can be used with the scanner object. For example:

```
mf_scanner = gto.M().apply(scf.RHF).as_scanner()
mf_scanner(gto.M(atom='H 0 0 0; H 0 0 1.2'))
mf_scanner.analyze()
dm1 = mf_scanner.make_rdm1()
mf_grad_scanner = mf_scanner.nuc_grad_method().as_scanner()
mf_grad_scanner(gto.M(atom='H 0 0 0; H 0 0 1.2'))
```

As shown in the example above, the scanner works pretty close to the relevant
class object except that the scanner does not need the `kernel`

or `run`

methods to run a calculation. Given molecule structure, the scanner
automatically checks and updates the necessary object dependence and passes the
work flow to the `kernel`

method. The computational results are held in the
scanner object as the regular class object does.

To make structure of scanner object uniform for all methods, two attributes
(`.e_tot`

and `.converged`

) are defined for all energy scanner
and three attributes (`.e_tot`

, `.de`

and `.converged`

) are defined for
all nuclear gradients scanner.