Getting Started¶
Jump to: Installation · Configuration kwargs · getCharges() · getEfield() · getESP() · getElectrostatic_stabilization()
Installation¶
git clone git@github.com:hjkgrp/pyEF.git
cd pyEF
conda env create -f environment.yml
conda activate pyef
pip install -e .
Requirements¶
Python 3.8+
Multiwfn (for charge calculations)
Installing Multiwfn¶
PyEF requires Multiwfn for charge partitioning and wavefunction analysis.
Note
Currently Multiwfn is NOT supported on macOS. We recommend using a Linux or Windows system.
Download and compile:
# Download from http://sobereva.com/multiwfn/
wget http://sobereva.com/multiwfn/misc/Multiwfn_3.8_bin_Linux_noGUI.zip
unzip Multiwfn_3.8_bin_Linux_noGUI.zip
cd Multiwfn_3.8_bin_Linux_noGUI
# Give executable permission to the Multiwfn executable file
chmod +x Multiwfn_noGUI
Add to your PATH (add to ~/.bashrc):
export PATH=/path/to/Multiwfn_3.8_bin_Linux_noGUI/Multiwfn_noGUI:$PATH
Or use the full path when running PyEF (e.g., multiwfn_path: /path/to/Multiwfn_3.8_bin_Linux_noGUI/Multiwfn_noGUI).
Core Concepts¶
PyEF requires two input files per structure:
Molden file (
.molden): Contains wavefunction data from QM calculationsXYZ file (
.xyz): Contains atomic coordinates
All atom indices in PyEF are 0-indexed.
Supported Charge Schemes¶
Recommended:
Hirshfeld_I: Iterative Hirshfeld - most accurate for most systems
ADCH: Atomic dipole corrected Hirshfeld - recommended by Multiwfn
CM5: Charge Model 5 - good balance of accuracy and speed
ESP-based methods:
RESP: Restrained ESP fitting - standard for force field development
CHELPG: ESP fitting (Breneman) - good for molecular mechanics
MK: Merz-Kollmann ESP fitting - alternative ESP method
Population analysis:
Hirshfeld: Standard Hirshfeld partitioning - fast, good for most systems
Mulliken: Mulliken population - fast but basis-set dependent
Lowdin: Löwdin population - uses orthogonalized basis
Voronoi: Voronoi deformation density - space partitioning method
SCPA: Ros & Schuit modified Mulliken scheme
Becke: Becke partitioning with atomic dipole correction
Methods with limitations:
EEM: Electronegativity equalization method
⚠️ Limitation: Requires all atoms to be bonded. Fails for ionic systems (e.g., systems with Na+, K+, or other isolated ions)
PEOE: Gasteiger (PEOE) charges
⚠️ Limitation: Missing parameters for some elements including Na, K, and most transition metals. Will fail if your system contains unsupported elements
Initialization Configuration (kwargs)¶
When creating an Electrostatics object, you can pass keyword arguments to configure the
calculation behavior. All kwargs are optional.
ECP (Effective Core Potential) support:
If your QM calculation used ECPs, you must tell pyEF so it can fix molden file artifacts:
# For a TeraChem calculation with LACVPS (default ECP)
es = Electrostatics(molden_paths, xyz_paths, hasECP=True, ECP="lacvps")
# For a calculation with Stuttgart RSC ECPs
es = Electrostatics(molden_paths, xyz_paths, hasECP=True, ECP="stuttgart_rsc")
Supported ECP values:
"lacvps"(default) – Hybrid: all-electron for Z ≤ 18, LANL2DZ for heavier elements"lacvp"– Hybrid: all-electron for Z ≤ 10, LANL2DZ for heavier elements"lanl2dz"– LANL2DZ ECP, widely used for transition metals"def2"– def2-type ECPs (e.g., def2-SVP, def2-TZVP families)"crenbl"– CRENBL shape-consistent relativistic ECPs"stuttgart_rsc"– Stuttgart Relativistic Small Core ECPs
Dielectric environment:
# Protein interior (dielectric ~ 4)
es = Electrostatics(molden_paths, xyz_paths, dielectric=4.0)
# Water solvent (dielectric ~ 78.5)
es = Electrostatics(molden_paths, xyz_paths, dielectric=78.5)
# Use dielectric=1 for bonded atoms (special boundary treatment)
es = Electrostatics(molden_paths, xyz_paths, dielectric=4.0, changeDielectBoundBool=True)
QM/MM point charges:
# New API: per-job point charge paths
es = Electrostatics(molden_paths, xyz_paths,
ptchg_paths=['/path/to/ptchg1.txt', '/path/to/ptchg2.txt'],
includePtChgs=True)
Computation options:
# Force recalculation (ignore cached charges)
es = Electrostatics(molden_paths, xyz_paths, rerun=True)
# Skip missing files in batch processing
es = Electrostatics(molden_paths, xyz_paths, skip_missing_files=True)
# Increase basis function limit for very large systems
es = Electrostatics(molden_paths, xyz_paths, maxIHirshBasis=20000)
Visualization:
# Enable PDB output for E-fields and charges
es = Electrostatics(molden_paths, xyz_paths,
visualize_ef=True, visualize_charges=True)
For the complete list of all kwargs and their defaults, see the API Reference.
Computing Partial Charges¶
Compute partial charges for all atoms using specified charge partitioning scheme(s). Charges can optionally be written to PDB files for visualization.
Python API¶
from pyef.analysis import Electrostatics
molden_paths = ['/path/to/structure.molden']
xyz_paths = ['/path/to/structure.xyz']
es = Electrostatics(molden_paths, xyz_paths)
# Get multiple charge types with PDB output for visualization
df = es.getCharges(
charge_types=['RESP', 'Hirshfeld_I'],
multiwfn_path='/path/to/multiwfn',
output_filename='my_charges',
write_pdb=True
)
Output: Returns a DataFrame with columns: Job, Charge_Type, Atom_Index,
Element, x, y, z, Charge, Molden_Path, XYZ_Path.
PDB Visualization: When write_pdb=True, creates PDB files with charges in the
B-factor column. Visualize in PyMOL with:
spectrum b, blue_white_red, minimum=-1, maximum=1
Computing Electric Fields¶
Calculate electric fields at specific bonds.
Python API¶
from pyef.analysis import Electrostatics
import numpy as np
molden_paths = ['/path/to/structure1.molden', '/path/to/structure2.molden']
xyz_paths = ['/path/to/structure1.xyz', '/path/to/structure2.xyz']
# Create an electrostatics object
es = Electrostatics(molden_paths, xyz_paths, dielectric=1.0)
# Record the indices (0-indexed) for the bonds to project E-field across
# Here: one bond (0, 10) in structure 1 and two bonds (20, 21), (25, 26) in structure 2
ef_bond = [[(0, 10)], [(20, 21), (25, 26)]]
# Record indices of the substrate containing bonds of interest
sub_indices = [np.arange(0, 10), np.arange(10, 25)]
# Exclude substrates from E-field calc (only probe environment impact, not intramolecular)
es.setExcludeAtomFromCalc(sub_indices)
# Calculate E-field (charge_types can be a list or single value)
df = es.getEfield(
charge_types='Hirshfeld_I',
Efielddata_filename='output',
multiwfn_path='/path/to/multiwfn',
input_bond_indices=ef_bond
)
Key parameters:
charge_types: Charge scheme (‘Hirshfeld_I’, ‘CHELPG’, etc.) - can be string or listinput_bond_indices: List of (atom1, atom2) tuples defining bonds per structuremultipole_bool: Use multipole expansion (default: False)dielectric: Dielectric constant (1=vacuum, 4=protein, 78.5=water)
CLI¶
Create a jobs file (jobs.csv):
ef, /path/to/optim.molden, /path/to/optim.xyz, (25, 26), (25, 27)
Create a config file (config.yaml):
input: jobs.csv
multiwfn_path: /path/to/multiwfn
charge_types:
- Hirshfeld_I
dielectric: 1
Run:
pyef -c config.yaml
Computing Electrostatic Potentials¶
Calculate ESP at metal centers or specific atomic sites.
Python API¶
from pyef.analysis import Electrostatics
molden_paths = ['/path/to/structure1.molden', '/path/to/structure2.molden']
xyz_paths = ['/path/to/structure1.xyz', '/path/to/structure2.xyz']
# One atom index per file (0-indexed)
metal_indices = [30, 0] # atom 30 for structure1, atom 0 for structure2
# Create an electrostatics object with esp_atom_idx
es = Electrostatics(molden_paths, xyz_paths, esp_atom_idx=metal_indices, dielectric=1.0)
# Run electrostatic potential calculations
df = es.getESP(
charge_types=['Hirshfeld_I'],
ESPdata_filename='esp_results',
multiwfn_path='/path/to/multiwfn'
)
Key parameters:
esp_atom_idx: Atom indices where ESP is calculated (set at initialization, 0-indexed)charge_types: Charge schemeuse_multipole: Use multipole expansion (default: False)dielectric: Dielectric constant
CLI¶
Create a jobs file:
esp, /path/to/optim.molden, /path/to/optim.xyz, 30
Run with config:
pyef -c config.yaml
Computing Electrostatic Stabilization¶
Calculate electrostatic stabilization energy between substrate and environment.
CLI¶
Create a jobs file:
estab, /path/to/structure.molden, /path/to/structure.xyz
Create a config file (config.yaml):
input: jobs.csv
dielectric: 1
multiwfn_path: /path/to/multiwfn
charge_types:
- Hirshfeld_I
multipole_order: 2
substrate_idxs: [1, 2, 3, 4, 5]
Run:
pyef -c config.yaml
Python API¶
from pyef.analysis import Electrostatics
import numpy as np
molden_paths = ['/path/to/structure1.molden', '/path/to/structure2.molden']
xyz_paths = ['/path/to/structure1.xyz', '/path/to/structure2.xyz']
es = Electrostatics(molden_paths, xyz_paths, dielectric=1.0)
# Record indices of the substrate in structure 1 and structure 2
sub_indices = [np.arange(0, 10), np.arange(10, 25)]
# Only Hirshfeld, Hirshfeld_I, and Becke are compatible with multipole_order=2
# Otherwise will default to multipole_order=1
df = es.getElectrostatic_stabilization(
charge_types='Hirshfeld',
multiwfn_path='/path/to/multiwfn',
substrate_idxs=sub_indices,
multipole_order=2
)
Key parameters:
substrate_idxs: List of atom indices for the substrate (one per structure)charge_types: Charge scheme (Hirshfeld, Hirshfeld_I, or Becke for multipole_order=2)multipole_order: 1 for monopole, 2 for dipole correctionsdielectric: Dielectric constant
Dielectric Constants¶
Common values:
1.0: Vacuum
2-4: Protein interior
20-40: Protein-solvent interface
78.5: Water
Output Files¶
Results are saved as CSV files with the specified filename prefix:
*_Efielddata.csv: Electric field results*_ESPdata.csv: Electrostatic potential results
Package Structure¶
pyef/
├── analysis.py # Main Electrostatics class
├── cli.py # Command-line interface
├── geometry.py # Geometry utilities
├── utility.py # Helper functions
└── multiwfn_interface.py # Multiwfn integration