pyef.analysis¶
Electrostatics Analysis Module for Quantum Chemistry Calculations.
This module provides the Electrostatics class for computing electrostatic properties (ESP, electric fields, partial charges) from quantum chemistry calculations. It interfaces with Multiwfn for charge analysis and supports various charge partitioning schemes (Hirshfeld, Hirshfeld-I, CHELPG, etc.).
- Key Features:
Multiple charge partitioning schemes
ESP and electric field calculations
Multipole moment analysis
QMMM point charge integration
Dielectric environment effects
ECP (Effective Core Potential) support
- Dependencies:
Multiwfn (external program)
NumPy, pandas, scipy
OpenBabel, BioPandas
Module Contents¶
Classes¶
Compute electrostatic properties from quantum chemistry calculations. |
Functions¶
|
Create a PDB file with charges visualized in the b-factor column. |
- class Electrostatics(molden_paths=None, xyz_paths=None, esp_atom_idx=None, ptchg_paths=None, pdb_charge_paths=None, **kwargs)¶
Compute electrostatic properties from quantum chemistry calculations.
This class processes series of quantum chemistry calculations to extract electrostatic properties including ESP, electric fields, and partial charges using various charge partitioning schemes via Multiwfn.
- molden_paths¶
Paths to .molden files, one per job (absolute or relative).
- Type:
list of str
- xyz_paths¶
Paths to .xyz files, one per job (absolute or relative).
- Type:
list of str
- esp_atom_idx¶
Atom indices for ESP evaluation (typically metal centers).
- Type:
list of int
- config¶
- Configuration dictionary containing:
hasECP (bool): Whether ECP was used in calculation
includePtChgs (bool): Include point charges in ESP calculation
ptChgfp (str): Path to point charge file
molden_filename (str): Name of molden file (compatibility)
xyzfilename (str): Name of xyz coordinate file (compatibility)
rerun (bool): Force recalculation of charges
maxIHirshBasis (int): Max basis functions for Hirshfeld-I
maxIHirshFuzzyBasis (int): Max basis for fuzzy Hirshfeld-I
ECP (str): ECP basis set family
dielectric (float): Dielectric constant
dielectric_scale (float): Scaling factor for dielectric
excludeAtomfromEcalc (list): Atoms to exclude from E-field calc
changeDielectBoundBool (bool): Use dielectric=1 for bonded atoms
visualize_ef (bool): Create PDB files for atom-wise E-fields
visualize_charges (bool): Create PDB files for partial charges
visualize_per_bond (bool): Create separate PDB files for each bond
skip_missing_files (bool): Skip calculations if required files missing (default: False)
- Type:
dict
- dict_of_calcs¶
Mapping of charge scheme names to Multiwfn command codes.
- Type:
dict
- dict_of_multipole¶
Mapping of multipole schemes to Multiwfn command sequences.
- Type:
dict
- ptchgdf¶
DataFrame containing point charge data for QMMM calculations.
- Type:
pd.DataFrame or None
- periodic_table¶
Mapping of element symbols to atomic numbers.
- Type:
dict
- amassdict¶
Atomic properties (mass, atomic number, covalent radius, valence).
- Type:
dict
Examples
>>> molden_paths = ['calc1/scr/optim.molden', 'calc2/scr/optim.molden'] >>> xyz_paths = ['calc1/scr/optim.xyz', 'calc2/scr/optim.xyz'] >>> es = Electrostatics(molden_paths, xyz_paths, dielectric=2.0) >>> >>> # For ESP calculations (metal indices required) >>> es_with_metals = Electrostatics(molden_paths, xyz_paths, esp_atom_idx=[0, 0], dielectric=2.0) >>> data = es_with_metals.getESP(['Hirshfeld', 'Hirshfeld_I'], 'esp_results', ...)
- run_multiwfn(command, input_commands, output_file=None, description='Multiwfn calculation', capture_output=True)¶
Centralized Multiwfn runner with proper error handling and output display.
- Parameters:
command (str) – The shell command to run Multiwfn (e.g., “multiwfn file.molden”)
input_commands (list of str) – List of commands to send to Multiwfn’s stdin
output_file (str, optional) – Path to redirect stdout to a file
description (str, optional) – Description of the calculation for error messages
capture_output (bool, optional) – If True, capture and display output. If False, let it stream to terminal
- Returns:
(stdout, stderr, returncode) from the Multiwfn process
- Return type:
tuple
- Raises:
RuntimeError – If Multiwfn fails with non-zero exit code
- updateCalcSettings(key, value)¶
Update calculation settings dictionary.
- Parameters:
key (str) – Setting key to update.
value (any) – New value for the setting.
- setExcludeAtomFromCalc(lstExcludeAtoms)¶
Set atoms to exclude from electric field calculations.
- Parameters:
lstExcludeAtoms (list of int) – Atom indices (0-indexed) to exclude from E-field calculations.
- includePtChgs(name_ptch_file)¶
Include point charges in ESP calculation (for QMMM).
- Parameters:
name_ptch_file (str or list) – If str, uses a single filename (e.g., ‘pointcharges.txt’) that is searched for in each job directory. If list, uses per-job file paths, one per job; use None for jobs without point charges.
Notes
Sets includePtChgs config to True and stores filename/paths. Point charges are typically from MM region in QM/MM calculations.
For filename mode (string), the file is automatically located in each job’s working directory. For list mode, paths may be absolute or relative to the current working directory.
Examples
# Single filename (applies per job directory) estat.includePtChgs(‘pointcharges.txt’)
# Per-job paths estat.includePtChgs([‘/path/job1/ptchg.txt’, None, ‘/path/job3/ptchg.txt’])
- set_dielec_scale(dielec)¶
Set dielectric scaling factor for point charges.
- Parameters:
dielec (float) – Scaling factor for dielectric in QMMM calculations.
- initialize_excludeAtomsFromEfieldCalc(atom_to_exclude)¶
Initialize atoms to exclude from electric field calculation.
- Parameters:
atom_to_exclude (list of int) – List of atom indices (0-indexed) to exclude.
- minDielecBonds(bool_bonds)¶
Set dielectric to 1 for bonded atoms in ESP calculation.
- Parameters:
bool_bonds (bool) – If True, use dielectric=1 for atoms bonded to ESP center. Helps avoid over-estimating screening from bound atoms.
- runlowmemory()¶
Enable low-memory mode for large systems (>300 atoms).
Notes
Switches Hirshfeld-I calculation to slower but lower-memory algorithm. Useful for large systems that would otherwise exceed memory limits.
- changeDielectric(dlc)¶
Change dielectric constant of the environment.
- Parameters:
dlc (float) – Dielectric constant of solvent/environment.
Notes
Used to model solvation effects in ESP calculations. Default is 1 (vacuum). Common values: water=80, protein=4.
- set_molden_filename(new_name)¶
Set the expected molden filename.
- Parameters:
new_name (str) – Name of the molden file to read.
- set_xyzfilename(new_name)¶
Set the expected XYZ coordinate filename.
- Parameters:
new_name (str) – Name of the XYZ file to read.
- rePrep()¶
Re-run data preparation (file extraction and validation).
Notes
Useful if files have been updated or regenerated.
- fix_allECPmolden()¶
Fix molden files that used Effective Core Potentials.
Notes
ECPs can cause artifacts in molden files that make them incompatible with Multiwfn. This method reformats the molden files to fix these issues. Processes all molden/xyz file pairs.
- prepData()¶
Validate that all specified .molden and .xyz files exist.
Checks that all file paths provided during initialization are valid and accessible. Removes invalid entries from processing lists.
- Raises:
FileNotFoundError – If any of the specified .molden or .xyz files cannot be found.
- prepData_pdb()¶
Validate PDB charge files for pdb_only mode.
Checks that all PDB files provided during initialization exist. Removes invalid entries from processing lists.
- Raises:
FileNotFoundError – If no valid PDB charge files are found.
- static getmultipoles(multipole_name)¶
Parse multipole moments from Multiwfn output file.
- Parameters:
multipole_name (str) – Path to multipole moment output file from Multiwfn.
- Returns:
List of dictionaries containing multipole data for each atom: - ‘Index’: atom index (str) - ‘Element’: element symbol (str) - ‘Atom_Charge’: atomic charge (float) - ‘Dipole_Moment’: dipole moment vector [x, y, z] (list of float) - ‘Quadrupole_Moment’: 3x3 quadrupole tensor (np.ndarray)
- Return type:
list of dict
Notes
Parses Multiwfn output format with sections separated by asterisks. Extracts monopole (charge), dipole, and quadrupole moments for each atom.
- getPtChgs(filename_pt)¶
Load point charges from file (for QMMM calculations).
- Parameters:
filename_pt (str) – Path to point charge file (typically from Amber/CHARMM).
- Returns:
DataFrame with columns: [‘Atom’, ‘charge’, ‘x’, ‘y’, ‘z’] All atoms labeled as ‘pnt’ (point charge).
- Return type:
pd.DataFrame
Notes
File format: whitespace-delimited with 2 header lines to skip.
- getPdbCharges(pdb_path)¶
Load point charges from a PDB file using the B-factor column.
- Parameters:
pdb_path (str) – Path to PDB file containing charges in the B-factor column.
- Returns:
DataFrame with columns: [‘Atom’, ‘charge’, ‘x’, ‘y’, ‘z’]
- Return type:
pd.DataFrame
Notes
Reads ATOM/HETATM records and parses fixed-width PDB columns. The B-factor column is interpreted as the partial charge (e).
- static mapcount(filename)¶
Rapidly count lines in a file using memory mapping.
- Parameters:
filename (str) – Path to file to count.
- Returns:
Number of lines in the file.
- Return type:
int
Notes
Uses mmap for efficient line counting in large files.
- monopole_esp(path_to_xyz, espatom_idx, charge_range, charge_file, ptchg_path=None)¶
Calculate electrostatic potential at specified atom.
- Parameters:
path_to_xyz (str) – Path to XYZ coordinate file.
espatom_idx (int) – Atom index (0-indexed) where ESP will be calculated.
charge_range (list of int) – Indices of atoms to include in ESP calculation.
charge_file (str) – Path to partial charge file from Multiwfn.
ptchg_path (str or None, optional) – Path to point charge file. If None, no point charges are included.
- Returns:
[ESP (float), atom_symbol (str)] ESP in Volts at the specified atom.
- Return type:
list
Notes
Uses Coulomb’s law: V = k * q / r with dielectric screening
Point charges from QMMM included if ptchg_path is provided
Bonded atoms treated with dielectric=1 if config[‘changeDielectBoundBool’] is True
Units: Volts (N·m/C = J/C)
- monopole_esp_df(df_charges, espatom_idx, charge_range)¶
Calculate ESP at a specified atom using charges from a DataFrame.
This is the pdb_only equivalent of monopole_esp(): it reads directly from a DataFrame (e.g. produced by getPdbCharges) instead of a file. The changeDielectBoundBool feature is not supported in this mode because bond detection requires an XYZ file.
- Parameters:
df_charges (pd.DataFrame) – DataFrame with columns [‘Atom’, ‘x’, ‘y’, ‘z’, ‘charge’]. Typically returned by getPdbCharges().
espatom_idx (int) – Atom index (0-indexed) at which the ESP is calculated.
charge_range (list of int) – Indices of atoms to include in the ESP sum.
- Returns:
[ESP (float in Volts), atom_symbol (str)]
- Return type:
list
- calc_firstTermE(charge_range, charge_file)¶
Input: espatom_idx: integer of atom index charge_range: list of integers of atom indices charge_file: string of charge filename Output: list of E-field vector and atomic symbol
- monopole_atomicEfield(espatom_idx, charge_range, charge_file, df_ptchg=None)¶
Input: espatom_idx: integer of atom index charge_range: list of integers of atom indices charge_file: string of charge filename df_ptchg: optional pre-loaded point charge dataframe Output: list of E-field vector and atomic symbol
- pointcharge_atomicEfield(espatom_idx, charge_range, df_charges)¶
Calculate E-field using point charges from a DataFrame only.
- Parameters:
espatom_idx (int) – Atom index (0-indexed) where the E-field is calculated.
charge_range (list or range) – Atom indices to include in the E-field calculation.
df_charges (pd.DataFrame) – DataFrame with columns: [‘Atom’, ‘x’, ‘y’, ‘z’, ‘charge’].
- Returns:
[E_vec, position_vec, atom_symbol, atom_wise_contributions] E_vec is in V/Angstrom.
- Return type:
list
- compute_esp(q_vals, probe_coords)¶
Compute ESP at probe_coords due to point charges at q_coords.
- compute_esp_from_qm(qm_charges, mm_coords)¶
Compute the ESP at MM atoms due to QM charges.
- multipole_atomicEfield(idx_atom, charge_range, xyz_file, atom_multipole_file, df_ptchg=None)¶
Input: idx_atom: integer of atom index charge_range: list of integers of atom indices xyz_file: string of xyz filename atom_multipole_file: string of multipole filename df_ptchg: optional pre-loaded point charge dataframe Output: list of E-field vector and atomic symbol, Efields are reported in units of Volts/Angstrom
- multipole_esp(xyzfilepath, atom_multipole_file, charge_range, idx_atom, ptchg_path=None)¶
Input: idx_atom: integer of atom index charge_range: list of integers of atom indices xyfilepath: string of xyz filename atom_multipole_file: string of multipole filename ptchg_path: optional path to point charge file Output: list of ESP and atomic symbol Units of ESP is Volts!
- calc_bond_dipoles(bond_indices, xyz_filepath, atom_multipole_file, bool_multipole, charge_df=None)¶
Calculate bond dipole moments for given bond indices.
For each bond between atoms A and B, calculates the bond dipole moment vector relative to the geometric center (midpoint) of the bond. This is origin-independent for both neutral and charged bonds: μ_bond = q_A * (r_A - r_center) + q_B * (r_B - r_center) which simplifies to: μ_bond = (q_A - q_B) * (r_A - r_B) / 2
- Parameters:
bond_indices (list of tuples) – List of (atomA_idx, atomB_idx) tuples specifying bonds
xyz_filepath (str) – Path to XYZ structure file
atom_multipole_file (str) – Path to multipole/charge file
bool_multipole (bool) – If True, use multipole file; if False, use monopole charges
charge_df (pd.DataFrame or None) – Optional DataFrame with monopole charges (columns: Atom, x, y, z, charge). If provided, overrides charge file input.
- Returns:
List of bond dipole information dictionaries containing: - ‘bond_dipole_vec’: [x, y, z] dipole vector in Debye - ‘bond_dipole_mag’: magnitude of dipole in Debye - ‘bond_indices’: (atomA_idx, atomB_idx) - ‘charges’: (charge_A, charge_B)
- Return type:
list
- bondEfield(bond_indices, xyz_filepath, atom_multipole_file, all_lines, bool_multipole, df_ptchg=None)¶
Input: bond_indices: list of tuples of atom indices xyz_filepath: string of xyz filename atom_multipole_file: string of multipole filename all_lines: list of strings of lines in xyz file df_ptchg: optional pre-loaded point charge dataframe Output: list of E-field vector and atomic symbol in units of V/Angstrom
- bondEfield_pointcharges(bond_indices, df_charges, all_lines)¶
Compute bond-projected E-fields using point charges only.
- Parameters:
bond_indices (list of tuples) – List of (atomA_idx, atomB_idx) tuples specifying bonds.
df_charges (pd.DataFrame) – DataFrame with columns: [‘Atom’, ‘x’, ‘y’, ‘z’, ‘charge’].
all_lines (list) – Atom indices to include in the E-field calculation.
- Returns:
[E_projected, bonded_atoms, bond_indices, bond_lens, E_proj_atomwise, E_proj_atomwise_list]
- Return type:
list
- esp_coord_shell(metal_idx, charge_file, path_to_xyz, n_shells=1)¶
Calculate ESP accounting for electrostatic contributions from n coordination shells.
This unified method replaces esp_first_coord() and esp_second_coord() by accepting a parameter to specify how many coordination shells to include.
- Parameters:
metal_idx (int) – Atom index of the central metal
charge_file (str) – Path to charge filename
path_to_xyz (str) – Path to xyz structure file
n_shells (int, optional) – Number of coordination shells to include (default=1) n_shells=1: first coordination shell only n_shells=2: first and second coordination shells
- Returns:
ESP value for the specified coordination shell(s)
- Return type:
float
- charge_atom(atom_idx)¶
Compute charges and return total charge and partial charge of atom Input: filename: string of charge filename atom_idx: integer of atom index Output: list of total charge and partial charge of atom
- charge_atoms(chg_filename, xyzfilename)¶
Input: filename: string of charge filename or multipole filename… if the xyzfilename is included must be multipole!
- getAtomInfo(atom_idx)¶
Input: filename: string of charge filename atom_idx: integer of atom index Output: list of total charge and partial charge of atom
- getAtomsInfo(atom_indices)¶
Input: filename: string of charge filename atom_indices: list of integers of atom indices Output: list of total charge and partial charge of atom
- esp_bydistance(path_to_xyz, espatom_idx, charge_file)¶
Input: path_to_xyz: string of xyz filename espatom_idx: integer of atom index charge_file: string of charge filename Output: list of ESP and atomic symbol
- getESP(charge_types='CM5', ESPdata_filename='ESP', multiwfn_path='None', use_multipole=False, include_decay=False, include_coord_shells=False, visualize=None, dielectric=1, dielectric_scale=1, includePtChgs=None, ptchg_paths=None, pdb_charge_paths=None)¶
Unified ESP calculation method supporting multiple modes.
This consolidated method replaces getESPData(), getESPMultipole(), and getESPDecay() to eliminate code duplication while preserving all functionality.
- Parameters:
charge_types (list of str or str) – Charge partitioning scheme(s). Options: ‘Hirshfeld’, ‘Voronoi’, ‘Mulliken’, ‘Lowdin’, ‘SCPA’, ‘Becke’, ‘ADCH’, ‘CHELPG’, ‘MK’, ‘AIM’, ‘Hirshfeld_I’, ‘CM5’, ‘EEM’, ‘RESP’, ‘PEOE’. Can be a single string (for multipole mode) or list of strings.
ESPdata_filename (str) – Output CSV filename (without extension).
multiwfn_path (str) – Path to Multiwfn executable. Ignored when pdb_charge_paths is provided.
use_multipole (bool, optional) – If True, use multipole expansion; if False, use monopole charges (default: False).
include_decay (bool, optional) – If True, include distance-sorted ESP decay analysis (default: False).
include_coord_shells (bool, optional) – If True, include first and second coordination shell ESP (default: False).
visualize (bool or None, optional) – If True, create PDB file with ESP values in B-factor column. None uses config defaults (default: None).
dielectric (float, optional) – Dielectric constant (default: 1).
dielectric_scale (float, optional) – Scaling factor for point charges. Point charge contributions are scaled by 1/sqrt(dielectric_scale). Typically used in QM/MM calculations (default: 1).
includePtChgs (bool or None, optional) – Override object-level includePtChgs setting for this calculation. If None, uses object-level setting (default: None).
ptchg_paths (list of str or None, optional) – Override point charge file paths for this calculation. List of paths (one per job), with None for jobs without point charges. If provided, takes precedence over object-level settings (default: None).
pdb_charge_paths (str or list of str or None, optional) – If provided, use partial charges from PDB B-factor columns only (no Multiwfn). Can be a single PDB path or a list of paths (one per job), with None to skip a job. Overrides the object-level pdb_charge_paths if the object was created in pdb_only mode. This mode supports monopole charges only (use_multipole must be False).
- Returns:
pd.DataFrame – DataFrame with ESP results saved to CSV.
Output Files
————
- Monopole (esp_{charge_type}.csv)
- Multipole (esp_multipole_{charge_type}.csv)
- Examples (esp_Hirshfeld_I.csv, esp_multipole_Hirshfeld_I.csv)
- Contains ESP values at metal centers for all jobs
- Columns (job name, charge type, ESP values, coordinates, etc.)
**PDB file (if visualize=True) (** esp_{charge_type}_{structure}_atom{idx}.pdb)
- One file per job and metal atom
- B-factor column contains ESP values at each atom
- Example (esp_Hirshfeld_I_complex_atom25.pdb)
Note (If you provide a custom filename (not ‘output’, ‘esp’, etc.),)
your filename will be used instead of automatic naming.
Examples
>>> # Monopole mode (replaces getESPData) >>> es.getESP(['Hirshfeld', 'Hirshfeld_I'], 'esp_mono', ...)
>>> # Multipole mode (replaces getESPMultipole) >>> es.getESP('Hirshfeld_I', 'esp_multi', ..., use_multipole=True)
>>> # Decay analysis mode (replaces getESPDecay) >>> es.getESP(['Hirshfeld'], 'esp_decay', ..., include_decay=True, include_coord_shells=True)
>>> # pdb_only mode (no Multiwfn required) >>> es.getESP(pdb_charge_paths=['job1.pdb', 'job2.pdb'])
- getEfield(charge_types='Hirshfeld_I', Efielddata_filename='ef', multiwfn_path=None, multipole_bool=False, input_bond_indices=[], auto_find_bonds=False, save_atomwise_decomposition=False, visualize=None, dielectric=1, dielectric_scale=1, includePtChgs=None, ptchg_paths=None, pdb_charge_paths=None)¶
Unified E-field calculation method supporting multiple modes.
This consolidated method replaces getEFieldMultipole(), getEfield_acrossBond(), and getEfield_decomposable() to eliminate code duplication while preserving all functionality.
- Parameters:
charge_types (str or list of str) – Charge partitioning scheme. If a list is provided, only the first entry is used. Options: ‘Hirshfeld’, ‘Becke’, ‘Hirshfeld_I’, etc. When pdb_charge_paths is provided, this value is used for labeling only.
Efielddata_filename (str) – Output CSV filename (without extension).
multiwfn_path (str) – Path to Multiwfn executable.
multipole_bool (bool, optional) – If True, use multipole expansion; if False, use monopole charges (default: False).
input_bond_indices (list, optional) – Bond indices as list of tuples [(atomA, atomB), …]. If empty, bonds are auto-found using metal indices (esp_atom_idx). If provided, bonds are used as-is unless auto_find_bonds=True.
auto_find_bonds (bool, optional) – If True, automatically find bonds to adjacent atoms using getBondedAtoms() and ignore input_bond_indices (default: False).
save_atomwise_decomposition (bool, optional) – If True, save atom-wise E-field decomposition to CSV (default: False).
visualize (bool or None, optional) – Override config visualization settings. None uses config defaults (default: None).
dielectric (float, optional) – Dielectric constant (default: 1).
dielectric_scale (float, optional) – Scaling factor for point charges. Point charge contributions are scaled by 1/sqrt(dielectric_scale). Typically used in QM/MM calculations (default: 1).
includePtChgs (bool or None, optional) – Override object-level includePtChgs setting for this calculation. If None, uses object-level setting (default: None).
ptchg_paths (list of str or None, optional) – Override point charge file paths for this calculation. List of paths (one per job), with None for jobs without point charges. If provided, takes precedence over object-level settings (default: None).
pdb_charge_paths (str or list of str or None, optional) – If provided, use partial charges from PDB B-factor columns only (no Multiwfn). Can be a single PDB path or a list of paths (one per job), with None to skip a job. This mode supports monopole charges only (multipole_bool must be False).
- Returns:
pd.DataFrame or tuple of pd.DataFrame – If save_atomwise_decomposition=False: returns DataFrame with E-field results If save_atomwise_decomposition=True: returns (total_df, atomwise_df) tuple
Output Files
————
- Monopole (ef_{charge_type}.csv)
- Multipole (ef_multipole_{charge_type}.csv)
- Examples (ef_Hirshfeld_I.csv, ef_multipole_Hirshfeld_I.csv)
- Contains E-field magnitudes and projections for all jobs and bonds
- Columns (job name, bond atoms, E-field magnitude (MV/cm), projections, etc.)
- Format (ef_{charge_type}_atomwise.csv or ef_multipole_{charge_type}_atomwise.csv)
- Example (ef_Hirshfeld_I_atomwise.csv)
- Per-atom E-field contributions for each bond
- Shows which atoms contribute most to the E-field
- Columns (job, bond, atom index, contribution magnitude, etc.)
- Per bond (ef_{charge_type}_{structure}_bond{i}-{j}.pdb)
- Combined (ef_{charge_type}_{structure}_combined.pdb)
- B-factor contains E-field contribution from each atom
- Examples (ef_Hirshfeld_I_complex_bond25-26.pdb)
Note (If you provide a custom filename (not ‘output’, ‘efield’, etc.),)
your filename will be used instead of automatic naming.
Examples
>>> # Basic multipole mode with specified bonds >>> es.getEfield('Hirshfeld_I', 'efield', ..., input_bond_indices=[(0,1), (0,2)])
>>> # Auto-find bonds to adjacent atoms >>> es.getEfield('Hirshfeld_I', 'efield', ..., auto_find_bonds=True)
>>> # Monopole mode with atom-wise decomposition >>> es.getEfield('Hirshfeld', 'efield', ..., multipole_bool=False, save_atomwise_decomposition=True)
- getCharges(charge_types='Hirshfeld_I', multiwfn_path=None, output_filename='charges', write_pdb=False, pdb_bfactor=True)¶
Compute partial charges for all atoms using specified charge partitioning scheme(s).
This method provides a standalone interface for computing partial charges without calculating E-fields or ESP. Charges can optionally be written to PDB files with the B-factor column containing the charge values.
- Parameters:
charge_types (str or list of str) – Charge partitioning scheme(s) to use. Options include: ‘RESP’, ‘Hirshfeld’, ‘Hirshfeld_I’, ‘CHELPG’, ‘Mulliken’, ‘Lowdin’, ‘SCPA’, ‘Becke’, ‘ADCH’, ‘Voronoi’, ‘MK’, ‘AIM’, ‘CM5’, ‘EEM’, ‘PEOE’
multiwfn_path (str) – Path to the Multiwfn executable.
output_filename (str, optional) – Base name for output files (default: ‘charges’). CSV will be saved as ‘{output_filename}.csv’.
write_pdb (bool, optional) – If True, create PDB files for each structure (default: False).
pdb_bfactor (bool, optional) – If True and write_pdb=True, store charges in B-factor column (default: True). This allows visualization of charges in molecular viewers like PyMOL.
- Returns:
pd.DataFrame – DataFrame containing partial charges for all atoms in each structure. Columns include: ‘Job’, ‘Charge_Type’, ‘Atom_Index’, ‘Element’, ‘x’, ‘y’, ‘z’, ‘Charge’.
Output Files
————
**CSV file (** {output_filename}.csv) – Contains all partial charges for all atoms and all jobs.
**PDB files (if write_pdb=True) (** {structure}_{charge_type}_charges.pdb) – One PDB file per job per charge type. B-factor column contains the partial charge values.
Examples
>>> es = Electrostatics(molden_paths, xyz_paths) >>> >>> # Get RESP charges for all atoms >>> df = es.getCharges('RESP', '/path/to/multiwfn') >>> >>> # Get multiple charge types with PDB output >>> df = es.getCharges(['RESP', 'Hirshfeld_I'], '/path/to/multiwfn', ... output_filename='my_charges', write_pdb=True) >>> >>> # Access charges for a specific atom >>> atom_5_charges = df[df['Atom_Index'] == 5]
Notes
Charges are computed via Multiwfn and cached in ‘Charges{type}.txt’ files.
Subsequent calls with the same charge type will reuse cached results unless config[‘rerun’] is True.
PDB files with B-factor charges can be visualized in PyMOL using: spectrum b, blue_white_red, minimum=-1, maximum=1
- get_residues(auto_gen, solute_solvent_dict)¶
- partitionCharge(multipole_bool, f, multiwfn_path, charge_type, owd)¶
Partition electron density using Multiwfn to generate partial charges or multipole moments.
This is the centralized interface for all charge partitioning schemes (Hirshfeld, CHELPG, etc.). Checks if calculation was previously completed and reuses results unless rerun=True.
Parameters:¶
- multipole_boolbool
True to compute multipole moments, False for monopoles only
- fstr
Complete path to folder containing the calculation
- multiwfn_pathstr
Path to Multiwfn executable
- charge_typestr
Partitioning scheme (e.g., ‘Hirshfeld’, ‘CHELPG’, ‘Hirshfeld_I’)
- owdstr
Original working directory
Returns:¶
- comp_costfloat
Computation time in seconds, 0 if calculation was previously completed (skipped), or -1 if calculation failed
Notes:¶
For Hirshfeld-I calculations, atmrad files are automatically loaded from pyef.resources.atmrad (bundled with the package).
- get_residueDipoles(charge_type, multiwfn_path, multipole_bool, solute_indices=[], num_atoms_solvent=0, save_distance_dipoles=False)¶
Function computes a partial charges of all atoms in one system… if the desired file already exists the just return it!! Inputs: ——- charge_type: string, possible choices include: ‘Hirshfeld’, ‘Voronoi’, ‘Mulliken’, ‘Lowdin’, ‘SCPA’, ‘Becke’, ‘ADCH’, ‘CHELPG’, ‘MK’, ‘AIM’, ‘Hirshfeld_I’, ‘CM5’, ‘EEM’, ‘RESP’, ‘PEOE’}
Also accepts multipole options including: ‘Hirshfeld’: [‘3’, ‘2’], ‘Hirshfeld_I’: [‘4’, ‘1’, ‘2’], ‘Becke’: [‘1’, ‘2’]
- getpartialchgs(charge_types, lst_atom_idxs, partial_chg_filename, multiwfn_path)¶
Function computes partial charges on a select set of atoms using the charge scheme specified in charge types. Note atom indices will be carried over between csvs
- Inputs:
– charge_types: list(str)
list of strings
- lst_of_atom_idxs: list(int)
list of integers denoting atom indices (0 indexed!)
- partial_chg_filename: string
Name of the output file name
- multiwfn_path: string
Path to the multiwfn executable
Path to the atmrad executable
- Outputs:
– df: Pandas Dataframe with partial charge info
Will Create a csv file entitled partial_chg_filename.csv with partial charge info
- compute_dipole(res_chgs, nuc_chgs)¶
res_coords in units of angstroms, res_chgs are in a.u. units returns dipole in units of Debye
- getdipole_residues(res_dict, df)¶
res_dict: dictionary with strings mapped to list of lists(int) strings denote name of residues which are mapped to the associated atom indices in the lists(0 indexed!)
- getcharge_residues(charge_types, res_dict, partial_chg_filename, multiwfn_path, multipole_mode=True, polarization_scheme='Hirshfeld_I')¶
Function computes partial charges on a select set of atoms using the charge scheme specified in charge types. Note atom indices will be carried over between csvs
Inputs:¶
- charge_types: list(str)
list of strings
- res_dict: dictionary with strings mapped to list of lists(int)
strings denote name of residues which are mapped to the associated atom indices in the lists(0 indexed!)
- partial_chg_filename: string
Name of the output file name
- multiwfn_path: string
Path to the multiwfn executable
Path to the atmrad executable
- multipole_mode: boolean
If True, will use multipole mode to compute partial charges, if False will use the standard method
Outputs:
Notes
Will Create a csv file entitled partial_chg_filename.csv with partial charge info
- compute_atomwise_dipole_contributions(atomwise_Efield, df_substrate, df_env, df_all_atoms, one_mol, filename)¶
Compute atom-wise energy contributions from environment E-fields interacting with substrate dipoles.
The interaction energy is: E = -μ·E (negative dot product of dipole moment with electric field)
Parameters:¶
- atomwise_Efielddict
Dictionary with (substrate_idx, env_idx) as keys and E-field vector [Ex, Ey, Ez] (in V/Å) as values
- df_substrateDataFrame
DataFrame containing substrate atom information
- df_envDataFrame
DataFrame containing environment atom information
- df_all_atomsDataFrame
DataFrame containing all atom information (for element names)
- one_molfloat
Avogadro’s number
- filenamestr
Name of the current file being processed
Returns:¶
- DataFrame with columns: FileName, Env_Atom_Index, Env_Element, Total_Energy_Contribution_kcal_mol,
Substrate_Atom_Index, Substrate_Element, Energy_Contribution_kcal_mol
- visualize_atomwise_contributions(df_atomwise, file_path_xyz, charge_file, filename, charge_type)¶
Create PDB visualization of atom-wise energy contributions.
Parameters:¶
- df_atomwiseDataFrame
DataFrame with atom-wise contribution data from _compute_atomwise_contributions
- file_path_xyzstr
Path to XYZ file
- charge_filestr
Path to charge/multipole file
- filenamestr
Folder/file name for output
- charge_typestr
Type of charge calculation used
- compute_atomwise_contributions(atomwise_ESP, df_substrate, df_env, df_all_atoms, one_mol, filename)¶
Compute atom-wise energy contributions from environment atoms to substrate stabilization.
Parameters:¶
- atomwise_ESPdict
Dictionary with (substrate_idx, env_idx) as keys and ESP contribution (in Volts) as values
- df_substrateDataFrame
DataFrame containing substrate atom information
- df_envDataFrame
DataFrame containing environment atom information
- df_all_atomsDataFrame
DataFrame containing all atom information (for element names)
- one_molfloat
Avogadro’s number
- filenamestr
Name of the current file being processed
Returns:¶
- DataFrame with columns: FileName, Env_Atom_Index, Env_Element, Total_Energy_Contribution_kcal_mol,
Substrate_Atom_Index, Substrate_Element, Energy_Contribution_kcal_mol
- build_interaction_tensor(r_i, r_j, multipole_order, dielectric)¶
Build the multipole interaction tensor T_ij using AMOEBA formalism.
This tensor contains derivatives of 1/r and enables computation of all multipole-multipole interactions via: E = M_i^T · T_ij · M_j
Parameters:¶
- r_i, r_jarray-like
Position vectors of atoms i and j in Angstroms
- multipole_orderint
Maximum multipole order (1=monopole, 2=+dipole, 3=+quadrupole)
- dielectricfloat
Dielectric constant
Returns:¶
- Tndarray
Interaction tensor with dimensions based on multipole_order - Order 1: 1×1 (monopole-monopole only) - Order 2: 4×4 (monopole + dipole) - Order 3: 9×9 (monopole + dipole + quadrupole, traceless)
Notes:¶
Tensor elements represent spatial derivatives of Coulomb potential: - T[0,0]: 1/r (monopole-monopole) - T[0,1:4]: ∂(1/r)/∂r_j (monopole-dipole) - T[1:4,1:4]: ∂²(1/r)/∂r_i∂r_j (dipole-dipole) - Higher orders follow similar pattern
- build_multipole_vector(atom_data, multipole_order)¶
Build multipole moment vector M for an atom using AMOEBA formalism.
Parameters:¶
- atom_datadict or pd.Series
Atom data containing ‘Atom_Charge’, ‘Dipole_Moment’, ‘Quadrupole_Moment’
- multipole_orderint
Maximum multipole order (1=monopole, 2=+dipole, 3=+quadrupole)
Returns:¶
- Mndarray
Multipole vector [q, μ_x, μ_y, μ_z, Q_xx, Q_xy, …] in SI-based units - Order 1: [q] (1 element) - Order 2: [q, μ_x, μ_y, μ_z] (4 elements) - Order 3: [q, μ_x, μ_y, μ_z, Q_xx, Q_xy, Q_xz, Q_yy, Q_yz] (9 elements, traceless)
Notes:¶
Charges are dimensionless (in units of elementary charge)
Dipoles are converted from Bohr to meters
Quadrupoles are converted from Bohr² to meters²
Uses traceless quadrupole representation (Q_zz = -Q_xx - Q_yy)
- getElectrostatic_stabilization(multiwfn_path=None, substrate_idxs=[], charge_type='Hirshfeld_I', name_dataStorage='estatic', env_idxs=None, save_atomwise_decomposition=False, visualize=None, multipole_order=2, substrate_multipole_order=None, env_multipole_order=None, dielectric=1, dielectric_scale=1, includePtChgs=None, ptchg_paths=None, pdb_charge_paths=None)¶
Compute electrostatic stabilization using AMOEBA-style multipole tensor formalism.
This function implements the complete multipole expansion for intermolecular electrostatic interactions using the tensor formalism:
E_total = Σᵢ∈substrate Σⱼ∈environment M_i^T · T_ij · M_j
where M is the multipole vector and T_ij is the interaction tensor containing all derivatives of the Coulomb potential. This naturally includes ALL multipole-multipole interaction terms:
Order 1 (monopole): charge-charge (q×q)
Order 2 (+ dipole): + charge-dipole (q×μ) + dipole-dipole (μ×μ)
Order 3 (+ quadrupole): + charge-quadrupole (q×Q) + dipole-quadrupole (μ×Q) + quadrupole-quadrupole (Q×Q)
This is the formalism used in the AMOEBA polarizable force field and provides a complete treatment of distributed multipole interactions.
Parameters:¶
- multiwfn_pathstr
Path to Multiwfn executable
- substrate_idxslist
List of substrate atom indices (the molecule being stabilized)
- charge_typestr, optional
Charge partitioning scheme (default: ‘Hirshfeld_I’)
- name_dataStoragestr, optional
Output filename prefix (default: ‘estaticFile_tensor’)
- env_idxslist or None, optional
List of environment atom indices. If None, uses all non-substrate atoms (default: None)
- save_atomwise_decompositionbool, optional
If True, save atom-wise decomposition to CSV showing each substrate atom’s contribution (default: False)
- visualizebool or None, optional
If True, create PDB files with energy contributions in B-factor column. None uses config defaults (default: None)
- multipole_orderint, optional
Order of multipole expansion: - 1: monopole only (charge-charge) - 2: monopole + dipole (includes charge-charge, charge-dipole, dipole-dipole) - 3: monopole + dipole + quadrupole (includes all terms up to Q×Q) (default: 2)
- substrate_multipole_orderint or None, optional
Multipole order for substrate atoms. If None, uses multipole_order. Set to 1 for QM/MM where substrate is MM (charges only). (default: None)
- env_multipole_orderint or None, optional
Multipole order for environment atoms. If None, uses multipole_order. Set to 1 for QM/MM where environment is MM (charges only). (default: None)
- dielectricfloat, optional
Dielectric constant (default: 1).
- dielectric_scalefloat, optional
Scaling factor for point charges. Point charge contributions are scaled by 1/sqrt(dielectric_scale). Typically used in QM/MM calculations (default: 1).
- includePtChgsbool or None, optional
Include point charges as environment atoms only. If None, uses object-level setting (default: None). Point charges are ONLY added to environment, never to substrate.
- ptchg_pathslist of str or None, optional
Override point charge file paths for this calculation. List of paths (one per job), with None for jobs without point charges. Point charges are treated as monopole-only environment atoms. If provided, takes precedence over object-level settings (default: None).
- pdb_charge_pathsstr or list of str or None, optional
If provided, use partial charges from PDB B-factor columns only (no Multiwfn). Monopole-only (multipole_order is forced to 1). Coordinates are also read from the PDB, so no separate xyz file is needed. Overrides the object-level pdb_charge_paths when the object was created in pdb_only mode.
Returns:¶
- pd.DataFrame or tuple of pd.DataFrame
If save_atomwise_decomposition=False: returns DataFrame with total stabilization energies If save_atomwise_decomposition=True: returns (total_df, atomwise_df) tuple
Output Files¶
Main CSV file: Auto-generated with format: - Monopole: estab_{charge_type}.csv - Multipole: estab_multipole{N}_{charge_type}.csv (N = multipole order) - Examples: estab_Hirshfeld_I.csv, estab_multipole2_Hirshfeld_I.csv - Contains total electrostatic stabilization energies (kcal/mol) for all jobs - Columns: job name, total energy, substrate/environment details, etc.
Atomwise CSV (if save_atomwise_decomposition=True): - Format: estab_{charge_type}_atomwise.csv or estab_multipole{N}_{charge_type}_atomwise.csv - Examples: estab_Hirshfeld_I_atomwise.csv, estab_multipole2_Hirshfeld_I_atomwise.csv - Per-atom energy contributions showing which atoms contribute most - Columns: job, atom index, energy contribution, coordinates, etc.
PDB file (if visualize=True): estab_{charge_type}_{structure}_tensor_sub{N}_env{M}.pdb - B-factor contains energy contribution from each environment atom to substrate - N = substrate multipole order, M = environment multipole order - Example: estab_Hirshfeld_I_complex_tensor_sub2_env1.pdb
Note: If you provide a custom filename (not ‘output’, ‘estatic’, etc.), your filename will be used instead of automatic naming.
Notes:¶
Comparison with other methods: - getElectrostatic_stabilization(): Computes q·V (substrate charge in environment potential) - get_Electrostatic_stabilization_dipole(): Computes -μ·E (substrate dipole in environment field) - This function: Computes DIRECT pairwise multipole-multipole interactions
Advantages of tensor formalism: - Complete: automatically includes all multipole-multipole terms - Symmetric: properly treats both substrate and environment multipoles - Efficient: single matrix multiplication per atom pair - Extensible: easy to add higher-order terms
Important: - Requires Multiwfn multipole analysis for orders ≥ 2 - Automatic fallback: If multipole files don’t exist, automatically uses charges-only files - QM/MM support: Substrate and environment can have different multipole orders (e.g., QM substrate with multipoles, MM environment with charges only) - Results are the electrostatic STABILIZATION energy (environment → substrate) - Energy is in kcal/mol - Follows AMOEBA force field formalism (Ren & Ponder, J. Phys. Chem. B 2003)
Examples:¶
>>> # Compute monopole + dipole + dipole-dipole interactions >>> df = estat.get_Electrostatic_stabilization_tensor( ... multiwfn_path, ... substrate_idxs=[[0, 1, 2]], # 3-atom substrate ... multipole_order=2 # Include dipole-dipole terms ... )
>>> # Include quadrupole terms with atom-wise decomposition >>> df, df_atomwise = estat.get_Electrostatic_stabilization_tensor( ... multiwfn_path, ... substrate_idxs=[[0, 1, 2]], ... multipole_order=3, ... save_atomwise_decomposition=True ... )
>>> # QM/MM: QM substrate (with dipoles) interacting with MM environment (charges only) >>> df = estat.get_Electrostatic_stabilization_tensor( ... multiwfn_path, ... substrate_idxs=[[0, 1, 2]], # QM region ... substrate_multipole_order=2, # QM: charges + dipoles ... env_multipole_order=1 # MM: charges only ... )
- visualize_charges_pdb(xyz_file, charges, pdb_name)¶
Create a PDB file with charges visualized in the b-factor column.
- Parameters:
xyz_file (str) – Path to the XYZ structure file
charges (array-like) – Array of atomic charges to visualize
pdb_name (str) – Output PDB filename
- Returns:
Writes PDB file to disk
- Return type:
None