pyef.analysis

Electrostatics Analysis Module for Quantum Chemistry Calculations.

This module provides the Electrostatics class for computing electrostatic properties (ESP, electric fields, partial charges) from quantum chemistry calculations. It interfaces with Multiwfn for charge analysis and supports various charge partitioning schemes (Hirshfeld, Hirshfeld-I, CHELPG, etc.).

Key Features:
  • Multiple charge partitioning schemes

  • ESP and electric field calculations

  • Multipole moment analysis

  • QMMM point charge integration

  • Dielectric environment effects

  • ECP (Effective Core Potential) support

Dependencies:
  • Multiwfn (external program)

  • NumPy, pandas, scipy

  • OpenBabel, BioPandas

Module Contents

Classes

Electrostatics

Compute electrostatic properties from quantum chemistry calculations.

Functions

visualize_charges_pdb(xyz_file, charges, pdb_name)

Create a PDB file with charges visualized in the b-factor column.

class Electrostatics(molden_paths=None, xyz_paths=None, esp_atom_idx=None, ptchg_paths=None, pdb_charge_paths=None, **kwargs)

Compute electrostatic properties from quantum chemistry calculations.

This class processes series of quantum chemistry calculations to extract electrostatic properties including ESP, electric fields, and partial charges using various charge partitioning schemes via Multiwfn.

molden_paths

Paths to .molden files, one per job (absolute or relative).

Type:

list of str

xyz_paths

Paths to .xyz files, one per job (absolute or relative).

Type:

list of str

esp_atom_idx

Atom indices for ESP evaluation (typically metal centers).

Type:

list of int

config
Configuration dictionary containing:
  • hasECP (bool): Whether ECP was used in calculation

  • includePtChgs (bool): Include point charges in ESP calculation

  • ptChgfp (str): Path to point charge file

  • molden_filename (str): Name of molden file (compatibility)

  • xyzfilename (str): Name of xyz coordinate file (compatibility)

  • rerun (bool): Force recalculation of charges

  • maxIHirshBasis (int): Max basis functions for Hirshfeld-I

  • maxIHirshFuzzyBasis (int): Max basis for fuzzy Hirshfeld-I

  • ECP (str): ECP basis set family

  • dielectric (float): Dielectric constant

  • dielectric_scale (float): Scaling factor for dielectric

  • excludeAtomfromEcalc (list): Atoms to exclude from E-field calc

  • changeDielectBoundBool (bool): Use dielectric=1 for bonded atoms

  • visualize_ef (bool): Create PDB files for atom-wise E-fields

  • visualize_charges (bool): Create PDB files for partial charges

  • visualize_per_bond (bool): Create separate PDB files for each bond

  • skip_missing_files (bool): Skip calculations if required files missing (default: False)

Type:

dict

dict_of_calcs

Mapping of charge scheme names to Multiwfn command codes.

Type:

dict

dict_of_multipole

Mapping of multipole schemes to Multiwfn command sequences.

Type:

dict

ptchgdf

DataFrame containing point charge data for QMMM calculations.

Type:

pd.DataFrame or None

periodic_table

Mapping of element symbols to atomic numbers.

Type:

dict

amassdict

Atomic properties (mass, atomic number, covalent radius, valence).

Type:

dict

Examples

>>> molden_paths = ['calc1/scr/optim.molden', 'calc2/scr/optim.molden']
>>> xyz_paths = ['calc1/scr/optim.xyz', 'calc2/scr/optim.xyz']
>>> es = Electrostatics(molden_paths, xyz_paths, dielectric=2.0)
>>>
>>> # For ESP calculations (metal indices required)
>>> es_with_metals = Electrostatics(molden_paths, xyz_paths, esp_atom_idx=[0, 0], dielectric=2.0)
>>> data = es_with_metals.getESP(['Hirshfeld', 'Hirshfeld_I'], 'esp_results', ...)
run_multiwfn(command, input_commands, output_file=None, description='Multiwfn calculation', capture_output=True)

Centralized Multiwfn runner with proper error handling and output display.

Parameters:
  • command (str) – The shell command to run Multiwfn (e.g., “multiwfn file.molden”)

  • input_commands (list of str) – List of commands to send to Multiwfn’s stdin

  • output_file (str, optional) – Path to redirect stdout to a file

  • description (str, optional) – Description of the calculation for error messages

  • capture_output (bool, optional) – If True, capture and display output. If False, let it stream to terminal

Returns:

(stdout, stderr, returncode) from the Multiwfn process

Return type:

tuple

Raises:

RuntimeError – If Multiwfn fails with non-zero exit code

updateCalcSettings(key, value)

Update calculation settings dictionary.

Parameters:
  • key (str) – Setting key to update.

  • value (any) – New value for the setting.

setExcludeAtomFromCalc(lstExcludeAtoms)

Set atoms to exclude from electric field calculations.

Parameters:

lstExcludeAtoms (list of int) – Atom indices (0-indexed) to exclude from E-field calculations.

includePtChgs(name_ptch_file)

Include point charges in ESP calculation (for QMMM).

Parameters:

name_ptch_file (str or list) – If str, uses a single filename (e.g., ‘pointcharges.txt’) that is searched for in each job directory. If list, uses per-job file paths, one per job; use None for jobs without point charges.

Notes

Sets includePtChgs config to True and stores filename/paths. Point charges are typically from MM region in QM/MM calculations.

For filename mode (string), the file is automatically located in each job’s working directory. For list mode, paths may be absolute or relative to the current working directory.

Examples

# Single filename (applies per job directory) estat.includePtChgs(‘pointcharges.txt’)

# Per-job paths estat.includePtChgs([‘/path/job1/ptchg.txt’, None, ‘/path/job3/ptchg.txt’])

set_dielec_scale(dielec)

Set dielectric scaling factor for point charges.

Parameters:

dielec (float) – Scaling factor for dielectric in QMMM calculations.

initialize_excludeAtomsFromEfieldCalc(atom_to_exclude)

Initialize atoms to exclude from electric field calculation.

Parameters:

atom_to_exclude (list of int) – List of atom indices (0-indexed) to exclude.

minDielecBonds(bool_bonds)

Set dielectric to 1 for bonded atoms in ESP calculation.

Parameters:

bool_bonds (bool) – If True, use dielectric=1 for atoms bonded to ESP center. Helps avoid over-estimating screening from bound atoms.

runlowmemory()

Enable low-memory mode for large systems (>300 atoms).

Notes

Switches Hirshfeld-I calculation to slower but lower-memory algorithm. Useful for large systems that would otherwise exceed memory limits.

changeDielectric(dlc)

Change dielectric constant of the environment.

Parameters:

dlc (float) – Dielectric constant of solvent/environment.

Notes

Used to model solvation effects in ESP calculations. Default is 1 (vacuum). Common values: water=80, protein=4.

set_molden_filename(new_name)

Set the expected molden filename.

Parameters:

new_name (str) – Name of the molden file to read.

set_xyzfilename(new_name)

Set the expected XYZ coordinate filename.

Parameters:

new_name (str) – Name of the XYZ file to read.

rePrep()

Re-run data preparation (file extraction and validation).

Notes

Useful if files have been updated or regenerated.

fix_allECPmolden()

Fix molden files that used Effective Core Potentials.

Notes

ECPs can cause artifacts in molden files that make them incompatible with Multiwfn. This method reformats the molden files to fix these issues. Processes all molden/xyz file pairs.

prepData()

Validate that all specified .molden and .xyz files exist.

Checks that all file paths provided during initialization are valid and accessible. Removes invalid entries from processing lists.

Raises:

FileNotFoundError – If any of the specified .molden or .xyz files cannot be found.

prepData_pdb()

Validate PDB charge files for pdb_only mode.

Checks that all PDB files provided during initialization exist. Removes invalid entries from processing lists.

Raises:

FileNotFoundError – If no valid PDB charge files are found.

static getmultipoles(multipole_name)

Parse multipole moments from Multiwfn output file.

Parameters:

multipole_name (str) – Path to multipole moment output file from Multiwfn.

Returns:

List of dictionaries containing multipole data for each atom: - ‘Index’: atom index (str) - ‘Element’: element symbol (str) - ‘Atom_Charge’: atomic charge (float) - ‘Dipole_Moment’: dipole moment vector [x, y, z] (list of float) - ‘Quadrupole_Moment’: 3x3 quadrupole tensor (np.ndarray)

Return type:

list of dict

Notes

Parses Multiwfn output format with sections separated by asterisks. Extracts monopole (charge), dipole, and quadrupole moments for each atom.

getPtChgs(filename_pt)

Load point charges from file (for QMMM calculations).

Parameters:

filename_pt (str) – Path to point charge file (typically from Amber/CHARMM).

Returns:

DataFrame with columns: [‘Atom’, ‘charge’, ‘x’, ‘y’, ‘z’] All atoms labeled as ‘pnt’ (point charge).

Return type:

pd.DataFrame

Notes

File format: whitespace-delimited with 2 header lines to skip.

getPdbCharges(pdb_path)

Load point charges from a PDB file using the B-factor column.

Parameters:

pdb_path (str) – Path to PDB file containing charges in the B-factor column.

Returns:

DataFrame with columns: [‘Atom’, ‘charge’, ‘x’, ‘y’, ‘z’]

Return type:

pd.DataFrame

Notes

Reads ATOM/HETATM records and parses fixed-width PDB columns. The B-factor column is interpreted as the partial charge (e).

static mapcount(filename)

Rapidly count lines in a file using memory mapping.

Parameters:

filename (str) – Path to file to count.

Returns:

Number of lines in the file.

Return type:

int

Notes

Uses mmap for efficient line counting in large files.

monopole_esp(path_to_xyz, espatom_idx, charge_range, charge_file, ptchg_path=None)

Calculate electrostatic potential at specified atom.

Parameters:
  • path_to_xyz (str) – Path to XYZ coordinate file.

  • espatom_idx (int) – Atom index (0-indexed) where ESP will be calculated.

  • charge_range (list of int) – Indices of atoms to include in ESP calculation.

  • charge_file (str) – Path to partial charge file from Multiwfn.

  • ptchg_path (str or None, optional) – Path to point charge file. If None, no point charges are included.

Returns:

[ESP (float), atom_symbol (str)] ESP in Volts at the specified atom.

Return type:

list

Notes

  • Uses Coulomb’s law: V = k * q / r with dielectric screening

  • Point charges from QMMM included if ptchg_path is provided

  • Bonded atoms treated with dielectric=1 if config[‘changeDielectBoundBool’] is True

  • Units: Volts (N·m/C = J/C)

monopole_esp_df(df_charges, espatom_idx, charge_range)

Calculate ESP at a specified atom using charges from a DataFrame.

This is the pdb_only equivalent of monopole_esp(): it reads directly from a DataFrame (e.g. produced by getPdbCharges) instead of a file. The changeDielectBoundBool feature is not supported in this mode because bond detection requires an XYZ file.

Parameters:
  • df_charges (pd.DataFrame) – DataFrame with columns [‘Atom’, ‘x’, ‘y’, ‘z’, ‘charge’]. Typically returned by getPdbCharges().

  • espatom_idx (int) – Atom index (0-indexed) at which the ESP is calculated.

  • charge_range (list of int) – Indices of atoms to include in the ESP sum.

Returns:

[ESP (float in Volts), atom_symbol (str)]

Return type:

list

calc_firstTermE(charge_range, charge_file)

Input: espatom_idx: integer of atom index charge_range: list of integers of atom indices charge_file: string of charge filename Output: list of E-field vector and atomic symbol

monopole_atomicEfield(espatom_idx, charge_range, charge_file, df_ptchg=None)

Input: espatom_idx: integer of atom index charge_range: list of integers of atom indices charge_file: string of charge filename df_ptchg: optional pre-loaded point charge dataframe Output: list of E-field vector and atomic symbol

pointcharge_atomicEfield(espatom_idx, charge_range, df_charges)

Calculate E-field using point charges from a DataFrame only.

Parameters:
  • espatom_idx (int) – Atom index (0-indexed) where the E-field is calculated.

  • charge_range (list or range) – Atom indices to include in the E-field calculation.

  • df_charges (pd.DataFrame) – DataFrame with columns: [‘Atom’, ‘x’, ‘y’, ‘z’, ‘charge’].

Returns:

[E_vec, position_vec, atom_symbol, atom_wise_contributions] E_vec is in V/Angstrom.

Return type:

list

compute_esp(q_vals, probe_coords)

Compute ESP at probe_coords due to point charges at q_coords.

compute_esp_from_qm(qm_charges, mm_coords)

Compute the ESP at MM atoms due to QM charges.

multipole_atomicEfield(idx_atom, charge_range, xyz_file, atom_multipole_file, df_ptchg=None)

Input: idx_atom: integer of atom index charge_range: list of integers of atom indices xyz_file: string of xyz filename atom_multipole_file: string of multipole filename df_ptchg: optional pre-loaded point charge dataframe Output: list of E-field vector and atomic symbol, Efields are reported in units of Volts/Angstrom

multipole_esp(xyzfilepath, atom_multipole_file, charge_range, idx_atom, ptchg_path=None)

Input: idx_atom: integer of atom index charge_range: list of integers of atom indices xyfilepath: string of xyz filename atom_multipole_file: string of multipole filename ptchg_path: optional path to point charge file Output: list of ESP and atomic symbol Units of ESP is Volts!

calc_bond_dipoles(bond_indices, xyz_filepath, atom_multipole_file, bool_multipole, charge_df=None)

Calculate bond dipole moments for given bond indices.

For each bond between atoms A and B, calculates the bond dipole moment vector relative to the geometric center (midpoint) of the bond. This is origin-independent for both neutral and charged bonds: μ_bond = q_A * (r_A - r_center) + q_B * (r_B - r_center) which simplifies to: μ_bond = (q_A - q_B) * (r_A - r_B) / 2

Parameters:
  • bond_indices (list of tuples) – List of (atomA_idx, atomB_idx) tuples specifying bonds

  • xyz_filepath (str) – Path to XYZ structure file

  • atom_multipole_file (str) – Path to multipole/charge file

  • bool_multipole (bool) – If True, use multipole file; if False, use monopole charges

  • charge_df (pd.DataFrame or None) – Optional DataFrame with monopole charges (columns: Atom, x, y, z, charge). If provided, overrides charge file input.

Returns:

List of bond dipole information dictionaries containing: - ‘bond_dipole_vec’: [x, y, z] dipole vector in Debye - ‘bond_dipole_mag’: magnitude of dipole in Debye - ‘bond_indices’: (atomA_idx, atomB_idx) - ‘charges’: (charge_A, charge_B)

Return type:

list

bondEfield(bond_indices, xyz_filepath, atom_multipole_file, all_lines, bool_multipole, df_ptchg=None)

Input: bond_indices: list of tuples of atom indices xyz_filepath: string of xyz filename atom_multipole_file: string of multipole filename all_lines: list of strings of lines in xyz file df_ptchg: optional pre-loaded point charge dataframe Output: list of E-field vector and atomic symbol in units of V/Angstrom

bondEfield_pointcharges(bond_indices, df_charges, all_lines)

Compute bond-projected E-fields using point charges only.

Parameters:
  • bond_indices (list of tuples) – List of (atomA_idx, atomB_idx) tuples specifying bonds.

  • df_charges (pd.DataFrame) – DataFrame with columns: [‘Atom’, ‘x’, ‘y’, ‘z’, ‘charge’].

  • all_lines (list) – Atom indices to include in the E-field calculation.

Returns:

[E_projected, bonded_atoms, bond_indices, bond_lens, E_proj_atomwise, E_proj_atomwise_list]

Return type:

list

esp_coord_shell(metal_idx, charge_file, path_to_xyz, n_shells=1)

Calculate ESP accounting for electrostatic contributions from n coordination shells.

This unified method replaces esp_first_coord() and esp_second_coord() by accepting a parameter to specify how many coordination shells to include.

Parameters:
  • metal_idx (int) – Atom index of the central metal

  • charge_file (str) – Path to charge filename

  • path_to_xyz (str) – Path to xyz structure file

  • n_shells (int, optional) – Number of coordination shells to include (default=1) n_shells=1: first coordination shell only n_shells=2: first and second coordination shells

Returns:

ESP value for the specified coordination shell(s)

Return type:

float

charge_atom(atom_idx)

Compute charges and return total charge and partial charge of atom Input: filename: string of charge filename atom_idx: integer of atom index Output: list of total charge and partial charge of atom

charge_atoms(chg_filename, xyzfilename)

Input: filename: string of charge filename or multipole filename… if the xyzfilename is included must be multipole!

getAtomInfo(atom_idx)

Input: filename: string of charge filename atom_idx: integer of atom index Output: list of total charge and partial charge of atom

getAtomsInfo(atom_indices)

Input: filename: string of charge filename atom_indices: list of integers of atom indices Output: list of total charge and partial charge of atom

esp_bydistance(path_to_xyz, espatom_idx, charge_file)

Input: path_to_xyz: string of xyz filename espatom_idx: integer of atom index charge_file: string of charge filename Output: list of ESP and atomic symbol

getESP(charge_types='CM5', ESPdata_filename='ESP', multiwfn_path='None', use_multipole=False, include_decay=False, include_coord_shells=False, visualize=None, dielectric=1, dielectric_scale=1, includePtChgs=None, ptchg_paths=None, pdb_charge_paths=None)

Unified ESP calculation method supporting multiple modes.

This consolidated method replaces getESPData(), getESPMultipole(), and getESPDecay() to eliminate code duplication while preserving all functionality.

Parameters:
  • charge_types (list of str or str) – Charge partitioning scheme(s). Options: ‘Hirshfeld’, ‘Voronoi’, ‘Mulliken’, ‘Lowdin’, ‘SCPA’, ‘Becke’, ‘ADCH’, ‘CHELPG’, ‘MK’, ‘AIM’, ‘Hirshfeld_I’, ‘CM5’, ‘EEM’, ‘RESP’, ‘PEOE’. Can be a single string (for multipole mode) or list of strings.

  • ESPdata_filename (str) – Output CSV filename (without extension).

  • multiwfn_path (str) – Path to Multiwfn executable. Ignored when pdb_charge_paths is provided.

  • use_multipole (bool, optional) – If True, use multipole expansion; if False, use monopole charges (default: False).

  • include_decay (bool, optional) – If True, include distance-sorted ESP decay analysis (default: False).

  • include_coord_shells (bool, optional) – If True, include first and second coordination shell ESP (default: False).

  • visualize (bool or None, optional) – If True, create PDB file with ESP values in B-factor column. None uses config defaults (default: None).

  • dielectric (float, optional) – Dielectric constant (default: 1).

  • dielectric_scale (float, optional) – Scaling factor for point charges. Point charge contributions are scaled by 1/sqrt(dielectric_scale). Typically used in QM/MM calculations (default: 1).

  • includePtChgs (bool or None, optional) – Override object-level includePtChgs setting for this calculation. If None, uses object-level setting (default: None).

  • ptchg_paths (list of str or None, optional) – Override point charge file paths for this calculation. List of paths (one per job), with None for jobs without point charges. If provided, takes precedence over object-level settings (default: None).

  • pdb_charge_paths (str or list of str or None, optional) – If provided, use partial charges from PDB B-factor columns only (no Multiwfn). Can be a single PDB path or a list of paths (one per job), with None to skip a job. Overrides the object-level pdb_charge_paths if the object was created in pdb_only mode. This mode supports monopole charges only (use_multipole must be False).

Returns:

  • pd.DataFrame – DataFrame with ESP results saved to CSV.

  • Output Files

  • ————

  • **CSV file (*** Auto-generated with format:*)

  • - Monopole (esp_{charge_type}.csv)

  • - Multipole (esp_multipole_{charge_type}.csv)

  • - Examples (esp_Hirshfeld_I.csv, esp_multipole_Hirshfeld_I.csv)

  • - Contains ESP values at metal centers for all jobs

  • - Columns (job name, charge type, ESP values, coordinates, etc.)

  • **PDB file (if visualize=True) (** esp_{charge_type}_{structure}_atom{idx}.pdb)

  • - One file per job and metal atom

  • - B-factor column contains ESP values at each atom

  • - Example (esp_Hirshfeld_I_complex_atom25.pdb)

  • Note (If you provide a custom filename (not ‘output’, ‘esp’, etc.),)

  • your filename will be used instead of automatic naming.

Examples

>>> # Monopole mode (replaces getESPData)
>>> es.getESP(['Hirshfeld', 'Hirshfeld_I'], 'esp_mono', ...)
>>> # Multipole mode (replaces getESPMultipole)
>>> es.getESP('Hirshfeld_I', 'esp_multi', ..., use_multipole=True)
>>> # Decay analysis mode (replaces getESPDecay)
>>> es.getESP(['Hirshfeld'], 'esp_decay', ..., include_decay=True, include_coord_shells=True)
>>> # pdb_only mode (no Multiwfn required)
>>> es.getESP(pdb_charge_paths=['job1.pdb', 'job2.pdb'])
getEfield(charge_types='Hirshfeld_I', Efielddata_filename='ef', multiwfn_path=None, multipole_bool=False, input_bond_indices=[], auto_find_bonds=False, save_atomwise_decomposition=False, visualize=None, dielectric=1, dielectric_scale=1, includePtChgs=None, ptchg_paths=None, pdb_charge_paths=None)

Unified E-field calculation method supporting multiple modes.

This consolidated method replaces getEFieldMultipole(), getEfield_acrossBond(), and getEfield_decomposable() to eliminate code duplication while preserving all functionality.

Parameters:
  • charge_types (str or list of str) – Charge partitioning scheme. If a list is provided, only the first entry is used. Options: ‘Hirshfeld’, ‘Becke’, ‘Hirshfeld_I’, etc. When pdb_charge_paths is provided, this value is used for labeling only.

  • Efielddata_filename (str) – Output CSV filename (without extension).

  • multiwfn_path (str) – Path to Multiwfn executable.

  • multipole_bool (bool, optional) – If True, use multipole expansion; if False, use monopole charges (default: False).

  • input_bond_indices (list, optional) – Bond indices as list of tuples [(atomA, atomB), …]. If empty, bonds are auto-found using metal indices (esp_atom_idx). If provided, bonds are used as-is unless auto_find_bonds=True.

  • auto_find_bonds (bool, optional) – If True, automatically find bonds to adjacent atoms using getBondedAtoms() and ignore input_bond_indices (default: False).

  • save_atomwise_decomposition (bool, optional) – If True, save atom-wise E-field decomposition to CSV (default: False).

  • visualize (bool or None, optional) – Override config visualization settings. None uses config defaults (default: None).

  • dielectric (float, optional) – Dielectric constant (default: 1).

  • dielectric_scale (float, optional) – Scaling factor for point charges. Point charge contributions are scaled by 1/sqrt(dielectric_scale). Typically used in QM/MM calculations (default: 1).

  • includePtChgs (bool or None, optional) – Override object-level includePtChgs setting for this calculation. If None, uses object-level setting (default: None).

  • ptchg_paths (list of str or None, optional) – Override point charge file paths for this calculation. List of paths (one per job), with None for jobs without point charges. If provided, takes precedence over object-level settings (default: None).

  • pdb_charge_paths (str or list of str or None, optional) – If provided, use partial charges from PDB B-factor columns only (no Multiwfn). Can be a single PDB path or a list of paths (one per job), with None to skip a job. This mode supports monopole charges only (multipole_bool must be False).

Returns:

  • pd.DataFrame or tuple of pd.DataFrame – If save_atomwise_decomposition=False: returns DataFrame with E-field results If save_atomwise_decomposition=True: returns (total_df, atomwise_df) tuple

  • Output Files

  • ————

  • **Main CSV file (*** Auto-generated with format:*)

  • - Monopole (ef_{charge_type}.csv)

  • - Multipole (ef_multipole_{charge_type}.csv)

  • - Examples (ef_Hirshfeld_I.csv, ef_multipole_Hirshfeld_I.csv)

  • - Contains E-field magnitudes and projections for all jobs and bonds

  • - Columns (job name, bond atoms, E-field magnitude (MV/cm), projections, etc.)

  • **Atomwise CSV (if save_atomwise_decomposition=True) (****)

  • - Format (ef_{charge_type}_atomwise.csv or ef_multipole_{charge_type}_atomwise.csv)

  • - Example (ef_Hirshfeld_I_atomwise.csv)

  • - Per-atom E-field contributions for each bond

  • - Shows which atoms contribute most to the E-field

  • - Columns (job, bond, atom index, contribution magnitude, etc.)

  • **PDB files (if visualize=True) (****)

  • - Per bond (ef_{charge_type}_{structure}_bond{i}-{j}.pdb)

  • - Combined (ef_{charge_type}_{structure}_combined.pdb)

  • - B-factor contains E-field contribution from each atom

  • - Examples (ef_Hirshfeld_I_complex_bond25-26.pdb)

  • Note (If you provide a custom filename (not ‘output’, ‘efield’, etc.),)

  • your filename will be used instead of automatic naming.

Examples

>>> # Basic multipole mode with specified bonds
>>> es.getEfield('Hirshfeld_I', 'efield', ..., input_bond_indices=[(0,1), (0,2)])
>>> # Auto-find bonds to adjacent atoms
>>> es.getEfield('Hirshfeld_I', 'efield', ..., auto_find_bonds=True)
>>> # Monopole mode with atom-wise decomposition
>>> es.getEfield('Hirshfeld', 'efield', ..., multipole_bool=False, save_atomwise_decomposition=True)
getCharges(charge_types='Hirshfeld_I', multiwfn_path=None, output_filename='charges', write_pdb=False, pdb_bfactor=True)

Compute partial charges for all atoms using specified charge partitioning scheme(s).

This method provides a standalone interface for computing partial charges without calculating E-fields or ESP. Charges can optionally be written to PDB files with the B-factor column containing the charge values.

Parameters:
  • charge_types (str or list of str) – Charge partitioning scheme(s) to use. Options include: ‘RESP’, ‘Hirshfeld’, ‘Hirshfeld_I’, ‘CHELPG’, ‘Mulliken’, ‘Lowdin’, ‘SCPA’, ‘Becke’, ‘ADCH’, ‘Voronoi’, ‘MK’, ‘AIM’, ‘CM5’, ‘EEM’, ‘PEOE’

  • multiwfn_path (str) – Path to the Multiwfn executable.

  • output_filename (str, optional) – Base name for output files (default: ‘charges’). CSV will be saved as ‘{output_filename}.csv’.

  • write_pdb (bool, optional) – If True, create PDB files for each structure (default: False).

  • pdb_bfactor (bool, optional) – If True and write_pdb=True, store charges in B-factor column (default: True). This allows visualization of charges in molecular viewers like PyMOL.

Returns:

  • pd.DataFrame – DataFrame containing partial charges for all atoms in each structure. Columns include: ‘Job’, ‘Charge_Type’, ‘Atom_Index’, ‘Element’, ‘x’, ‘y’, ‘z’, ‘Charge’.

  • Output Files

  • ————

  • **CSV file (** {output_filename}.csv) – Contains all partial charges for all atoms and all jobs.

  • **PDB files (if write_pdb=True) (** {structure}_{charge_type}_charges.pdb) – One PDB file per job per charge type. B-factor column contains the partial charge values.

Examples

>>> es = Electrostatics(molden_paths, xyz_paths)
>>>
>>> # Get RESP charges for all atoms
>>> df = es.getCharges('RESP', '/path/to/multiwfn')
>>>
>>> # Get multiple charge types with PDB output
>>> df = es.getCharges(['RESP', 'Hirshfeld_I'], '/path/to/multiwfn',
...                    output_filename='my_charges', write_pdb=True)
>>>
>>> # Access charges for a specific atom
>>> atom_5_charges = df[df['Atom_Index'] == 5]

Notes

  • Charges are computed via Multiwfn and cached in ‘Charges{type}.txt’ files.

  • Subsequent calls with the same charge type will reuse cached results unless config[‘rerun’] is True.

  • PDB files with B-factor charges can be visualized in PyMOL using: spectrum b, blue_white_red, minimum=-1, maximum=1

get_residues(auto_gen, solute_solvent_dict)
partitionCharge(multipole_bool, f, multiwfn_path, charge_type, owd)

Partition electron density using Multiwfn to generate partial charges or multipole moments.

This is the centralized interface for all charge partitioning schemes (Hirshfeld, CHELPG, etc.). Checks if calculation was previously completed and reuses results unless rerun=True.

Parameters:

multipole_boolbool

True to compute multipole moments, False for monopoles only

fstr

Complete path to folder containing the calculation

multiwfn_pathstr

Path to Multiwfn executable

charge_typestr

Partitioning scheme (e.g., ‘Hirshfeld’, ‘CHELPG’, ‘Hirshfeld_I’)

owdstr

Original working directory

Returns:

comp_costfloat

Computation time in seconds, 0 if calculation was previously completed (skipped), or -1 if calculation failed

Notes:

For Hirshfeld-I calculations, atmrad files are automatically loaded from pyef.resources.atmrad (bundled with the package).

get_residueDipoles(charge_type, multiwfn_path, multipole_bool, solute_indices=[], num_atoms_solvent=0, save_distance_dipoles=False)

Function computes a partial charges of all atoms in one system… if the desired file already exists the just return it!! Inputs: ——- charge_type: string, possible choices include: ‘Hirshfeld’, ‘Voronoi’, ‘Mulliken’, ‘Lowdin’, ‘SCPA’, ‘Becke’, ‘ADCH’, ‘CHELPG’, ‘MK’, ‘AIM’, ‘Hirshfeld_I’, ‘CM5’, ‘EEM’, ‘RESP’, ‘PEOE’}

Also accepts multipole options including: ‘Hirshfeld’: [‘3’, ‘2’], ‘Hirshfeld_I’: [‘4’, ‘1’, ‘2’], ‘Becke’: [‘1’, ‘2’]

getpartialchgs(charge_types, lst_atom_idxs, partial_chg_filename, multiwfn_path)

Function computes partial charges on a select set of atoms using the charge scheme specified in charge types. Note atom indices will be carried over between csvs

Inputs:

– charge_types: list(str)

list of strings

lst_of_atom_idxs: list(int)

list of integers denoting atom indices (0 indexed!)

partial_chg_filename: string

Name of the output file name

multiwfn_path: string

Path to the multiwfn executable

Path to the atmrad executable

Outputs:

– df: Pandas Dataframe with partial charge info

Will Create a csv file entitled partial_chg_filename.csv with partial charge info

compute_dipole(res_chgs, nuc_chgs)

res_coords in units of angstroms, res_chgs are in a.u. units returns dipole in units of Debye

getdipole_residues(res_dict, df)

res_dict: dictionary with strings mapped to list of lists(int) strings denote name of residues which are mapped to the associated atom indices in the lists(0 indexed!)

getcharge_residues(charge_types, res_dict, partial_chg_filename, multiwfn_path, multipole_mode=True, polarization_scheme='Hirshfeld_I')

Function computes partial charges on a select set of atoms using the charge scheme specified in charge types. Note atom indices will be carried over between csvs

Inputs:

charge_types: list(str)

list of strings

res_dict: dictionary with strings mapped to list of lists(int)

strings denote name of residues which are mapped to the associated atom indices in the lists(0 indexed!)

partial_chg_filename: string

Name of the output file name

multiwfn_path: string

Path to the multiwfn executable

Path to the atmrad executable

multipole_mode: boolean

If True, will use multipole mode to compute partial charges, if False will use the standard method

Outputs:

Notes

Will Create a csv file entitled partial_chg_filename.csv with partial charge info

compute_atomwise_dipole_contributions(atomwise_Efield, df_substrate, df_env, df_all_atoms, one_mol, filename)

Compute atom-wise energy contributions from environment E-fields interacting with substrate dipoles.

The interaction energy is: E = -μ·E (negative dot product of dipole moment with electric field)

Parameters:

atomwise_Efielddict

Dictionary with (substrate_idx, env_idx) as keys and E-field vector [Ex, Ey, Ez] (in V/Å) as values

df_substrateDataFrame

DataFrame containing substrate atom information

df_envDataFrame

DataFrame containing environment atom information

df_all_atomsDataFrame

DataFrame containing all atom information (for element names)

one_molfloat

Avogadro’s number

filenamestr

Name of the current file being processed

Returns:

DataFrame with columns: FileName, Env_Atom_Index, Env_Element, Total_Energy_Contribution_kcal_mol,

Substrate_Atom_Index, Substrate_Element, Energy_Contribution_kcal_mol

visualize_atomwise_contributions(df_atomwise, file_path_xyz, charge_file, filename, charge_type)

Create PDB visualization of atom-wise energy contributions.

Parameters:

df_atomwiseDataFrame

DataFrame with atom-wise contribution data from _compute_atomwise_contributions

file_path_xyzstr

Path to XYZ file

charge_filestr

Path to charge/multipole file

filenamestr

Folder/file name for output

charge_typestr

Type of charge calculation used

compute_atomwise_contributions(atomwise_ESP, df_substrate, df_env, df_all_atoms, one_mol, filename)

Compute atom-wise energy contributions from environment atoms to substrate stabilization.

Parameters:

atomwise_ESPdict

Dictionary with (substrate_idx, env_idx) as keys and ESP contribution (in Volts) as values

df_substrateDataFrame

DataFrame containing substrate atom information

df_envDataFrame

DataFrame containing environment atom information

df_all_atomsDataFrame

DataFrame containing all atom information (for element names)

one_molfloat

Avogadro’s number

filenamestr

Name of the current file being processed

Returns:

DataFrame with columns: FileName, Env_Atom_Index, Env_Element, Total_Energy_Contribution_kcal_mol,

Substrate_Atom_Index, Substrate_Element, Energy_Contribution_kcal_mol

build_interaction_tensor(r_i, r_j, multipole_order, dielectric)

Build the multipole interaction tensor T_ij using AMOEBA formalism.

This tensor contains derivatives of 1/r and enables computation of all multipole-multipole interactions via: E = M_i^T · T_ij · M_j

Parameters:

r_i, r_jarray-like

Position vectors of atoms i and j in Angstroms

multipole_orderint

Maximum multipole order (1=monopole, 2=+dipole, 3=+quadrupole)

dielectricfloat

Dielectric constant

Returns:

Tndarray

Interaction tensor with dimensions based on multipole_order - Order 1: 1×1 (monopole-monopole only) - Order 2: 4×4 (monopole + dipole) - Order 3: 9×9 (monopole + dipole + quadrupole, traceless)

Notes:

Tensor elements represent spatial derivatives of Coulomb potential: - T[0,0]: 1/r (monopole-monopole) - T[0,1:4]: ∂(1/r)/∂r_j (monopole-dipole) - T[1:4,1:4]: ∂²(1/r)/∂r_i∂r_j (dipole-dipole) - Higher orders follow similar pattern

build_multipole_vector(atom_data, multipole_order)

Build multipole moment vector M for an atom using AMOEBA formalism.

Parameters:

atom_datadict or pd.Series

Atom data containing ‘Atom_Charge’, ‘Dipole_Moment’, ‘Quadrupole_Moment’

multipole_orderint

Maximum multipole order (1=monopole, 2=+dipole, 3=+quadrupole)

Returns:

Mndarray

Multipole vector [q, μ_x, μ_y, μ_z, Q_xx, Q_xy, …] in SI-based units - Order 1: [q] (1 element) - Order 2: [q, μ_x, μ_y, μ_z] (4 elements) - Order 3: [q, μ_x, μ_y, μ_z, Q_xx, Q_xy, Q_xz, Q_yy, Q_yz] (9 elements, traceless)

Notes:

  • Charges are dimensionless (in units of elementary charge)

  • Dipoles are converted from Bohr to meters

  • Quadrupoles are converted from Bohr² to meters²

  • Uses traceless quadrupole representation (Q_zz = -Q_xx - Q_yy)

getElectrostatic_stabilization(multiwfn_path=None, substrate_idxs=[], charge_type='Hirshfeld_I', name_dataStorage='estatic', env_idxs=None, save_atomwise_decomposition=False, visualize=None, multipole_order=2, substrate_multipole_order=None, env_multipole_order=None, dielectric=1, dielectric_scale=1, includePtChgs=None, ptchg_paths=None, pdb_charge_paths=None)

Compute electrostatic stabilization using AMOEBA-style multipole tensor formalism.

This function implements the complete multipole expansion for intermolecular electrostatic interactions using the tensor formalism:

E_total = Σᵢ∈substrate Σⱼ∈environment M_i^T · T_ij · M_j

where M is the multipole vector and T_ij is the interaction tensor containing all derivatives of the Coulomb potential. This naturally includes ALL multipole-multipole interaction terms:

  • Order 1 (monopole): charge-charge (q×q)

  • Order 2 (+ dipole): + charge-dipole (q×μ) + dipole-dipole (μ×μ)

  • Order 3 (+ quadrupole): + charge-quadrupole (q×Q) + dipole-quadrupole (μ×Q) + quadrupole-quadrupole (Q×Q)

This is the formalism used in the AMOEBA polarizable force field and provides a complete treatment of distributed multipole interactions.

Parameters:

multiwfn_pathstr

Path to Multiwfn executable

substrate_idxslist

List of substrate atom indices (the molecule being stabilized)

charge_typestr, optional

Charge partitioning scheme (default: ‘Hirshfeld_I’)

name_dataStoragestr, optional

Output filename prefix (default: ‘estaticFile_tensor’)

env_idxslist or None, optional

List of environment atom indices. If None, uses all non-substrate atoms (default: None)

save_atomwise_decompositionbool, optional

If True, save atom-wise decomposition to CSV showing each substrate atom’s contribution (default: False)

visualizebool or None, optional

If True, create PDB files with energy contributions in B-factor column. None uses config defaults (default: None)

multipole_orderint, optional

Order of multipole expansion: - 1: monopole only (charge-charge) - 2: monopole + dipole (includes charge-charge, charge-dipole, dipole-dipole) - 3: monopole + dipole + quadrupole (includes all terms up to Q×Q) (default: 2)

substrate_multipole_orderint or None, optional

Multipole order for substrate atoms. If None, uses multipole_order. Set to 1 for QM/MM where substrate is MM (charges only). (default: None)

env_multipole_orderint or None, optional

Multipole order for environment atoms. If None, uses multipole_order. Set to 1 for QM/MM where environment is MM (charges only). (default: None)

dielectricfloat, optional

Dielectric constant (default: 1).

dielectric_scalefloat, optional

Scaling factor for point charges. Point charge contributions are scaled by 1/sqrt(dielectric_scale). Typically used in QM/MM calculations (default: 1).

includePtChgsbool or None, optional

Include point charges as environment atoms only. If None, uses object-level setting (default: None). Point charges are ONLY added to environment, never to substrate.

ptchg_pathslist of str or None, optional

Override point charge file paths for this calculation. List of paths (one per job), with None for jobs without point charges. Point charges are treated as monopole-only environment atoms. If provided, takes precedence over object-level settings (default: None).

pdb_charge_pathsstr or list of str or None, optional

If provided, use partial charges from PDB B-factor columns only (no Multiwfn). Monopole-only (multipole_order is forced to 1). Coordinates are also read from the PDB, so no separate xyz file is needed. Overrides the object-level pdb_charge_paths when the object was created in pdb_only mode.

Returns:

pd.DataFrame or tuple of pd.DataFrame

If save_atomwise_decomposition=False: returns DataFrame with total stabilization energies If save_atomwise_decomposition=True: returns (total_df, atomwise_df) tuple

Output Files

Main CSV file: Auto-generated with format: - Monopole: estab_{charge_type}.csv - Multipole: estab_multipole{N}_{charge_type}.csv (N = multipole order) - Examples: estab_Hirshfeld_I.csv, estab_multipole2_Hirshfeld_I.csv - Contains total electrostatic stabilization energies (kcal/mol) for all jobs - Columns: job name, total energy, substrate/environment details, etc.

Atomwise CSV (if save_atomwise_decomposition=True): - Format: estab_{charge_type}_atomwise.csv or estab_multipole{N}_{charge_type}_atomwise.csv - Examples: estab_Hirshfeld_I_atomwise.csv, estab_multipole2_Hirshfeld_I_atomwise.csv - Per-atom energy contributions showing which atoms contribute most - Columns: job, atom index, energy contribution, coordinates, etc.

PDB file (if visualize=True): estab_{charge_type}_{structure}_tensor_sub{N}_env{M}.pdb - B-factor contains energy contribution from each environment atom to substrate - N = substrate multipole order, M = environment multipole order - Example: estab_Hirshfeld_I_complex_tensor_sub2_env1.pdb

Note: If you provide a custom filename (not ‘output’, ‘estatic’, etc.), your filename will be used instead of automatic naming.

Notes:

Comparison with other methods: - getElectrostatic_stabilization(): Computes q·V (substrate charge in environment potential) - get_Electrostatic_stabilization_dipole(): Computes -μ·E (substrate dipole in environment field) - This function: Computes DIRECT pairwise multipole-multipole interactions

Advantages of tensor formalism: - Complete: automatically includes all multipole-multipole terms - Symmetric: properly treats both substrate and environment multipoles - Efficient: single matrix multiplication per atom pair - Extensible: easy to add higher-order terms

Important: - Requires Multiwfn multipole analysis for orders ≥ 2 - Automatic fallback: If multipole files don’t exist, automatically uses charges-only files - QM/MM support: Substrate and environment can have different multipole orders (e.g., QM substrate with multipoles, MM environment with charges only) - Results are the electrostatic STABILIZATION energy (environment → substrate) - Energy is in kcal/mol - Follows AMOEBA force field formalism (Ren & Ponder, J. Phys. Chem. B 2003)

Examples:

>>> # Compute monopole + dipole + dipole-dipole interactions
>>> df = estat.get_Electrostatic_stabilization_tensor(
...     multiwfn_path,
...     substrate_idxs=[[0, 1, 2]],  # 3-atom substrate
...     multipole_order=2  # Include dipole-dipole terms
... )
>>> # Include quadrupole terms with atom-wise decomposition
>>> df, df_atomwise = estat.get_Electrostatic_stabilization_tensor(
...     multiwfn_path,
...     substrate_idxs=[[0, 1, 2]],
...     multipole_order=3,
...     save_atomwise_decomposition=True
... )
>>> # QM/MM: QM substrate (with dipoles) interacting with MM environment (charges only)
>>> df = estat.get_Electrostatic_stabilization_tensor(
...     multiwfn_path,
...     substrate_idxs=[[0, 1, 2]],  # QM region
...     substrate_multipole_order=2,  # QM: charges + dipoles
...     env_multipole_order=1         # MM: charges only
... )
visualize_charges_pdb(xyz_file, charges, pdb_name)

Create a PDB file with charges visualized in the b-factor column.

Parameters:
  • xyz_file (str) – Path to the XYZ structure file

  • charges (array-like) – Array of atomic charges to visualize

  • pdb_name (str) – Output PDB filename

Returns:

Writes PDB file to disk

Return type:

None