Changelog¶
Version 0.3.4¶
Added¶
Added
clearCLI command to remove all ORCA files in the working directory (useful for cleaning up files that weren’t removed due to errors)Added
purge_cacheCLI command to remove ORCA cache
Changed¶
Improved cache system to preserve original file extensions (.out, .log, .smd.out) when storing files
Enhanced NBO stabilization energy parsing with better error messages indicating NBOEXE environment variable requirement
Updated test for AM1 Mayer bond indices to account for different values compared to DFT methods
Removed¶
Removed ESP extrema descriptor (
get_esp_extrema) - not implemented due to limitations in ORCA 6.0.1 for generating ESP cube files directly (would require orca_plot utility integration)
Fixed¶
Fixed cache system issue where files were saved with .out extension regardless of original extension, causing file not found errors
Fixed NBO stabilization energy parsing to properly detect when NBOEXE environment variable is not set
Fixed NMR chemical shifts parsing to work correctly with cached files
Fixed test for AM1 Mayer bond indices to accept different value ranges compared to DFT
Technical Details¶
Cache now preserves original file extensions to ensure correct file retrieval
NBO analysis requires NBOEXE environment variable to be set to point to NBO executable (nbo6.exe or nbo5.exe)
ESP extrema calculation would require integration with orca_plot utility, which is not currently implemented
CLI commands
clearandpurge_cachehelp maintain clean working directories and cache management
Version 0.3.3¶
Changed¶
Major code refactoring: split large
orca.pyfile (1620 lines) into modular structure: * Createdbase.pywithOrcaBaseclass containing common utility methods * Createdcalculation.pywithCalculationMixinfor calculation execution methods * Createddecorators.pywithhandle_x_moleculedecorator * Split descriptors into separate modules by category:descriptors/electronic.py- Electronic property descriptorsdescriptors/energy.py- Energy-related descriptorsdescriptors/structural.py- Structural property descriptorsdescriptors/topological.py- Topological descriptorsdescriptors/misc.py- Miscellaneous descriptors
Main
Orcaclass now uses multiple inheritance from mixinsImproved code organization and maintainability
Removed redundant comments throughout the codebase
Technical Details¶
The new modular structure makes it easier to extend functionality and maintain code
Descriptors are organized by category for better code navigation
All functionality remains backward compatible
Version 0.3.0¶
Added¶
Added new
ORCABatchProcessingclass for efficient batch processing of molecular descriptors with pandas compatibilityAdded support for semi-empirical methods (AM1, PM3, PM6, PM7, RM1, MNDO, MNDOD, OM1, OM2, OM3)
Added
pre_optimizeparameter (default:True) for pre-optimizing molecular geometry using MMFF94 force field before ORCA calculationsAdded multiprocessing support for parallel batch processing via
parallel_mode="multiprocessing"parameterAdded automatic cleanup of all ORCA temporary files (input, output, and all temporary files) after calculations since results are cached
Added improved error parsing with brief summaries in
logging.INFOand detailed information inlogging.DEBUGAdded
_pre_optimize_geometry()method for MMFF94 geometry optimizationAdded
_is_semi_empirical()method to detect semi-empirical methods
Changed¶
Refactored batch processing functionality from
Orca.calculate_descriptors()into dedicatedORCABatchProcessingclassOrca.calculate_descriptors()now usesORCABatchProcessinginternally for backward compatibilitycalculate_descriptors()now preserves original DataFrame columns (including ‘smiles’) instead of removing and re-adding themImproved molecule hash calculation to include
pre_optimizeparameter for proper cachingUpdated molecule hash calculation to exclude
basis_setanddispersion_correctionfor semi-empirical methodsEnhanced input file generation to support semi-empirical methods (no basis set or dispersion correction needed)
Improved file cleanup to remove all ORCA files (including input and output files) since results are cached
Fixed¶
Fixed DataFrame handling in batch processing to preserve all original columns
Fixed error handling to provide concise error messages in INFO level and detailed information in DEBUG level
Technical Details¶
ORCABatchProcessingsupports three parallelization modes: “sequential”, “multiprocessing”, and “mpirun”Pre-optimization uses RDKit’s MMFF94 force field for fast geometry optimization before quantum chemical calculations
All ORCA files are automatically cleaned up after successful calculations, with results stored in cache
Semi-empirical methods are automatically detected and handled differently from DFT methods
Batch processing now includes time estimation based on benchmark machine performance
Version 0.2.2¶
Added¶
Added
numpy>=1.20.0to project dependencies (numpy was used but not declared)Added dynamic time estimation updates in batch processing - time estimates are now refined based on actual execution times of previous molecules
Added
_get_available_descriptors()method to dynamically discover available descriptor methods
Changed¶
Updated dipole moment parser to prioritize gas-phase values when available (for calculations without solvation)
Improved time estimation algorithm: * Changed scaling exponent from O(N^3.5) to O(N^2.5) for more realistic estimates * Uses
total_timefrom benchmark instead ofscf_timeas base unit * More realistic optimization step estimation (15-35 steps instead of 10-50) * Removed artificial 24-hour time capRefactored
calculate_descriptors()method: * Removed redundant code duplication (replaced large if-elif chain withgetattr-based method calls) * Removed redundantall_descriptorslist - descriptors are now discovered dynamically * Removed unnecessary comments * Improved code maintainability and readability
Fixed¶
Fixed dipole moment parser to correctly extract gas-phase values from ORCA output when available
Fixed time estimation showing unrealistic values (e.g., 47 hours for 2 molecules) - now provides accurate estimates based on actual benchmark data
Technical Details¶
Time estimator now uses exponential moving average for better prediction accuracy
Descriptor methods are called dynamically using
getattr(self, desc_name)Automatic descriptor discovery eliminates need to maintain manual descriptor lists