Benchmark and Time Estimation ============================== Overview -------- The library includes a time estimation system that helps predict how long ORCA calculations will take. This is essential for planning large-scale QSAR descriptor calculations. How It Works ------------ The time estimation system uses a benchmark calculation to calibrate performance on your machine: 1. **Benchmark Calculation**: A single-point calculation is run on benzene (C₁H₆) to measure: - Number of basis functions - Time per SCF cycle - Total calculation time 2. **Scaling Formula**: For new molecules, the system estimates time using: - Molecular size scaling (O(N².⁵) for DFT calculations) - Method type (SP, Opt, Freq) - Number of processors 3. **Parameter Scaling**: The system automatically adjusts for different: - Number of processors (accounts for parallel efficiency) - Basis sets (scales with size) - Functionals (accounts for computational cost) Why It's Needed --------------- * **Planning**: Estimate total time for large datasets * **Resource Management**: Allocate computational resources efficiently * **Progress Tracking**: Monitor calculation progress * **Optimization**: Choose optimal parameters for your hardware Running a Benchmark ------------------- Using CLI ~~~~~~~~~ .. code-block:: bash orca_descriptors run_benchmark --working_dir ./calculations The benchmark uses benzene as a standard test molecule and saves results to ``.orca_benchmark.json``. Using Python ~~~~~~~~~~~~ .. code-block:: python from orca_descriptors import Orca orca = Orca(working_dir="./calculations") benchmark_data = orca.run_benchmark() print(f"SCF cycle time: {benchmark_data['scf_time']:.2f} seconds") print(f"Number of basis functions: {benchmark_data['n_basis']}") Estimating Calculation Time ---------------------------- Using CLI ~~~~~~~~~ .. code-block:: bash orca_descriptors approximate_time --molecule CCO --method_type Opt This estimates time without running the actual calculation. Using Python ~~~~~~~~~~~~ .. code-block:: python from orca_descriptors import Orca from rdkit.Chem import MolFromSmiles, AddHs orca = Orca(working_dir="./calculations") mol = AddHs(MolFromSmiles("CCO")) estimated_time = orca.estimate_calculation_time(mol) print(f"Estimated time: {estimated_time:.2f} seconds") Automatic Parameter Scaling --------------------------- The system automatically scales benchmark data for different parameters. You don't need to re-run the benchmark if you change: * **Number of processors**: Automatically accounts for parallel efficiency * **Basis set**: Scales based on basis set size (O(N³.⁵)) * **Functional**: Adjusts for relative computational cost Example: If your benchmark was run with 1 processor and def2-SVP, you can estimate time for 4 processors and def2-TZVP without re-running the benchmark. Benchmark File Location ----------------------- The benchmark data is saved to:: /.orca_benchmark.json This file contains: * Functional and basis set used * Number of processors * Number of basis functions * SCF cycle time * Total calculation time You can share this file between different working directories if using the same hardware and ORCA version.