This is an old revision of the document!

Uncertainty Quantification

potfit uses the potential ensemble method ¹⁾ to quantify the uncertainty in fitted potential parameters by generating an ensemble of candidate potentials of varying suitability sampled from around the best fit potential space.

To enable this feature, compile potfit with the uq option, see compiling.

<span style="color:red">This option is only available for analytic potentials.</span>

This generates an ensemble of potentials whose spread can be used to quantify the uncertainties in the fitted parameters. Taking an uncorrelated subsample of the MCMC output forms a potential ensemble representing the uncertainties in each parameter by the ensemble spread and covariance. Propagating the uncertainty represented by the ensemble members through molecular dynamics, the resultant uncertainties in quantities of interest can be obtained. For an example of this see ²⁾.

The Ensemble Method

An ensemble of potentials are generated by taking a series on Markov chain Monte Carlo steps starting from the best fit potential parameters. The step size in each parameter direction is scaled dependant on the curvature in each parameter. This is encoded using information about the eigenvalues of the hessian at the best fit potential minimum, for potential parameters $\Theta=\{\theta_1,..., \theta_N\}$.

\begin{equation} \Delta\theta_{i}=\sum_{j=1}^{\rm N}\sqrt{\frac{\rm R}{{\rm{max}} (1,\lambda_{j})}}V_{ij}r_{j} \end{equation} where $\lambda_j$ are the hessian eigenvalues, $V_{ij}$ the eigenvector components and $r_j$ is Gaussian noise. The R value, acc_rescaling, is a tunable parameter for the MCMC step acceptance rate.

The MCMC algorithm samples potentials from the distribution at a temperature, $T_0$, set by the number of potential parameters and minimum cost value. In the majority of cases tis temperature should be sufficient to generate a suitable ensemble. In the event that a reduced sampling temperature is required this can be scaled by a parameter $\alpha$ (uq_temp), such that $T=\alpha T_0$.

Hessian Bracketing Algorithm

Only use this if you know what you are doing!

If hess_pert = -1 the parameter perturbations used in the finite difference calculation of the hessian are found individually. This algorithm can be used as a diagnostic tool to understand the curvature on the length scale of the sampling temperature. However care should be taken when analysing the information as many assumptions about the cost minimum are inherently made (e.g. that the landscape at the sampling temperature height is harmonic). Each parameter is perturbed to bracket the perturbation value yielding a the cost set by the sampling temperature - $C_T = C_0 + T = C_0 +\frac{2\alpha C_0}{N}$. When the bracketing interval is within 5% of $C_T$, a line is drawn between the two bounds and the gradient is used to choose the perturbation value estimated to give a cost of $C_T$.

If the landscape at this scale is not harmonic, the eigenvalues of the hessian will be negative. In this case a reduced sampling temperature may be required and the user should think about improving the reference data being fit to, as well as the suitability and possible limitations of the potential model being used.

Parameters

parameter name	parameter type	default value
short explanation.

Required parameters

acc_rescaling*	float	(none)
R value to tune MCMC acceptance rate.

acc_moves*	integer	(none)
Number of accepted MCMC moves required.

Optional parameters

ensemblefile	string	`startpot`
Potential ensemble output filename, `ensemblefile.uq`. If this is not defined then `output_prefix.uq` is used. Should neither `ensemblefile` nor `output_prefix` be defined, the `startpot` filename is used, with a '.uq' extension.

uq_temp	float	1.0
Temperature scaling parameter $\alpha$.

use_svd	boolean	0
Use singular value decomposition to find Hessian eigenvalues (default is eigenvalue decomposition).

hess_pert	float	0.00001
Percentage parameter perturbation in Hessian finite difference calculation. (If `hess_pert = -1` a bracketing algorithm is used to find individual parameter perturbation values, see explanation above - only use this is you know what you are doing!)

eig_max	float	1.0
Alternative MCMC step perturbation maximum value in max(`eig_max`, $\lambda_j$).

write_ensemble	integer	0
Writes a potential file every `write_ensemble` members.

¹⁾

Frederiksen, S. L., Jacobsen, K. W., Brown, K. S., and Sethna, J. P.: Bayesian ensemble approach to error estimation of interatomic potentials. Phys. Rev. Lett. 93 (16), 165501, 2004.

²⁾

Longbottom, S., Brommer, P.: Uncertainty Quantification for Classical Effective Potentials. arxiv link

potfit wiki

Table of Contents