Optimisation and Bayesian optimality
This doctoral thesis will present the results of work into optimisation algorithms. We first give a detailed exploration of the problems involved in comparing optimisation algorithms. In particular we provide extensions and refinements to no free lunch results, exploring algorithms with arbitrary stopping conditions, optimisation under restricted metrics, parallel computing and free lunches, and head-to-head minimax behaviour. We also characterise no free lunch results in terms of order statistics. We then ask what really constitutes understanding of an optimisation algorithm. We argue that one central part of understanding an optimiser is knowing its Bayesian prior and cost function. We then pursue a general Bayesian framing of optimisation, and prove that this Bayesian perspective is applicable to all optimisers, and that even seemingly non-Bayesian optimisers can be understood in this way. Specifically we prove that arbitrary optimisation algorithms can be represented as a prior and a cost function. We examine the relationship between the Kolmogorov complexity of the optimiser and the Kolmogorov complexity of it’s corresponding prior. We also extended our results from deterministic optimisers to stochastic optimisers and forgetful optimisers, and we show that uniform randomly selecting a prior is not equivalent to uniform randomly selecting an optimisation behaviour. Lastly we consider what the best way to go about gaining a Bayesian understanding of real optimisation algorithms is. We use the developed Bayesian framework to explore the affects of some common approaches to constructing meta-heuristic optimisation algorithms, such as on-line parameter adaptation. We conclude by exploring an approach to uncovering the probabilistic beliefs of optimisers with a “shattering” method.