A Novel Ranking Scheme for the Performance Analysis of Stochastic Optimization Algorithms using the Principles of Severity
Sowmya Chandrasekaran,
Thomas Bartz-Beielstein
Kapitel/Beitrag aus dem Buch: Schulte, H et al. 2024. Proceedings - 34. Workshop Computational Intelligence: Berlin, 21.-22. November 2024.
Stochastic optimization algorithms have been successfully applied in several domains to find optimal solutions. Because of the ever-growing complexity of integrated systems, novel stochastic algorithms are being proposed, which makes the task of the performance analysis of the algorithms extremely important. This paper provides a novel ranking scheme to rank the algorithms over multiple single-objective optimization problems. The results of the algorithms are compared using a robust bootstrapping-based hypothesis testing procedure that is based on the principles of severity. Analogous to the football league scoring scheme, we propose pairwise comparison of algorithms as in league competition. Each algorithm accumulates points and a performance metric of how good or bad it performed against other algorithms analogous to the goal differences metric in the football league scoring system. The goal differences performance metric can be used not only as a tie-breaker but also to obtain a quantitative performance of each algorithm. The key novelty of the proposed ranking scheme is that it takes into account the performance of each algorithm considering the magnitude of the achieved performance improvement along with its practical relevance and does not have any distributional assumptions. To demonstrate the advantages of the proposed ranking scheme, we compare the expected run-time metrics of three hyperparameter optimization (HPO) procedures, namely, Irace, a mixed-integer parallel efficient global optimization (MIP-EGO), the mixed-integer evolution strategy (MIES), along with (1+1)EA and grid search(GS) on a genetic algorithm framework for Pseudo-Boolean Optimization (PBO) Suite of 25 problems. The proposed ranking scheme is compared to classical hypothesis testing and the analysis of the results shows that the results are comparable and our proposed ranking showcases many additional benefits.