TY - GEN

T1 - Empirical Risk Minimization with Relative Entropy Regularization

T2 - 2022 IEEE International Symposium on Information Theory, ISIT 2022

AU - Perlaza, Samir M.

AU - Bisson, Gaetan

AU - Esnaola, Iñaki

AU - Jean-Marie, Alain

AU - Rini, Stefano

N1 - Publisher Copyright:
© 2022 IEEE.

PY - 2022

Y1 - 2022

N2 - The optimality and sensitivity of the empirical risk minimization problem with relative entropy regularization (ERM-RER) are investigated for the case in which the reference is a σ-finite measure instead of a probability measure. This generalization allows for a larger degree of flexibility in the incorporation of prior knowledge over the set of models. In this setting, the interplay of the regularization parameter, the reference measure, the risk function, and the empirical risk induced by the solution of the ERM-RER problem is characterized. This characterization yields necessary and sufficient conditions for the existence of regularization parameters that achieve arbitrarily small empirical risk with arbitrarily high probability. Additionally, the sensitivity of the expected empirical risk to deviations from the solution of the ERM-RER problem is studied. Dataset-dependent and dataset-independent upper bounds on the absolute value of the sensitivity are presented. In a special case, it is shown that the expectation (with respect to the datasets) of the absolute value of the sensitivity is upper bounded, up to a constant factor, by the square root of the lautum information between the models and the datasets.

AB - The optimality and sensitivity of the empirical risk minimization problem with relative entropy regularization (ERM-RER) are investigated for the case in which the reference is a σ-finite measure instead of a probability measure. This generalization allows for a larger degree of flexibility in the incorporation of prior knowledge over the set of models. In this setting, the interplay of the regularization parameter, the reference measure, the risk function, and the empirical risk induced by the solution of the ERM-RER problem is characterized. This characterization yields necessary and sufficient conditions for the existence of regularization parameters that achieve arbitrarily small empirical risk with arbitrarily high probability. Additionally, the sensitivity of the expected empirical risk to deviations from the solution of the ERM-RER problem is studied. Dataset-dependent and dataset-independent upper bounds on the absolute value of the sensitivity are presented. In a special case, it is shown that the expectation (with respect to the datasets) of the absolute value of the sensitivity is upper bounded, up to a constant factor, by the square root of the lautum information between the models and the datasets.

UR - http://www.scopus.com/inward/record.url?scp=85124642565&partnerID=8YFLogxK

U2 - 10.1109/ISIT50566.2022.9834273

DO - 10.1109/ISIT50566.2022.9834273

M3 - Conference contribution

AN - SCOPUS:85124642565

T3 - IEEE International Symposium on Information Theory - Proceedings

SP - 684

EP - 689

BT - 2022 IEEE International Symposium on Information Theory, ISIT 2022

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 26 June 2022 through 1 July 2022

ER -