TY - GEN
T1 - Empirical Risk Minimization with Relative Entropy Regularization
T2 - 2022 IEEE International Symposium on Information Theory, ISIT 2022
AU - Perlaza, Samir M.
AU - Bisson, Gaetan
AU - Esnaola, Iñaki
AU - Jean-Marie, Alain
AU - Rini, Stefano
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - The optimality and sensitivity of the empirical risk minimization problem with relative entropy regularization (ERM-RER) are investigated for the case in which the reference is a σ-finite measure instead of a probability measure. This generalization allows for a larger degree of flexibility in the incorporation of prior knowledge over the set of models. In this setting, the interplay of the regularization parameter, the reference measure, the risk function, and the empirical risk induced by the solution of the ERM-RER problem is characterized. This characterization yields necessary and sufficient conditions for the existence of regularization parameters that achieve arbitrarily small empirical risk with arbitrarily high probability. Additionally, the sensitivity of the expected empirical risk to deviations from the solution of the ERM-RER problem is studied. Dataset-dependent and dataset-independent upper bounds on the absolute value of the sensitivity are presented. In a special case, it is shown that the expectation (with respect to the datasets) of the absolute value of the sensitivity is upper bounded, up to a constant factor, by the square root of the lautum information between the models and the datasets.
AB - The optimality and sensitivity of the empirical risk minimization problem with relative entropy regularization (ERM-RER) are investigated for the case in which the reference is a σ-finite measure instead of a probability measure. This generalization allows for a larger degree of flexibility in the incorporation of prior knowledge over the set of models. In this setting, the interplay of the regularization parameter, the reference measure, the risk function, and the empirical risk induced by the solution of the ERM-RER problem is characterized. This characterization yields necessary and sufficient conditions for the existence of regularization parameters that achieve arbitrarily small empirical risk with arbitrarily high probability. Additionally, the sensitivity of the expected empirical risk to deviations from the solution of the ERM-RER problem is studied. Dataset-dependent and dataset-independent upper bounds on the absolute value of the sensitivity are presented. In a special case, it is shown that the expectation (with respect to the datasets) of the absolute value of the sensitivity is upper bounded, up to a constant factor, by the square root of the lautum information between the models and the datasets.
UR - http://www.scopus.com/inward/record.url?scp=85124642565&partnerID=8YFLogxK
U2 - 10.1109/ISIT50566.2022.9834273
DO - 10.1109/ISIT50566.2022.9834273
M3 - Conference contribution
AN - SCOPUS:85124642565
T3 - IEEE International Symposium on Information Theory - Proceedings
SP - 684
EP - 689
BT - 2022 IEEE International Symposium on Information Theory, ISIT 2022
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 26 June 2022 through 1 July 2022
ER -