# Publications

Publications in reversed chronological order.

## 2023

- NeurIPSOn the Minimax Regret for Online Learning with Feedback Graphs
*To appear in NeurIPS ’23*, 2023In this work, we improve on the upper and lower bounds for the regret of online learning with strongly observable undirected feedback graphs. The best known upper bound for this problem is \mathcalO\bigl(\sqrtαT\ln K\bigr), where K is the number of actions, αis the independence number of the graph, and T is the time horizon. The \sqrt\ln K factor is known to be necessary when α= 1 (the experts case). On the other hand, when α= K (the bandits case), the minimax rate is known to be Θ\bigl(\sqrtKT\bigr), and a lower bound Ω\bigl(\sqrtαT\bigr) is known to hold for any α. Our improved upper bound \mathcalO\bigl(\sqrtαT(1+\ln(K/α))\bigr) holds for any αand matches the lower bounds for bandits and experts, while interpolating intermediate cases. To prove this result, we use FTRL with q-Tsallis entropy for a carefully chosen value of q ∈[1/2, 1) that varies with α. The analysis of this algorithm requires a new bound on the variance term in the regret. We also show how to extend our techniques to time-varying graphs, without requiring prior knowledge of their independence numbers. Our upper bound is complemented by an improved Ω\bigl(\sqrtαT(\ln K)/(\lnα)\bigr) lower bound for all α> 1, whose analysis relies on a novel reduction to multitask learning. This shows that a logarithmic factor is necessary as soon as α< K.

- ICMLDelayed Bandits: When Do Intermediate Observations Help?
*Emmanuel Esposito*, Saeed Masoudian, Hao Qiu, Dirk van der Hoeven, Nicolò Cesa-Bianchi, and Yevgeny Seldin*ICML ’23*, 2023We study a K-armed bandit with delayed feedback and intermediate observations. We consider a model where intermediate observations have a form of a finite state, which is observed immediately after taking an action, whereas the loss is observed after an adversarially chosen delay. We show that the regime of the mapping of states to losses determines the complexity of the problem, irrespective of whether the mapping of actions to states is stochastic or adversarial. If the mapping of states to losses is adversarial, then the regret rate is of order \sqrt(K+d)T (within log factors), where T is the time horizon and d is a fixed delay. This matches the regret rate of a K-armed bandit with delayed feedback and without intermediate observations, implying that intermediate observations are not helpful. However, if the mapping of states to losses is stochastic, we show that the regret grows at a rate of \sqrt\bigl(K+\min{|\mathcalS|,d}\bigr)T (within log factors), implying that if the number |\mathcalS| of states is smaller than the delay, then intermediate observations help. We also provide refined high-probability regret upper bounds for non-uniform delays, together with experimental validation of our algorithms.

## 2022

- NeurIPSLearning on the Edge: Online Learning with Stochastic Feedback Graphs
*NeurIPS ’22*, 2022The framework of feedback graphs is a generalization of sequential decision-making with bandit or full information feedback. In this work, we study an extension where the directed feedback graph is stochastic, following a distribution similar to the classical Erdős-Rényi model. Specifically, in each round every edge in the graph is either realized or not with a distinct probability for each edge. We prove nearly optimal regret bounds of order \min\bigl{\min_\varepsilon \sqrt(\alpha_\varepsilon/\varepsilon) T, \min_\varepsilon (\delta_\varepsilon/\varepsilon)^1/3 T^2/3\bigr} (ignoring logarithmic factors), where \alpha_\varepsilon and \delta_\varepsilon are graph-theoretic quantities measured on the support of the stochastic feedback graph \mathcalG with edge probabilities thresholded at \varepsilon. Our result, which holds without any preliminary knowledge about \mathcalG, requires the learner to observe only the realized out-neighborhood of the chosen action. When the learner is allowed to observe the realization of the entire graph (but only the losses in the out-neighborhood of the chosen action), we derive a more efficient algorithm featuring a dependence on weighted versions of the independence and weak domination numbers that exhibits improved bounds for some special cases.

## 2020

- ALENEXRecSplit: Minimal Perfect Hashing via Recursive Splitting
*Emmanuel Esposito*, Thomas Mueller Graf, and Sebastiano Vigna*In 2020 Proceedings of the Symposium on Algorithm Engineering and Experiments (ALENEX)*, 2020A minimal perfect hash function bijectively maps a key set S out of a universe U into the first |S| natural numbers. Minimal perfect hash functions are used, for example, to map irregularly-shaped keys, such as strings, in a compact space so that metadata can then be simply stored in an array. While it is known that just 1.44 bits per key are necessary to store a minimal perfect hash function, no published technique can go below 2 bits per key in practice. We propose a new technique for storing minimal perfect hash functions with expected linear construction time and expected constant lookup time that makes it possible to build for the first time, for example, structures which need 1.56 bits per key, that is, within 8.3% of the lower bound, in less than 2 ms per key. We show that instances of our construction are able to simultaneously beat the construction time, space usage and lookup time of the state-of-the-art data structure reaching 2 bits per key. Moreover, we provide parameter choices giving structures which are competitive with alternative, larger-size data structures in terms of space and lookup time. The construction of our data structures can be easily parallelized or mapped on distributed computational units (e.g., within the MapReduce framework), and structures larger than the available RAM can be directly built in mass storage.