Stochastic Bandits for Sticky Recommendations
Theja Tulabandhula, University of Illinois Chicago
10-11am 25th Jun 2019
We consider sequential decision problems related to making recommendations. A platform needs to show good and timely recommendations to its users to engage them and increase revenues, while not knowing their behavioral patterns a priori. We consider two behavioral effects that modulate relevance: (a) the users have a latent propensity to act on a recommendation based on its position in a sequence of recommended items, and (b) the users have a latent propensity to act on a recommendation only if it has been shown to them repeatedly. In both settings, the platform has to simultaneously learn the quality of its recommendations and the corresponding user behavior, while exploiting the information it knows so far. We develop new bandit algorithms with regret guarantees when considering both these effects, and validate their performance with experiments. For more details refer to https://arxiv.org/abs/1901.07734 and https://arxiv.org/abs/1811.09026.