The web browser is arguably the most important piece of software on your computer. You spend much of your time online inside a browser: when you search, chat, email ...
We consider an offline reinforcement learning (RL) setting where the agent need to learn from a dataset collected by rolling out multiple behavior policies. There are two challenges for this setting: ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results