BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//132.216.98.100//NONSGML kigkonsult.se iCalcreator 2.20.4//
BEGIN:VEVENT
UID:20260413T014356EDT-7871fuFc6x@132.216.98.100
DTSTAMP:20260413T054356Z
DESCRIPTION:Informal Systems Seminar (ISS)\, Centre for Intelligent Machine
 s (CIM) and Groupe d'Etudes et de Recherche en Analyse des Decisions (GERA
 D)\n\nSpeaker: Dengwang Tang\n\n\n	** Note that this is a hybrid event.\n	**
  This seminar will be projected at McConnell 437 at McGill University\n	\n	Z
 oom Link\n	Meeting ID: 845 1388 1004       \n	Passcode: VISS\n	\n	Abstract: In
  many traditional reinforcement learning (RL) settings\, an agent learns t
 o\n	control the system without incorporating any prior knowledge. However\,
  such a\n	paradigm can be impractical since learning can be slow. In many e
 ngineering\n	applications\, offline datasets are often available. To levera
 ge the information provided\n	by the offline datasets with the power of onl
 ine-finetuning\, we proposed the informed\n	posterior sampling based reinfo
 rcement learning (iPSRL) for both episodic and\n	continuing MDP learning pr
 oblems. In this algorithm\, the learning agent forms an\n	informed prior wi
 th the offline data along with the knowledge about the offline policy that
 \n	generated the data. This informed prior is then used to initiate the pos
 terior sampling\n	procedure. Through a novel prior-dependent regret analysi
 s of the posterior sampling\n	procedure\, we showed that when the offline d
 ata is informative enough\, the iPSRL\n	algorithm can significantly reduce 
 the learning regret compared to the baselines (that do\n	not use offline da
 ta in the same way). Based on iPSRL\, we then proposed the more\n	practical
  iRLSVI algorithm. Empirical results showed that iRLSVI can significantly
 \n	reduce regret compared to baselines without regret.\n	\n	Bio: Dengwang Tan
 g is currently a postdoctoral researcher at University of Southern\n	Califo
 rnia. He obtained his B.S.E in Computer Engineering from University of Mic
 higan\,\n	Ann Arbor in 2016. He earned his Ph.D. in Electrical and Computer
  Engineering (2021)\,\n	M.S. in Mathematics (2021)\, and M.S. in Electrical
  and Computer Engineering (2018) all\n	from University of Michigan\, Ann Ar
 bor. Prior to joining USC he was a postdoctoral\n	researcher at University 
 of California\, Berkeley. His research interests involve control\n	and lear
 ning algorithms in stochastic dynamic systems\, multi-armed bandits\, mult
 i-agent\n	systems\, queuing theory\, and game theory\n
DTSTART:20240426T143000Z
DTEND:20240426T153000Z
LOCATION:Zames Seminar Room\, MC 437\, McConnell Engineering Building\, CA\
 , QC\, Montreal\, H3A 0E9\, 3480 rue University
SUMMARY:Informed Posterior Sampling Based Reinforcement Learning Algorithms
URL:https://www.mcgill.ca/cim/channels/event/informed-posterior-sampling-ba
 sed-reinforcement-learning-algorithms-356914
END:VEVENT
END:VCALENDAR
