In this talk I will introduce a combination of moment based predictive models with deep reinforcement learning architectures, Recurrent Predictive State Policy (RPSP) networks. Predictive state serves as an equivalent representation of a belief state. Therefore, the policy component of the RPSP-network can be purely reactive, simplifying training while still allowing optimal behaviour. We show the efficacy of RPSP-networks under partial observability on a set of robotic control tasks from OpenAI Gym. We empirically show that RPSP-networks perform well compared with memory-preserving networks such as GRUs, as well as finite memory models. This work was done in collaboration with Ahmed Hefny at CMU.
Kernel and Moment Based Prediction and Planning
March 6, 2018
1:00 pm
Zita Marinho
Zita Marinho is PhD finalist in Robotics Institute, under the CMU/Portugal doctoral program. She is affiliated with Institute for Robotics and Systems, and Instituto de Telecomunicações at IST. She is currently working in Sacoor Brothers as a Data Scientist. She received a M.S. degree in Robotics from CMU 2015, and a M.S. degree in Physics Engineering from Instituto Superior Técnico, Universidade de Lisboa, Portugal 2010. As a PhD student, she was jointly advised by André Martins at Unbabel/Instituto de Telecomunicações, Geoffrey Gordon and Siddhartha Srinivasa at CMU. Her research interests focus on machine learning methods using semi-supervision. She is interested in studying algorithms for learning with large amounts of data and little supervised information. Her PhD thesis focuses on spectral methods for learning in Natural Language and Robotics.ITSeminários
Últimos seminários
Cost-Sensitive Learning to Defer to Multiple Experts
March 2, 2026Large language models (LLMs) have emerged as strong contenders in machine translation. Yet, they often fall behind specialized neural machine…
Fair Federated Learning under Group-Specific Distributed Concept Drift
February 24, 2026Machine learning models can become unfair when different groups experience changes in data over time, a phenomenon called group-specific concept…
Unlocking Latent Discourse Translation in LLMs Through Quality-Aware Decoding
June 17, 2025Large language models (LLMs) have emerged as strong contenders in machine translation. Yet, they often fall behind specialized neural machine…
Speech as a Biomarker for Disease Detection
May 20, 2025Today’s overburdened health systems face numerous challenges, exacerbated by an aging population. Speech emerges as a ubiquitous biomarker with strong…

