已发表成果:
WOK 论文 30 篇;中文核心 6 篇;
Sequential action-induced invariant representation for reinforcement learning
Learning task-relevant representations via rewards and real actions for reinforcement learning
Learning Task-relevant Sequence Representations via Intrinsic Dynamics Characteristics in Reinforcement Learning
EPISODIC REINFORCEMENT LEARNING WITH EXPANDED STATE-REWARD SPACE
Episodic Reinforcement Learning with Expanded State-reward Space
Balancing exploration and exploitation in episodic reinforcement learning
SEQUENTIAL ACTION-INDUCED INVARIANT REPRESENTATION FOR REINFORCEMENT LEARNING
The treatment of sepsis: an episodic memory-assisted deep reinforcement learning approach
Learning and planning in partially observable environments without prior domain knowledge
Hard Negative Sample Mining for Contrastive Representation in Reinforcement Learning
Sequential Decision Making with "Sequential Information" in Deep Reinforcement Learning
Gated multi-attention representation in reinforcement learning
Deep Q-Network with Predictive State Models in Partially Observable Domains
An improved relief feature selection algorithm based on Monte-Carlo tree search
Attention-based deep Q-network in complex systems
Basis selection in spectral learning of predictive state representations
Bayesian hybrid state estimation for unequal-length batch processes with incomplete observations
MV-PID controller for linear systems subject to input constraint
Making and improving predictions of interest using an MDP model
Synchronized Bayesian state estimation in batch processes using a two-dimensional particle filter
Learning predictive state representations via monte-carlo tree search
Predictive state representations with state space partitioning
Solving partially observable problems with inaccurate PSR models
Recent Advances in Mathematical Modeling and Simulation of DNA Replication Process
Design and implementation of robot soccer communication protocol based on ant colony algorithm
Recent advances in mathematical modeling and simulation of DNA replication process
PD plus error-dependent integral nonlinear controllers for robot manipulators with an uncertain Jacobian matrix
CMAC-based Sarsa(位) learning algorithm for RoboCup-soccer goalkeeper
An algorithm for resetting PSR models
Using learned PSR model for planning under uncertainty
基于顺序耦合对抗学习的脓毒症序列生成方法
计算机工程,1000-3428,2024-11-21.局部可观测环境下未来信息辅助的无模型深度强化学习
南京大学学报(自然科学),0469-5097,2022-09-30.基于记忆探索策略的有模型深度强化学习算法
微电子学与计算机,1000-7180,2021-04-05.基于循环卷积神经网络的POMDP值迭代算法
计算机工程,1000-3428,2020-02-13.基于CMAC网络Sarsa(λ)学习的RoboCup守门员策略
北京工业大学学报,0254-0037,2012.预测状态表示模型的复位算法
计算机学报,0254-4164,2012.