TetraRL: A Self-Adaptive Runtime for On-Device Deep Reinforcement Learning Systems

Zexin Li, Soheil Shirvani, Cong Liu

June, 2026

Abstract

Autonomous robotic systems, such as autonomous vehicles, drones, and mobile robots, increasingly require on-device Deep Reinforcement Learning (DRL) to continuously adapt to dynamic environments. Unlike cloud-based learning, embedded DRL must perform training and inference directly on resource-constrained hardware while maintaining timely decision-making. This requirement exposes a fundamental challenge: on-device DRL must simultaneously balance four tightly coupled objectives: real-time performance, task reward, memory utilization, and energy consumption. Optimizing these objectives independently often leads to suboptimal system behavior, while naïve multi-objective optimization can violate resource constraints and degrade reliability. This paper presents TetraRL, a holistic runtime framework for self-adaptive tetra-objective on-device DRL. TetraRL formulates embedded DRL as a unified optimization problem over real-time, reward, RAM, and reserve (energy) objectives, and employs a preference-conditioned reinforcement learning controller to dynamically navigate the resulting trade-off space. The framework further integrates a unified resource-management abstraction, hardware-aware DVFS control, and a runtime Override Layer for enforcing resource constraints. We implement and evaluate TetraRL across diverse DRL environments and embedded platforms, including NVIDIA Jetson AGX Orin and Orin Nano. Experimental results demonstrate that TetraRL consistently auto-balances the four objectives, achieving competitive trade-offs across them while maintaining negligible runtime overhead. Furthermore, TetraRL enables runtime-switchable optimization goals through a single trained policy, providing a practical foundation for self-adaptive and resource-aware on-device DRL.

Type

Publication

In Submission to TC

Deep Learning