Deep Reinforcement Learning (DRL) is critical for autonomous systems to continuously learn and adapt in dynamic environments. However, frequent retraining in DRL leads to high energy consumption, posing significant challenges for mobile and battery-dependent robotic systems. Co-optimizing energy, latency, and algorithm performance is essential for efficient on-device DRL. Current approaches either focus on traditional DNNs like CNNs or target only two out of the three dimensions, rather than addressing all three simultaneously. This paper introduces DuoJoule, a comprehensive framework designed to address the unique challenges of DRL workloads by meeting latency deadlines and adhering to energy budgets while maximizing algorithm performance through both application and system-level configurations. DuoJoule dynamically coordinates adjustments in DRL algorithm parameters and system frequency settings using Dynamic Voltage and Frequency Scaling (DVFS). A key innovation of DuoJoule is its runtime metric tracker, which assesses system status against target budgets and calculates a universal efficiency score. This enables rapid and adaptive tuning at runtime, balancing energy efficiency, latency, and algorithm performance. Extensive evaluation using benchmarks along with a realistic autonomous driving case study demonstrates DuoJoule’s versatile cross-platform efficiency, practicality in real-world scenarios, adaptivity to varying constraints, and low runtime overhead evaluated on two widely used autonomous embedded platforms. Empirical results show that DuoJoule consistently meets latency and energy targets while maintaining near-optimal performance, showcasing its effectiveness in managing the complex trade-off space of on-device DRL.