Although Deep Reinforcement Learning (DRL) and Large Language Models (LLMs) each show promise in addressing decision-making challenges in autonomous driving, DRL often suffers from high sample complexity, while LLMs have difficulty ensuring real-time decision making. To address these limitations, we propose TeLL-Drive, a hybrid framework that integrates an Teacher LLM to guide an attention-based Student DRL policy. By incorporating risk metrics, historical scenario retrieval, and domain heuristics into context-rich prompts, the LLM produces high-level driving strategies through chain-of-thought reasoning. A self-attention mechanism then fuses these strategies with the DRL agent’s exploration, accelerating policy convergence and boosting robustness across diverse driving conditions. Our experimental results, evaluated across multiple traffic scenarios, show that TeLL-Drive outperforms existing baseline methods, including other LLM-based approaches, in terms of success rates, average returns, and real-time feasibility. Ablation studies underscore the importance of each model component, especially the synergy between the attention mechanism and LLM-driven guidance. Finally, we build a virtual-real fusion experimental platform to verify the real-time, robustness, and reliability of the model when deployed in real vehicles through vehicle-in-loop experiments. These findings suggest that TeLL-Drive significantly enhances both the adaptability and safety of autonomous driving systems, while offering a more efficient and scalable approach for policy learning.
Images
The overall conceptual framework of TeLL-Drive, where a DRL student agent is guided by the LLM teacher for better decision making in autonomous driving.
Comparison of the performance of this model with traditional DRL training results.
Comparison of performance results during the ablation experiment training process.
Comparison of testing success rate results between the teacher agent and the student agent.
Videos
Simulation Experiment
Unsignalized intersection: The agent must execute an unprotected left turn at an unsignalized intersection, requiring conflict resolution and time-slot preemption to navigate crossing traffic safely.
High-Speed Ramp Merging: The agent operates on an acceleration lane, performing speed matching and gap selection to merge seamlessly into highway traffic at elevated velocities.
Four-Lane Adaptive Cruise: The agent focuses on fine-grained control of inter-vehicle distances and speeds across four lanes, highlighting precision in longitudinal control and continuous lane tracking.
Vehicle-in-loop Experiment
In Case 1, the autonomous vehicle equipped with TeLL-Drive begins from a standstill and accelerates toward the intersection. As it approaches the stop line, the vehicle slows down to create sufficient observation and decision space, enhancing its ability to assess the surrounding traffic. By the 7th second, the vehicle encounters an oncoming vehicle. Upon assessing the situation, the vehicle decides to slow further at the 12th second to yield and avoid a collision. After the oncoming vehicle passes, the autonomous vehicle resumes acceleration and approaches the exit road of the intersection by the 15th second. To maintain a safe distance from the vehicle in front, the system performs adaptive acceleration and deceleration, ensuring both safety and traffic efficiency. In this case, the autonomous vehicle is the last to leave the intersection, but the maneuver was executed safely and efficiently.
In Case 2, the autonomous vehicle follows similar actions up to the point before entering the intersection. However, at the 6th second, the vehicle observes fewer vehicles in the intersection and determines it can pass first, so it accelerates. By the 8th second, a vehicle on the left side approaches, prompting an interaction. After a brief period of strong interaction between the two vehicles, the simulation vehicle (SV) decides to slow down and stop, while our autonomous vehicle continues to pass first, successfully navigating the intersection.