Deep reinforcement learning-based object detection approaches center around a pivotal concept: hierarchically scaling image segments that harbor more intricate details. Compared with the traditional object detection approaches, this approach significantly curbs the quantity of region proposals. This reduction holds paramount significance in curtailing the computational overhead. However, common deep reinforcement learning-based approaches suffer from a significant defect in terms of precision. This issue arises from inadequacies in representing image states appropriately and the unstable learning ability exhibited by the agent. To address these issues, we present the LHAR-RLD. First, we design the Low-dimensional RepVGG(LDR) feature extractor to reduce memory consumption and to reduce the difficulty of fitting downstream networks. Second, we propose the Hybrid DQN(HDQN) to enhance the agent's ability to determine the state-action of images in complex environments. Then, the Adaptive Dynamic Reward Function(ADR) is crafted to dynamically adjust the reward based on shifts within the agent's exploration environment. Finally, the ROI Align-based bounding box regression network (RABRNet) is proposed, which aims at further regressing the localization results of reinforcement learning to improve the detection precision. Our method accomplishes 74.4% mAP on the VOC2007, 76.2% mAP on the COCO2017, 75.2% Precision on the SF dataset, with 1.43G FLOPs. The precision outperforms the advanced deep reinforcement learning approaches and the computational cost is far lower than theirs and mainstream object detection methods. This method facilitates highly accurate object localization with minimal computational demands, which means it has notable applications on resource-constrained devices.