Due to the variability of gait environments and individual differences, traditional soft exoskeleton force control methods fail to achieve optimal performance. To enhance adaptability to both environments and individuals, a reinforcement learning-based method is proposed for soft exoskeleton control to assist hip flexion-extension. First, the alternating flexion-extension assistive mechanism is analyzed, and the motor control system of the exoskeleton is modeled through system identification. Subsequently, the adaptive controller, based on the identified model, is trained using reinforcement learning in a simulated environment. The TD3 algorithm is selected as the reinforcement learning method to generate the controller. This controller updates the PWM value to ensure the actual force aligns with the desired force. Finally, experimental results demonstrate that the reinforcement learning-based controller for the soft exoskeleton effectively tracks the desired assistive force curve. Additionally, a metabolic experiment involving uniform walking and slope walking is conducted to verify the effectiveness of the soft exoskeleton. Compared to the power-off mode, the net metabolic cost of wearers using the soft exoskeleton with the RLPID method decreases by 12.9% ± 3.3% (uniform walking) and 10.7% ± 3.7% (slope walking).