Patients with locally advanced rectal cancer (LARC) show substantial individual variability and a pronounced imbalance in response distribution to neoadjuvant chemoradiotherapy (nCRT), posing significant challenges to treatment response prediction. This study aims to identify effective predictive biomarkers and develop an ensemble learning-based prediction model to assess the response of LARC patients to nCRT. A two-step feature selection method was developed to identify predictive biomarkers by deriving stable reversal gene pairs through within-sample relative expression orderings (REOs) from LARC patients undergoing nCRT. Preliminary screening utilized four methods-MDFS, Boruta, MCFS, and VSOLassoBag-to form a candidate feature set. Secondary screening ranked these features by permutation importance, applying Incremental Feature Selection (IFS) with an Extreme Gradient Boosting (XGBoost) to determine final predictive gene pairs. The ensemble model BoostForest, combining boosting and bagging, served as the predictive framework, with SHAP employed for interpretability. Through two-step feature selection, the 32-gene pair signature (32-GPS) was established as the final predictive biomarker. In the test set, the model achieved an area under the precision-recall curve (AUPRC) of 0.983 and an accuracy of 0.988. In the validation cohort, the AUPRC was 0.785, with an accuracy of 0.898, indicating strong model performance. The study further demonstrated that BoostForest achieved superior overall performance compared to Random Forest, Support Vector Machine (SVM), and XGBoost. To evaluate the effectiveness of the 32-GPS, its performance was compared with two alternative feature sets: the lasso-gene pair signature (lasso-GPS), derived through lasso regression, and the 15-shared gene pair signature (15-SGPS), consisting of gene pairs identified by all four feature selection methods. The 32-GPS demonstrated superior performance in both comparisons. The two-step feature selection method identified robust predictive biomarkers, and BoostForest outperformed Random Forest, Support Vector Machine, and XGBoost in classification performance and predictive capability.