Medical image segmentation models often fail to generalize well to new datasets due to substantial variability in imaging conditions, anatomical differences, and patient demographics. Conventional domain generalization (DG) methods focus on learning domain-agnostic features but often overlook the importance of maintaining performance balance across different domains, leading to suboptimal results. To address these issues, we propose a novel approach using game theory to model the training process as a zero-sum game, aiming for a Nash equilibrium to enhance adaptability and robustness against domain shifts. Specifically, our adaptive domain selection method, guided by the Beta distribution and optimized via reinforcement learning, dynamically adjusts to the variability across different domains, thus improving model generalization. We conducted extensive experiments on benchmark datasets for polyp segmentation, optic cup/optic disc (OC/OD) segmentation, and prostate segmentation. Our method achieved an average Dice score improvement of 1.75% compared with other methods, demonstrating the effectiveness of our approach in enhancing the generalization performance of medical image segmentation models.