Water pollution from hazardous materials, particularly arsenic, downstream of gold mines poses severe environmental and health risks. This study employs a systematic approach to predict water arsenic (WA) levels downstream of gold mines affected by acid mine drainage. WA data from the affected region were collected and preprocessed to standardize the dataset and mitigate overfitting risks. Advanced ensemble machine learning methods, particularly Light Gradient Boosting Machine (LightGBM), with two models developed: a manually-adjusted version and an optimization-based model using Jellyfish Search Optimizer (JSO). The performance of the LightGBM-JSO model was evaluated against a range of ensemble learning models, metaheuristic algorithms, and artificial intelligence techniques. Models were evaluated using mean absolute error (MAE), mean absolute percentage error (MAPE), coefficient of determination (R