INTRODUCTION: Roadway crash data are very rare and occur randomly, therefore there are several challenges to developing a crash prediction model for real-time traffic safety management. Recently, to resolve the problem of crash data sample size, researchers have conducted studies on crash data augmentation using machine learning techniques for developing safety evaluation models. However, it's important to incorporate the specific characteristics of crash data into augmentation and crash risk assessment, as these characteristics vary depending on spatial and temporal conditions. METHOD: Therefore, this study developed a real-time crash risk model in three stages. First, crash data were clustered to define heterogeneous crash risk situations and then, key variables were derived by the ensemble and explainable artificial intelligence techniques, Boruta-SHAP. Second, augmentation of each clustered crash data was performed using oversampling techniques including Conditional Generative Adversarial Network (CGAN), which can consider each crash risk cluster's characteristics. Finally, crash risk models were developed and compared with other crash risk models developed by using binary logistic regression model (BLM), Random Forest (RF), extreme gradient boosting (XGBoost), and Support Vector Machine (SVM). RESULTS: The results showed that the CGAN-based XGBoost model has the best performance and the variable of the temporal speed difference at 10-minute intervals and the precipitation variable have a large impact on crash risk prediction. This paper emphasizes that crash risk characteristics must be distinguished in crash risk prediction and provides new insights into addressing the imbalance data issue within crash and non-crash datasets.