OBJECTIVE: Synthetic data (SD) is artificially generated information that mimics the statistical characteristics and correlations of real-world data, enabling researchers to simulate variables that are challenging to obtain in routine practice while overcoming confidentiality barriers. This study aims to evaluate the utility, validity, and potential limitations of SD in glioblastoma (GBM) and brain metastases (BM) research. METHODS: Three published neuro-oncology studies focusing on prognostic factors were selected: 2 involving GBM patients and 1 with BM patients. These studies were replicated using the MDClone platform, a healthcare data exploration tool that enables the creation of SD. Real-world data and SD were compared across patient demographic and outcome variables using summary statistics, normality testing, and t-test as required. RESULTS: 452 GBM patients and 1320 BM patients were generated with SD. Among GBM patients, longer median overall survival was associated with younger age (age<
50: 16.3 months [95% CI: 12.8-19.8]
age 50-59: 15.6 [95% CI: 13.1-18.1]
age 60-69: 13.9 [95% CI: 12.1-15.7]
age>
70: 8.8 [95% CI: 7.4-10.2], P <
0.001), greater extent of resection (debulking: 16.8 months [95% CI 14.9-18.7] vs. biopsy: 10.9 months [95% CI: 9.6-12.3], P <
0.001), and higher serum albumin (sAlb) (sAlb<
30 g/L: 7.0 months [95% CI: 4.8-9.3]
sAlb 30-40 g/L: 12.9 [95% CI: 11.6-14.1]
sAlb>
40: 16.2 [95% CI: 13.4-19.1], P <
0.05). Among BM patients, lower systemic inflammation scores (neutrophil-lymphocyte-ratio, leukocyte-lymphocyte-ratio, platelet-lymphocyte-ratio, monocyte-lymphocyte-ratio, and C-reactive-protein/albumin-ratio) were associated with longer overall survival (P <
0.05). These results aligned with the findings reported in the literature. CONCLUSIONS: Integrating SD into clinical research offers potential for providing accurate predictive insights without compromising patient privacy.