BACKGROUND: Breast cancer (BC) is a malignant tumor characterized by a high incidence rate and is the leading cause of cancer-related deaths among women worldwide. This study aims to identify key genes and potential prognostic biomarkers using a bioinformatics approach. METHODS: Three microarray datasets, GSE86374, GSE120129, and GSE29044, were downloaded from the GEO database. GEO2R and Venn diagram software were employed to identify differentially expressed genes (DEGs), while DAVID was utilized for functional enrichment analysis. Subsequently, STRING and Cytoscape were used to construct the protein-protein interaction (PPI) network among the DEGs. UALCAN, GEPIA, and the Kaplan-Meier plotter were employed for prognostic analysis. Following this, the correlations and alterations of key genes were examined using cBioPortal. Finally, immunohistochemistry (IHC) was performed to validate the expression levels of the key genes. RESULTS: A total of 323 differentially expressed genes (DEGs) were identified. From the protein-protein interaction (PPI) network, 37 hub genes were selected. Validation using UALCAN, GEPIA, and Kaplan-Meier plotters revealed that three key genes-RACGAP1, SPAG5, and KIF20A-were significantly overexpressed and associated with poor prognosis in breast cancer (BC), as well as advanced tumor staging. The correlations and alterations of these key genes, as demonstrated on cBioPortal, indicated that their alterations co-occurred. Experimental verification through immunohistochemistry (IHC) confirmed that the proteins of these key genes were highly expressed in tumor tissues. CONCLUSIONS: The key genes identified in this study can enhance our understanding of the molecular mechanisms underlying breast cancer (BC). Additionally, these genes may serve as potential sensitive biomarkers for patients with BC.