BACKGROUND: Staphylococcus aureus is an important pathogen that can colonize humans and various animals. However, the host-associated determinants of S. aureus remain uncertain, which leads to difficulties in inferring its host species and cross-species transmission. We performed a 3-stage genome-wide association study (discovery, confirming, and validation) to compare genetic variation between pig and human S. aureus, aiming to elucidate the host-specific genetic elements (k-mers). RESULTS: After 3-stage association analyses, we found a subset of 20 consensus-significant host-associated k-mers, which are significantly overrepresented in a specific host. Surprisingly for host prediction, both the final model with the top 5 k-mers and the simplest model with only the most important k-mer achieved a high classification accuracy of 98%, giving a simple target for predicting host species and cross-species transmission of S. aureus. The final classifier with the top 5 k-mers revealed that 97.5% of S. aureus isolates from livestock-exposed workers were predicted as pig origin, suggesting a high cross-species transmission risk. The time-based phylogeny inferred the cross-species transmission directions, indicating that ST9 can cross-species spread from animals to humans while ST59 can cross-species spread in the opposite direction. CONCLUSION: Our findings provide novel insights into host-associated determinants and an accurate model for inferring S. aureus host species and cross-species transmission.