The detection and identification of polycyclic aromatic hydrocarbons (PAHs) and their modified derivatives in contaminated soil is challenging due to the chemical and microbial complexity of soil organic matter. To address these challenges, we developed an innovative analytical approach that combines Surface-enhanced Raman spectroscopy with a Raman spectral library constructed in silico using density functional theory (DFT)-calculated spectra. This method overcomes several limitations associated with traditional experimental libraries, including spectral background interference, solvent effects, and commercially unavailable or challenging to synthesize compounds. Our methodology employs a physics-informed machine learning pipeline that operates in two stages: the characteristic peak extraction (CaPE) algorithm, which isolates distinctive spectral features, and the characteristic peak similarity (CaPSim) algorithm, which identifies analytes with high robustness to spectral shifts and amplitude variations. Validation of this approach showed strong similarity values (>
0.6) between DFT-calculated and experimental Surface-enhanced Raman spectra for multiple PAHs, confirming its accuracy and discriminative capability. This study establishes the viability of DFT-calculated spectra as reliable references for identifying analytes that lack experimental reference spectra, including those formed through environmental modification of PAHs. This advancement addresses a critical gap in environmental monitoring, providing a valuable tool for assessing public health risks associated with these contaminants.