Earthquake detection is the base of seismological research. Recent advancements have highlighted the superior efficacy of deep learning techniques compared to conventional methods. However, deploying these techniques in highly heterogeneous environments poses significant challenges, primarily due to variations in datasets and the diversity of evaluation methods. Notably, existing models often focus on detecting the more pronounced S-waves, neglecting the crucial early detection of P-waves. To address this, our study introduces TFEQ, a transformer-based model designed for real-time earthquake detection within diverse IoT environments. Uniquely, TFEQ concurrently analyzes both P and S waves across different domains. We further substantiate TFEQ's effectiveness and its broad applicability through case studies involving MEMS sensor data collected by the CrowdQuake initiative, demonstrating its reliability and generalization capabilities.