Coal, as a vital global energy resource, directly impacts the efficiency of power generation and environmental protection. Thus, rapid and accurate coal quality analysis is essential to promote its clean and efficient utilization. However, combined near-infrared spectroscopy and X-ray fluorescence (NIRS-XRF) spectroscopy often suffer from the particle size effect of coal samples, resulting in unstable and inaccurate analytical outcomes. This study introduces a novel correction method combining the Segment Anything Model (SAM) for precise particle segmentation and Data-Efficient Image Transformers (DeiTs) to analyze the relationship between particle size and ash measurement errors. Microscopic images of coal samples are processed with SAM to generate binary mask images reflecting particle size characteristics. These masks are analyzed using the DeiT model with transfer learning, building an effective correction model. Experiments show a 22% reduction in standard deviation (SD) and root mean square error (RMSE), significantly enhancing ash prediction accuracy and consistency. This approach integrates cutting-edge image processing and deep learning, effectively reducing submillimeter particle size effects, improving model adaptability, and enhancing measurement reliability. It also holds potential for broader applications in analyzing complex samples, advancing automation and efficiency in online analytical systems, and driving innovation across industries.