BACKGROUND: Stereotactic body radiotherapy (SBRT) has become a promising alternative for patients with inoperable liver cancer. However, the accurate delivery of high doses to moving liver tumors remains challenging. Treatment accuracy can be quantified by comparing post-radiotherapeutic magnetic resonance imaging (MRI)-morphologic alterations (MMA) and corresponding isodose-structure cropped to the liver (ISL) upon planning computed tomography (CT). The study aimed to evaluate the robustness of accuracy metrics, and investigate the factors influencing treatment accuracy of liver SBRT using an internal target volume (ITV) strategy based on four-dimensional (4D) CT. METHODS: A retrospective observational study was conducted on a cohort of 31 liver cancer patients who underwent liver SBRT using an ITV strategy based on 4D CT from October 2018 to March 2024. All patients exhibited localized morphological changes on MRI. In vivo analysis (IVA) of liver SBRT was performed by comparing MMA and ISL following deformable image registration of post-radiotherapeutic MRI and planning CT. Accuracy metrics included Dice similarity coefficient (DSC), conformity index of MMA and ISL (CIMI), Hausdorff distance (HD), mean distance to agreement (MDA), and three-dimensional center-of-mass difference (3D-CoMD). Correlation analysis regarding accuracy metrics and potential factors was conducted to evaluate the robustness of accuracy metrics. Patients were stratified into two groups in ascending order. Kaplan-Meier method was used to evaluate IVA's influence on progression-free survival (PFS) of clinical target volume (CTV) in the two groups. Two-sample RESULTS: Distance metrics (HD, MDA, and 3D-CoMD) were significantly (P<
0.050) influenced by gross tumor volume (GTV), planning target volume (PTV), and time to post-therapeutic MRI. Patients with DSC >
0.7, CIMI >
0.5, HD <
25 mm, MDA <
5 mm, and 3D-CoMD <
8 mm showed significant differences in PFS of CTV (log-rank P=0.013, log-rank P=0.013, log-rank P=0.002, log-rank P=0.009, and log-rank P=0.022, respectively). Motion amplitude did not show significant difference in the two groups defined by thresholds of DSC, CIMI, HD, MDA, and 3D-CoMD. CONCLUSIONS: In this