Objective The aim of this study was to compare the accuracy of ChatGPT artificial intelligence (AI) with clinicians in real-life case scenarios related to retinopathy of prematurity (ROP). Methods This was a prospectively conducted study with a real-life case scenario-based questionnaire with multiple-response answers. Thirteen clinicians, including eight vitreoretinal fellowship trainees (with less than two years of experience in the management of ROP) and five ROP experts (with more than three years of experience in the management of ROP), were given 10 real-life case scenarios in ROP. The majority of responses from trainees and ROP experts were compared with the ChatGPT AI-generated responses. The ChatGPT exercise was repeated for both versions 3.5 and 4.0 more than a month apart on May 29, 2024, and July 18, 2024, to check for the majority of AI response consistency. For each real-life case scenario, the majority of clinician responses were compared with the majority of AI responses for agreement. Results ChatGPT answered nine cases correctly (90%), outperforming the fellowship trainees (77.5%, i.e., 62 correct responses out of 80). The accuracy of ROP experts was highest at 96% (i.e., 48 correct responses out of 50). There was substantial agreement between the majority of responses of clinicians and the ChatGPT responses, with a Cohen's kappa of 0.80. Conclusion The ChatGPT AI model showed substantial agreement with the majority of clinician responses and performed better than vitreoretinal fellowship trainees. ChatGPT AI presents promising new software tools that can be explored further for use in real-life case scenarios in ROP. A more accurate prompt mentioning the type of screening guidelines can promote more accurate answers by ChatGPT as per the requested guidelines.