BACKGROUND: Artificial intelligence (AI) systems in endoscopy are predominantly developed and tested using high-quality imagery from expert centers. However, their performance may be different when applied in clinical practice, partly due to the diversity in post-processing enhancement settings used in endoscopy units. We evaluated the impact of post-processing enhancement settings on AI performance and tested specific data augmentation strategies to mitigate performance loss. METHODS: We used a computer-aided detection (CADe) system for Barrett's neoplasia (6223 images, 906 patients) and a computer-aided diagnosis (CADx) system for colorectal polyps (3288 images, 969 patients), both trained on datasets acquired with Olympus equipment and with limited variability in enhancement settings. The CAD systems were then tested across a wide range of test sets, which comprised the same images, but displayed with different enhancement settings. Both CAD systems were then retrained using image enhancement-based data augmentation. The performance of the adjusted CAD systems was evaluated on the same test sets. RESULTS: Both systems displayed substantial performance variability over a range of enhancement settings (CADe: 83 %-92 % sensitivity, 84 %-91 % specificity
CADx: 78 %-85 % sensitivity, 45 %-63 % specificity). After retraining, variability in sensitivity and specificity was reduced to 2 % ( CONCLUSION: The performance of endoscopic AI systems can vary substantially depending on post-processing enhancement settings of the endoscopy unit. Specific data augmentation can mitigate this performance loss.