Ageing is associated with elevated pure-tone thresholds, accompanied by increased difficulties in understanding speech-in-noise. While amplification provides important, but insufficient support, auditory-cognitive training (ACT) might propose a solution. However, generalized effects have been scarce, highlighting the necessity of training designs targeting naturalistic listening situations. We addressed this issue by designing a short-term ACT in a purely auditory- and a virtual multisensory environment, targeting both, sensory and cognitive processing of natural speech. 40 healthy older participants with varying hearing- and cognitive capacities were exposed to both trainings (cross-over design), while speech-in-noise perception was measured before and after each session. Immersive ACT exposure resulted in increased speech-in-noise perception, particularly for individuals with more pronounced hearing loss or reduced auditory working memory capacity. These results demonstrate that combining sensory and cognitive training elements, particularly in a multisensory environment, has the potential to improve speech in noise perception.