Thermal and RGB images exhibit significant differences in information representation, especially in low-light or nighttime environments. Thermal images provide temperature information, complementing the RGB images by restoring details and contextual information. However, the spatial discrepancy between different modalities in RGB-Thermal (RGB-T) semantic segmentation tasks complicates the process of multimodal feature fusion, leading to a loss of spatial contextual information and limited model performance. This paper proposes a channel-space fusion nonlinear spiking neural P system model network (CSPM-SNPNet) to address these challenges. This paper designs a novel color-thermal image fusion module to effectively integrate features from both modalities. During decoding, a nonlinear spiking neural P system is introduced to enhance multi-channel information extraction through the convolution of spiking neural P systems (ConvSNP) operations, fully restoring features learned in the encoder. Experimental results on public datasets MFNet and PST900 demonstrate that CSPM-SNPNet significantly improves segmentation performance. Compared with the existing methods, CSPM-SNPNet achieves a 0.5% improvement in mIOU on MFNet and 1.8% on PST900, showcasing its effectiveness in complex scenes.