Imaging speed is critical for photoacoustic microscopy as it affects the capability to capture dynamic biological processes and support real-time clinical applications. Conventional approaches for increasing imaging speed typically involve high-repetition-rate lasers, which pose a risk of thermal damage to samples. Here, we propose a deep-learning-driven optical-scanning undersampling method for photoacoustic remote sensing (PARS) microscopy, accelerating imaging acquisition while maintaining a constant laser repetition rate and reducing laser dosage. We develop a hybrid Transformer-Convolutional Neural Network, HTC-GAN, to address the challenges of both nonuniform sampling and motion misalignment inherent in optical-scanning undersampling. A mouse ear vasculature image dataset is created through our customized galvanometer-scanned PARS system to train and validate HTC-GAN. The network successfully restores high-quality images from 1/2-undersampled and 1/4-undersampled data, closely approximating the ground truth images. A series of performance experiments demonstrate that HTC-GAN surpasses the basic misalignment compensation algorithm, and standalone CNN or Transformer networks in terms of perceptual quality and quantitative metrics. Moreover, three-dimensional imaging results validate the robustness and versatility of the proposed optical-scanning undersampling imaging method across multiscale scanning modes. Our method achieves a fourfold improvement in PARS imaging speed without hardware upgrades, offering an available solution for enhancing imaging speed in other optical-scanning microscopic systems.