OBJECTIVE: Targeted blood-brain barrier (BBB) opening using focused ultrasound (FUS) and micro/nanobubbles is a promising method for brain drug delivery. This study aims to explore the feasibility of multiple instance learning (MIL) in accurate and fast prediction of FUS BBB opening outcomes. METHODS: FUS BBB opening experiments are conducted on 52 mice with the infusion of SonoVue microbubbles or custom-made nanobubbles. Acoustic signals collected during the experiments are transformed into frequency domain and used as the dataset. We propose a Simple Transformer-based model for BBB Opening Prediction (SimTBOP). By leveraging the self-attention mechanism, our model considers the contextual relationships between signals from different pulses in a treatment and aggregates this information to predict the BBB opening outcomes. Multiple preprocessing methods are applied to evaluate the performance of the proposed model under various conditions. Additionally, a visualization technique is employed to explain and interpret the model. RESULTS: The proposed model achieves excellent prediction performance with an accuracy of 96.7%. Excluding absolute intensity information and retaining baseline noise did not affect the model's performance or interpretability. The proposed model trained on SonoVue data generalizes well to nanobubble data and vice versa. Visualization results indicate that the proposed model focuses on pulses with significant signals near the ultra-harmonic frequency. CONCLUSION: We demonstrate the feasibility of MIL in FUS BBB opening prediction. The proposed Transformer-based model exhibits outstanding performance, interpretability, and cross-agent generalization capability, providing a novel approach for FUS BBB opening prediction with clinical translation potential.