Our research examines how spatial proximity, shaped by visual statistical learning (VSL), initiates the categorization of individuals into in-/out-groups. We hypothesized that individuals positioned continuously closer in a visual array would be more frequently chosen as part of the in-group, while those farther away would be categorized as out-group members. In Experiment 1, participants selected individuals spatially associated as in-group members, while those associated differently and farther away were more often assigned to the out-group. Experiment 2 replicated these findings and refined the methodology by incorporating two types of visual representation: facial images and initials. These findings enhance our understanding of how VSL not only shapes perceptions of spatial proximity but also initiates the process of group judgments. Specifically, participants indirectly learned and recognized spatial regularities, which influenced their in-group and out-group decisions, underscoring the critical role of VSL in driving early-stage social categorization within virtual environments.