INTRODUCTION: Peer assessment enables students to evaluate peers, deepening their understanding of course objectives and boosting engagement. However, its dependability, especially in summative contexts, is often questioned. This study examines the dependability of peer assessments, and the optimal number of items and raters needed for a reliable assessment in project-based learning (PtBL). METHODS: In a PtBL class, 95 third-year pre-clinical students, grouped into ten teams, created 5-min videos on cardiovascular lifestyle modifications for the community, followed by a 5-min presentation on the video's information and relevance to the citizen. Over three weeks, each group, guided by three advisors, refined their video. During presentations, peers (from nine non-presenting groups) and ten teachers (five Doctor of Medicines (MDs) and five health professors) evaluated the video presentation using a 6-item rubric covering three domains: interdisciplinary data integration, interpersonal skills, and video quality/effectiveness. Messick's validity framework was utilized to guide the collection of validity evidence. RESULTS: Five sources of validity evidence were collected: 1) Content: Three professors confirmed the rubric's content validity. 2) Response process: Scores from students, MDs, and health professors are similar at 54.00 ± 4.03, 53.24 ± 4.18, and 54.16 ± 4.16, respectively (F = 0.75, p = 0.472). 3) Internal structure: A fully crossed design (p × i × r) generalizability theory analysis showed that achieving a Phi-coefficient ≥ 0.70 on a six-item rubric requires 27 students (Phi-coefficient = 0.70), 7 MDs (Phi-coefficient = 0.70), or 5 health professors (Phi-coefficient = 0.73). A nested design (r:(p × i)) demonstrated superior reliability, requiring only 9 students, 5 MDs, and 4 health professors for acceptable reliability. The confirmatory factor analysis indicated a good model fit. 4) Relations to other variables: On average, peer and teacher ratings scored 54.00 ± 2.22 and 53.70 ± 2.78, respectively, with an inter-rater reliability of r = 0.73 (p = 0.016). 5) Consequences: Most groups found peer assessment beneficial for gaining insightful feedback (8/10), enhancing engagement (7/10), refining their work (5/10), and learning structured feedback (3/10), though there were concerns about potential bias (5/10). CONCLUSION: Dependability evidence for peer assessment in the PtBL context was successfully gathered. In PtBL, students contribute to grading due to their diverse expertise. While peer assessment cannot replace teacher evaluations, it enhances engagement, enriches the learning environment, and improves assessment quality through valuable feedback.