Analysis and benchmarking of small and large genomic variants across tandem repeats.

 0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: Mark J P Chaisson, Wouter De Coster, Egor Dolzhenko, Michael A Eberle, Adam C English, Bida Gu, Melissa Gymrek, Sean K McKenzie, Nathan D Olson, Jonghun Park, Fritz J Sedlazeck, Justin Wagner, Helyaneh Ziaei Jam, Justin M Zook

Ngôn ngữ: eng

Ký hiệu phân loại:

Thông tin xuất bản: United States : Nature biotechnology , 2025

Mô tả vật lý:

Bộ sưu tập: NCBI

ID: 719323

Tandem repeats (TRs) are highly polymorphic in the human genome, have thousands of associated molecular traits and are linked to over 60 disease phenotypes. However, they are often excluded from at-scale studies because of challenges with variant calling and representation, as well as a lack of a genome-wide standard. Here, to promote the development of TR methods, we created a catalog of TR regions and explored TR properties across 86 haplotype-resolved long-read human assemblies. We curated variants from the Genome in a Bottle (GIAB) HG002 individual to create a TR dataset to benchmark existing and future TR analysis methods. We also present an improved variant comparison method that handles variants greater than 4 bp in length and varying allelic representation. The 8.1% of the genome covered by the TR catalog holds ~24.9% of variants per individual, including 124,728 small and 17,988 large variants for the GIAB HG002 'truth-set' TR benchmark. We demonstrate the utility of this pipeline across short-read and long-read technologies.
Tạo bộ sưu tập với mã QR

THƯ VIỆN - TRƯỜNG ĐẠI HỌC CÔNG NGHỆ TP.HCM

ĐT: (028) 36225755 | Email: tt.thuvien@hutech.edu.vn

Copyright @2024 THƯ VIỆN HUTECH