MUSET: set of utilities for constructing abundance unitig matrices from sequencing data.

 0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: Francesco Andreace, Rayan Chikhi, Yoann Dufresne, Camila Duitama González, Riccardo Vicedomini

Ngôn ngữ: eng

Ký hiệu phân loại: 551.8 Structural geology

Thông tin xuất bản: England : Bioinformatics (Oxford, England) , 2025

Mô tả vật lý:

Bộ sưu tập: NCBI

ID: 690544

 SUMMARY: MUSET is a novel set of utilities designed to efficiently construct abundance unitig matrices from sequencing data. Unitig matrices extend the concept of k-mer matrices by merging overlapping k-mers that unambiguously belong to the same sequence. MUSET addresses the limitations of current software by integrating k-mer counting and unitig extraction to generate unitig matrices containing abundance values, as opposed to only presence-absence in previous tools. These matrices preserve variations between samples while reducing disk space and the number of rows compared to k-mer matrices. We evaluated MUSET's performance using datasets derived from a 618-GB collection of ancient oral sequencing samples, producing a filtered unitig matrix that records abundances in <
 10 h and 20 GB memory. AVAILABILITY AND IMPLEMENTATION: MUSET is open source and publicly available under the AGPL-3.0 licence in GitHub at https://github.com/CamilaDuitama/muset. Source code is implemented in C++ and provided with kmat_tools, a collection of tools for processing k-mer matrices. Version v0.5.1 is available on Zenodo with DOI 10.5281/zenodo.14164801.
Tạo bộ sưu tập với mã QR

THƯ VIỆN - TRƯỜNG ĐẠI HỌC CÔNG NGHỆ TP.HCM

ĐT: (028) 36225755 | Email: tt.thuvien@hutech.edu.vn

Copyright @2024 THƯ VIỆN HUTECH