Representing DNA for machine learning algorithms: A primer on one-hot, binary, and integer encodings.

 0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: Yash Munnalal Gupta, Somjit Homchan, Satwika Nindya Kirana

Ngôn ngữ: eng

Ký hiệu phân loại: 286.133 *National Baptist Convention of the United States of America

Thông tin xuất bản: United States : Biochemistry and molecular biology education : a bimonthly publication of the International Union of Biochemistry and Molecular Biology , 2025

Mô tả vật lý:

Bộ sưu tập: NCBI

ID: 700397

This short paper presents an educational approach to teaching three popular methods for encoding DNA sequences: one-hot encoding, binary encoding, and integer encoding. Aimed at bioinformatics and computational biology students, our learning intervention focuses on developing practical skills in implementing these essential techniques for efficient representation and analysis of genetic data. The primary goal of this study is to enhance students' understanding and practical application of DNA encoding methods, which are crucial for various computational analyses in bioinformatics. Our intervention consists of three key components: (1) a conceptual framework that contextualizes these encoding methods within broader bioinformatics applications, (2) an interactive Jupyter Notebook with Python code examples (https://github.com/yashmgupta/Representing-DNA/tree/main), and (3) a user-friendly Streamlit application for visualizing encoded sequences (https://dnaencoding.streamlit.app/) that also enables students to input their own DNA sequences and visualize the different encoding methods, further enhancing their understanding and practical experience. By combining conceptual overview with practical coding and visualization tools, our approach provides a comprehensive foundation for students to leverage these key DNA sequence encoding methods in their future work. This study contributes to bioinformatics education by offering effective, hands-on learning resources that bridge the gap between theoretical knowledge and practical application in DNA sequence analysis, preparing students for advanced research and data analysis projects in the field.
Tạo bộ sưu tập với mã QR

THƯ VIỆN - TRƯỜNG ĐẠI HỌC CÔNG NGHỆ TP.HCM

ĐT: (028) 36225755 | Email: tt.thuvien@hutech.edu.vn

Copyright @2024 THƯ VIỆN HUTECH