COGRAM: A Computational Pipeline for Genome Assembly and Reconstruction via Optimized K-mer Sampling and De Bruijn Graph Networks

Research output: Contribution to book or proceedingConference articlepeer-review

Abstract

Genome assembly and annotation accuracy fundamentally depend on optimal selection of parameters and robust computational approaches. Here we introduce COGRAM (Coggins-Ramasamy Genomic Assembly Method), a novel bioinformatics pipeline that enhances genome assembly and reconstruction by optimizing k-mer parameters, leveraging graph theory, and incorporating machine learning techniques. Initially, COGRAM identifies the optimal k-mer length using methods inspired by KMERGENIE and grid search techniques, followed by random genomic sampling at the optimal resolution. It then conducts a comprehensive analysis of the frequency distributions of k-mer and GC-content across the sampled genome windows. Subsequently, the pipeline constructs a detailed de Bruijn framework graph from parsed genomic data. Using this graph, COGRAM trains a network to model genomic structures effectively, enhancing accuracy and scalability. Genome reconstruction is accomplished through rigorous cross-validation with a greedy algorithm designed to refine the quality of genome assembly iteratively. We demonstrate the effectiveness of COGRAM through benchmark tests on the E. coli genome. This pipeline represents a powerful tool for genomic projects with potential for expansion to other projects.

Original languageEnglish
Title of host publicationSocial Networks Analysis and Mining - 17th International Conference, ASONAM 2025, Proceedings
EditorsAijun An, Alfredo Cuzzocrea, Hongxin Hu
PublisherSpringer Science and Business Media Deutschland GmbH
Pages391-412
Number of pages22
ISBN (Print)9783032135124
DOIs
StatePublished - 2026
Event17th International Conference on Social Networks Analysis and Mining, ASONAM 2025 - Niagara Falls, Canada
Duration: Aug 25 2025Aug 28 2025

Publication series

NameLecture Notes in Computer Science
Volume16322 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference17th International Conference on Social Networks Analysis and Mining, ASONAM 2025
Country/TerritoryCanada
CityNiagara Falls
Period08/25/2508/28/25

Scopus Subject Areas

  • Theoretical Computer Science
  • General Computer Science

Keywords

  • COGRAM
  • de Bruijn
  • genome assembly
  • graph networks
  • grid search
  • KMERGENIE

Fingerprint

Dive into the research topics of 'COGRAM: A Computational Pipeline for Genome Assembly and Reconstruction via Optimized K-mer Sampling and De Bruijn Graph Networks'. Together they form a unique fingerprint.

Cite this