Visual analysis of genome organisation and variation across bacterial species with Graphia

New article describes a new approach for visual analysis of bacterial species pangenomes

In collaboration with Roslin Institute alumnus Professor Tom Freeman, our recent publication in BMC Bioinformatics, titled ‘A graph-based approach for the visualisation and analysis of bacterial pangenomes’ outlines a novel approach for visualising bacterial pangenome data with Staphylococcus aureus and Legionella pneumophila as examples. This work was championed by Josh Harling-Lee as a part of his PhD, and utilises the Graphia software application developed by the Freeman group for rapid, interactive visualisation of large and complex datasets.

Abstract

The advent of low cost, high throughput DNA sequencing has led to the availability of thousands of complete genome sequences for a wide variety of bacterial species. Examining and interpreting genetic variation on this scale represents a significant challenge to existing methods of data analysis and visualisation. Starting with the output of standard pangenome analysis tools, we describe the generation and analysis of interactive, 3D network graphs to explore the structure of bacterial populations, the distribution of genes across a population, and the syntenic order in which those genes occur, in the new open-source network analysis platform, Graphia. Both the analysis and the visualisation are scalable to datasets of thousands of genome sequences. We anticipate that the approaches presented here will be of great utility to the microbial research community, allowing faster, more intuitive, and flexible interaction with pangenome datasets, thereby enhancing interpretation of these complex data.

Related links

A graph-based approach for the visualisation and analysis of bacterial pangenomes

More information on Graphia

More information on GraPPLE

Full Citation

Harling-Lee, J.D., Gorzynski, J., Yebra, G. et al. A graph-based approach for the visualisation and analysis of bacterial pangenomes. BMC Bioinformatics 23, 416 (2022). https://doi.org/10.1186/s12859-022-04898-2