motifStack for the analysis of transcription factor binding site evolution

J Ou, SA Wolfe, MH Brodsky, LJ Zhu - Nature methods, 2018 - nature.com
J Ou, SA Wolfe, MH Brodsky, LJ Zhu
Nature methods, 2018nature.com
To the Editor: A sequence motif is a short recurring pattern with biological significance such
as a DNA-recognition sequence for a transcription factor (TF), an mRNA splicing signal, or a
functional region of a protein domain. Many high-throughput experimental approaches and
computational tools have been developed to discover motifs from a population of functional
sequences such as TF binding sites1. TF binding motifs are often represented as position
weight matrices (PWMs) and visualized as sequence logos (Supplementary Note). To …
To the Editor: A sequence motif is a short recurring pattern with biological significance such as a DNA-recognition sequence for a transcription factor (TF), an mRNA splicing signal, or a functional region of a protein domain. Many high-throughput experimental approaches and computational tools have been developed to discover motifs from a population of functional sequences such as TF binding sites1. TF binding motifs are often represented as position weight matrices (PWMs) and visualized as sequence logos (Supplementary Note). To facilitate classification and comparison of motifs, researchers have developed motif alignment and clustering tools such as STAMP2, Tomtom3, and MatAlign4. However, existing tools for the visualization of similarities or differences within groups of motifs are limited by their flexibility in displaying trees (STAMP), the number of motifs supported (DiffLogo5), or the ability to display motif logo alignments (Cytoscape6).
We describe motifStack, a Bioconductor package to visualize the alignment of motifs as a phylogenetic tree. This tool facilitates the analysis of binding site diversity and conservation within families of TFs and the evolution of TFs among different species. motif-Stack can align DNA motifs; generate motif signatures for closely related motifs; and plot aligned motifs as a stack, a linear or a radial tree, or a word cloud of sequence logos (Supplementary Fig. 1). Different parameter settings can be used to generate diverse types of plots with color schema highlighting important data features
nature.com