Interactive plot instructions:

Figure description

Visualization to compare the quality of predicted motifs for different gene sets across different organisms. Each point is a predicted motif from M. buryatense, E. coli, or B. subtilis. The numerical label of each point is the n% gene set used to derive that motif. The x-axis represents the information content of each motif averaged across each of the 12 positions in the motif. The y-axis represents the log2 ratio of the frequency the motif was found <100bp from a gene start to the frequency the motif was found in intergenic regions. B. subilitis predictions had the highest overall information content and enrichment in promoter regions, with E. coli motif predictions resulting in lower information content and lower enrichment in promoter regions, consistent with findings from Latif et al suggesting E. coli promoters have higher variability relative to the consensus. The motif chosen for experimental follow up in M. buryatense, the top 3% motif, appears in between the results for B. subtilis and E. coli in each dimension.