Research

Recombination, introgression and the evolution of bacterial genomes

The dynamic nature of bacterial gene pools, especially the mobility of genes among unrelated groups by lateral gene transfer makes it difficult to develop a coherent species concept. Unconstrained gene flow between populations would prevent the emergence of distinct species but despite this melding of genomes, functionally and genetically related populations can be described. For example, Campylobacter coli and C. jejuni dominate in swine and wild bird hosts respectively and, therefore perhaps qualify as biological species on the basis of genetic isolation and phenotypic differences. This genetic structuring requires barriers to gene flow that can be: (i) mechanistic – imposed by the homology dependence of recombination or other factors promoting recombinational specificity; (ii) ecological – a consequence of physical separation in distinct niches; (iii) adaptive – implying selection against hybrid genotypes. We have shown that large numbers of intermediate (hybrid) C. jejuni/C. coli genotypes exist and that there is considerable subspecies structuring. The significance of the clades is not known as they do not follow strict ecological divisions often being found together, for example in chicken hosts. Using population genomics techniques we are testing hypotheses about the genomic basis of genus, species and clade definitions, the existence of insipient species (clades), and the ecological basis of genetic introgression.

An illustration of the genetic divergence and a scenario for despeciation of C. jejuni (gray) and C. coli clades 1 (blue), 2 (yellow), and 3 (red). Between time t2 and t1, campylobacter split into two separate species, C. jejuni and C. coli.Between t1 and the present (t0), C. coli further separated into three distinct lineages representing incipient species. At t0 C. coli clade 1 starts to accumulate genetic material imported from C. jejuni, owing to expansion into a novel agricultural niche. By t1, recombination has been sufficient to make strains with a clade 1 clonal origin indistinguishable from C. jejuni. A more speculative projection would be that the change in environmental conditions could also have a substantial effect on clade 2 and clade 3 strains, with an elevated rate of exchange with the cosmopolitan and numerous C. jejuni clade 1 bacteria, which could lead to complete despeciation at the nucleotide level by time t3. Sheppard et al (2008) Science 320 (5873): 237-239.
A scenario for the evolution of Campylobacter jejuni and C. coli. These species diverged followed by the split of C. coli clades 1, 2 and 3. Recombination from C. jejuni to C. coli clade 1 began at some point before R1, and subsequent clonal expansion of introgressed lineages (828 and 1150 clonal complexes) at R1 and R2 led to the dominance of hybrid lineages in agriculture, human disease and currently available isolate collections. Clade 2 (C2) and 3 (C3) and clade 1 (C1*) populations from wild bird and environmental reservoirs (e.g. represented by isolates 16 and 23) remained unintrogressed. The cross‐sectional area and diameter of the lineage ‘trunks’ are based on the abundance of isolates in the PubMLST database and the length of trunks is arbitrarily defined. Sheppard et al (2012) Molecular Ecology 22, 1051–1064.









Population genomics of pathogenic Staphylococci

Staphylococcus aureus and S. epidermidis are common constituents of the microbial flora of the skin and mucous membranes of humans and other animals. These organisms are, however, best known as some of the most prevalent pathogens in surgery- or device-associated hospital-acquired infection. Because of their ubiquity, Staphylococcus infections of implanted medical devices, such as central venous catheters, prosthetic joints and heart valves, are often thought to result from contamination with commensal strains from the skin or the hospital environment. However, there is mounting evidence that disease causing lineages are a subset of those found in these places. For example, there is evidence that phenotypes associated with attachment to host tissue and implanted device surfaces and the ability to from biofilms are over represented among pathogenic strains from indwelling devices. This implies that, rather than simple passive infection, there may be specific virulence factors associated with the emergence of pathogens from a background of harmless ancestors. By identifying the genetic elements and phenotypes associated with staphylococci that proliferate on indwelling devices, bacteraemia, and in infection reservoirs (skin, nasal mucosa, hospital environment etc.), we aim to answer the question: why do certain (‘pathogenic’) strains survive and proliferate in a clinical setting?

Fig. 2
Population structure and genome-wide association study of S. epidermidis pathogenicity. a Isolation source of infection and asymptomatic carriage isolates. Shades of red correspond to the broad infection phenotype, shades of blue to the broad asymptomatic carriage phenotype. b Phylogenetic tree of S. epidermidis isolates, reconstructed using an approximation of the maximum-likelihood algorithm. c Pangenomic position of GWAS results. Ticks in the outer ring represent the pangenomic position of genes in the S. epidermidis ATCC12228 reference genome, seven plasmid genomes, and the rest of the pangenome inferred in this study. Méric* & Mageiros* et al (2018) Nature Communications 9: 5034.


Host adaptation and the evolutionary ecology of Campylobacter and Staphylococcus

The structure of bacterial populations is shaped by many factors, such as variations in ecological strategies of different lineages. In organisms like Staphylococcus and Campylobacter, different lineages can be can be found in host-associated (animal and human guts) and nonhost-associated environments. The relative frequency of the different lineages varies in these niches and different hosts. By comparing large sets of strains, we identify adaptive traits associated with different environments and hosts and examine their phylogenetic distribution, in order to explore the link between ecology and phylogeny. Furthermore, comparative genome analyses allows the identification of functionally-related sets of genes for experimentally testing adaptive hypotheses.

Genetic relatedness of S. aureus isolates from different hosts. (a) Host origin of all S. aureus isolates from chicken (blue), human (red), and other species (yellow). Clonal complex (CC) designations are based on shared MLST housekeeping loci. Chicken isolates were found in five sequence clusters, corresponding to CCs 1, 5, 30, 358, and 398 which are highlighted. Murray et al (2017) Genome Biology and Evolution 9 (4): 830-842. 

Genome positions of colocalized poultry-associated genes and recombination regions. (a) Poultry-associated genes (blue) and recombination regions (red) mapped to the ED98 reference chromosome. Hot spots of colocalization of poultry-associated elements are numbered. (b) Schematic diagrams of each hot spot showing gene content, including poultry-associated genes and genes containing recombination regions. Genes with no poultry-association in the same regions are also shown (grey). Poultry-associated genes and genes containing recombination regions are labeled, including S. aureus pathogenicity island genes (1), transposon-related genes (2), hypothetical proteins (3), and phage-related genes (4). More details of putative gene function can be found in supplementary table S2, Supplementary Material online.
Genome positions of colocalized poultry-associated genes and recombination regions. (a) Poultry-associated genes (blue) and recombination regions (red) mapped to the ED98 reference chromosome. Hot spots of colocalization of poultry-associated elements are numbered. (b) Schematic diagrams of each hot spot showing gene content, including poultry-associated genes and genes containing recombination regions. Genes with no poultry-association in the same regions are also shown (grey). Poultry-associated genes and genes containing recombination regions are labeled, including S. aureus pathogenicity island genes (1), transposon-related genes (2), hypothetical proteins (3), and phage-related genes (4). Murray et al (2017) Genome Biology and Evolution 9 (4): 830-842.



Evolutionary modeling of bacterial adaptation

The forces that generate high levels of genetic structuring in populations of bacterial pathogens remain controversial. In particular it is not fully understood how the evolutionary processes of mutation and homologous recombination (analogous to eukaryotic sex) interact with selection to produce complex genealogies or how this lineage structure relates to phenotypic properties such as virulence. By combining a modelling approach with multilocus sequence data from natural populations, we are demonstrating that the population genetic structure in the bacterial pathogens, Campylobacter jejuni, Bacillus cereus, and Neisseria meningitidis, can be explained by a selection driven evolutionary model. The predictions of our models correlate well with data from natural populations and explain the genesis and distribution of lineage clusters. Using these models, where genetic structure reflects the action of selection on the population, we are demonstrating an evolutionary advantage of homologous recombination which leads to increased fitness variance and improves the population response to changes in the fitness landscape. Homologous recombination may, therefore, aid niche colonization, host invasion and the emergence of pathogenicity.


Attributing the source of human campylobacteriosis

Campylobacter species cause a high proportion of bacterial gastroenteritis cases and are a significant burden on health care systems and economies worldwide; however, the relative contributions of the various possible sources of infection in humans are unclear. Using National-scale genotyping of Campylobacter species we are quantifying the relative importance of various possible sources of human infection. We compare multilocus and whole genome sequence data from isolates obtained from cases of human campylobacteriosis and from samples from potential human infection sources. The clinical isolates are attributed to possible sources on the basis of their allelic variation or nucleotide polymorphisms using model based computer software, such as STRUCTURE. Using these methods, contaminated chicken meat has been shown to be among the most important sources of human disease.