18th International Symposium on Staphylococci and Staphylococcal Infections (ISSSI) in Copenhagen, Denmark – August 2018

I had the opportunity to attend for the first time the ISSSI conference, devoted to the study of Staphylococci in the beautiful Copenhagen. It was a fantastic conference in which I met lots of researchers with common interests and reunited with collaborators.

This occasion was special because I got to present for the first time results from my work on bacteria, from one of the projects I’m involved in. It was entitled “Massive gene decay and insertion sequence acquisition has shaped the evolutionary history of the host restricted Staphylococcus aureus subsp. anaerobius”.


Presenting in the fancy auditorium of the Maersk Tower

Staphylococcus aureus subsp. anaerobius is the aetiological agent of Morel’s disease in sheep and goats, which causes very specific abscesses in lymph nodes. This bacteria only grows in microaerophilic conditions and is unable to infect other hosts. We applied whole-genome sequencing to a collection of isolates from several countries and aimed to examine its evolutionary history and to understand the molecular basis of its host adaptation and restricted metabolism.

Using phylogenetics and population genomics we inferred that anaerobius emerged at least a thousand years ago but it has had a very limited expansion. It seems to have evolved from a S. aureus subsp. aureus ancestor that jumped to a new host and underwent an extreme host-adaptation that changed its genome dramatically. Some of the genomic signatures of this process are a massive gene decay mediated through the accumulation of many pseudogenes and insertion elements and the existence of large chromosomal rearrangements. Similar evidence of restricted niche adaptation had been reported in other distantly related bacteria such as Mycobacterium or Yersinia, but it is unheard of in the Staphylococci context.

The presence of these abundant pseudogenes (that take up 10% of the genome) could explain why this bacterium has such a restricted metabolism (including being unable to grow in aerobic conditions) and reduced pathogenicity.

We also performed expression analyses that revealed that the insertion elements are being transcribed and this represses the transcription of the genes located next to those insertion elements. We suggest that these control of gene expression mediated by insertion elements underpins an orchestrated mechanism of host adaptation.

To sum up, Staphylococcus aureus subsp. anaerobius is remarkable example of drastic modifications that affect a bacterial genome during severe adaptation to a new host.

I got great feedback and comments which will hopefully help me finish up this study and aim for a publication soon!


New position and new study field

It’s already been a month since I moved from King’s Buildings to the Roslin Institute in March 2017. It wasn’t a big move in terms of distance (Roslin is just outside Edinburgh city), but it meant quite a bit in terms of research field, since I changed HIV molecular epidemiology for bacterial genomics, particularly Staphylococcus. Although for some people this might not seem too different, it implies dealing with genomes 300 hundred times larger and a much more complex genetic organisation.

I will be working in Prof. Ross Fitzgerald’s group (the “Laboratory for Bacterial Evolution and Pathogenesis”) in a Wellcome Trust funded project that aims to investigate the molecular basis of S. aureus host-adaptation.

S. aureus is an important pathogen that affects humans, livestock and wildlife, and has undergone numerous host-switching events during its evolutionary history leading to the emergence of new pandemic clones.

In collaboration with researchers from around the world, we will collect and sequence the whole genomes of hundreds of S. aureus isolates, and apply genome-wide association and evolutionary genomic analysis to understand the genetic basis for this pathogen’s host-tropism and epidemic clone emergence. This fascinating project will involve more colleagues at the University of Edinburgh but also at the University of Glasgow, who will apply different approaches to the same topic.

Despite the difficulties of starting a new job and having to learn new concepts and techniques, everyone at the lab and in the Institute in general has been tremendously welcoming and I feel very well taken care of. Hopefully the future will bring many successes!

24th Conference on Retroviruses and Opportunistic infections (CROI) in Seattle, WA – February 2017

I was lucky enough, one more year, to attend the CROI conference, the biggest convention regarding HIV research. I presented the work “Analysis of Nearly Full-Genome HIV-1 Sequences from Uganda: Results from PANGEA_HIV” as a poster.

As you can see from the title, in this communication I shared the preliminary results from the analysis of nearly full-genome HIV sequences generated from samples taken in Uganda. We found a remarkable proportion of A1/D recombinant sequences, but low rates of drug resistance mutations across different genes and transmission between different populations.


Just arrived to Seattle: with my boss Andy Leigh Brown and my colleagues Manon Ragonnet and Emma Hodcroft

The samples studied corresponded to individuals from several cohorts studied by my colleagues of the MRC-UVRI in Entebbe, and are part of the PANGEA_HIV project, dedicated to increase the understanding of HIV transmission dynamics in Africa by producing and using HIV sequence data. The 685 samples used in this particular dataset corresponded to contemporary sequences (sampled between 2009 and 2014) from 3 cohorts: i) a rural population in Masaka district in the south-west of Uganda, ii) fishing communities (“fisherfolk”) who work in different sites around the shores of Lake Victoria, and iii) female sex workers from Kampala, the capital city. Additionally, we analysed historical samples which were taken as part of a serological surveillance study in Kampala hospitals in 1986 from patients with AIDS. They were analysed with Illumina MiSeq next-generation sequencing in the Wellcome Trust Sanger Institute and processed using a pipeline created for this purpose by colleagues at UCL, which produced consensus sequences longer than 1Kb for 609 (89%) of the samples (565 being contemporary and 44 historical).

Given the long-term co-circulation of subtypes A1 and D in Uganda a frequent recombination between them was to be expected, however the levels found here (52% in contemporary sequences and 80% in historical ones) were much higher than those previously reported. This is obviously related to the fact that we analysed full genomes, as opposed to the traditional analysis of partial pol sequences which limit our ability to detect recombination breakpoints. These results were obtained with the SCUEAL subtyping tool adapted to HIV full-genome sequence analysis.

To test the level of HIV transmission between different populations, we looked for transmission clusters, i.e. groups of closely related sequences in phylogenetic trees, among contemporary sequences. We found 54 of them (44 sequence pairs, 10 triplets), which involved 21% of the sample. Most clusters involved individuals from the same population only (mainly corresponding to fisherfolk), although not always sampled in the same population. This could reveal a compartmentalised epidemic in which different populations don’t frequently interact despite being mobile populations. Alternatively, this might be due to the fact that we need a deeper sampling strategy to reveal undetected clusters.

We tested for the presence of drug resistance mutations in different genes: protease, reverse transcriptase, integrase and gp120. In contemporary sequences the level of resistance was very low, as expected in low-income settings. However the analysis of gp120 sequences revealed a high level of usage of the X4 co-receptor in historical sequences. Usage of X4 (as opposed to R5) confers resistance to entry inhibitors – but this is common considering that these samples come from 1980s patients with chronic infection who were suffering from AIDS.

We believe these results help understanding how HIV is transmitted in different populations of Uganda. However, the use of full genome sequences provides new methodological challenges that will have to be sort out. Fortunately, PANGEA_HIV is generating more samples that will help us on this matter, and will shed new light into this topic. So more updates to come!

You can take a look at the poster here.

Article published in Scientific Reports

Finally! Over the past Christmas break, our new paper Using nearly full-genome HIV sequence data improves phylogeny reconstruction in a simulated epidemic was published in the journal Scientific Reports (open access!).

In this study we compared how employing different HIV genes (with different length) and different sampling coverage levels affects the reconstruction of the correct HIV phylogeny using simulated sequence data.

This is an important question since more and more full-genome sequence data is becoming available but we don’t have enough experience on its application to the reconstruction of HIV phylogenies –the vast majority of studies so far have used partial pol sequences. Are trees reconstructed using full genomes the most accurate ones? Do other gene(s) provide good approximations?

However, to answer this we need to know what the real phylogeny is, and the best way to do so in a large scale is using simulated data: we used a simulated HIV epidemic (developed by Emma Hodcroft and Samantha Lycett) resembling an “African Village” scenario, in which all sexual contacts were recorded. Selecting the contacts that gave rise to transmissions produced the true transmission tree. Along this tree, associated HIV sequence data was simulated applying realistic, different evolutionary rates to different genes.

We created different combinations of gene datasets (full genome, gag-pol, gag, full pol, partial pol, and env) and sampling coverage (full coverage [100%], 60%, 20% and 5%). For each combination, 100 replicates were created, and for each of them we built a maximum likelihood tree which was compared to the true tree.

We found that the accuracy of the trees was significantly proportional to the length of the sequences used, with the full genome datasets showing the best performance and gag and partial pol sequences showing the worst. The lowest sampling depths (20% and 5%) greatly reduced the accuracy of tree reconstruction and showed high variability among replicates, especially when using the shortest gene datasets.

Thanks to the increasingly affordable generation of full HIV genomes, we will be able to analyse longer genetic regions that, according to our according to our results, will improve the reliability of phylogenetic reconstruction. The short pol sequences generated for resistance testing that are traditionally used in most molecular epidemiology studies are substantially less reliable, especially with low sampling depths.

Contagion, an ‘infectious’ public engagement event at the Science Museum, London (26/10/16)

I had the incredible opportunity of joining my colleagues at the Farr Institute (with whom I collaborate as part of the ICONIC project) in “Contagion”, an evening of science outreach in the quirky and fun Science Museum, London.


One of our banners #datasaveslives

Contagion was a public engagement event held last October 26 that focused on revealing to the public different aspects about important infectious diseases (HIV, Ebola, Zika, Polio, Malaria…) and the approaches that researchers take to study them. Contagion, generously funded by the Bill & Melinda Gates Foundation, was part of the Lates programme, which according the Science Museum consists on “adults-only, after-hours theme nights that take place in the Museum on the last Wednesday of every month. Each entry in this hugely popular ongoing series of events centres on a different theme: from sex to climate change, from big data to childhood.”


Very curious people (yes, there was booze!)

The slogan for the Farr Institute stand was “Tracking viruses in space and time”, and we did our best to describe what we can learn from the analysis of genetic material from viruses, especially using phylogenetics, and how that information can help us to improve global health. Viral epidemics are becoming more and more global, and information about how viruses transmit and spread around the world is key if we want to implement measures to tackle this expansion.

Lots of people with quite different backgrounds were very interested in our work, which provided an evening of fascinating debate and learning. It was great fun!

Visit to the MRC/UVRI in Entebbe, Uganda

I had the pleasure of staying at the MRC-funded Uganda Virus Research Institute (UVRI) for the first two weeks of October. The MRC/UVRI is an 80-year-old institution located in Entebbe, Uganda, that conducts public health related research. The motivation of the visit stems from the long-term collaboration between the Leigh Brown group at the University of Edinburgh and the Research Unit on AIDS at the UVRI.


Clinical Diagnostic Labs, MRC/UVRI (from http://www.mrcuganda.org)

My visit was made possible by a MUIIplus travel grant for visiting scientists. The MUII (Makerere University/UVRI Infection and Immunity) programme works with regional research centres and leading international Universities to ensure collaborative training activities including short courses, research attachments and research fellowships.

During my visit, I assisted on the implementation and installation of bioinformatics resources at the UVRI, as well as on the instruction of UVRI students and staff in those methods. We were all most interested on making available sophisticated methods for phylogenetic and phylodynamic analysis of HIV sequences, particularly RAxML and BEAST. These analyses are currently being applied to the UVRI database of HIV pol sequences associated with epidemiological data –which includes samples from different Ugandan populations.

This technology will allow UVRI staff to gain independence and experience on analysing HIV phylodynamics, and will provide resources for future analyses. Further capacity building will be gained in the next few months through the collaboration with UMIC, which will increase the computing capability of the UVRI. I also had the opportunity of giving a seminar going through my research career, and explaining basic concepts on HIV molecular epidemiology.

Virus Genomics & Evolution 2016

1st Virus Genomics & Evolution at the Wellcome Genome Campus, Hinxton, Cambridge, UK – June 2016


One of the buildings at the Wellcome Genome Campus conference centre (phone pic)

I attended the first edition of this interesting Conference covering multidisciplinary approaches of the application of virus genome sequencing to the study of epidemiology, pathogenesis and public health implications of viruses. It was a very successful mixed of well-renowned experts in different fields and young students (ironically I’m neither).

I had the opportunity of presenting again the preliminary results of the ICONIC project a a poster, highlighting the tremendous HIV strain variability that we found in our London samples —somehow  more typical of sub-Saharan African settings than of a European capital— which included many complex recombinants.

You can take a look at the poster here.