Friday, October 19, 2007

Tamed twice: convergent domestication of pogo transposases in yeast and mammals

Self-promotion yes, but still a cool story (we think):

In a new paper, which appeared on October 16 in Advance Access for Molecular Biology and Evolution, we show that two different sources of normally selfish pogo-like transposases were recruited independently in the lineages of fission yeast (S. pombe) and mammals to give rise to centromere-binding proteins with important cellular functions. We call this a case of 'convergent transposase domestication'.

It is a case of convergent evolution in the sense that very similar transposases have given rise (at least) twice independently, and in separate evolutionary lineages, to proteins that bind specifically to the centromeric regions of their respective host chromosomes. Now, it is important to emphasize that the two sets of proteins (there are three in fission yeast, but only one in mammals called CENP-B) may still perform distinct functions at the centromeres. We can't answer that question yet. The chromosomal function of the 3 proteins of S. pombe is relatively well understood, but the role of mammalian CENP-B in chromosome segregation remains unclear, and even controversial. Peter Warburton has several nice review papers on the topic. We are now collaborating with Peter's lab to see if any of the many other pogo transposases domesticated in mammals have centromere-binding activity.

Congrats to Claudio and Don for their equally outstanding contribution to the study! And double congrats to Don for his first paper! The preprint can be found here.

Cool chromosome art by Alisa Poh [buy it].

Monday, October 8, 2007

Junkyard and/or 'hardware superstore'?

GMO Pundit aka David Tribe has a post about yet another story of an ancient mammalian TE co-opted for gene regulation. The study, just published in PLoS Genetics and led by Marcelo Rubinstein, an HHMI International Scholar from the Institute for Research on Genetic Engineering and Molecular Biology and the University of Buenos Aires in Argentina, demonstrates that an ancient SINE (short interspersed nuclear element) has donated an enhancer DNA element, conserved in all mammals examined, that directly contributes to the expression of the propiomelanocortin (POMC) gene in hypothalamic neurons. The POMC gene is involved in the production of neuropeptides controlling functions as diverse as the stress response, skin and hair pigmentation, analgesia, and the regulation of food intake and energy balance. It is tempting to speculate that the acquisition of the neuronal enhancer from the SINE was a key step in the functional evolution of POMC in mammals and the advent of mammalian-specific evolutionary innovations.

Sunday, October 7, 2007

Genome-wide recruitment of a transposon family for post-transcriptional gene regulation

A new paper just published in PLoS Pathogens is raising the bar on the potential for transposable elements to be co-opted for useful cellular functions. This remarkable story, which combines bioinformatics and biochemical experiments, provides several lines of evidence supporting an hypothesis put forward nearly 40 years ago by Roy Britten and Eric Davidson that interspersed repeats [derived from the propagation of TEs] can participate in the coordination of host gene expression and be harnessed to build regulatory networks.

Our story takes place in the trypanosomatid Leishmania major (pic on the right, courtesy of Dr Laurence Tetley). Trypanosomatids are single-celled, parasitic eukaryotes causing a variety of devastating diseases in humans, including leishmaniasis, sleeping sickness and Chagas disease. Recently, complete draft genome sequences of L. major and of two species of Trypanosoma were published. One of the most intriguing features of trypanosomatid genomes is that the bulk of their protein-coding genes are organized in large directional gene clusters (DGCs). Each DGC consists of multiple genes co-transcribed as a single pre-RNA molecule (also called a polycistron) that is processed post-transcriptionally into messenger RNAs prior to translation into different proteins. This organization is reminiscent of operons in bacteria. However, a distinct characteristic of trypanosome DGCs is that they do not appear to be regulated at the level of transcription. Indeed, they are no promoter elements identifiable upstream of trypanosomatid protein-coding genes and there seems no need to be. Indeed the RNA polymerase II (the transcription enzyme of) of trypanosomes is known to be unusual in that it can efficiently initiate transcription without any upstream promoter elements. Together, these observations (and others) indicate that trypanosomatid gene expression is almost exclusively regulated post-transcriptionally, which is at odds with what we know about gene regulation in most other organisms (see here for a good review). Gene regulation in trypanosomatids seems to occur predominantly at the levels of processing of the poly-cistronic transcripts and differential stability of the resulting mRNAs. Most of the regulatory sequences known to control stability of trypanosome mRNAs have been mapped to 3’ UTRs (untranslated regions), but the precise mechanism(s) by which the coordinated regulation of gene expression is achieved has remained elusive.

Using bioinformatics tools, Bringaud et al. discovered a family of retroposons in the Leishmania genome (i.e. short transposable elements that move via a RNA intermediate) called LmSIDER2. They found about 1,000 such elements dispersed in the L. major genome, which results from a relatively ancient genomic invasion. None of the elements appear to be able to transpose anymore and thus the entire TE family seems to be extinct. LmSIDER2 elements were found to have a strikingly biased distribution in the genome: they are almost exclusively located in the 3’ UTRs of predicted L. major mRNAs, suggesting a global role for these elements in post-transcriptional regulation. The authors go on to show experimentally that the presence of a SIDER copy in the 3’ UTR of a gene is not benign; it decreases the stability of the corresponding mRNA in vivo (the SIDER-containing mRNA has a shorter lifespan than one without SIDER). Consistent with these experiments, microarray analyses revealed that Leishmania mRNAs containing SIDER in their 3’ UTR are expressed generally at lower levels than non-SIDER containing mRNAs.

Together, these data suggest an intriguing scenario whereby SIDER elements have been recruited at a genome-wide scale to modulate the expression of hundreds of genes. The model offers a potential mechanism for coordinating gene expression at the post-transcriptional level and could be one strategy by which Leishmania has effectively ‘compensated’ for its loss of ability to control gene expression at the transcriptional level.

Tuesday, September 25, 2007

Adaptive Complexity: More Confusion about Junk DNA and Regulatory Sequences

Adaptive Complexity: More Confusion about Junk DNA and Regulatory Sequences

How selfish DNA is put to work

Last year, in a paper published in PNAS in collaboration with Richard Cordaux (now at the University of Poitiers, France) and Mark Batzer (LSU), we reconstructed the evolutionary history of a primate fusion gene called SETMAR. I realize that's already an old story and it might sound like I am just self-promoting my own research on my own blog. But I thought it might serve as a good introduction to the kind of questions that I am interested in (and will post more about on this site in the future). It will also provide an example of how transposons and other forms of so-called 'junk DNA' can, on occasions, make themselves useful in the genome. Finally this is a story that generated quite a bit of web/blog discussion, some of them I have linked below.

What got me interested in this gene was the fact that half of it somehow derived from a transposase gene which used to be encoded by and serving selfishly a transposon called mariner. Thus it looks as if the transposase gene had been 'captured' and recycled to give birth to a new gene, and thus contribute to the advent of a new function.

There are two major questions that we wanted to address: first, when and how did the fusion happened? Second, what is the function of the new protein and what is the contribution of the transposon-derived part (ie. the transposase) to this function ? The paper provides a pretty clear answer to the first question and some bits of answers to the second question.

I'll give you a quick summary. First, you have to know that the transposase region of SETMAR is derived from a particular copy of a mariner family called Hsmar1. There are about 200 copies of these Hsmar1 elements in the human genome, and about 7,000 copies of a related but smaller and non-coding element called MADE1. We knew from another study that we published this year in Genome Research, that Hsmar1 and MADE1 were inserted around 45 Myr ago in the genome of an anthropoid primate ancestor. All the Hsmar1 elements are now inactivated by mutations which make them unable to encode a functional transposase enzyme, except for one copy: the one incorporated in SETMAR. The transposase region of SETMAR has retained an intact coding sequence, and we found that it is highly conserved and evolving under strong selective constraint in all anthropoid primates examined. However, it is precisely missing from the orthologous genomic region in prosimian primates (tarsier, lemurs) and in all other vertebrates that we looked at (see figure above). However, these species do have the SET portion of SETMAR. This data strongly suggest that SETMAR arose by insertion of a Hsmar1 copy downstream of a pre-exisiting SET gene sometime in the lineage of anthropoid primates, followed by transcriptional fusion to give rise to the present-day SETMAR. Amazingly enough, by comparing the SETMAR genomic sequence in diverse primates, we realized that the birth of the new gene was made possible by not just one, but a series of seemingly unlikely mutational events, including a transposon insertion and the creation of a new intron.

This intricate process of gene origination from a piece of so-called 'junk DNA' or, more accurately selfish DNA, attracted substantial media coverage, and gave rise to a number of interesting posts on several evo/discussion web sites, some of which are listed below (including an accurate and comprehensive rendition of the story in the lively Theology Web Campus):

Mechanisms in Evolution: the evolution of a new gene (Theology Web Campus)
Piecing Together a Gene: Nobel Intent (ARS Technica - Nobel Intent, followed by an animated discussion)
Brig Klyce's Cosmic Ancestry
Happy Birthday Primate Gene SETMAR (GMO Pundit aka David Tribe)

As to the present function of SETMAR in humans and other primates, you will find more in our paper and in two articles recently published (1, 2). And of course, I will keep you posted here or there.