A defining feature of eukaryotic cells is the presence of numerous membrane-bound organelles that subdivide the intracellular space into distinct compartments. How the eukaryotic cell acquired its internal complexity is still poorly understood. Material exchange among most organelles occurs via vesicles that bud off from a source and specifically fuse with a target compartment. Central players in the vesicle fusion process are the Soluble N-ethylmaleimide-sensitive factor Attachment protein REceptor (SNARE) proteins. These small tail-anchored (TA) membrane proteins zipper into elongated four-helix bundles that pull membranes together1–3. SNARE proteins are highly conserved among eukaryotes but are thought to ... More
A defining feature of eukaryotic cells is the presence of numerous membrane-bound organelles that subdivide the intracellular space into distinct compartments. How the eukaryotic cell acquired its internal complexity is still poorly understood. Material exchange among most organelles occurs via vesicles that bud off from a source and specifically fuse with a target compartment. Central players in the vesicle fusion process are the Soluble N-ethylmaleimide-sensitive factor Attachment protein REceptor (SNARE) proteins. These small tail-anchored (TA) membrane proteins zipper into elongated four-helix bundles that pull membranes together1–3. SNARE proteins are highly conserved among eukaryotes but are thought to be absent in prokaryotes. Here, we identified SNARE-like factors in the genomes of uncultured organisms of Asgard archaea of the Heimdallarchaeota clade4,5, which are thought to be the closest living relatives of eukaryotes. Biochemical experiments show that the archaeal SNARE-like proteins can interact with eukaryotic SNARE proteins. We did not detect SNAREs in α-proteobacteria, the closest relatives of mitochondria, but identified several genes encoding for SNARE proteins in γ-proteobacteria of the order Legionellales, pathogens that live inside eukaryotic cells. Very probably, their SNAREs stem from lateral gene transfer from eukaryotes. Together, this suggests that the diverse set of eukaryotic SNAREs evolved from an archaeal precursor. However, whether Heimdallarchaeota actually have a simplified endomembrane system will only be seen when we succeed studying these organisms under the microscope.
All SNARE proteins share an evolutionary conserved α-helical stretch ~ 60 amino acids long – the so-called SNARE-motif – whereas their N-terminal regulatory domains can have different folds. In most SNAREs, the SNARE motif is C-terminally anchored via a single-pass transmembrane domain (TMD) and is facing the cytosol. N- to C-terminal assembly into a tight parallel four-helix bundle SNARE complex6 between apposing membranes thus pulls the membranes together1–3. In the core of the SNARE bundle, 16 layers of mostly hydrophobic residues interact tightly6. The central ionic “0-layer”, consisting of three glutamine (Q) residues and one arginine (R) residue, is almost unchanged throughout the SNARE protein family7. Four basic types, namely Qa-, Qb-, Qc-, and R-SNAREs, can be distinguished by their sequence profiles, reflecting their position in the heterologous four-helix bundle8. The structures of different SNARE complexes reveal that the four helices have distinctive features, particularly Qa- and R-SNAREs, which interlock6,9. This feature renders QabcR-SNARE complexes stable and may prevent other combinations. Among the four basic SNARE types, about 20 subtypes can be distinguished8. They assemble into distinct QabcR units that work in different trafficking steps and probably represent the original repertoire of the Last Eukaryotic Common Ancestor (LECA)8, which was a fairly sophisticated cell with a nucleus, peroxisomes, mitochondria, and probably all compartments of the endomembrane system10,11. The diverse SNARE set of the LECA and, by extension, of all present-day eukaryotes can be traced back to a single QabcR complex that was multiplied about 2 billion years ago. The different vesicle fusion machines then adapted to different intracellular trafficking steps1,8,10. Moreover, as all four basic SNARE types are related, it is conceivable that the first QabcR unit arose by gene duplication from one common SNARE ancestor as a prototypic SNARE protein assembled into homomeric bundles. But do such prototypic SNARE proteins still exist?
In order to search for prototypic SNARE proteins, we scanned the NCBI protein databases for archaea and bacteria (collectively known as prokaryotes). For this, we took advantage of Hidden Markov Model (HMM) profiles trained previously8 to classify the SNARE motifs of eukaryotic SNARE proteins. We implemented a 1E−4 expectation value cutoff and kept only the sequences for which the target motif was at least 40 amino acids long to minimize false positive results. Around 5,000 prokaryotic sequences met these criteria (see Methods and Supplementary Information, Section 1 for details). For an overview of the relationships among the collected prokaryotic sequences, we clustered them with the Basic Local Alignment Search Tool (BLAST) to construct groups of similar factors for further inspection (Methods). The sequences split into 96 different clusters of different sizes from a large cluster with ~ 4,200 sequences to clusters with only two sequences; 178 sequences remained isolated (Supplementary Information, Section 1). We then restricted the search to tail-anchored (TA) proteins, which are a subset of membrane proteins (~ 5%) that play important membrane-active roles in eukaryotes, these include the SNARE proteins12. TA proteins have been reported in all domains of life, although their number is usually restricted to about a dozen in most prokaryotes13,14, while eukaryotes can have several hundreds15,16.
Altogether, the collection of prokaryotic SNARE-like sequences found by our bioinformatic screen contained only 20 candidate sequences for TA protein (Methods). The sequences were contained in smaller clusters and as singletons and had only moderate e-values (Supplementary Information, Section 1). Cluster 8 contained sequences from several closely related species of the genus Variovorax (γ-proteobacteria). These sequences were about 10 residues too short for a SNARE motif. Cluster 40, containing TA proteins from the genus Methylobacterium (α-proteobacteria), was even less enticing, as it was too short and lacked a central glutamine. By contrast, Cluster 28 containing a pair of related sequences (OLS22354.1 and PWI47941.1) from two metagenomes of the Asgard group, Heimdallarchaeota archaeon LC_2 5 and B3-JM-08 17, looked like promising candidates. Both sequences possess an entire SNARE motif and a central glutamine residue within the motif and the motif was connected to a TMD via a short linker. Most strikingly, both sequence’s N-terminal regions are predicted to contain α-helices that may fold into a three-helix bundle, a structural feature found in the vast majority of eukaryotic Q-SNARE subtypes (Extended Data Fig. 1; Supplementary Information, Section 2). Overall, these two Asgard sequences posed the good prototypic prokaryotic SNARE protein candidates uncovered. This observation was intriguing, as this archaeal lineages is considered to be the closest extant relative of eukaryotes5,17,18 and the existence of functional SNARE proteins within this lineage would provide fascinating insights into the origin of intracellular trafficking and eukaryotes in general. However, do these Asgard genes really encode SNARE proteins?