Autophagy is a degradative pathway for cytoplasmic constituents, and is conserved across eukaryotes. Autophagy-related (ATG) genes have undergone extensive multiplications and losses in different eukaryotic lineages, resulting in functional diversification and specialization. Notably, even though bacteria and archaea do not possess an autophagy pathway, they do harbor some remote homologs of Atg proteins, suggesting that preexisting proteins were recruited when the autophagy pathway developed during eukaryogenesis. In this Review, we summarize our current knowledge on the distribution of Atg proteins within eukaryotes and outline the major multiplication and loss events within the eukaryotic tree. We also discuss the potential prokaryotic homologs of Atg proteins identified to date, emphasizing the evolutionary relationships and functional differences between prokaryotic and eukaryotic proteins.
Macroautophagy, hereafter referred to as autophagy, is the process whereby cytoplasmic constituents, including organelles, are delivered to the lysosomes (or vacuoles) for degradation via double-membraned structures known as autophagosomes. Autophagosomes are assembled de novo through the coordinated actions of ∼20 core autophagy-related (Atg) proteins and other membrane trafficking proteins (Mizushima and Levine, 2020; Nakatogawa, 2020; Søreng et al., 2018). These Atg proteins can be divided into five functional groups (Fig. 1; Table 1): (1) those of the Atg1/ULK complex, (2) Atg9, a transmembrane protein that resides on vesicles, (3) members of the class III phosphoinositide 3-kinase (PI3K) complex I, (4) Atg2–β-propellers that bind to polyphosphoinositides (PROPPINs), such as Atg18 homologs and WD repeat domain phosphoinositide-interacting proteins (WIPIs), as well as (5) members of the Atg12–Atg5 and Atg8–phosphatidylethanolamine (PE) conjugation systems. Other proteins, such as VMP1 and TMEM41B, are not classified into these groups, but are also important for autophagy.
With some notable exceptions (e.g. microsporidia in fungi, the metamonad Giardia lamblia, and red algae) (Rigden et al., 2009; Shemi et al., 2015; Wang et al., 2019), ATG genes are conserved in most eukaryotic species, at least partially. Thus, autophagy is considered a fundamental pathway that already existed in the last eukaryotic common ancestor (LECA). However, there remains much to uncover about the origin and evolution of the autophagy pathway. The emergence of membrane-bound compartments and the membrane trafficking system during eukaryogenesis (Klinger et al., 2016; Koumandou et al., 2013; Schlacht et al., 2014) is thought to have been a key event, given that the very definition of autophagy relies on both processes (Farhan et al., 2017). Remote homologs (genes with a common ancestor but divergent sequences resulting from evolutionary separation a long time ago) of ATG genes were recently identified in bacteria and archaea (Burroughs et al., 2011, 2015; Imachi et al., 2020; Iyer et al., 2006; Levine, 2019; Maeda et al., 2020; Mesdaghi et al., 2020; Okawa et al., 2021; Spang et al., 2015; Ye et al., 2020), shedding light on the evolution of autophagy-specific components. In this Review, we summarize our current knowledge on the expansion and functional diversification of Atg proteins in the eukaryotic lineages (Fig. 2), and discuss the prokaryotic homologs of Atg proteins, while speculating on the evolutionary history of individual genes as well as the autophagy pathway. Unless otherwise specified, our primary focus is on bulk, non-selective, starvation-induced autophagy.
From one to many – the functional divergence of Atg paralogs in eukaryotes
Multiplication and loss of Atg proteins were both common events in the evolution of the eukaryotic lineages. This section covers Atg proteins with multiple homologs in mammals in the order of their actions in the autophagy pathway. Note that the less-studied lineages may harbor additional events that are yet unknown.
The Atg1/ULK complex, which includes Atg1, Atg13, Atg17, Atg29, Atg31 and Atg11 in the budding yeast and ULK1 (or ULK2), ATG13, FIP200 (also known as RB1CC1) and ATG101 in humans, is responsible for the initiation and regulation of autophagy (Mizushima, 2010; Noda and Fujioka, 2015). ULK1 and ULK2 have been thought to be largely redundant in the autophagy pathway, but recent evidence suggests that this may not be as fully correct; for example, they have different binding partners and transcriptional regulators (Demeter et al., 2020). The human genome also contains other Atg1 homologs not part of the Atg1/ULK complex, including ULK3, ULK4 and STK36/Fused. Atg1, ULK1, ULK2 and ULK3 have a N-terminal serine/threonine kinase domain and a conserved C-terminal region, consisting of two microtubule-interacting and transport (MIT) domains. The MIT domains in Atg1, ULK1 and ULK2 interact with Atg13/ATG13 and play an important role in autophagy (Caballe et al., 2015; Nishimura and Tooze, 2020). However, (putative) Atg1 homologs that do not possess tandem MIT domains have also been identified, including ULK4 and STK36/Fused in metazoa, plants and protists (Dictyostelium, an amoeba species, also has a STK36/Fused homolog, TsuA) (Maloverjan and Piirsoo, 2012; Oh et al., 2005; Préat et al., 1990; Preuss et al., 2020; Tang et al., 2008), Atg1t in seeding plants (Huang et al., 2019; Suttangkakul et al., 2011) and almost all Alveolata homologs (Aslan et al., 2017; Földvári-Nagy et al., 2015) (Fig. 3A). In addition to differences in the C-terminal region, diverged sequences in the kinase domain also separate ULK4 and STK36/Fused from ULK1 and ULK2 (Demeter et al., 2020; Preuss et al., 2020). Because Atg1/ULK1, ULK4 and STK36/Fused are widely distributed among eukaryotes, it is likely that the LECA already contained two groups of Atg1 homologs, one Atg1/ULK1-like and the other ULK4- and STK36/Fused-like (Atg1-1 and Atg1-2 in Fig. 3A). However, ULK3 was a later addition in vertebrates and some arthropods (Braden and Neufeld, 2016), and, as is the case for ULK4 and STK36/Fused, is unlikely to possess autophagic functions. Instead, ULK3 and STK36/Fused regulate the sonic hedgehog pathway in mammals (Maloverjan and Piirsoo, 2012; Maloverjan et al., 2010), whereas ULK4 is a risk gene for schizophrenia and autism (Lang et al., 2016).
Atg17/ATG17 and Atg11/FIP200
Atg17 and Atg11/FIP200 are both scaffolding proteins in the Atg1/ULK complex with overlapping functions (Farré and Subramani, 2016; Hara and Mizushima, 2009). Atg11/FIP200 is conserved in metazoa, fungi, amoebae, plants and some protist species (Pan et al., 2020), whereas Atg17 is found in fungi and amoebae, as well as some protists (Fischer and Eichinger, 2019; Mizushima, 2010) (Fig. 3B,C). Although Atg11 and Atg17 are required for selective autophagy and starvation-induced bulk autophagy, respectively, in Saccharomyces cerevisiae, Atg11/FIP200 is required for bulk autophagy in most organisms (even in the yeast Schizosaccharomyces pombe; Sun et al., 2013). Remarkably, Atg11/FIP200 harbors a Atg17-like domain towards its N-terminus, a pattern that is widespread in eukaryotes (Li et al., 2014; Pan et al., 2020). It is therefore possible that Atg11/FIP200 and Atg17 are actually remote homologs that separated early on, and one of which was subsequently lost in multiple lineages. Consistent with a potential evolutionary link, yeast Atg17 and the Atg17-like region of human FIP200 bind to ATG13 and form a dimer with the N-termini protruding outwards, although the overall shape of the resulting dimers differs (Ragusa et al., 2012; Shi et al., 2019).
As one of the first ATGs to arrive at the site of autophagosomal formation, Atg9 proteins contained in single-membrane vesicles (hereafter denoted Atg9 vesicles) in yeast serve as a seed, accepting lipids from the lipid transfer protein Atg2, thereby allowing for phagophore expansion (Sawa-Makarska et al., 2020; Yamamoto et al., 2012). Mechanistically, Atg9/ATG9A possesses lipid scramblase activity (Maeda et al., 2020; Matoba et al., 2020). Atg9 deficiency in Arabidopsis leads to somewhat different phenotypes depending on the specific conditions, from elongated Atg8-decorated tubular structures protruding from the endoplasmic reticulum (ER), which result from abnormal membrane expansion without closure (Zhuang et al., 2017), to no obvious phenotype (Kang et al., 2018). In Dictyostelium, Atg9 deficiency leads to growth defects and impaired phagocytosis (Tung et al., 2010).
Humans possess two Atg9 homologs, ATG9A, the main form, and ATG9B, which has a highly restricted expression pattern that includes adult placenta and testis, and fetal brain, lung and thymus tissue (Kusama et al., 2009), as well as some cancer cell lines (Fig. 3D) (Ma et al., 2017; Yun et al., 2020). Most organisms have one copy of Atg9; however, Atg9 has not been found in some Alveolata species (Aslan et al., 2017; Rigden et al., 2009).
Vps30/Beclin 1 (BECN1) is part of the PI3K complex I (PI3KC3), which interacts with Atg14 to target the lipid kinase Vps34 to the site of autophagosome formation and regulates the catalytic activity of Vps34 (Levine et al., 2015). The evolutionary history of Vps30 is relatively simple, with most plant, alveolata and non-mammal metazoan species having only one Vps30 homolog (Aslan et al., 2017; Avin-Wittenberg et al., 2012; Földvári-Nagy et al., 2015; He et al., 2013; Rigden et al., 2009) (Fig. 3E). In contrast, Dictyostelium has two Vps30 homologs; however, their functional differences have yet to be studied in detail (Calvo-Garrido et al., 2010). Among plants, Oryza sativa has three Vps30 homologs that are clustered together in the phylogenetic tree, each showing differential expression under abiotic stress conditions (Rana et al., 2012). In addition to Beclin 1, mammals have a second Vps30 homolog, Beclin 2 (He et al., 2013). Whereas Beclin 1 functions in both autophagy and the endocytic pathway (Levine et al., 2015), Beclin 2 is primarily involved in the endolysosomal degradation of G protein-coupled receptors (He et al., 2013). The fact that Beclin 2 contains little to no introns may reflect a unique evolutionary history, such as RNA-based gene duplication, resulting in altered transcriptional regulation compared with Beclin 1 (Grzybowska, 2012; He et al., 2013).
PROPPINs, which include Atg18 homologs and WD repeat domain phosphoinositide-interacting proteins (WIPIs), organize isolation-membrane expansion by binding to phosphatidylinositol 3-phosphate (PI3P), which marks the membrane expansion site, and recruiting the Atg12–Atg5–Atg16 complex through binding to Atg16 (Nishimura and Tooze, 2020; Sawa-Makarska et al., 2020). The PROPPIN family represents one of the less common cases of Atg protein evolution whereby duplication occurred in the fungal lineage, namely, in Saccharomycetaceae, giving rise to Atg18, Atg21 and Hsv2, and in Schizosaccharomycetes, giving rise to Atg18a–Atg18c (homologs of Atg18, Atg21 and Hsv2, respectively) (Sun et al., 2013; Wang et al., 2019).
The phylogenetic tree divides metazoan and fungal PROPPIN homologs into two groups, one consisting of WIPI1, WIPI2, Atg18 and Atg21 (Group 1), and the other consisting of WIPI3, WIPI4 and Hsv2 (Group 2) (Polson et al., 2010) (Fig. 3F). Some publications show that Arabidopsis Atg18 proteins are also divided into these two groups (Norizuki et al., 2019; Xiong et al., 2005), suggesting that the emergence of these groups may have been earlier than that of Opisthokonta. Function-wise, Atg21 and WIPI2 bind directly to Atg16/ATG16L1 through similar mechanisms that involve blades 2 and 3 in Atg21/WIPI2 and acidic residues near the dimerization domain in Atg16/ATG16L1 (Dooley et al., 2014; Juris et al., 2015; Munzel et al., 2020), suggesting that interaction with Atg16 was an ancestral feature of Group 1. However, in yeast and humans, Atg18 and WIPI4, respectively, bind to Atg2 at different sites (Lei et al., 2020; Ren et al., 2020). Atg18 interacts with Atg2 with the 7AB loop that is absent from other Atg18 family proteins, suggesting that this is a fungal-specific functional innovation (Lei et al., 2020). Meanwhile, Hsv2 shows perivacuolar and endosomal localization similar to Atg18 and Atg21, and contributes to the efficiency of micronucleophagy (Krick et al., 2008). However, more studies are required to clarify its molecular function and role in autophagy.
Plants have a distinct type of Atg18 proteins, with a BCAS3 domain at their C-terminus that first occurred in Streptophyta (Norizuki et al., 2019). For example, Arabidopsis has five conventional Atg18 proteins and three with a BCAS3 domain (Atg18f, Atg18g and Atg18h). The number of Atg18 family proteins varies among Alveolata species. In Toxoplasma gondii, there are two Atg18 homologs, and although TgPROP1 likely functions in autophagy under stress, TgPROP2 is required for growth in normal conditions, which may be an autophagy-independent role (Nguyen et al., 2018).
Atg10/ATG10 and Atg12/ATG12
In autophagy, two ubiquitin-like (Ubl) conjugation systems ensure that Atg8 conjugates to the autophagosomal membrane even though Atg8 itself is not a membrane-associated protein (Fig. 1). In the Atg12–Atg5 conjugation system, the E1-like enzyme Atg7 activates and forms a thioester bond with Atg12, before handing Atg12 over to the E2-like enzyme Atg10, which transfers Atg12 to the substrate, Atg5. In the Atg8 conjugation system, Atg8 is also activated by Atg7, albeit via the evolutionarily related but different E2-like enzyme Atg3. Finally, Atg3 transfers Atg8 to PE facilitated by the Atg12–Atg5 conjugate, which acts as an E3-like enzyme (Mizushima, 2020; Shpilka et al., 2012).
It was recently discovered that in some Apicomplexa and Pichiaceae species, such as Plasmodium, Toxoplasma and Komagataella, Atg10 has been lost, and moreover, Atg12 does not have the C-terminal glycine needed for conjugation (Pang et al., 2019). Accordingly, in these species, Atg12 has lost the ability to form covalent conjugations with Atg5, interacting instead with Atg5 via non-covalent bonds, which still supports Atg8 lipidation (Pang et al., 2019). Similar but weaker non-covalent interactions exist between human and yeast Atg12 and Atg5 (Noda et al., 2013; Otomo et al., 2013), suggesting that the non-covalent interface evolved from preexisting bonds. Compared with its covalent counterpart, the non-covalent interaction does not involve ATP hydrolysis, although it does require a specific binding interface. Because Atg12 does not need to interact with a plethora of other substrates, as is the case for ubiquitin (Ub), it might be able to switch to a simpler and more specific form of interaction (Pang et al., 2019). Overall, both this and other studies on the distribution of Atg proteins across eukaryotes suggest that Atg10, Atg12 and Atg16, which are involved in the Atg12–Atg5 conjugation system, are less conserved among fungal and Alveolata lineages (Fig. 3G) (Aslan et al., 2017; Meijer et al., 2007; Rigden et al., 2009; Wang et al., 2019).
Atg16/ATG16L1 is part of the Atg12–Atg5–Atg16 complex, interacting with Atg21/WIPI2 and FIP200 to recruit this complex to the site of autophagosome formation (Gammoh, 2021). In addition to the Atg16 domain at the N-terminus, a wide range of metazoan, amoeba, plant and chromalveolata species also have a WD-repeat domain at the C-terminus (Aslan et al., 2017; Shemi et al., 2015; Xiong et al., 2018), suggesting that this may be the ancestral form. In contrast, yeasts have no WD domain, although it is not clear at which point in the fungal lineage the domain was lost (Fig. 3H). The WD domain is not required for canonical autophagy, but is important for the conjugation of ATG8 family proteins to endocytic single membranes (e.g. LC3-associated phagocytosis) (Fletcher et al., 2018; Fujita et al., 2013), which yeasts are unlikely to have. Vertebrates have two ATG16 proteins (Fig. 3H), namely ATG16L1 and ATG16L2 (Ishibashi et al., 2011). ATG16L2 is dispensable for autophagy, but like ATG16L1 is also linked to autoinflammatory diseases and exerts a tissue-specific effect (Khor et al., 2019).
In most species, the Atg8 C-terminus is extended beyond the glycine, the exposure of which is required for conjugation. Atg4 is a cysteine protease required for both cleaving Atg8 to expose the C-terminal glycine and releasing Atg8 from the autophagosomal membrane in order to be recycled (Maruyama and Noda, 2017). Although lineages in which the Atg8 family has greatly expanded, often have multiple Atg4 proteins, their evolutionary patterns differ (Fig. 3I). Metazoa likely first acquired two Atg4 proteins, each of which was duplicated in vertebrates (López-Otín and Mariño, 2017; Wang et al., 2020). Meanwhile, in humans, ATG4B is the main enzyme that processes all LC3 and GABARAP proteins and stabilizes unlipidated GABARAPs (Agrotis et al., 2019; Skytte Rasmussen et al., 2017), whereas ATG4A, ATG4C and ATG4D process LC3A and GABARAPs at a lower efficiency (Agrotis et al., 2019). A previous in vitro study revealed that ATG4C and ATG4D possess only minimal enzymatic activity (Li et al., 2011). Caenorhabditis elegans Atg-4.1, which groups with human ATG4A and ATG4B, is also the primary enzyme in this organism (Wu et al., 2012), suggesting that the Atg4A and ATG4B branch is likely the more enzymatically active one in all metazoans. Moreover, plants have one to three Atg4 proteins, and the Atg4 phylogenetic tree is similar to the known taxonomy, indicating that the Atg4 protein duplications occurred after speciation (Seo et al., 2016). In Arabidopsis, Atg4a processes Atg8a, Atg8c, Atg8d and Atg8i more efficiently, and is approximately equal to Atg4b for the rest of the Atg8 homologs (Park et al., 2014).
Atg8 conjugated to the autophagosomal membrane has long been considered the hallmark of autophagy. As mentioned above, Atg8–PE conjugation is achieved via two Ubl-conjugation systems (Hanada et al., 2007; Ichimura et al., 2000; Mizushima et al., 1998). Atg8 serves different functions in autophagy, including the regulation of autophagic initiation, isolation membrane expansion, autophagosome closure and autophagosome fusion with the lysosomes (Mizushima, 2020).
Although fungi such as S. cerevisiae have only a single Atg8 protein, expansion of the Atg8 family proteins occurred in many species, most notably in the metazoan and plant lineages (Kellner et al., 2017; Shpilka et al., 2011) (Fig. 3J), which also underwent major expansion of other Ubl systems (Grau-Bové et al., 2015). Metazoan Atg8 proteins are divided into two families, GABARAP and LC3 (also known as MAP1LC3), both of which have distinct sequence and structural features (Jatana et al., 2020). The exact degree of functional redundancy between these two families remains unclear; however, they are known to differ in terms of their binding affinities towards the core Atg proteins and receptors, with GABARAPs appearing to play a more central role in the later steps of starvation-induced and selective autophagy (Mizushima, 2020; Nguyen et al., 2016; Olsvik et al., 2015; Vaites et al., 2018; von Muhlinen et al., 2012; Wirth et al., 2019). In the slime mold Dictyostelium discoideum, Atg8b and Atg8a are sequentially associated with the autophagosomes (Matthias et al., 2016), somewhat reminiscent of the distinction between LC3 and GABARAP (Weidberg et al., 2010). Although a previous phylogenetic study grouped Atg8b with the LC3 family (Mesquita et al., 2017), further study is required to determine whether the LC3 and GABARAP families appeared before metazoa. In plants, Atg8 can be classified into two clades (Kellner et al., 2017; Seo et al., 2016). Clade I is further divided into three subfamilies, subfamily 2 including Arabidopsis Atg8a–Atg8d, subfamily 1 comprising Arabidopsis Atg8e–Atg8g, and subfamily 3 containing Atg8 in green algae, moss and some vascular plants. Clade II consists of Arabidopsis Atg8h and Atg8i and is close to metazoan GABARAPs. These phylogenetic trees are consistent with the hypothesis that multiple Atg8 proteins emerged early on in the plant lineage (Kellner et al., 2017), with development into different clades occurring in vascular plants. In Alveolata, the number of Atg8 homologs varies from one in the apicomplexa Plasmodium falciparum and Toxoplasma gondii to 22 in the ciliate Paramecium tetraurelia (Aslan et al., 2017; Rigden et al., 2009). In the ciliate Tetrahymena thermophila, three Atg8 proteins that are differentially used for survival under starvation and programmed nuclear degradation have been reported (Liu and Yao, 2012).
It is worth noting that the Atg8 family proteins also have autophagy-independent functions; for example, ATG8 family proteins in LC3-associated phagocytosis, GABARAPs in GABAA receptor trafficking and apicomplexan Atg8 in apicoplast-related functions (Galluzzi and Green, 2019; Jacob et al., 2008; Lévêque et al., 2015; Martinez et al., 2015; Mizushima and Sahani, 2014; Walczak et al., 2017). These findings therefore emphasize that the functional diversification of Atg8 family proteins extends beyond autophagy.
The unique evolution of ATG genes in budding yeast
The discovery of ATG genes was first made in the budding yeast S. cerevisiae (Ohsumi, 2014; Takeshige et al., 1992; Tsukada and Ohsumi, 1993; Umekawa and Klionsky, 2012), and to date, this remains the most extensively studied and well-understood organism in autophagy research. However, advances in our understanding of the autophagic system in other species, including in mammals and plants, have revealed that the budding yeast (and other closely-related fungi species) harbors many unique aspects (Table 2), possibly reflecting its natural history and ecological niches (Liti, 2015). Supporting this idea, whereas amoebae branched out before the separation of metazoa and fungi, autophagy in Dictyostelium is more similar to that in mammals (Mesquita et al., 2017), suggesting that ancestral autophagy may be closer to the mammalian form.
Unlike in mammals, amoebae and plants (King et al., 2011; Le Bars et al., 2014), autophagy in yeast is initiated at a single site, the preautophagosomal structure (PAS), which is located near the ER-exit site and the vacuole (Graef et al., 2013; Suzuki et al., 2013). The Atg1/ULK complex consists of Atg1, Atg13, Atg17, Atg29, Atg31 and Atg11 in the budding yeast, and Atg1, Atg13, Atg11/FIP200, Atg101 and possibly Atg17 in other eukaryotic species (Li et al., 2014; Mesquita et al., 2017; Noda and Fujioka, 2015). Atg17 exists in amoebae, fungi and some Alveolata species (see previous sections), and a scaffolding function is found outside of budding yeast in S. pombe and Kluyveromyces marxianus (although the function of Dictyostelium Atg17 remains unclear) (Mesquita et al., 2015; Nanji et al., 2017; Sun et al., 2013; Yamamoto et al., 2015). Atg29 is present only in Ascomycota and Atg31 in Saccharomycetes (Meijer et al., 2007; Wang et al., 2019), suggesting that the Atg17–Atg29–Atg31 complex is a functional innovation specific to the budding yeast and its close relatives. Yeast Atg13 has two unique structural features, the cap and the hinge loop (Jao et al., 2013; Michel et al., 2015; Qi et al., 2015; Suzuki et al., 2015a). The cap stabilizes Atg13, whereas in most other eukaryotes, the same function is carried out through interactions with Atg101 (Jao et al., 2013; Noda and Mizushima, 2016).
Another difference is the requirement of VMP1, an ER-localized multi-spanning protein, which is conserved in most eukaryotes except fungi (King, 2012) and required for autophagy in metazoans and Dictyostelium (Calvo-Garrido and Escalante, 2010; Itakura and Mizushima, 2010). Atg15, the only lipase among the Atg proteins, is responsible for digesting the autophagic body after the autophagosome fuses with the vacuole (Epple et al., 2001; Ramya and Rajasekharan, 2016). No clear Atg15 ortholog has yet been identified in humans or plants (Meijer et al., 2007; Rigden et al., 2009), and different lipases might be required to carry out this role in these species.
In addition to canonical autophagy, budding yeast also has a unique pathway, namely the cytoplasm-to-vacuole targeting (Cvt) pathway, which delivers the vacuolar hydrolases Ape1, Ape4 and Ams1 to their place of action (and the retrotransposon Ty1 and cytosolic enzyme Lap3 for degradation) (Hirayama et al., 2010; Yamasaki and Noda, 2017; Yuga et al., 2011). The selective advantage of this pathway is not clear, but it is thought to promote efficient delivery of Ape1 via condensate formation (Yamasaki et al., 2020) and allows Ape4 and Ams1 to be used as both cytosolic and vacuolar enzymes.
From initiation to degradation, the autophagic system in budding yeast has many unique aspects. It is important that we keep this in mind, as the budding yeast is often used as the model to compare with.
From prokaryotes to eukaryotes – prokaryotic homologs of ATG genes or domains
An interesting question is how ATG genes emerged during evolution. Although autophagy is a cellular function that is only possible in eukaryotes, some Atg-related proteins were recently identified in prokaryotes. This section covers these potentially ancient origins and compares the functions of these prokaryotic proteins with those of Atg proteins in eukaryotes.
HORMA domain-containing proteins – ATG13 and ATG101
HORMA (Hop1, Rev7 and Mad2) domain-containing proteins are known for their roles in DNA repair, mitosis and meiosis (Rosenberg and Corbett, 2015). The best-studied member, Mad2 (also known as MAD2L1), can switch between two conformations (Luo and Yu, 2008). Binding to CDC20 switches it to the active ‘closed’ form (C-MAD2), which maintains the spindle assembly checkpoint. C-MAD2 interacts with another HORMA-domain protein p31 (also known as MAD2L1BP). p31 recruits TRIP13, which in turn converts C-MAD2 into the inactive ‘open’ form (O-MAD2), allowing for chromosome segregation (Alfieri et al., 2018; Ye et al., 2017) (Fig. 4A). The HORMA domain also exists in ATG13 and ATG101, where it mediates the interaction between these two proteins (Suzuki et al., 2015a) (and also between Atg13 and Atg9 in yeast; Suzuki et al., 2015b). Despite the overall structural similarity between the ATG101–ATG13 and O-MAD2–C-MAD2 heterodimers that are formed through HORMA domain dimerization, ATG101 and ATG13 are locked to the open and closed forms, respectively, and are unable to switch conformations (Mapelli et al., 2007; Michel et al., 2015; Qi et al., 2015; Suzuki et al., 2015a).
Possible interaction between a HORMA domain and Trip13 exists in a wide range of prokaryotes (Burroughs et al., 2015). Moreover, more recently, it was found that, like Mad2, bacterial HORMA proteins also have the ability to switch conformations, with their closed form able to bind to and activate DncV-like proteins (bacterial homologs of cyclic GMP-AMP synthase, cGAS), which produces cyclic triadenylate as an immune response to bacteriophages (Ye et al., 2020). Similar to what is seen in the case of Mad2, Trip13 converts bacterial HORMA proteins into their open form and negatively regulates DncV-like proteins (Fig. 4A). In bacteria, the HORMA proteins therefore play direct roles in cell immunity. Bacteria have two types of HORMA proteins, HORMA1 and HORMA2 (Burroughs et al., 2015; Tromer et al., 2019; Ye et al., 2020). The phylogenetic tree that includes HORMA-domain-containing proteins in both eukaryotes and prokaryotes is largely separated by the domains of life, and the eukaryotic proteins form a monophyletic group (Almutairi, 2018; Tromer et al., 2019), suggesting that the ability to switch conformations was likely an ancestral feature that was lost in certain eukaryotic proteins, such as ATG13 and ATG101.
VMP1 and TMEM41B – DedA superfamily proteins
VMP1 and TMEM41B are ER-localizing transmembrane proteins required for autophagy, lipoprotein secretion, lipid mobilization, as well as infection by flaviviruses and coronaviruses, including SARS-CoV-2 (Baggen et al., 2021; Hoffmann et al., 2021; Moretti et al., 2018; Morishita et al., 2019; Morita et al., 2018; Schneider et al., 2021; Shoemaker et al., 2019; Trimarco et al., 2021). They belong to the same family as eukaryotic TMEM41A and TMEM64 (a homolog of yeast Tvp38) and are also evolutionarily related to bacterial DedA proteins (Doerrler et al., 2013; Keller and Schneider, 2013; Morita et al., 2018; Tábara et al., 2019). Escherichia coli has eight DedA proteins, which are collectively essential; of these, the most extensively studied are YqjA and YghB, which regulate temperature and pH sensitivity, phospholipid levels, cell division and drug resistance (Boughner and Doerrler, 2012; Kumar and Doerrler, 2014, 2015; Sikdar and Doerrler, 2010; Thompkins et al., 2008).
Two recent studies found that PF06695, a mostly bacterial Pfam family with unknown function, is also a remote homolog of VMP1 and TMEM41B (Mesdaghi et al., 2020; Okawa et al., 2021) (Fig. 4B). Analysis of the phylogenetic tree further revealed that the TMEM41 (including TMEM41A, TMEM41B and TMEM64/Tvp38), VMP1, DedA and PF06695 families constitute a large superfamily, which we have named the DedA superfamily (Okawa et al., 2021). Although the TMEM41 and VMP1 families are primarily eukaryotic, they also harbor some bacterial and archaeal proteins (e.g. YdjX and YdjZ), suggesting that ancestors of TMEM41B and VMP1 were already present in prokaryotes. The Lokiarcheon Candidatus Prometheoarchaeum syntrophicum only has homologs that are close to eukaryotic proteins, consistent with the hypothesis that eukaryotes evolved from an archaeon ancestor closely related to Lokiarchaeota (Spang et al., 2015).
Of all the DedA superfamily proteins, only TMEM41B and VMP1 are required for autophagy. Even though mammalian TMEM41A and TMEM64 and yeast Tvp38 are members of the TMEM41 family, they are dispensable for autophagy (Okawa et al., 2021). We speculate that an autophagy-related function was either acquired independently in TMEM41B and VMP1 (and may be related to the ancestral function of this superfamily), or was acquired first in VMP1 and then shared with TMEM41B through interactions between these two proteins (Okawa et al., 2021). The latter is more consistent with experimental results showing that VMP1 overexpression was found to rescue TMEM41B defects, but not vice versa (Hoffmann et al., 2021; Morita et al., 2018; Shoemaker et al., 2019).
Co-evolution-based structural predictions and subsequent experimental validation suggest that the DedA domain forms two re-entrant loops that face each other in the membrane, a structure often seen in ion-coupled transporters, such as SLC1 (Mesdaghi et al., 2020; Okawa et al., 2021). VMP1 and TMEM41B are suggested to have lipid scramblase activity and may coordinate with Atg2 and Atg9 to regulate the membrane dynamics (Ghanbarpour et al., 2021; Huang et al., 2021; Li et al., 2021). More studies are required to determine the structure and clarify how these proteins are integrated into the functional network of Atg proteins.
ATG2 – a chorein-N domain-containing lipid-transfer protein
The chorein-N domain is found in Vps13, Atg2 and SHIP164 in eukaryotes (Lees and Reinisch, 2020). Vps13 and Atg2 are lipid-transfer proteins with a rod-like shape (Kumar et al., 2018; Maeda et al., 2019; Osawa et al., 2019; Valverde et al., 2019). The solved structure of their N-terminal region, including the chorein-N domain, reveals a hydrophobic cavity in the middle, which is thought to extend over their entire length (Kumar et al., 2018; Osawa et al., 2019; Valverde et al., 2019). Recently, remote homologs of the chorein-N domain have been found in bacterial AsmA, TamB and DUF2993 proteins (Levine, 2019) (Fig. 4C). In DUF2993, the chorein-N domain is connected to a tubular lipid-binding protein (TULIP) domain, which is also a eukaryotic lipid-transfer domain, suggesting that it is another lipid-transfer protein (Levine, 2019). TamB forms part of the bacterial translocation and assembly module, and interacts with BamA, a subunit of the β-barrel assembly machinery, which facilitates the assembly and export of outer-membrane proteins (Iqbal et al., 2016; Selkrig et al., 2012). The C-terminus of TamB forms a β-taco fold, or a series of folds, with a concave hydrophobic side large enough to shield an amphipathic β-strand (Josts et al., 2017), similar to the β-jellyroll in lipopolysaccharide (LPS) transport proteins, which shields their substrates from the aqueous environment (Okuda et al., 2016). However, whether TamB transports lipids or lipopolysaccharides remains unknown. Overall, although the chorein-N domain itself is rather short (Fig. 4C); it forms the N-terminal tip of lipid-transfer proteins, and the role of lipid transfer is likely conserved between eukaryotes and bacteria.
ATG8 and ATG12 – ubiquitin-like proteins
In eukaryotes, Ub and Ubl proteins, such as SUMO proteins, NEDD8, Atg8 and Atg12, act as versatile protein modifiers that play an important role in proteasome-mediated protein degradation, cell signaling and autophagy (Ichimura et al., 2000; Mizushima et al., 1998; Welchman et al., 2005). Ubl proteins are closely related to ThiS, MoaD and TGS, which were already present in the last universal common ancestor (LUCA) (Burroughs et al., 2012). ThiS and MoaD are sulfur carriers that function in the thiamin and molybdenum cofactor (MoCo) synthesis pathways and are conserved in most bacterial and archaeal lineages (Iyer et al., 2006). Both ThiS and MoaD require adenylation at the C-terminus in order to accept sulfur (Kessler, 2006), similar to Ub activation via the E1 enzyme, suggesting an evolutionary link (Fig. 4D). Consistent with this, the bacterial, archaeal and eukaryotic Ubl proteins TtuB, SAMP and Urm1 have been found to have dual functions, being capable of both sulfur transfer and protein modification (Humbard et al., 2010; Miranda et al., 2011; Shigi, 2012; Van Der Veen et al., 2011). Similar to ThiS and MoaD, these proteins require only E1 enzymes, and thus represent a simpler and likely ancestral version of Ub conjugation.
Operons with E2 and E3 enzymes have also been found in certain bacterial and archaeal species, with dispersed distribution across many lineages (Burroughs et al., 2009, 2011; Grau-Bové et al., 2015; Imachi et al., 2020; Iyer et al., 2006; Nunoura et al., 2011; Spang et al., 2015), possibly as a result of lateral gene transfer. As a result, all components of the eukaryotic Ub-conjugation system are present in prokaryotes. Of these, the Aigarchaeon Candidatus subterraneum and Lokiarchaeon species, such as Candidatus Prometheoarchaeum syntrophicum, the closest cultured archaeal relative of eukaryotes, are likely to have a Ub-conjugation system that is most similar to that of eukaryotes (Imachi et al., 2020).
Atg9 – lipid scramblase
The lipid scramblase Atg9/ATG9, the only transmembrane protein among all core Atg proteins, forms a homo-trimer with both central and lateral cavities (Guardia et al., 2020; Maeda et al., 2020; Matoba et al., 2020). A recent study found an unexpected sequence and structural similarity between the transmembrane domains of ATG9 and type I ATP binding cassette (ABC) transporters in bacteria (Maeda et al., 2020). The N- and C-terminal halves of the transmembrane portion of ATG9 represent structural repeats, each consisting of two transmembrane helices (TMHs) and one re-entrant membrane helix (RMH). This repeated unit is similar to the TMH1–TMH3 domains in the ABC transporter, although the TMH3 domain of the ABC transporter is not a re-entrant and is aligned only to the RMH in the C-terminal half and not the N-terminal half (Maeda et al., 2020) (Fig. 4E). Notably, superimposing MsbA, a representative ABC transporter, bound to its substrate LPS with the C-terminal half of ATG9, places LPS in a similar position to lipids bound to ATG9 (Maeda et al., 2019). ABC transporters are ubiquitous in all three domains of life and usually contain two transmembrane modules consisting of several TMHs and two nucleotide-binding domains (NBDs) (Higgins, 1992). With some exceptions, they can be phylogenetically divided into three groups: (1) bacterial class I proteins that possess fused transmembrane modules and NBDs, and eukaryotic ABCB, ABCC and ABCD; (2) bacterial class III proteins with separate transmembrane modules and NBDs, and eukaryotic ABCA (and possibly ABCG); and (3) bacterial class II proteins and eukaryotic ABCE and ABCF (Davidson et al., 2008; Xiong et al., 2015) (Fig. 4E). The last group does not contain transmembrane modules and functions differently. The newly identified similarity between ATG9 and type I ABC transporters places ATG9 in the first group. However, as most phylogenies of ABC transporters are based on NBD sequences, which is their most conserved portion, more studies are required to fully understand the evolutionary relationships between these proteins.
Speculation on the evolution of the autophagy pathway
Even though autophagy exists only in eukaryotes, remote homologs of Atg proteins were already present in prokaryotes as discussed above. The level of functional conservation between prokaryotic and eukaryotic proteins varies; for example, the lipid-transfer function of the chorein-N domain remains essentially unchanged, and the relative position of the lipid component and the TMH remains similar in MsbA and ATG9, whereas the HORMA domain switched from an immune-centric role to a more versatile scaffolding role in different eukaryotic complexes. In the case of Ubl conjugation, similar systems were already fully established in prokaryotes, although the autophagy-specific version was not. In the DedA superfamily, the autophagy-related function of VMP1 and TMEM41B may be related to their ancestral function in the plasma membrane.
The exact temporal order of events during eukaryogenesis remains unclear; however, it is possible to imagine that a developing intracellular membrane trafficking system enlisted preexisting proteins to form a basic autophagy-like pathway, which then became specialized and more efficient. Detailed analyses of the archaeal species that are most closely related to eukaryotes, as well as eukaryotic species without any (e.g. Giardia lamblia and red algae) or with only a minimal autophagy pathway (e.g. Trypanosomatids), will help shed more light on this process.
Evolution of the autophagy pathway can be separated into two phases, pre- and post-development of a complete, yet simple, autophagic system in the LECA. Our understanding of both phases has greatly expanded in recent years; however, a number of important questions remain. For example, the order of events leading up to the development of the autophagy pathway in LECA, the recruitment of different Atg proteins and how this process is related to the evolution of the membrane trafficking system remain unknown. Understanding the reasons for the multiplication and loss events that occurred after the development of LECA is also important, as well as a detailed analysis of currently less well-understood species and lineages. Future studies addressing these issues will help deepen our overall understanding of the autophagy pathway.
We would like to thank Drs Hayashi Yamamoto, Ikuko Koyama-Honda, Junichi Sakamaki, and Kamil Soltysik for their valuable comments.
This work was supported by a grant for Exploratory Research for Advanced Technology (JPMJER1702 to N.M.) from the Japan Science and Technology Agency (JST).
The authors declare no competing or financial interests.