ABSTRACT
TDP-43 proteinopathy is the major pathology in amyotrophic lateral sclerosis (ALS) and tau-negative frontotemporal dementia (FTD). Mounting evidence implicates loss of normal TDP-43 RNA-processing function as a key pathomechanism. However, the RNA targets of TDP-43 differ by report, and have never been formally collated or compared between models and disease, hampering understanding of TDP-43 function. Here, we conducted re-analysis and meta-analysis of publicly available RNA-sequencing datasets from six TDP-43-knockdown models, and TDP-43-immunonegative neuronal nuclei from ALS/FTD brain, to identify differentially expressed genes (DEGs) and differential exon usage (DEU) events. There was little overlap in DEGs between knockdown models, but PFKP, STMN2, CFP, KIAA1324 and TRHDE were common targets and were also differentially expressed in TDP-43-immunonegative neurons. DEG enrichment analysis revealed diverse biological pathways including immune and synaptic functions. Common DEU events in human datasets included well-known targets POLDIP3 and STMN2, and novel targets EXD3, MMAB, DLG5 and GOSR2. Our interactive database (https://www.scotterlab.auckland.ac.nz/research-themes/tdp43-lof-db/) allows further exploration of TDP-43 DEG and DEU targets. Together, these data identify TDP-43 targets that can be exploited therapeutically or used to validate loss-of-function processes.
This article has an associated First Person interview with the first author of the paper.
INTRODUCTION
TDP-43 (encoded by TARDBP) is a predominantly nuclear DNA- and RNA-binding protein first discovered to bind to the trans-active response element in the human immunodeficiency virus (HIV)-1 sequence (Ou et al., 1995). TDP-43 was subsequently found to be the major constituent of pathogenic aggregates in amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD) neuropathology (Arai et al., 2006; Neumann et al., 2006). Indeed, hyper-phosphorylated and aggregated cytoplasmic TDP-43 is the pathological signature in almost all cases of ALS and ∼50% of FTD patients (Neumann et al., 2006; Ling et al., 2013; Mackenzie et al., 2007). Other neurodegenerative diseases also manifest with TDP-43 neuropathology, including Alzheimer's disease, Parkinson's disease and Huntington's disease (Amador-Ortiz et al., 2007; Arai et al., 2009; Higashi et al., 2007; James et al., 2016; McAleese et al., 2017; Nakashima-Yasuda et al., 2007; Schwab et al., 2008; Dewan et al., 2021). There is also a clear relationship between the development of neurodegenerative disease and mutations in other RNA-binding proteins that favor their aggregation (Kim et al., 2013; Vance et al., 2009; Johnson et al., 2014; Elden et al., 2010). Strikingly, the regional patterning of neuronal loss both in the brain and spinal cord closely reflects the patterning of TDP-43 aggregate deposition (Brettschneider et al., 2014, 2013; Mackenzie et al., 2013). However, TDP-43 protein inclusions represent only one species across a spectrum of conformations in which TDP-43 can exist (Scotter et al., 2014), and their ease of detection has likely influenced our perception of the pathomechanisms of disease.
The gain-of-toxic-function hypothesis for TDP-43 in ALS emerged from three key findings: (1) that almost all inherited ALS including TARDBP-mutant ALS is inherited dominantly (Sreedharan et al., 2008; Rutherford et al., 2008); (2) that the characteristic pathology is the appearance of cytoplasmic TDP-43 aggregates absent from non-neurodegenerative disease tissue (Arai et al., 2006; Neumann et al., 2006); and (3) that TDP-43 overexpression paradigms in animal models recapitulated these TDP-43 aggregates and the symptoms of human ALS (Wils et al., 2010; Li et al., 2010; Wegorzewska et al., 2009; Johnson et al., 2008). Recognition that loss of normal TDP-43 function may also, or instead, be pathogenic in ALS was based upon the observation that TDP-43, both in human ALS tissue and transgenic animal models harboring TDP-43 inclusions, is cleared from its normal location in the nucleus (Mitra et al., 2019; Braak and Del Tredici, 2018; Walker et al., 2015; Braak et al., 2017). Nuclear-to-cytoplasmic mislocalization of TDP-43 likely feeds into, and is further induced by, TDP-43 aggregation in the cytoplasm via a sequestration mechanism (Walker et al., 2015; Amlie-Wolf et al., 2015; Mutihac et al., 2015; Winton et al., 2008) (Fig. 1). TDP-43 fulfils specific cytoplasmic functions, including the regulation of stress granules, which are proposed by some to ‘seed’ TDP-43 aggregation (Birsa et al., 2020; Molliex et al., 2015; Fernandes et al., 2018; Becker et al., 2017). However, TDP-43 is a predominantly nuclear protein, and the majority of its RNA-processing roles are executed in the nucleus (Ou et al., 1995; Bowden and Dormann, 2016; Ling et al., 2010; Sephton et al., 2011; Buratti et al., 2001; Fiesel et al., 2012; Shiga et al., 2012; De Conti et al., 2015; Ling et al., 2015; Tollervey et al., 2011). These include regulation of alternative splicing, enhancement or repression of exon and cryptic exon inclusion, mRNA transport and polyadenylation (Ling et al., 2015; Tollervey et al., 2011; Buratti and Baralle, 2008; Buratti et al., 2004; Fallini et al., 2012; Klim et al., 2019; Melamed et al., 2019; Polymenidou et al., 2011). Nuclear TDP-43 is also able to autoregulate its own mRNA levels through a negative-feedback loop by binding its own 3′ untranslated region (UTR) (Ayala et al., 2011), so loss of TDP-43 from the nucleus likely further contributes to TDP-43 overproduction, phase separation and aggregation, and sequestration (Tziortzouda et al., 2021) (Fig. 1). Loss of appropriate TDP-43 RNA-processing function is evidenced in human ALS by extensive transcriptional change and mis-splicing (Tollervey et al., 2011; Buratti and Baralle, 2008, 2010; Polymenidou et al., 2011). Notably, similar transcriptional changes, motor neuron pathology and motor symptoms are seen both in animal models with TDP-43 inclusions and in models that are based upon TDP-43 knockdown (Mihevc et al., 2016; Broeck et al., 2014). Thus, loss of nuclear TDP-43 function is clearly critical to the pathogenesis of ALS.
Schematic of the mechanisms of TDP-43 loss of function in ALS. (Top) Normally, TDP-43 is actively imported into the nucleus and passively diffuses (Pinarbasi et al., 2018), giving TDP-43 its predominantly nuclear localization. However, TDP-43 in amyotrophic lateral sclerosis (ALS) is frequently mislocalized to the cytoplasm, leading to nuclear TDP-43 depletion. (Bottom right) Cytoplasmic TDP-43 is prone to phase separation and aggregation, with hyperphosphorylated aggregates further sequestering TDP-43. (Bottom left) Readily detectable TDP-43 nuclear depletion or sequestration into aggregates, or less easily detected misfolding, can deplete the functional pool of TDP-43. TDP-43 loss of function leads to failed autoregulation of TARDBP (enhancing TDP-43 translation), in addition to failed regulation of myriad other TDP-43 targets, many of which are unknown or unvalidated.
Schematic of the mechanisms of TDP-43 loss of function in ALS. (Top) Normally, TDP-43 is actively imported into the nucleus and passively diffuses (Pinarbasi et al., 2018), giving TDP-43 its predominantly nuclear localization. However, TDP-43 in amyotrophic lateral sclerosis (ALS) is frequently mislocalized to the cytoplasm, leading to nuclear TDP-43 depletion. (Bottom right) Cytoplasmic TDP-43 is prone to phase separation and aggregation, with hyperphosphorylated aggregates further sequestering TDP-43. (Bottom left) Readily detectable TDP-43 nuclear depletion or sequestration into aggregates, or less easily detected misfolding, can deplete the functional pool of TDP-43. TDP-43 loss of function leads to failed autoregulation of TARDBP (enhancing TDP-43 translation), in addition to failed regulation of myriad other TDP-43 targets, many of which are unknown or unvalidated.
In recognition of the impact of TDP-43 loss of function on gene expression in ALS and FTD, an increasing number of studies report the transcriptome-wide effect of TDP-43 depletion. Although certain TDP-43 targets, such as RANBP1 and POLDIP3, are clearly reproducible in multiple studies (Fiesel et al., 2012; Shiga et al., 2012; Ling et al., 2015; Polymenidou et al., 2011; Mihevc et al., 2016; Roczniak-Ferguson and Ferguson, 2019; Stalekar et al., 2015), there has yet to be a formal analysis published of common TDP-43 loss-of-function targets. Different cell types are obviously transcriptomically unique, as are the same cell types derived from different species, meaning that the influence of TDP-43 on gene expression and splicing is context dependent (Ling et al., 2015; Jeong et al., 2017). Here, we re-analyzed publicly available RNA-sequencing (RNA-seq) datasets from TDP-43-depleted model systems, as well as a human ALS/FTD neuronal nuclei dataset demonstrating loss of nuclear TDP-43, to examine common transcriptional patterns of TDP-43 loss of function. Elucidating markers of TDP-43 loss of function will enable better understanding of disease mechanisms and the extent to which TDP-43 loss of function is associated with neurodegeneration. Further, such markers could serve as biomarkers and/or targets for treatment.
RESULTS
Validation of TARDBP depletion in TDP-43 knockdown models and in TDP-negative ALS/FTD neuronal nuclei
Forty-seven RNA-sequencing datasets were identified using ‘TDP-43’ as a keyword. Nine studies met the inclusion criteria for re-analysis [raw data available, appropriate sample size, TDP-43 depleted experimentally (‘knockdown’)]; however, only six of these met quality control thresholds and were fully processed (Fig. 2). Of these six, three were performed on cells derived from humans {GSE136366 (human HeLa cells; Roczniak-Ferguson and Ferguson, 2019), GSE122069 (human SH-SY5Y cells; Melamed et al., 2019) and GSE121569 [human induced motor neurons (ihMNs); Klim et al., 2019)]}, two from mouse [GSE116456 (mouse mammary gland; Zhao et al., 2020) and GSE27218 (mouse striatum; Polymenidou et al., 2011)], and one from rat [GSE135611 (rat primary astrocytes; LaRocca et al., 2019)] (Table S1). An additional RNA-seq dataset was identified in which neuronal nuclei had been sorted from ALS/FTD tissue according to nuclear TDP-43 immunoreactivity [GSE126543 (cortical neuronal nuclei with or without detectable TDP-43 immunolabeling from seven ALS/FTD human brains; Liu et al., 2019)]. This dataset also met quality control thresholds and was processed using the same pipeline (Fig. 2).
Study selection and RNA-seq data-processing pipeline. RNA-seq datasets were selected from the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database using search term and study type, filtered for studies that performed RNA-seq on models with experimental TDP-43 depletion, and processed through a common bioinformatic pipeline. DEG, differentially expressed gene; DEU, differential exon usage. TDP-43 knockdown datasets were then compared with a TDP-43-immunonegative amyotrophic lateral sclerosis (ALS)/frontotemporal dementia (FTD) neuronal nuclei dataset.
Study selection and RNA-seq data-processing pipeline. RNA-seq datasets were selected from the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database using search term and study type, filtered for studies that performed RNA-seq on models with experimental TDP-43 depletion, and processed through a common bioinformatic pipeline. DEG, differentially expressed gene; DEU, differential exon usage. TDP-43 knockdown datasets were then compared with a TDP-43-immunonegative amyotrophic lateral sclerosis (ALS)/frontotemporal dementia (FTD) neuronal nuclei dataset.
Differentially expressed genes shared by TDP-43 knockdown and TDP-negative ALS/FTD neuronal nuclei datasets
We first examined differential gene expression between control and TDP-43-knockdown samples [human HeLa cells (Roczniak-Ferguson and Ferguson, 2019), human SH-SY5Y cells (Melamed et al., 2019), ihMNs (Klim et al., 2019), mouse mammary gland (Zhao et al., 2020), mouse striatum (Polymenidou et al., 2011) and rat primary astrocytes (LaRocca et al., 2019)], and ALS/FTD TDP-43-immunopositive (TDP pos) and TDP-43-immunonegative (TDP neg) neuronal nuclei (Liu et al., 2019), using the DESeq2 package. Depletion of TARDBP/Tardbp was examined in HeLa, SH-SY5Y, ihMN, mouse striatum, mouse mammary gland and rat astrocytes (Fig. 3A). The extent of TARDBP/Tardbp depletion was greater in TDP-43 knockdown samples than in sorted ALS/FTD TDP neg neuronal nuclei, as expected given that the latter were selected for nuclear clearance of TDP-43 protein, rather than being subject to TARDBP knockdown. All datasets represent models of TDP-43 loss of function.
DEGs shared by TDP-43 knockdown and TDP-negative ALS/FTD neuronal nuclei datasets were cell-type specific and species specific. (A) Normalized counts of TARDBP/Tardbp transcripts from TDP-43 knockdown studies and in ALS/FTD neuronal nuclei. (B) Venn diagrams comparing upregulated and downregulated DEGs (Padj<0.05) among human datasets and among rodent datasets. (C) DEGs shared with the ALS/FTD TDP-43-immunonegative (TDP neg) neuronal nuclei dataset. (D) Top three upregulated and downregulated DEGs shared between the TDP-43 knockdown studies and the TDP neg neuronal nuclei dataset. ihMN, human induced motor neurons; TDP pos, TDP-43 immunopositive.
DEGs shared by TDP-43 knockdown and TDP-negative ALS/FTD neuronal nuclei datasets were cell-type specific and species specific. (A) Normalized counts of TARDBP/Tardbp transcripts from TDP-43 knockdown studies and in ALS/FTD neuronal nuclei. (B) Venn diagrams comparing upregulated and downregulated DEGs (Padj<0.05) among human datasets and among rodent datasets. (C) DEGs shared with the ALS/FTD TDP-43-immunonegative (TDP neg) neuronal nuclei dataset. (D) Top three upregulated and downregulated DEGs shared between the TDP-43 knockdown studies and the TDP neg neuronal nuclei dataset. ihMN, human induced motor neurons; TDP pos, TDP-43 immunopositive.
There were three upregulated differentially expressed genes (DEGs) and five downregulated DEGs shared by all human-derived TDP-43 knockdown models, and nine upregulated and three downregulated DEGs shared by all rodent-derived TDP-43 knockdown models (Fig. 3B). There was no overlap of DEGs shared among all human-derived models and DEGs shared among all rodent-derived models (Fig. 3B). Despite this, both human- and rodent-derived datasets shared DEGs with ALS/FTD TDP neg neuronal nuclei (Fig. 3C). These included known TDP-43-regulated transcripts such as PFKP and STMN2 (both downregulated).
The upregulated DEGs shared by the largest number of TDP-43 knockdown models and with ALS/FTD TDP neg neuronal nuclei were KIAA1324, CFP and ITGA4. The shared downregulated DEGs were TRHDE, PFKP and MASP2 (Fig. 3D). DEGs [adjusted P-value (Padj)<0.05] in each model are listed in Tables S2 and S3 (only genes with log2FC between −0.5 and >0.5), or can be explored interactively at https://www.scotterlab.auckland.ac.nz/research-themes/tdp43-lof-db/ (all significant DEGs), filtering by model system or gene of interest. Overall, DEGs were highly variable between datasets, but the sharing of DEGs between model systems validates them as conserved TDP-43 targets, and sharing of DEGs with ALS/FTD TDP neg neuronal nuclei validates those DEGs as being disease relevant.
Altered biological processes, molecular functions and cellular components with TDP-43 knockdown
Having identified certain genes with conserved patterns of regulation by TDP-43 between cell types and species, we examined the effect of TDP-43 knockdown on biological processes (BPs), molecular functions (MFs) and cellular components (CCs) using gene ontology (GO) enrichment analysis. The top three significant terms for BPs, MFs and CCs for both upregulated genes (Fig. 4, left) and downregulated genes (Fig. 4, right) from each dataset are collated in Fig. 4 and Table S4.
Altered biological processes, molecular functions and cellular components with TDP-43 knockdown. Enriched terms from gene ontology (GO) analysis (Padj<0.05) of upregulated genes (left) and downregulated genes (right). Terms are categorized into biological process (BP), cellular component (CC) and molecular function (MF). The model system that each term belongs to is denoted by a colored box. GO terms with shared ancestor terms are indicated by highlighting. Keys to model systems and ancestor terms shown at the bottom. ihMN, human induced motor neurons; M. Mam, mouse mammary gland; M. Str, mouse striatum; R. Ast, rat astrocytes.
Altered biological processes, molecular functions and cellular components with TDP-43 knockdown. Enriched terms from gene ontology (GO) analysis (Padj<0.05) of upregulated genes (left) and downregulated genes (right). Terms are categorized into biological process (BP), cellular component (CC) and molecular function (MF). The model system that each term belongs to is denoted by a colored box. GO terms with shared ancestor terms are indicated by highlighting. Keys to model systems and ancestor terms shown at the bottom. ihMN, human induced motor neurons; M. Mam, mouse mammary gland; M. Str, mouse striatum; R. Ast, rat astrocytes.
Two patterns emerged after redundancy was eliminated from the GO terms: many terms were related to immune response (Fig. 4, green asterisks), and many were neuron specific (Fig. 4, yellow asterisks). Mouse striatum had three BP/CC terms for downregulated genes that were neuron specific. The mouse striatum is composed of various cell types, but these data imply neuron-selective effects of TDP-43 loss of function. Supporting this, other datasets with significant neuron-specific BP or CC terms were SH-SY5Y and ALS/FTD TDP neg neuronal nuclei.
Differential exon usage events shared by TDP-43 knockdown and TDP-negative ALS/FTD neuronal nuclei datasets
As TDP-43 is known to regulate splicing and cryptic exon expression (Ling et al., 2015; Tollervey et al., 2011), we next examined differential exon usage (DEU) between control and TDP-43-knockdown samples, and ALS/FTD TDP pos and TDP neg neuronal nuclei, using the DEXSeq package. This package quantifies changes in the relative usage of exons or parts of exons (annotated by a feature ID, e.g. E001) between conditions and generates graphical displays. DEU events in each model are listed in Table S5, or can be explored interactively at https://www.scotterlab.auckland.ac.nz/research-themes/tdp43-lof-db/, filtering for DEU as the data type.
There were 31 DEU events that were shared between ALS/FTD TDP neg neuronal nuclei and at least two other human-derived datasets (Fig. 5A). These DEU events included 12 within the (non-coding) 3′ UTR region of the gene. Fourteen of the 31 DEU events increased usage of an exonic element [i.e. exon inclusion, cryptic exon, intron retention, long non-coding RNA (lncRNA) upregulation], while 17 events decreased usage of the exon (i.e. exon exclusion, new intron) (Fig. 5B). For these 31 DEU events, we examined whether the orthologous mouse and rat genes in the rodent datasets demonstrated DEU events, identifying nine genes in which rodent models also showed DEU events (Fig. 5C). The differentially used exonic regions from humans were then subjected to National Center for Biotechnology Information (NCBI) Basic Local Alignment Search Tool (BLAST) analysis (https://blast.ncbi.nlm.nih.gov/Blast.cgi) using the blastn algorithm, to examine sequence homology of DEU loci between human and rodent datasets. Only two genes showed a DEU event that was shared between ALS/FTD TDP neg neuronal nuclei and a rodent dataset, and that also showed sequence homology of the region between human and mouse: STMN2 and POLDIP3 (Fig. 5C,D). In POLDIP3, exon usage in the orthologous location in both human and mouse models was decreased. However, in STMN2, cryptic exon usage in human models was increased, as reported previously (Klim et al., 2019; Melamed et al., 2019), but exon usage at the orthologous location in mouse was decreased (Fig. 5E).
DEU events shared by TDP-43 knockdown and TDP-negative ALS/FTD neuronal nuclei datasets. (A) Venn diagram comparing human dataset DEU events. In the textbox are the locations of DEU events shared between ALS/FTD TDP neg neuronal nuclei and at least two other human-derived datasets. A DEU is described with a feature ID (e.g. E001) that corresponds to an exon or part of an exon. (B) DEU events according to direction of change and nature of splicing event. (C) Genes from human datasets in textbox in A showing whether a DEU was present in the orthologous mouse or rat gene (colored box), and whether the DEU region showed sequence homology between human and rodent. (D) DEU regions in POLDIP3/Poldip3 and STMN2/Stmn2 are homologous between human and mouse. (E) Human STMN2 transcript (left), but not mouse Stmn2 (right), includes a cryptic exon with TDP-43 knockdown.
DEU events shared by TDP-43 knockdown and TDP-negative ALS/FTD neuronal nuclei datasets. (A) Venn diagram comparing human dataset DEU events. In the textbox are the locations of DEU events shared between ALS/FTD TDP neg neuronal nuclei and at least two other human-derived datasets. A DEU is described with a feature ID (e.g. E001) that corresponds to an exon or part of an exon. (B) DEU events according to direction of change and nature of splicing event. (C) Genes from human datasets in textbox in A showing whether a DEU was present in the orthologous mouse or rat gene (colored box), and whether the DEU region showed sequence homology between human and rodent. (D) DEU regions in POLDIP3/Poldip3 and STMN2/Stmn2 are homologous between human and mouse. (E) Human STMN2 transcript (left), but not mouse Stmn2 (right), includes a cryptic exon with TDP-43 knockdown.
There were ten DEU events that were shared between ALS/FTD TDP neg neuronal nuclei and all other human-derived datasets (Fig. 6), and all of these were consistent in the direction of change. Exonic element usage in POLDIP3, MMAB, IL18BP and GOSR2 was decreased, and in MARK3, USP31, NDUFA5, DLG5, EXD3 and TBL1XR1-AS1 was increased. Of these, only POLDIP3 showed differential usage of a coding exon, with the remainder involving changes to non-coding regions, such as introns/cryptic exons (USP13, DLG5 and EXD3), the 3′ UTR (MMAB, IL18BP, GOSR2, MARK3 and NDUFA5) and a lncRNA (TBL1XR1-AS1) (Cunningham et al., 2022). We consider these ten highly conserved DEU events to represent an ideal ‘panel’ for examining loss of TDP-43 function in humans and human models.
Top shared differentially used exons – a panel of markers of TDP-43 loss of function. The ten DEU events shared by all three human TDP-43 knockdown datasets and the ALS/FTD TDP neg dataset are showcased as a potential TDP-43 loss-of-function panel, accompanied by coverage tracks from the ALS/FTD TDP neg dataset. Of the ten shared DEU events, four were decreased with TDP-43 loss of function and six were increased.
Top shared differentially used exons – a panel of markers of TDP-43 loss of function. The ten DEU events shared by all three human TDP-43 knockdown datasets and the ALS/FTD TDP neg dataset are showcased as a potential TDP-43 loss-of-function panel, accompanied by coverage tracks from the ALS/FTD TDP neg dataset. Of the ten shared DEU events, four were decreased with TDP-43 loss of function and six were increased.
DISCUSSION
As the common neuropathological feature in ALS with or without FTD, aggregated TDP-43 and its acquired toxic functions represent tractable molecular targets for therapy. Yet, increasing evidence suggests that successful therapies will also require rescue of normal physiological functions of TDP-43 that are compromised in disease. Loss of gene expression and splicing regulatory function are now widely accepted features of ALS with TDP-43 proteinopathy, but how the specific complement of TARDBP mRNA targets drives pathogenesis and phenotype remains unclear. Central to this is the need to identify the complement of TARDBP mRNA targets in distinct cell types and model systems, and to verify which of these targets is regulated by TDP-43 in human ALS/FTD neurons. We approached this problem by assessing TARDBP mRNA targets in human and rodent models, in neuronal and non-neuronal cell types, and identifying shared and unique targets and biological pathways under TDP-43 control. TDP-43 regulates certain common targets across myriad cell types, implying that TDP-43 proteinopathy in non-neuronal cells initiates processes that at least partially overlap with those occurring in neurons. Notably, TDP-43-dependent changes identified in multiple model systems were validated to change in human ALS/FTD neuronal nuclei with TDP-43 nuclear depletion, supporting TDP-43 loss of function as being pathomechanistic.
DEGs as markers of TDP-43 loss of function in disease
Certain transcripts were robustly regulated by TDP-43 in multiple datasets and in human ALS/FTD neurons with nuclear TDP-43 loss. Decreased PFKP mRNA has been identified as a TDP-43 loss-of-function marker by several studies in addition to those in our meta-analysis (Buratti and Baralle, 2010; Coyne et al., 2021; Park et al., 2013). PFKP encodes the platelet isoform of phosphofructokinase, a key glycolytic enzyme that is expressed almost ubiquitously. PFKP is proposed to decrease under conditions of TDP-43 depletion via suppression of miR-520 (Park et al., 2013). Yet, despite the apparent promise of decreased PFKP as a TDP-43 loss-of-function marker, glycolysis is hypothesized to be a compensatory mechanism in ALS, and in human ALS spinal cord tissue with TDP-43 proteinopathy, PFKP levels were found to increase rather than decrease compared to those in controls (Manzo et al., 2019). However, TDP-43 can also bind PFKP directly (Tollervey et al., 2011), and TARDBP knockdown can result in PFKP cryptic exon inclusion, suggesting a direct interaction (Klim et al., 2019). Indeed, PFKP cryptic exon usage within intron 16 of ENSEMBL PFKP-203 was seen in all human datasets we examined except ALS/FTD TDP neg neurons (Table S5). This cryptic exon may therefore be preferable to overall PFKP transcript levels as a TDP-43 loss-of-function marker.
Loss of STMN2 expression following TDP-43 depletion is associated with the emergence of a cryptic exon after exon 1, which introduces a premature polyadenylation site (Prudencio et al., 2020). This truncated STMN2 mRNA was upregulated in the frontal cortex in FTD with TDP-43 pathology (Prudencio et al., 2020). STMN2 protein is essential for microtubule stability and thus cytoskeletal transport, synapse maintenance and homeostasis of motor neurons. STMN2 has been identified as a key TDP-43 target in neurodegeneration, with deficits in axonal outgrowth and repair following TDP-43 depletion being rescued by restoration of STMN2 alone (Klim et al., 2019; Melamed et al., 2019). Cytoskeletal dynamics are further implicated in the pathogenesis of ALS/FTD by disease-causing mutations in DCTN1 (Münch et al., 2005), PFN1 (Wu et al., 2012; Smith et al., 2015) and TUBA4A (Smith et al., 2014). Interestingly, ALS/FTD linked to any of these genes is associated with TDP-43 cytoplasmic aggregation (and thus presumably TDP-43 nuclear clearing), raising the possibility of a feed-forward interaction between TDP-43 loss of function and axonal cytoskeleton dysfunction. Indeed, a therapeutic that induces STMN2 for ALS is soon to enter clinical trial. We predict that both the STMN2 cryptic exon and overall STMN2 transcript levels will be of high utility in detecting neuronal subtypes with TDP-43 loss of function.
Two other shared downregulated genes among TDP-43 knockdown models and the ALS/FTD TDP neg dataset were TRHDE and MASP2. TRHDE (encoding thyrotropin-releasing hormone-degrading enzyme) is an M1 family metallopeptidase enriched in brain regions (Torres et al., 1986). Its main substrate is thyrotropin-releasing hormone (TRH), secreted by neurons, and, accordingly, TRHDE was downregulated in brain-derived or neuron-like model datasets. However, a dual TRH-mimic/TRHDE inhibitor compound reduced motor decline and spinal cord neurodegeneration in a SOD1 mouse model (Kelly et al., 2015), so loss of TRH degradation downstream of TDP-43 dysfunction is unlikely to be pathomechanistic. Also downregulated is MASP2, which is involved in activation of the complement system (Kidmose et al., 2012). The MASP2 gene is immediately downstream of TARDBP, and thus a degree of co-regulation is expected due to similar chromatin states (Arnone et al., 2012), but these data suggest that MASP2 levels are responsive to TDP-43 protein levels independent of chromatin packing. As is true of other genes, whether MASP2 gene expression is useful as a TDP-43 loss-of-function marker depends on whether TDP-43 is a major determinant of its transcript levels.
Upregulated genes shared between various TDP-43 depletion datasets and ALS/FTD TDP neg neurons were KIAA1324, CFP and ITGA4. KIAA1324 encodes a multifunctional protein and is also known as estrogen-induced gene 121 (EIG121) and endosome/lysosome-associated apoptosis and autophagy regulator (ELAPOR1) in humans (Deng et al., 2010), or Elapor1 or insulin inhibitory receptor (Iir) in mice (Ansarullah et al., 2021). CFP encodes the complement factor properdin, which regulates the alternative complement pathway (Pedersen et al., 2017) and may be upregulated by chronic cell stress. ITGA4 encodes integrin subunit alpha 4, which is overexpressed in nerve injury (Bas et al., 2015; Xing et al., 2017), promotes immune cell infiltration and inhibited therapeutically in multiple sclerosis (Butzkueven et al., 2014). As ITGA4 is positively regulated by the lncRNA NEAT1 (Asadi et al., 2021), and NEAT1 binding scaffolds the phase separation of TDP-43 (Tollervey et al., 2011; Wang et al., 2020), TDP-43 loss of function might liberate NEAT1 and thus increase ITGA4 levels. However, ITGA4 is also upregulated in SOD1- and FUS-mutant induced pluripotent stem cell models, which do not exhibit TDP-43 proteinopathy or dysfunction (Ziff et al., 2022). As all of these upregulated genes are sensitive to apoptosis and neuroinflammation, any upregulation specifically due to loss of TDP-43 function may be difficult to disentangle.
In general, the value of using overall expression levels of TDP-43-regulated transcripts as loss-of-function markers depends upon the following: (1) the cell types in which the transcript is expressed, and whether those cell types are subject to TDP-43 dysfunction in the tissue sampled; (2) whether the transcript is subject to secondary regulation, aside from regulation by TDP-43; and (3) whether regulation by TDP-43 is direct or indirect via other mediators also subject to change. Overall, the ability of DEGs to report upon TDP-43 function is likely to differ across different transcriptomic profiles (cell states, cell types, species).
DEU events as markers of TDP-43 loss of function in disease
TDP-43 mainly binds non-coding stretches of DNA/RNA, such as introns, untranslated regions, intergenic regions and lncRNAs (Tollervey et al., 2011). Of the ten DEU events identified as common between the human-derived datasets (Fig. 6), only one involved a coding exon (POLDIP3). The skipping of exon 3 in POLDIP3 is a frequently reproduced marker in TDP-43 knockdown studies (Fiesel et al., 2012; Shiga et al., 2012; Tollervey et al., 2011; Mihevc et al., 2016; Roczniak-Ferguson and Ferguson, 2019; Colombrita et al., 2015; Suzuki et al., 2015), in which it undergoes an isoform switch. In the absence of TDP-43, the canonical variant 1 is decreased whereas variant 2 is increased due to exon 3 exclusion (Fiesel et al., 2012; Shiga et al., 2012). Interestingly, overexpression of TDP-43 mutants in HEK293T cells also caused POLDIP3 exon 3 exclusion, suggesting that mutants can induce loss of function of endogenous wild-type TDP-43 (Chen et al., 2019). POLDIP3 protein [also known as S6K1 Aly/REF-like target (SKAR)] interacts with exon junctional complexes to increase the translation efficiency of spliced mRNAs (Ma et al., 2008). Our validation of POLDIP3 exon 3 exclusion in human ALS/FTD neuronal nuclei with loss of nuclear TDP-43 strongly supports POLDIP3 as a TDP-43 loss-of-function marker in ALS tissue, and indeed increased POLDIP3 variant 2 mRNA is seen in various motor regions of the CNS in ALS (Shiga et al., 2012).
Half of the DEU events common to all human datasets were modifiers of 3′ UTR usage. The 3′ UTR is an important regulatory region, of mRNA stability (Meijlink et al., 1985), localization (Macdonald and Struhl, 1988) and translation (Miller and Madras, 2002), and even protein–protein interactions (Berkovits and Mayr, 2015). Trans-acting factors such as RNA-binding proteins that interact with the 3′ UTR also determine its function (Mayr, 2017), meaning that altered 3′ UTR sequences may have variable effects depending on the transcriptome. Three intron retention or cryptic exon emergence events were common to all human datasets, and these can be associated with disease (Dhir and Buratti, 2010). Because changes in intron/cryptic exon usage with TDP-43 loss of function were sometimes subtle, these genes should be examined together and ideally as part of the panel of ten DEU events identified in Fig. 6. This panel may be used in the design of probes for in situ hybridization (RNAScope, BaseScope) of ALS/FTD tissue, in conjunction with cell-type-specific immunohistochemical markers, to identify additional cell types with TDP-43 loss of function. This panel of DEU events could also act as primers/probes for quantitative RT-PCR to assess the fidelity of ALS models or the restoration of TDP-43 function by therapeutics.
Together, the transcripts identified in the panel cover a range of biological functions, including translation (POLDIP3) (Ma et al., 2008), mitochondrial function (MMAB, NDUFA5) (Dobson et al., 2002; Loeffen et al., 1998), endosomal trafficking (GOSR2) (Lowe et al., 1997), immune response regulation (IL18BP) (Novick et al., 1999), microtubule regulation (MARK3, DLG5) (Gu et al., 2013; Wang et al., 2014), exonuclease activity (EXD3) (Gaudet et al., 2011) and deubiquitylation (USP31) (Lockhart et al., 2004). How the processing, translation and protein interactions of these transcripts are changed with DEU warrants future investigation. So too does the upregulation of the lncRNA TBL1XR1-AS1, which occurred in the absence of gene expression changes to its target transcript TBL1XR1 in any human dataset. These DEU events represent a conserved set of markers of TDP-43 loss of function, demonstrate the promiscuity of TDP-43 effects and mirror the diversity in biological functions implicated in ALS pathogenesis.
Models for identifying TDP-43 targets – strengths and limitations
Several of our data suggest our DESeq2 and DEXSeq analyses to be conservative methods for identifying DEGs and DEU events, and comparative studies of DEG packages have also demonstrated DESeq2 to err on the conservative side (Li et al., 2020; Soneson and Delorenzi, 2013). It is essential to identify consistent and reliable markers of TDP-43 loss of function to nominate targets with diagnostic or therapeutic potential. DEXSeq was chosen for this study for its conservative approach to controlling Type I error, leading to fewer false positives (Anders et al., 2012). TDP-43 may therefore regulate additional splicing events than those described here, and their identification could be aided by the combined use of multiple exon usage and splicing analysis tools. Conversely, exon usage changes identified in this study are unlikely to be due to sample variance and are indeed TDP-43 dependent.
In addition to the methodology used, the transcriptional targets of TDP-43 that we identified were dependent upon the species and fidelity of the cellular and animal models of ALS employed. Cryptic exons that emerge in human models of TDP-43 loss of function are not recapitulated in mouse models (Ling et al., 2015; Humphrey et al., 2017), including a cryptic exon identified in STMN2 that has attracted significant attention in ALS/FTD research (Klim et al., 2019). Our results build upon the emerging consensus that TDP-43 has a distinct set of molecular targets in different cell types and species (Jeong et al., 2017). Human-derived transcriptomes are likely to be most suitable for identifying molecular pathways and drug targets relevant to human ALS.
The mechanism of modelling ALS is equally critical to identifying disease-relevant TDP-43 targets and pathways. Depleting TDP-43 is gaining acceptance in a field initially predominated by overexpression and TDP-43-mutant models (Wegorzewska and Baloh, 2011; Liu et al., 2013), and here we demonstrate that TDP-43 depletion recapitulates at least some of the transcriptional effects of loss of nuclear TDP-43 in ALS/FTD neuronal nuclei. TDP-43 knockdown can thus be considered an appropriate, even if partial, experimental paradigm of disease for identifying mechanisms, biomarkers and therapeutic targets.
This study highlights mRNA transcripts and parts of transcripts for which expression can robustly report upon TDP-43 loss of function. TDP-43 knockdown largely alters different biological pathways in human and rodent model systems, and human-derived models better recapitulate specific transcriptional and splicing changes that occur in ALS/FTD neuronal nuclei. Our findings enable future identification of non-neuronal cell types with TDP-43 loss of function, while revealing key players in the selective neuronal cell death that occurs in ALS and FTD.
MATERIALS AND METHODS
Identification of TDP-43 knockdown studies for meta-analysis
A repository search was conducted in September 2020 to identify TDP-43 knockdown studies with available RNA-seq data using the Gene Expression Omnibus (GEO) from the NCBI (http://ncbi.nlm.nih.gov/geo). The search was performed using the keyword ‘TDP-43’, and the results were filtered by setting Entry Type as ‘Series’ to capture all potential samples that belonged to a common study, and setting Study Type as ‘Expression profiling by high throughput sequencing’. Inclusion criteria for re-analyzing these datasets were as follows: (1) raw RNA-seq data available; (2) at least three TDP-43 knockdown samples and two appropriate control samples; (3) experimental depletion of TDP-43. In addition, an RNA-seq dataset was identified in which neuronal nuclei had been sorted from ALS/FTD tissue according to nuclear TDP-43 immunoreactivity; either TDP pos (normal nuclear TDP-43) or TDP neg (no nuclear TDP-43) (Liu et al., 2019) (Fig. 2). This was considered an appropriate ‘disease validation’ dataset for targets identified through meta-analysis of TDP-43 knockdown studies, because the within-case comparison of TDP pos and TDP neg neuronal nuclei paralleled the paradigm of TDP-43 loss of function by knockdown.
Data processing and statistical analysis
The data processing pipeline is shown in Fig. 2. Raw data were downloaded from NCBI using Sequence Read Archive (SRA) Toolkit v2.9.6 (http://www.ncbi.nlm.nih.gov/sra), and quality control was applied using FastQC v0.11.9 software (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Details of each dataset including study design, GEO accession number and library information are supplied in Table S1. Code is deposited at https://github.com/mcao051/TDP_LOF. Raw data that still contained adapter sequence content were trimmed with Trimmomatic v0.39 (Bolger et al., 2014). If fewer than 5 million reads survived trimming, the samples were not included for further analysis, and if this reduced total number of controls or knockdown samples to below the inclusion criteria, the study was excluded. Reads were then aligned to the appropriate reference genome with HISAT2 v2.2.1 and quantified with StringTie v1.3.5 (Pertea et al., 2016). Reference genome builds used were GRCh38, GRCm39 and Rnor6.0 for human, mouse and rat, respectively. Count data were imported using R package tximport (Soneson et al., 2015) to perform differential expression analysis using DESeq2 (Love et al., 2014) and DEU analysis using DEXSeq (Anders et al., 2012). DEU is a more general measure than alternative splicing, as differing exon boundaries between reads are accounted for, therefore revealing differential usage of parts of exons or introns (exonic elements) (Anders et al., 2012). DESeq2 P-values were calculated using Wald tests and corrected for multiple testing using the Benjamini–Hochberg method. DEU events with Padj<0.05 were considered significantly changed. Genes with Padj<0.05 were used for interactive graphical reports but only genes with log2FC between −0.5 and 0.5 were compared between models. ShinyGO v0.75 (http://bioinformatics.sdstate.edu/go/) was used for GO enrichment analysis using genes with Padj<0.05 and log2FC of either <−0.5 or >0.5. The genes detected after low-count filtering for each dataset were used as the background gene list. Redundant GO terms were eliminated using the ReViGO tool available at http://revigo.irb.hr/.
Interactive graphical reports were generated from the packages Glimma (Su et al., 2017) and DEXSeq (Anders et al., 2012). These can be accessed and explored further at https://www.scotterlab.auckland.ac.nz/research-themes/tdp43-lof-db/. They enable the reader to visualize and interact with all differential gene expression and exon usage analyses described herein, including genes of interest not highlighted in our study.
Data visualization
Data were visualized using R software with ggplot2 (https://ggplot2.tidyverse.org/), ggvenn (https://CRAN.R-project.org/package=ggvenn), the plotCounts function from within DESeq2 (Love et al., 2014), DEXSeq (https://www.r-project.org/), Prism 9.0 software (GraphPad Software, La Jolla, CA, USA) and Integrative Genomics Viewer (Robinson et al., 2011). Adobe Photoshop 2021 v22.5.1 (Adobe Inc.) was used as a graphic editor.
Acknowledgements
This publication is dedicated to the patients and families who contribute to our research. We also acknowledge the use of New Zealand eScience Infrastructure (NeSI; https://www.nesi.org.nz) high-performance computing facilities, funded jointly by NeSI's collaborator institutions and the Ministry of Business, Innovation and Employment Research Infrastructure program. We thank Tailgunner Digital for developing the online database and Prof. Mike Dragunow for helpful suggestions regarding the manuscript.
Footnotes
Author contributions
Conceptualization: E.L.S.; Formal analysis: M.C.C.; Investigation: M.C.C.; Writing - original draft: M.C.C., E.L.S.; Writing - review & editing: M.C.C., E.L.S.; Visualization: M.C.C.; Supervision: E.L.S.; Project administration: E.L.S.; Funding acquisition: E.L.S.
Funding
M.C.C. is supported by a University of Auckland Doctoral Scholarship. E.L.S. is supported by Marsden FastStart and Rutherford Discovery Fellowship funding from Royal Society Te Apārangi [15-UOA-157, 15-UOA-003]. This work was also supported by grants from Motor Neuron Disease NZ, Freemasons New Zealand, Matteo de Nora, Coker Family Trust and PaR NZ Golfing. No funding body played any role in the design of the study, collection, analysis or interpretation of data, or writing the manuscript. Open Access funding provided by University of Auckland. Deposited in PMC for immediate release.
Data availability
Data are available at https://www.scotterlab.auckland.ac.nz/research-themes/tdp43-lof-db/. Code for data processing is deposited at https://github.com/mcao051/TDP_LOF.
References
Competing interests
The authors declare no competing or financial interests.