The extracellular matrix (ECM) is a complex meshwork of proteins that forms the scaffold of all tissues in multicellular organisms. It plays crucial roles in all aspects of life – from orchestrating cell migration during development, to supporting tissue repair. It also plays critical roles in the etiology or progression of diseases. To study this compartment, we have previously defined the compendium of all genes encoding ECM and ECM-associated proteins for multiple organisms. We termed this compendium the ‘matrisome’ and further classified matrisome components into different structural or functional categories. This nomenclature is now largely adopted by the research community to annotate ‘-omics’ datasets and has contributed to advance both fundamental and translational ECM research. Here, we report the development of Matrisome AnalyzeR, a suite of tools including a web-based application and an R package. The web application can be used by anyone interested in annotating, classifying and tabulating matrisome molecules in large datasets without requiring programming knowledge. The companion R package is available to more experienced users, interested in processing larger datasets or in additional data visualization options.

The extracellular matrix (ECM) is a complex meshwork of proteins that forms the scaffold of all multicellular organisms (Hynes and Naba, 2012). It plays critical roles in all aspects of life – from orchestrating cell migration and differentiation during development (Dzamba and DeSimone, 2018; Walma and Yamada, 2020), to supporting tissue growth and repair. It also plays critical roles in the etiology or progression of diseases (Theocharis et al., 2019).

‘Omic’ technologies (e.g. transcriptomics, proteomics and glycomics) have emerged as powerful approaches to profile at large scale, and often in an unbiased manner, the biomolecular landscape of cell and tissue states. However, to extract meaningful information and generate novel hypotheses, we need to develop comprehensive annotations and analytical methods to mine these complex inputs. Hence, to study the ECM using ‘-omic’ technologies, we first needed a compendium of all potential ECM components. Using de novo sequence analysis and unique features of ECM proteins, such as the presence of a signal peptide and of characteristic protein domains and motifs (Gebauer and Naba, 2020; Naba et al., 2012b, 2016), we have predicted the ‘matrisome’ of multiple organisms, including human (Naba et al., 2012a), mouse (Naba et al., 2012a), zebrafish (Nauroy et al., 2018), fruit fly (Davis et al., 2019) and nematode (Teuscher et al., 2019). We further classified matrisome genes into: (1) ‘core matrisome’ genes, which are the genes encoding structural components of the ECM including ECM glycoproteins, collagens and proteoglycans, and (2) ‘matrisome-associated’ genes, which are the genes encoding non-structural components of the ECM that either share structural similarities with core matrisome components (we termed these ‘ECM-affiliated proteins’) or are capable of modulating the structure (‘ECM regulators’) or signaling (‘secreted factors’) functions of the ECM proper (Table 1).

Table 1.

Composition of the matrisome of the five organisms integrated in Matrisome AnalyzeR

Composition of the matrisome of the five organisms integrated in Matrisome AnalyzeR
Composition of the matrisome of the five organisms integrated in Matrisome AnalyzeR

The matrisome lists have been deployed via different platforms to support data analysis, including the Molecular Signature Database (Subramanian et al., 2005; https://www.gsea-msigdb.org/gsea/msigdb), the Zebrafish Information Network (Bradford et al., 2022; https://zfin.org/) and FlyBase, the database of Drosophila Genes and Genomes (Gramates et al., 2022; https://flybase.org/). Used to annotate transcriptomic datasets, these matrisome lists have contributed, for example, to help identify the diverse cell populations expressing ECM genes in health and diseases (Bergmeier et al., 2018; Etich et al., 2019; Nauroy et al., 2017; Pietilä et al., 2021; Wietecha et al., 2020) and to identify networks of ECM genes characteristic of disease stages or of prognostic value (Izzi et al., 2018). When used to annotate proteomic datasets, these lists have enabled the definition of the ECM composition of tissues and organs across the pathophysiological spectrum (Naba, 2023; Randles et al., 2017; Shao et al., 2023).

To facilitate the use of the matrisome classification, we previously developed a web application capable of handling human and murine proteomic datasets (Naba et al., 2017). The previous iteration required users to extensively format their input datasets to be amenable, which hindered its diffusion to recently growing methodologies such as single-cell RNA-seq (sc-RNA-seq). Here, we report the development of Matrisome AnalyzeR, an augmented suite of versatile tools that includes a web-based Shiny application (https://sites.google.com/uic.edu/matrisome/tools/matrisome-analyzer) and a companion R package (https://github.com/Matrisome/MatrisomeAnalyzeR). The new intuitive web-based application can be used by anyone to obtain the annotation, classification and tabulation of matrisome molecules from any -omic datasets (e.g. genomic, transcriptomic, proteomic). In the Matrisome AnalyzeR application, results appear on screen in seconds and change dynamically in response to user actions, through a user-friendly, point-and-click interface requiring no programming knowledge. The companion Matrisome AnalyzeR package is available to more advanced users interested in processing larger files (>100 MB) and provides additional data visualization options and possibilities for integration with more complex pipelines. In their current versions, the Matrisome AnalyzeR web application and R package are capable of processing data from the following organisms: Homo sapiens (human), Mus musculus (mouse), Danio rerio (zebrafish), Drosophila melanogaster (fruit fly) and Caenorhabditis elegans (roundworm).

The web-based Matrisome AnalyzeR Shiny application

Data input

Fig. 1A illustrates the data input process. Matrisome AnalyzeR can handle a variety of data files including tab- or comma-separated (.tsv, .txt, .csv, .tabular) files as well as raw skyline (.sky) proteomic files and R Data Serialization (.rds) files. File specifications include column headers and a size not exceeding 100 MB. If a data file exceeds this limit, we recommend using the Matrisome AnalyzeR package (see below). Importantly, thresholding (such as excluding peptides or proteins not meeting a given false-discovery rate or proteins detected with less than two peptides in proteomic datasets) should be performed prior to inputting datasets to Matrisome AnalyzeR.

Fig. 1.

The web-based Matrisome AnalyzeR Shiny application interface. (A) Home page of the Matrisome AnalyzeR web application (https://sites.google.com/uic.edu/matrisome/tools/matrisome-analyzer) displaying input parameter options and output files. (B) Running the ‘annotate+analyze workflow’ using Table S1 as input, returns bar graphs (or ‘matribars’) representing the total numbers of matrisome molecules (here, proteins) classified according to matrisome divisions (left panel) and matrisome categories (right panel) across the entire dataset and a searchable and customizable table (arrows).

Fig. 1.

The web-based Matrisome AnalyzeR Shiny application interface. (A) Home page of the Matrisome AnalyzeR web application (https://sites.google.com/uic.edu/matrisome/tools/matrisome-analyzer) displaying input parameter options and output files. (B) Running the ‘annotate+analyze workflow’ using Table S1 as input, returns bar graphs (or ‘matribars’) representing the total numbers of matrisome molecules (here, proteins) classified according to matrisome divisions (left panel) and matrisome categories (right panel) across the entire dataset and a searchable and customizable table (arrows).

To help users familiarize themselves with the functionalities of the Matrisome AnalyzeR app, we are providing a gallery of test files accessible via the Matrisome AnalyzeR page of the Matrisome Project website (https://sites.google.com/uic.edu/matrisome/tools/matrisome-analyzer). These test files are also available for download via the web-based Shiny application (Fig. 1A). These and additional examples are also included with the R package available on GitHub (https://github.com/Matrisome/MatrisomeAnalyzeR; see below).

For illustration purpose, we will use a label-free quantitative proteomic dataset adapted from one of our previous studies (Renner et al., 2022) and that includes, for each protein entry and sample, total spectral counts, unique spectral counts and unique peptide numbers (Table S1).

Upon file upload, Matrisome AnalyzeR will automatically recognize number format, but we encourage using dots, and not commas, for decimals and avoiding formatting thousands. Matrisome AnalyzeR will also automatically populate the first box with column headers. Users will be asked to select, from the next two drop-down menus, the column containing the molecule identifiers to be used for the annotation and the species of interest. The tool is currently designed to accept gene symbols, NCBI gene (formerly Entrez Gene) and UniProt IDs (The UniProt Consortium, 2023) for all species. Additionally, Matrisome AnalyzeR accepts Ensembl Gene IDs for human and murine datasets, ZFIN IDs for zebrafish datasets, FlyBase ID for Drosophila datasets, and WormBase and Common Gene Name for C. elegans datasets. In the eventuality that no identifiers map to the application's database, an error message will prompt users to review input choices. After input selection, users will then select the workflow to process their data. Help buttons have been implemented to further facilitate data input (Fig. 1A).

Data annotation

The ‘Annotate’ workflow annotates the input file with matrisome divisions (i.e. core matrisome, matrisome-associated or non-matrisome) and categories (i.e. ECM glycoproteins, collagens, proteoglycans, ECM-affiliated, ECM regulators, secreted factors or non-matrisome). The output provides a .csv file that corresponds to the original input file, where column A lists the identifiers used for the annotation in alphabetical order, column B, the ‘Annotated Matrisome Division’, and column C, the ‘Annotated Matrisome Category’ (Table S2). To help users identify the nature of non-matrisome components present in their samples, we are also providing Gene Ontology annotations on Cellular Components (GO:CC) as part of the annotation workflow (Table S2, column D).

The output table is also visible and browsable on the main page upon completion of the Annotate workflow (Fig. 1B). Users can customize the number of entries displayed in the main window and can search the table using the search box (Fig. 1B). In addition, the output includes a .pdf file with bar graphs (or ‘matribars’) representing the total numbers of matrisome molecules (e.g. genes and proteins) classified according to matrisome divisions and categories across the entire dataset (Fig. S1). These bar graphs are also displayed on the main page upon completion of the Annotate workflow (Fig. 1B). Note that the output can change dynamically in response to user actions, through the user-friendly point-and-click interface.

Data analysis

The ‘Annotate+Analyze’ workflow does the above and then tabulates and sums the content of each numerical column in the input by matrisome divisions and categories. Here, the output is a .csv file where each row corresponds to a matrisome classification, where column A lists ‘Matrisome Annotations’, and where each subsequent columns report the tabulation of the numerical data according to these annotations (Table S3). This workflow allows users to evaluate, at a glance, the relative ECM content of each of their samples (e.g. number of reads if inputting RNA-seq data or number of peptides or spectra if inputting proteomic data, as shown in the example provided in Table S1). Users can then input these data in other statistical analysis software or data visualization software to pursue their analysis.

We caution users that not all tabulations might be relevant: for example, the test file provided contains representative proteomic data, listing in addition to quantitative metrics (e.g. total spectrum count or exclusive spectrum count), the molecular mass of each protein or protein identification probabilities that are numerical values if no unit is appended (e.g. kDa or %) and, thus, are tabulated by Matrisome AnalyzeR.

Importantly, the Matrisome AnalyzeR application implements a strict session-specific data policy: data uploaded by users are neither stored in our server, nor can the data leak through sessions. User data are purged upon user disconnection or at session timeout.

The Matrisome AnalyzeR package

For users familiar with R programming and wishing to analyze larger datasets (>100 MB, the limit imposed on data upload to the web application) or interested in additional data visualization options, as well as in the possibility of integrating matrisome annotation and analysis with other existing analysis pipelines, we have developed the Matrisome AnalyzeR R package, available at https://github.com/Matrisome/MatrisomeAnalyzeR.

The Matrisome AnalyzeR GitHub repository includes all the functions required to run the data processing workflow from the annotations to the tabulations as described above for the web application. Additional data visualization options are available such as donut charts (‘matrirings’), polar bar charts (‘matristars’) and alluvial charts (‘matriflows’). Fig. 2A shows examples of these visualization options for the data provided in the test file (Table S1).

Fig. 2.

Additional data visualization options using the Matrisome AnalyzeR package. (A) Upon running the matriannotate function on Table S1 as the input, users can obtain additional output files representing the data as donut chart (matriring; left panel) or polar bar chart (matristar; right panel). (B) Donut charts (or matrirings), obtained using the Matrisome AnalyzeR package to analyze whole exome sequencing data on four classes of breast cancer retrieved from the cBio portal, representing the 250 genes presenting the highest mutation frequencies in each breast cancer subtypes and their classification into matrisome categories. (C) Polar bar charts (matristars), obtained using the Matrisome AnalyzeR package to analyze single-cell RNA-seq data from 2700 single peripheral blood mononuclear cells, represent the average expression level of matrisome and non-matrisome gene categories for each single cell cluster such as B cells (left panel) and platelets (right panel). Differences in gene expression levels between categories and across cell clusters are visualized through the length of each segment and height of the bar for each segment.

Fig. 2.

Additional data visualization options using the Matrisome AnalyzeR package. (A) Upon running the matriannotate function on Table S1 as the input, users can obtain additional output files representing the data as donut chart (matriring; left panel) or polar bar chart (matristar; right panel). (B) Donut charts (or matrirings), obtained using the Matrisome AnalyzeR package to analyze whole exome sequencing data on four classes of breast cancer retrieved from the cBio portal, representing the 250 genes presenting the highest mutation frequencies in each breast cancer subtypes and their classification into matrisome categories. (C) Polar bar charts (matristars), obtained using the Matrisome AnalyzeR package to analyze single-cell RNA-seq data from 2700 single peripheral blood mononuclear cells, represent the average expression level of matrisome and non-matrisome gene categories for each single cell cluster such as B cells (left panel) and platelets (right panel). Differences in gene expression levels between categories and across cell clusters are visualized through the length of each segment and height of the bar for each segment.

The GitHub repository also features additional case studies to demonstrate the breadth of Matrisome AnalyzeR. In a second example, we show how the Matrisome AnalyzeR package can be applied to the analysis of whole-exome sequencing data obtained from the cBioPortal (Cerami et al., 2012; Gao et al., 2013) to identify matrisome genes presenting a high mutation frequency across four different breast cancer subtypes. Processing of the dataset using Matrisome AnalyzeR results in donut charts representing, in our example, the 250 genes presenting the highest mutation frequencies for each breast cancer subtype and their classification into matrisome categories (Fig. 2B). By selecting the donut chart representation (or matriring), users can easily visualize the contribution of matrisome genes to the query and for example identify a switch from the predominantly ECM-affiliated-proteins-rich profile for invasive ductal carcinoma to a more ECM glycoproteins- and ECM regulators-rich profile for invasive lobular carcinoma (Fig. 2B).

In a third example, we used single-cell RNA-seq data obtained from 2700 single peripheral blood mononuclear cells publicly available from 10× Genomics and used in the Seurat tutorial (Stuart et al., 2019). Pre-processing of the datasets identified nine clusters corresponding to the following cell types: B cells, memory CD4T cells, naïve CD4T, CD8T cells, CD14+ monocytes, FCGR3A+ monocytes, NK cells, dendritic cells and platelets. Tying the Seurat pipeline into Matrisome AnalyzeR enables the computation of the average expression of each gene for each single cell cluster, and the display as polar bar charts (or matristars) allows users to easily visualize the different matrisome categories arranged in a polar coordinate system, with the differences between categories being visualized through the length of their segments and the height of their bars (Fig. 2C). Users can, at a glance, appreciate the differential matrisome gene expression pattern across the different cell clusters, with B cells having the lowest number of ECM expressed genes (Fig. 2C, left panel) and platelets expressing a larger number of ECM genes encoding proteins involved in clotting (Fig. 2C, right panel). The complete analyzed dataset is available on the home page of the GitHub repository.

Upon completion of the workflow, users can extend their data analysis by using the output of the matriannotate and matrianalyze workflows to conduct comparative statistical analysis using the programs of their choice.

The identification of genes or proteins belonging to the same functional compartment provides important information about the processes happening in cells and tissues and is a critical step in the analysis of large -omic datasets. Here, we report the deployment of a suite of versatile tools to annotate, classify and tabulate ECM molecules in a variety of -omic datasets. Our goal was to develop tools accessible to non-ECM and ECM specialists alike, as well as novice and experts in big data analysis.

The current Matrisome AnalyzeR is designed to process data generated on the matrisomes of the five organisms the Naba laboratory and collaborators have predicted. In recent years, others have predicted the avian (Huss et al., 2019), planarian (Cote et al., 2019; Sonpho et al., 2021) and bovine (Listrat et al., 2023) matrisomes. It is our goal to test the robustness of these predictions and evaluate their adoption by the scientific community. Should the number of -omic datasets on samples from these organisms increase, we will release augmented versions of Matrisome AnalyzeR to include these organisms as well.

Importantly, the field of ‘matrisomics’ has significantly expanded in recent years, and we and others have developed additional tools to mine matrisomic datasets (Naba, 2023), such as MatrixDB, the database reporting ECM component interactions (http://matrixdb.univ-lyon1.fr/; Berthollier et al., 2021; Clerc et al., 2019), MatriNet, the database designed to explore network-scale changes in the ECM in pathophysiological conditions (https://www.matrinet.org/; Kontio et al., 2022) and the ECM proteomics database MatrisomeDB (https://matrisomedb.org; Shao et al., 2023). It is our goal to deploy, in the future, releases of Matrisome AnalyzeR that will create output that can directly be input to such databases to further advance ECM research and accelerate ECM biomarker discovery efforts.

Matrisome lists of model organisms

The list of matrisome genes for the following model organisms were retrieved from their original publications: Homo sapiens (Naba et al., 2012a), Mus musculus (Naba et al., 2012a), Danio rerio (Nauroy et al., 2017), Drosophila melanogaster (Davis et al., 2019) and Caenorhabditis elegans (Teuscher et al., 2019). The lists are also available via the Matrisome Project website at https://sites.google.com/uic.edu/matrisome. The original gene identifiers were programmatically used to derive other general (NCBI gene, formerly Entrez Gene, and UniProt IDs) and species-specific identifiers [Ensembl Gene IDs for human and murine datasets, ZFIN IDs for zebrafish (Bradford et al., 2022), FlyBase ID for drosophila (Gramates et al., 2022), and WormBase and Common Gene Name for C. elegans datasets (Davis et al., 2022)], using the annotation packages ‘org.Hs.eg.db’, ‘org.Mm.eg.db’, ‘org.Dr.eg.db’, ‘org.Ce.eg.db’ and ‘org.Dm.eg.db’. The retrieved IDs were finally manually reviewed and curated.

Input file format stipulation

The only formatting requirement to files uploaded to the Matrisome AnalyzeR application is that they should contain column headers in their top row. Matrisome AnalyzeR accepts tab- and comma-separated (.tsv, .txt, .csv, .tabular) as well as R Data Serialization (.rds) and proteomics Skyline (.sky) files, and can automatically recognize number format, although we encourage using dots for decimals and avoiding formatting thousands. The file size limit is 100 MB. If a file exceeds 100 MB, we recommend using the Matrisome AnalyzeR package. If processing files using the Matrisome AnalyzeR package, the input format is a data.frame; the function will stop and issue a warning otherwise.

Algorithms

The Matrisome AnalyzeR Shiny application and package are produced with the R Project for Statistical Computing and Shiny language (https://shiny.rstudio.com/), and share a common set of functions and ‘logic’. Users are expected to input a tabular dataset (typically, a high-throughput or -omic dataset) and identify a column with gene or protein identifiers and species information. Upon inputting the information, the first function of the pipeline (matriannotate) compares the input against a large database of matrisome annotations including gene symbols, NCBI gene (formerly Entrez Gene) and UniProt IDs for all species, as well as species-specific annotations such as Ensembl Gene IDs for human and murine datasets, ZFIN IDs for zebrafish datasets, FlyBase ID for Drosophila datasets, and WormBase and Common Gene Name for C. elegans datasets. Matching gene, protein or other ID are then enriched with matrisome divisions and categories (Naba et al., 2012a,b), and non-matching values are returned as ‘non-matrisome’.

The output is organized to have the gene, protein or ID in the first column, followed by the annotated matrisome divisions, annotated matrisome categories and the rest of the columns from the input file in their original order. This output is the base for the second function of the pipeline, matrianalyze, which takes in any numerical value in the dataset and sums them column-wise and by matrisome annotation. Note that formats not directly coercible (e.g. percentages) will be excluded. The result is a per-column (typically, per-sample) table of the quantity (e.g. number of reads, protein abundance and spectral counts) of any matrisome division and category across the entire dataset, which can be further used, for example, for statistical testing. The results from the matriannotate function are also the base for the graphical functions of both the application and package.

Output file format

In the Matrisome AnalyzeR web application, the output on screen comprises a graphical and a tabular part. The graphical part is a bar chart, internally produced with the library ggplot2 (https://github.com/tidyverse/ggplot2) and customized to apply the color codes assigned to matrisome divisions and categories independently of the molecule IDs and species.

The tabular part is a browsable, scrollable and searchable data table, internally produced with the library DT (https://rstudio.github.io/DT/). Upon completion of the matriannotate and/or matrianalyze functions, four download buttons appear in the navigation bar pointing to a single, zipped bundle including the tabular output in .csv format and the plot as a .pdf, or each of the outputs individually.

In the Matrisome AnalyzeR package, additional graphical functions are provided. These include donut charts (matrirings), polar bar chart (matristars) and Sankey/alluvial charts (matriflows). All graphs are internally produced with the library ggplot2 and with ggalluvial for matriflows. All graphical functions plot to the screen by default, but this behavior can be changed by setting the ‘print.plot’ parameter to FALSE. In this case, the underlying ggplot2 objects are returned instead, allowing further customization, integration with other pipelines, for example, printing to non-standard graphical devices. All tabular results are returned as data.frame.

The authors would like to thank all the members of the Izzi and Naba laboratories for their feedback on Matrisome AnalyzeR and Monica Bassignana (www.monicabassignana.com) for her help with data visualization and the preparation of the graphs presented in the manuscript.

Author contributions

Conceptualization: V.I., A.N.; Methodology: P.B.P., V.I., A.N.; Software: P.B.P., V.I., A.N.; Validation: J.M.C.; Formal analysis: V.I.; Resources: P.B.P., J.M.C., V.I., A.N.; Data curation: J.M.C., V.I., A.N.; Writing - original draft: P.B.P., J.M.C., V.I., A.N.; Writing - review & editing: A.N.; Visualization: J.M.C., V.I., A.N.; Supervision: V.I., A.N.; Project administration: A.N.; Funding acquisition: V.I., A.N.

Funding

This work was supported in part by the National Institutes of Health (U01HG012680, R21CA261642 and R01CA232517 to A.N.) and by a start-up fund from the Department of Physiology and Biophysics of the University of Illinois Chicago (A.N.). This research is connected to the DigiHealth-project, a strategic profiling project at the University of Oulu (V.I.) and the Infotech Institute (V.I., P.B.P.). The project is supported by the Academy of Finland (DECISION 326291 to V.I.), the Cancer Foundation Finland (V.I.), the Finnish Cancer Institute, and K. Albin Johansson Cancer Research Fellowship fund (V.I.). Open Access funding provided by National Institutes of Health. Deposited in PMC for immediate release.

Bergmeier
,
V.
,
Etich
,
J.
,
Pitzler
,
L.
,
Frie
,
C.
,
Koch
,
M.
,
Fischer
,
M.
,
Rappl
,
G.
,
Abken
,
H.
,
Tomasek
,
J. J.
and
Brachvogel
,
B.
(
2018
).
Identification of a myofibroblast-specific expression signature in skin wounds
.
Matrix Biol.
65
,
59
-
74
.
Berthollier
,
C.
,
Vallet
,
S. D.
,
Deniaud
,
M.
,
Clerc
,
O.
and
Ricard-Blum
,
S.
(
2021
).
Building protein-protein and protein-glycosaminoglycan interaction networks using MatrixDB, the extracellular matrix interaction database
.
Curr. Protoc.
1
,
e47
.
Bradford
,
Y. M.
,
Van Slyke
,
C. E.
,
Ruzicka
,
L.
,
Singer
,
A.
,
Eagle
,
A.
,
Fashena
,
D.
,
Howe
,
D. G.
,
Frazer
,
K.
,
Martin
,
R.
,
Paddock
,
H.
et al. 
(
2022
).
Zebrafish information network, the knowledgebase for Danio rerio research
.
Genetics
220
,
iyac016
.
Cerami
,
E.
,
Gao
,
J.
,
Dogrusoz
,
U.
,
Gross
,
B. E.
,
Sumer
,
S. O.
,
Aksoy
,
B. A.
,
Jacobsen
,
A.
,
Byrne
,
C. J.
,
Heuer
,
M. L.
,
Larsson
,
E.
et al. 
(
2012
).
The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data
.
Cancer Discov.
2
,
401
-
404
.
Clerc
,
O.
,
Deniaud
,
M.
,
Vallet
,
S. D.
,
Naba
,
A.
,
Rivet
,
A.
,
Perez
,
S.
,
Thierry-Mieg
,
N.
and
Ricard-Blum
,
S.
(
2019
).
MatrixDB: integration of new data with a focus on glycosaminoglycan interactions
.
Nucleic Acids Res.
47
,
D376
-
D381
.
Cote
,
L. E.
,
Simental
,
E.
and
Reddien
,
P. W.
(
2019
).
Muscle functions as a connective tissue and source of extracellular matrix in planarians
.
Nat. Commun.
10
,
1592
.
Davis
,
M. N.
,
Horne-Badovinac
,
S.
and
Naba
,
A.
(
2019
).
In-silico definition of the Drosophila melanogaster matrisome
.
Matrix Biol. Plus
4
,
100015
.
Davis
,
P.
,
Zarowiecki
,
M.
,
Arnaboldi
,
V.
,
Becerra
,
A.
,
Cain
,
S.
,
Chan
,
J.
,
Chen
,
W. J.
,
Cho
,
J.
,
da Veiga Beltrame
,
E.
,
Diamantakis
,
S.
et al. 
(
2022
).
WormBase in 2022—data, processes, and tools for analyzing Caenorhabditis elegans
.
Genetics
220
,
iyac003
.
Dzamba
,
B. J.
and
DeSimone
,
D. W.
(
2018
).
Extracellular matrix (ECM) and the sculpting of embryonic tissues
.
Curr. Top. Dev. Biol.
130
,
245
-
274
.
Etich
,
J.
,
Koch
,
M.
,
Wagener
,
R.
,
Zaucke
,
F.
,
Fabri
,
M.
and
Brachvogel
,
B.
(
2019
).
Gene expression profiling of the extracellular matrix signature in macrophages of different activation status: relevance for skin wound healing
.
Int. J. Mol. Sci.
20
,
5086
.
Gao
,
J.
,
Aksoy
,
B. A.
,
Dogrusoz
,
U.
,
Dresdner
,
G.
,
Gross
,
B.
,
Sumer
,
S. O.
,
Sun
,
Y.
,
Jacobsen
,
A.
,
Sinha
,
R.
,
Larsson
,
E.
et al. 
(
2013
).
Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal
.
Sci. Signal.
6
,
pl1
.
Gebauer
,
J. M.
and
Naba
,
A
. (
2020
).
The matrisome of model organisms: from in-silico prediction to big-data annotation
. In
Extracellular Matrix Omics
(
ed.
S.
Ricard-Blum
), pp.
17
-
42
.
Cham
:
Springer International Publishing
.
Gramates
,
L. S.
,
Agapite
,
J.
,
Attrill
,
H.
,
Calvi
,
B. R.
,
Crosby
,
M. A.
,
dos Santos
,
G.
,
Goodman
,
J. L.
,
Goutte-Gattat
,
D.
,
Jenkins
,
V. K.
,
Kaufman
,
T.
et al. 
(
2022
).
FlyBase: a guided tour of highlighted features
.
Genetics
220
,
iyac035
.
Huss
,
D. J.
,
Saias
,
S.
,
Hamamah
,
S.
,
Singh
,
J. M.
,
Wang
,
J.
,
Dave
,
M.
,
Kim
,
J.
,
Eberwine
,
J.
and
Lansford
,
R.
(
2019
).
Avian primordial germ cells contribute to and interact with the extracellular matrix during early migration
.
Front. Cell Dev. Biol.
7
,
35
.
Hynes
,
R. O.
and
Naba
,
A.
(
2012
).
Overview of the matrisome—an inventory of extracellular matrix constituents and functions
.
Cold Spring Harb. Perspect. Biol.
4
,
a004903
.
Izzi
,
V.
,
Lakkala
,
J.
,
Devarajan
,
R.
,
Savolainen
,
E.-R.
,
Koistinen
,
P.
,
Heljasvaara
,
R.
and
Pihlajaniemi
,
T.
(
2018
).
Expression of a specific extracellular matrix signature is a favorable prognostic factor in acute myeloid leukemia
.
Leuk. Res. Rep.
9
,
9
-
13
.
Kontio
,
J.
,
Soñora
,
V. R.
,
Pesola
,
V.
,
Lamba
,
R.
,
Dittmann
,
A.
,
Navarro
,
A. D.
,
Koivunen
,
J.
,
Pihlajaniemi
,
T.
and
Izzi
,
V.
(
2022
).
Analysis of extracellular matrix network dynamics in cancer using the MatriNet database
.
Matrix Biol.
110
,
141
-
150
.
Listrat
,
A.
,
Boby
,
C.
,
Tournayre
,
J.
and
Jousse
,
C.
(
2023
).
Bovine extracellular matrix proteins and potential role in meat quality: First in silico Bos taurus compendium
.
J. Proteomics
279
,
104891
.
Naba
,
A.
(
2023
).
Ten years of extracellular matrix proteomics: accomplishments, challenges, and future perspectives
.
Mol. Cell. Proteomics
22
,
100528
.
Naba
,
A.
,
Clauser
,
K. R.
,
Hoersch
,
S.
,
Liu
,
H.
,
Carr
,
S. A.
and
Hynes
,
R. O.
(
2012a
).
The matrisome: in silico definition and in vivo characterization by proteomics of normal and tumor extracellular matrices
.
Mol. Cell. Proteomics
11
,
M111.014647
.
Naba
,
A.
,
Hoersch
,
S.
and
Hynes
,
R. O.
(
2012b
).
Towards definition of an ECM parts list: an advance on GO categories
.
Matrix Biol.
31
,
371
-
372
.
Naba
,
A.
,
Clauser
,
K. R.
,
Ding
,
H.
,
Whittaker
,
C. A.
,
Carr
,
S. A.
and
Hynes
,
R. O.
(
2016
).
The extracellular matrix: Tools and insights for the “omics” era
.
Matrix Biol.
49
,
10
-
24
.
Naba
,
A.
,
Pearce
,
O. M. T.
,
Del Rosario
,
A.
,
Ma
,
D.
,
Ding
,
H.
,
Rajeeve
,
V.
,
Cutillas
,
P. R.
,
Balkwill
,
F. R.
and
Hynes
,
R. O.
(
2017
).
Characterization of the extracellular matrix of normal and diseased tissues using proteomics
.
J. Proteome Res
16
,
3083
-
3091
.
Nauroy
,
P.
,
Barruche
,
V.
,
Marchand
,
L.
,
Nindorera-Badara
,
S.
,
Bordes
,
S.
,
Closs
,
B.
and
Ruggiero
,
F.
(
2017
).
Human dermal fibroblast subpopulations display distinct gene signatures related to cell behaviors and matrisome
.
J. Invest. Dermatol.
137
,
1787
-
1789
.
Nauroy
,
P.
,
Hughes
,
S.
,
Naba
,
A.
and
Ruggiero
,
F.
(
2018
).
The in-silico zebrafish matrisome: a new tool to study extracellular matrix gene and protein functions
.
Matrix Biol.
65
,
5
-
13
.
Pietilä
,
E. A.
,
Gonzalez-Molina
,
J.
,
Moyano-Galceran
,
L.
,
Jamalzadeh
,
S.
,
Zhang
,
K.
,
Lehtinen
,
L.
,
Turunen
,
S. P.
,
Martins
,
T. A.
,
Gultekin
,
O.
,
Lamminen
,
T.
et al. 
(
2021
).
Co-evolution of matrisome and adaptive adhesion dynamics drives ovarian cancer chemoresistance
.
Nat. Commun.
12
,
3904
.
Randles
,
M. J.
,
Humphries
,
M. J.
and
Lennon
,
R.
(
2017
).
Proteomic definitions of basement membrane composition in health and disease
.
Matrix Biol.
57-58
,
12
-
28
.
Renner
,
C.
,
Gomez
,
C.
,
Visetsouk
,
M. R.
,
Taha
,
I.
,
Khan
,
A.
,
McGregor
,
S. M.
,
Weisman
,
P.
,
Naba
,
A.
,
Masters
,
K. S.
and
Kreeger
,
P. K.
(
2022
).
Multi-modal profiling of the extracellular matrix of human fallopian tubes and serous tubal intraepithelial carcinomas
.
J. Histochem. Cytochem.
70
,
151
-
168
.
Shao
,
X.
,
Gomez
,
C. D.
,
Kapoor
,
N.
,
Considine
,
J. M.
,
Grams
,
C.
,
Gao
,
Y. T.
and
Naba
,
A.
(
2023
).
MatrisomeDB 2.0: 2023 updates to the ECM-protein knowledge database
.
Nucleic Acids Res.
51
,
D1519
-
D1530
.
Sonpho
,
E.
,
Mann
,
F. G.
,
Levy
,
M.
,
Ross
,
E. J.
,
Guerrero-Hernández
,
C.
,
Florens
,
L.
,
Saraf
,
A.
,
Doddihal
,
V.
,
Ounjai
,
P.
and
Sánchez Alvarado
,
A.
(
2021
).
Decellularization enables characterization and functional analysis of extracellular matrix in planarian regeneration
.
Mol. Cell. Proteomics
20
,
100137
.
Stuart
,
T.
,
Butler
,
A.
,
Hoffman
,
P.
,
Hafemeister
,
C.
,
Papalexi
,
E.
,
Mauck
,
W. M.
,
Hao
,
Y.
,
Stoeckius
,
M.
,
Smibert
,
P.
and
Satija
,
R.
(
2019
).
Comprehensive integration of single-cell data
.
Cell
177
,
1888
-
1902.e21
.
Subramanian
,
A.
,
Tamayo
,
P.
,
Mootha
,
V. K.
,
Mukherjee
,
S.
,
Ebert
,
B. L.
,
Gillette
,
M. A.
,
Paulovich
,
A.
,
Pomeroy
,
S. L.
,
Golub
,
T. R.
,
Lander
,
E. S.
et al. 
(
2005
).
Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles
.
PNAS
102
,
15545
-
15550
.
Teuscher
,
A. C.
,
Jongsma
,
E.
,
Davis
,
M. N.
,
Statzer
,
C.
,
Gebauer
,
J. M.
,
Naba
,
A.
and
Ewald
,
C. Y.
(
2019
).
The in-silico characterization of the Caenorhabditis elegans matrisome and proposal of a novel collagen classification
.
Matrix Biol. Plus
1
,
100001
.
The UniProt Consortium
(
2023
).
UniProt: the Universal Protein Knowledgebase in 2023
.
Nucleic Acids Res.
51
,
D523
-
D531
.
Theocharis
,
A. D.
,
Manou
,
D.
and
Karamanos
,
N. K.
(
2019
).
The extracellular matrix as a multitasking player in disease
.
FEBS J.
286
,
2830
-
2869
.
Walma
,
D. A. C.
and
Yamada
,
K. M.
(
2020
).
The extracellular matrix in development
.
Development
147
,
dev175596
.
Wietecha
,
M. S.
,
Pensalfini
,
M.
,
Cangkrama
,
M.
,
Müller
,
B.
,
Jin
,
J.
,
Brinckmann
,
J.
,
Mazza
,
E.
and
Werner
,
S.
(
2020
).
Activin-mediated alterations of the fibroblast transcriptome and matrisome control the biomechanical properties of skin wounds
.
Nat. Commun.
11
,
2604
.

Competing interests

The authors declare no competing or financial interests.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.

Supplementary information