ABSTRACT
The study of morphological modularity using anatomical networks is growing in recent years. A common strategy to find the best network partition uses community detection algorithms that optimize the modularity Q function. Because anatomical networks and their modules tend to be small, this strategy often produces two problems. One is that some algorithms find inexplicable different modules when one inputs slightly different networks. The other is that algorithms find asymmetric modules in otherwise symmetric networks. These problems have discouraged researchers to use anatomical network analysis and boost criticisms to this methodology. Here, I propose a node-based informed modularity strategy (NIMS) to identify modules in anatomical networks that bypass resolution and sensitivity limitations by using a bottom-up approach. Starting with the local modularity around every individual node, NIMS returns the modular organization of the network by merging non-redundant modules and assessing their intersection statistically using combinatorial theory. Instead of acting as a black box, NIMS allows researchers to make informed decisions about whether to merge non-redundant modules. NIMS returns network modules that are robust to minor variation and does not require optimization of a global modularity function. NIMS may prove useful to identify modules also in small ecological and social networks.
INTRODUCTION
Anatomical network analysis has recently emerged as a new framework to study anatomy quantitatively using tools from network theory (Rasskin-Gutman and Esteve-Altava, 2014). This approach first formalizes anatomical systems as network models, in which nodes represent individual anatomical elements (e.g. bones) and links represent pair-wise relations among them (e.g. articulations), and then quantifies their topological organization as a proxy to understand the biological features of the organism. Anatomical studies using network analysis have focused on comparing the development, function, and evolution of morphological systems, from invertebrates to vertebrates, including extant and extinct organisms (Dos Santos et al., 2017; Esteve-Altava et al., 2013a,b, 2018, 2019; Fernández et al., 2020; Fontanarrosa et al., 2020; Kerkman et al., 2018; Murphy et al., 2018; Ostachuk, 2019; Plateau and Foth, 2020; Saucède et al., 2015; Sookias et al., 2020).
Modularity is one of the most explored features of anatomical networks. In this context, a module is a group of nodes with more connections among them than to other nodes outside the group. We can classify network-based modules as organizational modules: “Organizational morphological modules refer explicitly to the interactions postulated to be important in organismal construction or activity. They invite observation or description in terms of mechanistic relations, whether variation among organisms is present or not. As such, organizational modules are units of stability” (Eble, 2005). The vertebrate skull has attracted most of the research on morphological modularity (Esteve-Altava, 2017b). Here, anatomical network studies have tried to unveil the topological units of organization to understand how they vary in evolution (Arnold et al., 2017; Esteve-Altava et al., 2015; Plateau and Foth, 2020; Werneburg et al., 2019), but also in normal and pathological development (Diogo et al., 2019; Esteve-Altava and Rasskin-Gutman, 2015). Moreover, some other studies have applied network methods to identify modules of landmark-based morphometric correlations rather than topological relations (Ivan Perez et al., 2009; Suzuki, 2013). A connection between network-based organizational modules and shape covariation modules would be likely (Esteve-Altava et al., 2013a,b), but it has not been elucidated yet. A more thoughtful discussion about the differences between anatomical network modules and shape co-variation modules in terms of concepts, methods, and limitation has been offered elsewhere (Esteve-Altava, 2017a).
A common strategy to delimit modules in anatomical networks is to use a community detection algorithm that optimize a quality function, which measures how well divided is the network into modules compared to a random model. The most popular optimization function is the modularity Q (Newman and Girvan, 2004). Q measures the number of links within modules compared with a random distribution of links between all nodes regardless of modules. However, Q has a resolution limit for smaller modules that depends on the size of the network and the possibility to distant nodes to connect (Fortunato and Barthelemy, 2007). As a result, optimization algorithms may fail to identify small modules in relatively larger networks. This is because in large networks connections cannot be distributed purely at random. Although anatomical networks are smaller compared to other natural networks (e.g. genetic, ecological), the resolution limit applies here because geometry and development constrain connectivity (Esteve-Altava and Rasskin-Gutman, 2014). Despite their limitations, optimization algorithms are persistently used to study anatomical networks because they are easy to apply and because they are readily available through build-in packages and programs like igraph (Csardi and Nepusz, 2006). Other resolution problems originate from the fact that many nodes have spread their links evenly between modules. Because most nodes tend to have a small number of connections, this makes difficult to delimit modules solely by connections inside and outside of modules.
Global optimization and resolution limits produce two problems in many modularity studies of anatomical networks. The first problem is that algorithms ‘inexplicably’ find different modules when we input slightly different networks. This happens, for example, when one fixes a minor mistake in a network model only to find out that in the revised network the modularity output has changed. It also happens when comparing similar networks with intraspecific variation (Esteve-Altava and Rasskin-Gutman, 2015) for which we would expect no major changes of modularity. The second problem is that algorithms sometimes find asymmetric modules in otherwise symmetric networks, which challenges basic biological assumptions about symmetry of bilateral body parts (e.g. see discussion in the open peer-review report in Plateau and Foth, 2020). These problems can hinder anatomical network analyses and they fuel criticisms to this methodology from people unaware of these limitations.
Here I propose a strategy to bypass resolution and sensitivity limitations, using a bottom-up approach that frames the problem of finding modules in anatomical networks at the level of individual nodes: a node-based informed modularity strategy (NIMS). Starting with the local modularity around every single node, NIMS returns the modular organization of the entire network by merging non-redundant modules and statistically assessing their degree of intersection. Instead of acting as a black box, NIMS allows to make informed decisions about whether to merge non-redundant modules or keep them as separated modules, based on the significance and the size of the intersection. NIMS returns network modules that are robust to minor variations and avoids common pitfalls of other approaches by not requiring the optimization a global modularity function. The use of NIMS is demonstrated in four skull networks (Fig. 1): human, human with intraspecific variation, tinamou, and crocodile.
RESULTS
Modularity of the ‘type’ human skull network
NIMS returns four non-redundant node-level modules, labeled accordingly as occipital, sphenoidal, frontal, and ethmoidal. Note that the names of the node-level modules are taken from the first node, which local module is non-redundant and it is merely a label. The statistical analysis of the intersections of the four modules returned two significant overlaps (Fig. 2), which supports merging node-level modules Ethmoidal+Frontal (labeled facial) and Sphenoidal+Occipital (labeled cranial). The two modules overlap in the frontal, palatines, and vomer (A/N: modules that overlap are called covers). The facial cover groups the ethmoid, frontal, lacrimals, maxillas, inferior nasal conchae, nasals, palatines and vomer. The cranial cover groups the frontal, occipital, palatines, parietals, sphenoid, temporals, vomer, and zygomatics. Statistical evaluation of the two covers confirms they meet the standard definition of module, as a group of nodes with more links in than out the module (facial cover: W=161.5, P-value=3.12e-05; cranial cover: W=125.5, P-value=8.38e-04).
The composition of the two covers resembles that of the anterior and posterior modules reported by other studies using Q modularity optimization (Esteve-Altava et al., 2013a,b), but the overlapping bones differ from studies using another local optimization approach (Esteve-Altava, 2017a). Here the palatal region connecting the base of the face to the base of the neurocranium is identified as the overlap of the two covers, while in previous reports it was the zygomatic bones who played this role. However, the role of the frontal bone as a bridge between the face and vault is consistently identified using NIMS.
Modularity of the human skull network with natural variation
NIMS returns also four non-redundant node-level modules, here labeled as Wormian, Sphenoidal, Zygomatic Right, and Ethmoidal. The statistical analysis of their intersection returns two statistically significant merges: Ethmoidal+Zygomatic Right and Sphenoidal+Wormian (Fig. 3). The modules resulting from merging the two pairs of node-level modules are the same than for the ‘type’ human skull network. Likewise, both covers meet the standard definition of module (facial cover: W=160, P-value=4.42e-05; cranial cover: W=146.5, P-value=6.29e-04). Therefore, NIMS returns robust modules even in the presence of natural variation on anatomical networks.
Modularity of the tinamou skull network
As expected, NIMS returned one single node-level module grouping the six bones after removing redundancies (all single node-level grouped the same bones). Therefore, no further analysis was required.
Modularity of the crocodile skull network
NIMS returns seven non-redundant node-level modules, labeled following nodes' name as R.Postorbital, L.Postorbital, R.Squamosal, Frontal, R.Vomer, L.Vomer, and Pterygoid. The statistical analysis of the intersections for the seven modules returned two significant overlaps (Fig. 4), which indicate that we can merge these node-level modules. One merging joined R.Vomer+L.Vomer, which share 20 of their 21 nodes. Another merging joined R.Postorbital+L.Postorbital+R.Squamosal, which share 15 of their 19, 19, and 17 nodes, respectively. Finally, node-level modules Frontal and Pterygoid do not significantly overlap with other modules in a way that improves other merges. Note that a potential merging Pterygoid+R.Postorbital+L.Postorbital+R.Squamosal has not only a lower statistical support (as compared to the merge without Pterygoid) but also reduces the overall size of the overlap from 15 to 8 nodes. Consequently, the merging excluding the Pterygoid is preferred.
After the two merges, we ended up with four network covers, labeled as frontal, postorbital, pterygoid, and vomer. Frontal module covers the frontal, lacrimals, laterosphenoids, nasals, parietal, postorbitals, prefrontals, squamosals, and supraoccipital. Postorbital module covers the basioccipital, basisphenoid, ectopterygoids, jugals, laterosphenoids, otoccipitals, parietal, postorbitals, prootics, pterygoid, quadrates, quadratojugals, squamosals, and supraoccipital. Pterygoid module covers the basisphenoid, laterosphenoids, palatine, prootics, pterygoid, quadrates, quadratojugals, and vomers. Vomer module covers the ectopterygoids, frontal, jugals, lacrimals, maxillas, nasals, palatine, postorbitals, prefrontals, premaxillas, quadratojugals, and vomers. The overlap between covers range from two to ten bones (see Supplementary Material).
Only two covers, vomer and postorbital, show some bones that are exclusive of these modules and do not overlap. Despite having four modules, the skull of the crocodile is highly integrated due to a substantial overlap. It is not surprising that whole-network optimization approaches often return asymmetries since overlapping bones can be placed equally well in different modules. However, the four covers reported here show no asymmetries; additionally, all of them have a bilateral symmetry and group the same bones from the left and right sides. Finally, statistical evaluation of the four covers confirms they meet the standard definition of module (frontal cover: W=191, P-value=3.41e-04; postorbital cover: W=522.5, P-value=4.6e-09; pterygoid cover: W=142.5, P-value=1.37e-03; vomer cover: W=467, P-value=3.84e-08).
DISCUSSION
NIMS successfully resolved the modular organization of the four skull networks used as example of different types of anatomical networks (Fig. 5). It found the same modules in ‘type’ networks and in networks including intraspecific variation. Note, however, that robustness against intraspecific variation may not always be the desirable result; for example, if the objective is to assess how variation affects modularity. In these cases, NIMS could be useful to establish the baseline to compare the modules identified by other, more sensitive methods. Finally, NIMS also identified absence of modular organization due to small size and filters out artificial asymmetries. By working under supervision (Step 5), NIMS gives researchers back the control of the modularity analysis to decide the most biologically meaningful output.
Community detection algorithms that optimize global modularity will still have an important place in future research on anatomical networks modularity. These algorithms are faster and can be run unsupervised, which is important when analyzing many anatomical networks as part of a broader evolutionary study (e.g. Plateau and Foth, 2020) or when the actual modules are not the final end of the study (e.g. Esteve-Altava et al., 2019). NIMS can be used alongside other methods to clarify the boundaries of modules and assess biases (Fig. 6). For example, by providing a second assessment of network partitions that yield a low modularity Q value. Thus, helping researchers to determine whether unsatisfactory partitions come from an underlying strong overlap between two or more modules.
MATERIALS AND METHODS
Anatomical networks
Four anatomical networks were analyzed representing the ‘type’ adult human skull, a human skull with intraspecific variation, the adult skull of a tinamou, and the adult skull of a crocodile (Fig. 1).
The first network is the ‘type’ adult human skull and it has been analyzed in previous works (Esteve-Altava et al., 2013a,b, 2015), using different methods that have consistently identified two modules: one posterior, grouping the bones of the cranial vault and base; and one anterior, grouping the bones of the facial region. Using a local optimization method (Lancichinetti et al., 2011), which can find overlapping modules (covers), another study found that these two modules overlap in the frontal and zygomatic bones (Esteve-Altava, 2017a). However, results of this algorithm were sensitive to changes in internal arguments. The analysis of the ‘type’ human skull will serve to compare NIMS to the modules delimited by other algorithms. The expected output is that NIMS can identify the same two modules.
The second network is a modification of the first one that includes anatomical variation found in natural populations (Berry and Berry, 1967). It has an extra Wormian bone between the left parietal and the occipital. It also has different configurations of the pterion region on each side: on the left, the parietal contacts with the sphenoid, which prevents the contact between temporal and frontal; on the right, the temporal contacts with the frontal that prevents the contact between parietal and sphenoid. As a result, the network is asymmetric along the left-right axis for the number of links and bones. This network is also an example of normal variation that we find when studying actual skulls rather than type forms. The analysis of this network will show how NIMS deals with intraspecific variation naturally present in anatomical systems, and whether the resulting modules are equivalent to those of the ‘type’ skull network. The expected output is that NIMS has a robust behavior for small variations and finds a modular organization that is consistent with that of the type network.
The third network is the skull of an adult tinamou Nothura maculosa, which consist of only six bones due to a process of fusion during postnatal development. Because of the small size of this network, preliminary reports (Lee et al., 2020) found either that this network is not modular (i.e. they find one module) or a trivial partition of the network. The analysis of this network will show how NIMS deals with small, non-modular networks. The expected output is that NIMS find one module.
The fourth network is the adult skull of the crocodile Crocodylus moreletii, which consist of a skull with many unpaired bones typical of non-avian archosaurs. For anatomical networks with many bilaterally symmetric nodes, community detection algorithms by optimization sometimes return oddly asymmetric modules (e.g. a module grouping only right-side nodes plus one left bone) or trivially asymmetric by placing unpaired nodes with a one-side module when it is equally connected to the left and right sides (Lee et al., 2020; Plateau and Foth, 2020). The analysis of this network will show how NIMS deals with larger networks with many unpaired bones. The expected output is that NIMS will return a modular organization without asymmetries.
A NIMS
NIMS builds upon existing functions implemented in R packages (R Core Team, 2019). This has been done for convenience because R is a popular programming language for anatomical network analysis and among biologists. However, its implementation is independent of the programming language used and other implementations are possible.
NIMS has six sequential steps:
Get all node-level modules
1. Remove extra node-level modules with same elements
2. Remove node-level modules with all elements included in a larger module
3. Test multiple-set overlaps
4. Merge node-level modules (top P-value/max overlap size)
5. Test cohesion for network-level modules (Wilcoxon test)
Step 1 delimits node-level modules for every node in the network. This can be done using the function cluster_spinglass in the package igraph (Csardi and Nepusz, 2006), setting the argument vertex sequentially to every node of the network. This function finds node-level modules based on their difference between realized and expected internal links (cohesion) and between realized and expected external links (adhesion) (Reichardt and Bornholdt, 2006). Alternatively, we could use any algorithm that returns node-level modules, even those based on criteria.
Steps 2 and 3 cross-check node-level modules to filter out modules with the same nodes (duplicated) and modules entirely included as part of a larger module (nested). This produces a list of non-redundant node-level modules. Finding only one module after Steps 2 and 3 means that the nodes of the network are fully integrated and that the network has no modules.
Step 4 evaluates the intersection of non-redundant node-level modules to reveal potential merges of highly overlapping modules. This procedure is co-opted from research on gene sets functional enrichment analyses (Subramanian et al., 2005), extended to perform multiple-set comparisons (for details see Wang et al., 2015). Step 4 returns the size of the overlap (how many nodes) for all combinations of node-level modules and the statistical significance of their overlap compared to random expectations. In R, we can use the function supertest in the package SuperExactTest (Wang et al., 2015), which performs multi-set enrichment analysis and provides tools for efficiently plotting the results so we can inspect them visually. To determine when a combination of modules is significant, we must first account for multiple testing using an appropriate P-value correction, such as the Bonferroni's correction (Miller, 1966). Any combination of modules above the Bonferroni corrected P-value is well supported statistically. P-value thresholds can be incorporated in the visualization of Step 4 to facilitate the visualization of supported merges.
Step 5 requires the researcher to make an informed decision on whether to merge two or more overlapping modules (also called covers), based on the summary statistics from Step 4. As a rule, we can merge modules into a single module when the overlap is large compared to the number of nodes of each node-level module and when the merging is statistically well supported. Similar well supported merges are possible. For these cases, researchers will have to decide and justify the rationale of the decision (e.g. by considering prior knowledge on the development and/or functioning of the resulting modules) or consider all well supported merges equally valid.
Finally, Step 6 will test whether the final modules fulfil the definition of network module as a group of nodes with more links inside that outside the module. For this we can use a paired Wilcoxon signed rank test of the null hypothesis that in-module links are equal than the out-module links (not a module), against the alternative hypothesis that in-module links are greater than out-module links (a module). This test is implemented in the function wilcox.test of the stats package (R Core Team, 2019).
The next sections demonstrate the application of NIMS to resolve the modular organization of four different anatomical networks. The Supplementary Material includes all the data and R code to reproduce these examples, as well as an additional example using the benchmark social network Zachary's karate club (Newman and Girvan, 2004).
Acknowledgements
I thank Roland Sookias and Dimitri Neaux for useful comments during the revision of this work.
Footnotes
Author contributions
Conceptualization: B.E.-A.; Methodology: B.E.-A.; Validation: B.E.-A.; Formal analysis: B.E.-A.; Writing - original draft: B.E.-A.; Writing - review & editing: B.E.-A.; Funding acquisition: B.E.-A.
Funding
B.E.-A. has received financial support through the Postdoctoral Junior Leader Fellowship Programme from ‘la Caixa’ Banking Foundation [LCF/BQ/LI18/11630002].
Data availability
Data and code are included in the Supplementary Material.
References
Competing interests
The author declares no competing or financial interests.