A recent report by Vogel et al. describes a bioinformatic analysis of immunoglobulin superfamily (IgSF) members in Caenorhabditis elegansand Drosophila melanogaster(Vogel et al., 2003). We have previously published reports presenting genome-sequence-driven analyses of worm and fly IgSF members (Hutter et al.,2000; Hynes and Zhao,2000; Aurelio et al.,2002). In Vogel et al. (Vogel et al., 2003), these papers are either not cited(Hynes and Zhao, 2000; Aurelio et al., 2002) or their content is essentially ignored (Hutter et al., 2000), although they cover much of the same ground as the Vogel et al. paper. Furthermore, the Vogel et al. paper contains errors and misclassifications of IgSF family members. Given the high degree of interest in this superfamily, we wish to correct the errors and clarify any misconceptions caused by the conflation of structural features with functional characteristics by Vogel et al.
Errors and misinterpretations in the data set
Although in general, initial genome-wide analyses of protein families are rarely free of errors, we think that such errors should not be taken lightly in the context of a refinement of previously published analyses. We list below the errors that we noted.
UNC-73 is shown in figure 3 of the paper as a secreted protein. This is incorrect. It is well established in both worms and flies (where the protein is called Trio) that this protein is an intracellular signal transduction molecule with nucleotide exchange factor activity (e.g. Bateman and Van Vactor, 2001; Kubiseski et al., 2003; Newsome et al., 2000; Steven et al., 1998).
UNC-89 is shown in figure 3 as being an extracellular matrix protein. However, it is a well-documented, intracellular muscle protein (e.g. Flaherty et al., 2002; Lin et al., 2003; Mackinnon et al., 2002).
The F59F3.1, F59F3.5, T17A3.1 and T17A3.8 proteins have been published and are called VER proteins (Popovici et al.,2002; Popovici et al.,1999), which is not cited by the authors. Moreover, in figure 2,three of the four proteins are omitted.
The T17A3.10 protein is incorrectly listed in the `Cell surface - kinases and phosphatases' section of table 3. Although its extracellular domain is similar to VER receptor tyrosine kinases, T17A3.10 has neither a kinase- nor a phosphatase domain.
Oig proteins, secreted 1-Ig domain proteins(Aurelio et al., 2002), are not shown in figure 3. They are listed in table 2, but without appropriate citation of the prior annotation.
Beat Ia should be set apart from other Beat proteins in figure 3 as it has an additional domain (a Cysteine knot domain)(Pipes et al., 2001), not shown by the authors.
The authors use unpublished information to state that K07E12.1 corresponds to DIG-1 (table 2), but they fail to acknowledge their source of information. Moreover, K07E12.1/DIG-1 protein has a plethora of domains characteristic of extracellular proteins, such as Sushi, EGF and vWF domains, and thus it should be classified as an extracellular protein, not as a protein of unknown cellular location.
The authors are not consistent in their placement of molecules into distinct classes. For example, they define `Cell surface proteins I' as transmembrane or membrane attached proteins (see p. 6320), yet, in table 2, list the secreted ZIG proteins ZIG-2 to ZIG-8 in the `Cell surface proteins I'category. By contrast, in figure 3, the same ZIG proteins are shown as secreted.
In figure 3 the structure for perlecan (UNC-52) is incorrect. As previously published, there are 17 Ig domains, two spaced near the amino end and then a cluster of 15 (reviewed by Rogalski et al., 2001). There should also be laminin G repeats, which are not shown in figure 3.
The C. elegans Semaphorin 2a gene in table 3 (mab-20/Y71G12B.20)should be annotated as an experimentally characterized sequence(Roy et al., 2000).
The authors claim to have identified 19 new Ig-proteins in C. elegans, as compared with their own previous analysis. Five of those(Y54G2A.25, C09E7.3, T19D12.7, F28E10.2 and T17A3.10) had been identified earlier by Hutter et al. or by Aurelio et al., and thus cannot be termed `new'proteins (Hutter et al., 2000; Aurelio et al., 2002).
Classification of IgSF proteins
IgSF proteins are classified in this paper according to their domain organization. Although this is a useful classification from a structural point of view, it has only limited implications for the functions of the proteins. Treating these structural classes as being equivalent to functional classes is incorrect as members from each class have been shown to have overlapping functions. For example, most members of the `Cell Surface I protein' class,classified as `cell adhesion proteins' by Vogel et al., can clearly serve as signaling molecules (e.g. L1, NCAM, Robo and DSCAM) (reviewed by Rougon and Hobert, 2003). To illustrate one example, the Robo IgSF protein is classified as a cell adhesion protein by Vogel et al., yet it has clearly been demonstrated to be a signaling molecule acting through the recruitment of intracellular signal transducing molecules, such as kinases and nucleotide exchange factors(reviewed by Araujo and Tear,2003; Dickson,2002; Korey and Van Vactor,2000; Patel and Van Vactor,2002; Rougon and Hobert,2003). Moreover, many proteins in Class I are not sufficiently well characterized functionally to support their classification as `cell adhesion proteins'. Also, the vast majority of Class III molecules are not characterized functionally and may well have structural/adhesive roles, rather than signaling roles, as the authors imply. Consequently, conclusions made by the authors about the meaning of the expansion of `functional' classes in Drosophila (see p. 6326, `Proteins common and specific to Drosophila and C. elegans')are not justified. The lack of correct assignment of individual IgSF proteins also calls into question the claim of the authors that the particular nature of proteins of the Drosophila IgSF repertoire (see p. 6327) “must be one of the contributing factors responsible for, for example, the formation of a more complex cellular structure in Drosophila”. Perhaps the most impressive case of expansion of the IgSF repertoire in Drosophila,the thousands of alternatively spliced isoforms of the IgSF protein DSCAM(Schmucker et al., 2000), is unfortunately not mentioned by Vogel et al.