Hobert et al. (Hobert et al.,2004) have made a number of criticisms on our paper(Vogel et al., 2003). In the following paragraphs we give our replies to these criticisms. In a number of cases, the comments do provide useful corrections to the paper. Nevertheless,the major conclusions of our paper are not affected by these corrections and they remain both novel and valid.
Previous work on the immunoglobulin superfamily repertoire
Prior to our work, Hynes and Zhao (Hynes and Zhao, 2000) stated that the number of IgSF proteins in Drosophila was about 150, and for about 130 of these they listed some or all of the domains by which they are formed. This information should have been cited in our paper and we regret not having done so. Their results for C. elegans are similar to those we published(Teichmann and Chothia, 2000)prior to their paper. The work by Hutter et al.(Hutter et al., 2000) is acknowledged as a whole in our paper, but we are not able to make more detailed comparisons with our current work because their web database is inaccessible at present. The paper by Aurelio et al.(Aurelio et al., 2002) is discussed below.
Experimental characterisation of IgSF proteins
The most useful part of the correspondence by Hobert et al.(Hobert et al., 2004) is that which draws our attention to experimental work of which we were unaware. These correct the classification of two IgSF proteins: UNC-73 is an intracellular signalling molecule (Kubiseski et al.,2003) and not a secreted protein; and UNC-89 is a muscle protein and not an extracellular matrix protein(Flaherty et al., 2002). In addition, there are experimental papers on C. elegans IgSF proteins that we should have cited:
Popovici et al. (Popovici et al.,2002) noted the homology of F59F3.1, F59F3.5, T17A3.1 and T17A3.8,and gave them the name VER proteins. This homology was also described,independently, by Teichmann and Chothia(Teichmann and Chothia,2000).
Aurelio et al. (Aurelio et al.,2002) characterised the expression of C09E7.3, Y38F1A.9 and Y50E8A.3, and gave the three proteins the names Oig-1, Oig-2 and Oig-3. [They also described the number of Ig and FnIII domains in 24 C. elegansproteins. Of these, four were claimed to be not present in the IgSF proteins listed by Hutter et al. (Hutter et al.,2000) and Teichmann and Chothia(Teichmann and Chothia, 2000). In fact, only two, C09E7.3 and Y42H9B.2, were new: the other two, the domain structure of Y38F1A.9 and Y50E8A.3, are described in figure 6 of Teichmann and Chothia (Teichmann and Chothia,2000). The other sequence mentioned by Hobert et al.(Hobert et al., 2004),Y54G2A.25, is a revised version of Y94H6A_148.d that was used by both Hutter et al. (Hutter et al., 2000)and Teichmann and Chothia (Teichmann and Chothia, 2000).]
The C. elegans Semaphorin-2a is an experimentally characterised sequence (Roy et al.,2000).
Hobert et al. (Hobert et al.,2004) correctly note that two Ig domains are missing from the Perlecan structure in figure 1 of our paper(Vogel et al., 2003). They also point out that whilst Zig-2 to Zig-8 are correctly described as secreted proteins in figure 3 of our paper, they are carelessly placed with Zig-1 in the cell surface category in table 2.
Classification of IgSF proteins
The classification of the IgSF proteins in our paper is based on their structural features, their subcellular location and sequence similarities. We give some rough descriptions of the more common functions of the proteins in the different classes. Hobert et al. strongly object to this(Hobert et al., 2004). They claim that we imply that all proteins in Class I are cell adhesion molecules. We actually say that the experimentally characterised proteins in this class are “mainly cell adhesion molecules”. We and most readers are well aware of the multiple roles of, for example, Roundabout. Similarly, Hobert et al. (Hobert et al.,2004) claim that we imply that all Class III proteins are signalling molecules, whereas we state that “those characterised so far are signalling molecules”.
Only a wilful literalist would take rough descriptions of the more common known functions to be precise descriptions of the functions for all the proteins in a class. Proteins with similar domain structures and related sequences do tend to have related functions (e.g. Hegyi and Gerstein, 2001). But,as we say in the paper (p. 6327), any type of function suggested for new proteins by their structural and sequence similarties to characterised proteins will need to be refined or corrected by experiments.
Many of the criticisms above are concerned with work by others that should have been cited. The criticisms of the results are in some cases correct but their overall effect is small. The more serious criticisms require that two proteins, UNC-73 and UNC-89, are placed in different classes, and that the secreted proteins Zig-2 to Zig-8 are placed in the correct part of table 3.
Because of the improvements in predictions of protein sequences made by the curators of the genome sequences, and because of improvements in sequence comparison procedures (Karplus et al.,1998; Gough et al.,2001; Madera and Gough,2002), our descriptions of the IgSF proteins in Drosophila and C. elegans go beyond those published previously. The matches made by the sequence comparison programs are accompanied by a score that is an estimate of the match being in error. We have used conservative scores and would expect a very large proprotion of our assignments to be correct. However, given that we deal with over two hundred sequences, which together have about a thousand domains, we might also expect that a few assignments will be incorrect, and that some assignments will be missed because of the limitations of some of the hidden Markov models.
The criticisms made by Hobert et al.(Hobert et al., 2004) do not affect the novel and significant parts of our paper. We show that about half of the IgSF proteins in C. elegans and three-quarters of those in Drosophila have evolved since the divergence of the two organisms. The larger size of the Drosophila IgSF repertoire involves mainly cell surface and secreted proteins, and many of these have arisen through gene duplications. We believe that this overall expansion of the IgSF must be one of the factors that contributed to the formation of the more complex physiology of Drosophila. It is difficult to understand the assertion made by Hobert et al. (Hobert et al.,2004) that this view is invalidated by the increases in the repertoire produced by the alternative splicing of genes. Both factors are clearly important. In fact, the protein they take to illustrate the importance of splicing, DSCAM, is also a good example of repertoire expansion: there are probably four DSCAM sequences in Drosophila and none in C. elegans.