SUMMARY
Serine proteinase inhibitors (serpins) are a family of structurally similar but functionally diverse proteins that regulate several important proteolytic cascades in most branches of life. We have characterized 17 Amblyomma americanum serpin cDNAs here named as `Lospins' (L; an acronym for Lone Star tick serpin) that possess three β-sheets, eight α-helices and a reactive center loop consistent with the consensus serpin superfamily secondary structures. Visual inspection of deduced amino acid sequences revealed two patterns of basic residues: (i) 86DKSRVLKAYKRL97 in L5 and L13–16 and (ii) 158VRDKTRGKI166 in all Lospins, which are similar to consensus glycosaminoglycan (GAG) binding sites (XBnXmBX, where X and B are non-basic and basic residues, n=1 or 2 and m=1, 2 or 3). On three-dimensional models, the two putative GAG binding sites mapped onto α-helices D and F, respectively, with calculation of electrostatic surface potentials revealing basic patches on L5 and L13–16 models that are comparable to the heparin-binding site on antithrombin. RT-PCR expression analysis of 15 selected genes showed that the majority (11/15) of the Lospins were ubiquitously expressed in the midgut, ovary and salivary glands. On a neighbor-joining phylogeny guide tree, 15 serpins from other ticks and 17 Lospins from this study, a total of 32 tick serpin sequences, segregated into five groups with Lospins in groups A and D being conserved across tick species. The discovery of Lospins in this study sets the framework for future studies to understand the role of serpins in tick physiology.
Introduction
Ticks are among the most important vectors of disease agents affecting both humans and animals, and are currently considered to be vectors of human infectious diseases in the world second only to mosquitoes(Sonenshine, 1993). Control of ticks has traditionally been accomplished by use of acaricides, which has resulted in selection of resistant ticks and environmental pollution. Of the various proposed alternatives to acaricide use(Sonenshine, 1993),vaccination has emerged as a sustainable, cost-effective and environmentally friendly alternative. A deeper understanding of tick molecular physiology is needed before development of alternative tick control methods can be achieved. The current focus of tick research is to uncover the molecular basis of tick physiology (Hill and Wikel,2005; Nene et al.,2002a; Nene et al.,2002b; Ribeiro et al.,2006).
One group of proteins that may play important roles in tick physiology are serine proteinase inhibitors (serpins). Serpins represent one of the largest superfamilies of proteins found in most branches of life, ranging from viruses to vertebrates(Huntington, 2006). In humans,the majority of serpins function as negative regulators of several tightly regulated pathways such as blood coagulation, inflammation, complement activation, cancer metastasis and food digestion(Silverman et al., 2001; Huntington, 2006). More than 90 human diseases result from natural mutations of serpins, which attests to the importance of this family of proteins in the physiology of multicellular organisms (Potempa et al.,1994; Silverman et al.,2001). Although it cannot be assumed that observations in mammalian serpins will also be true for ticks, we are encouraged by evidence from other invertebrate systems where serpins have been linked to regulation of important pathways such as innate immunity(Abraham et al., 2005; Michel et al., 2005; Pelte et al., 2006; Nappi et al., 2005; Zou and Jiang, 2006) and embryo development(Carrell and Corra, 2004; Rushlow, 2004).
Given the importance of serpins in the regulation of mammalian host's physiological processes, it was hypothesized that ticks might encode serpins to evade host defense, and that blocking their function viaimmunization would compromise the tick's ability to feed(Mulenga et al., 2001; Mulenga et al., 2003). Indeed,a limited number of studies have reported mortality and reduced feeding efficiency in Ixodes ricinus Say(Prevot et al., 2007), Haemaphysalis longicornis Neuman(Sugino et al., 2003; Imamura et al., 2005) and Rhipicephalus appendiculatus Neuman(Imamura et al., 2005) ticks that fed on recombinant-serpin-immunized hosts. As part of our long-term study to explore the role of serpins in tick physiology, the objective of the current study was to identify and characterize Amblyomma americanumL. encoded serpins that are expressed early on, during the preparatory and slow feeding phase. We have used serpin generic primers and a previously published PCR approach (Mulenga et al.,2003) to identify and characterize 17 complete and two partial sequences of A. americanum serpin variants, here named Lospins, an acronym representing the `Lone star tick serpin'. We have identified a cluster of ten highly related Lospins that are conserved in several tick species.
The lone star tick A. americanum is among the most important and commonly encountered pests of humans and livestock in the Southern USA(Kollars et al., 2000). With its expanding range (Keirans and Lacombe,1998; Merten and Durden,2000) across the USA and its role as a vector of important human pathogens, including Elichia chaffeensis, E. ewingii and Borrelia lonestari, A. americanum, long considered as a nuisance tick in terms of public health, has now been recognized as a major vector of human disease agents in the USA (Childs and Paddock,2003). The discovery of Lospins in this study provides a framework for future studies to uncover the role of serpins in the physiology of lone star tick and other ticks.
Materials and methods
Tick dissections and total RNA isolation
Ticks used in this study were obtained from a colony of Amblyomma americanum L. ticks that are maintained in the laboratory of Dr Pete Teel, in our department. To feed ticks, females were placed in cells on the back of a calf in the presence of male ticks that had pre-attached for 3 days. This arrangement allowed female ticks to commence feeding within 24 h of being put on the animal. Tick dissections were routinely done as previously published (Mulenga et al.,2003). Briefly, ticks fed on cattle for 5 days were washed in 70%ethanol, held on glass slides with a pair of soft tissue forceps and their edges trimmed off using a sharp, sterile razor blade. Under a dissection microscope, the dorsal cuticle flap was lifted and salivary glands (SG),midgut (MG) and ovary (OV) were teased out from the ticks using an 18-gauge needle and a soft tissue forceps. All dissected tissues, including the carcass(CA) representing the tick remnant after removal of SG, MG and OV, were stored in RNA later (Ambion, Austin, TX, USA) at –80°C until used for RNA extraction.
Extraction of total RNA from whole ticks and dissected tick organs was done using the Trizol (Invitrogen, Carlsbard, CA, USA) reagent as previously described (Mulenga et al.,2003). Briefly, within the first hour of being detached from the host, whole ticks that were partially fed for 24 h (25 ticks), 96 h (10 ticks)and 120 h (10 ticks) were rinsed in 70% ethanol, pulverized in liquid nitrogen, and transferred to the Trizol reagent for RNA extraction. Similarly,tick organs dissected from 20 ticks that were partially fed for 5 days, were rinsed in DPEC water to remove the storage solution and then transferred to the Trizol reagent for RNA extraction. Tissue lysis was accomplished either by repeated pipetting (SG, MG, OV) or homogenization (CA) using a Sonic Dismembrator Model 100 (Fisher Scientific, Pittsburgh, PA, USA). Extracted total RNA was reconstituted in RNase-free water and stored at –80°C until used.
Discovery of lone star tick serpins `Lospins'
Cloning was done using generic serpin primers (GSPs)(5′-CATCCTGAACGCTGTCTACTTCAAGGG-3′,5′CGCGTCGGCCCTGGAGATACCGTAC-3′,5′-CGTCGACGTTCTCGACCTGCCTAC-3′) in combination with the SMART rapid amplification of cDNA ends kit (RACE; Clontech, San Jose, CA, USA) as published (Mulenga et al.,2003). GSPs were designed based on conserved nucleic acid sequences that were revealed by a multiple sequence alignment (not shown) of annotated tick serpin cDNA sequences from R. appendiculatus(Mulenga et al., 2003) H. longicornis (Sugino et al.,2003) Ixodes ricinus Leach(Prevot et al., 2006), I. scapularis (Ribeiro et al.,2006) and Boophilus microplus Cannestrini(AAP75707). In order to amplify tick serpin genes that are expressed early during the tick feeding cycle, the cDNA template primed by an adapter-linked oligodT primer(Clontech) was synthesized from 5 μg of total RNA extracted from a mixture of ticks that were partially fed for 24 h, 96 h and 120 h.
In the first round of sequencing, 56 PCR fragments cloned in pCR4-TOPO plasmid (Invitrogen) were sequenced and it was established that ∼80% of cDNAs encoded a serpin-like polypeptide, as revealed by BLASTX homology search. In the second round of sequencing, 288 insert positive clones were submitted to SequenceWright (Fisher Scientifc, Houston, TX, USA) for high throughput sequencing and contig assembly. Following contig assembly and singleton identification, gene-specific PCR primers were designed and used with the 5′ and 3′ RACE, to clone full-length cDNAs.
DNA sequence analyses
DNA sequences were routinely analyzed using the Vector NTI software packages (Invitrogen, free academic license). For comparison with known serpins and provisional identification, cDNA sequences were scanned against known protein entries in GenBank using the BLASTX and BLASTP homology search program. Additionally, deduced amino sequences were submitted to the ExPASY Proteomics Server(http://ca.expasy.org/)for prediction of signal peptides, amino acid motifs and patterns.
Structure-based alignment, comparative modeling and calculation of electrostatic surface potential
In order to predict secondary structures, structure-based alignment was performed between deduced Lospin amino acid residues and the native monomer of antithrombin (1AZX, chain I) using Expresso(Armougom et al., 2006). The 1AZXi template was retrieved from the protein data bank (PDB) as a molecular template based on its 31% and 51% amino acid sequence identity and similarity to Lospins, respectively, and the fact that its reactive centre loop (RCL) was resolved. The RCL, the region of the serpin molecule that is responsible for interaction with target proteinases, forms an extended, exposed conformation above the body of the serpin scaffold(Huntington, 2006; Gettins, 2002). Sequence alignments were subsequently used as input in the MODELLER version 9v1(Sali and Blundell, 1993) to predict comparative models. The models obtained were evaluated using Verify3D(Luthy et al., 1992) and PROCHECK (Morris et al.,1992). The electrostatic potential of antithrombin (1AZXi,positive control, template), PAI-2 (1BY7, negative control) and Lospin models were calculated by the Adaptive Poisson-Boltzmann Solver (APBS) (Baker et al.,2001). Protonation states were assigned using the parameters for solvation energy (PARSE) force field (Sitkoff et al., 1994) for each structure by PDB2PQR(Dolinsky et al., 2004). Execution of APBS and visualization of resulting electrostatic potentials were performed by PyMol 0.99rev10 (DeLano,2002) at ±5 kT/e of positive and negative contour fields.
Phylogeny tree construction and similarity comparisons
The phylogeny tree out rooted from the serpin superfamily archetype, humanα-1 antitrypsin (AAB59495), was constructed from the dataset of 15 tick serpin polypeptide sequences downloaded from GenBank (accession numbers shown in Fig. 5) and 17 Lospin variants from this study using the neighbor joining method. Specifications were set for bootstrap values at 1000 replications, gaps proportionately distributed and correction for distance set to a Poisson distribution. Amino acid sequence identities among Lospins and other tick serpin polypeptides were determined by pairwise alignment using the Vector NTI software package.
Expression analysis by semi-quantitative RT-PCR
In order to determine spatial patterns of expression for cloned Lospin genes, DNAse treated SG, OV, MG and CA total RNA was subjected to two-step RT-PCR using gene specific primers (GSPs, Table 1). OligodT primed first strand cDNA templates were synthesized from ∼5 μg DNAse treated total RNA using the first strand synthesis kit (Invitrogen). DNAse treatment of total RNA was accomplished by a 45 min incubation at 37°C with 1 U RQ1 DNAse (Promega, Madison, WI, USA) per 10 μg of RNA, followed by a standard Trizol reagent extraction. A 1 μl aliquot of the first strand cDNA template was used in a PCR reaction with GSPs that were designed based on variable domains of candidate Lospins. A 15 μl aliquot of the PCR product was electrophoresed on a 2% agarose gel containing 1 μg ethidium bromide. To determine transcript abundance, densitograms of amplified PCR bands were determined using the web based ImajeJ image analyzer software(http://rsb.info.nih.gov/ij/). To correct for differences due to variations between template concentrations,densities of detected PCR bands were normalized according to the following formula: Y=V+V(H–X)/X,where Y=normalized mRNA density, V=observed Lospin PCR band density in individual tissues (MG, SG and OV), H=highest tick 16S rRNA PCR band density among tested tissues (carcass in this case, CA), X=tissue (MG, SG and OV) tick 16S rRNA PCR band density.
Gene ID . | Forward primer . | Reverse primer . |
---|---|---|
Lospin 1 | S1/2/3TTGTGCTCTTCACCGCAGCCGTGATG | CCACGGTTCCTTCTTCGTTTACTTC |
Lospin 2 | S1/2/3TTGTGCTCTTCACCGCAGCCGTGATG | CAAGGCTGATCCCCGATAAGTCTGC |
Lospin 3 | S1/2/3TTGTGCTCTTCACCGCAGCCGTGATG | CTGGAGGGGTGGCAAACGCGCCTTC |
Lospin 4 | ATGTTCTCCAAGTTGGTATTTCTGGCG | GCTCGCATGGCGGGCACTAGGCCG |
Lospin 5 | S15/16CAGGGACGGTCTGTCGCTAGCG | CAG CCG ATT GTC TAC CGT GGC |
Lospin 6 | S17/6CATGGTCGTCTTGCTTCCAGAC | GTCATTCTGGAAATAGAAGAGGAG |
Lospin 7 | GGATCCATGTCGGAAGCCATGGCGG | CATTCCGTTACTGACCATCCCACTC |
Lospin 8 | S9/8GGATCCCAAGAGGAGCAAAAGGTGG | GAGTGTAGATGATGACACCTGTGACG |
Lospin 9 | S9/8GGATCCCAAGAGGAGCAAAAGGTGG | GCTCTGAGTGTACATGATGACACCG |
Lospin 11 | GAAGACGTCGACAGCAAGCGAG | CTTTGCTTACCTTACAATTTAACTTTATGC |
Lospin 13 | S13/14CCCAAGTTCGACATGAGCCTTC | CGGAGGACGATGGCTCCCATCTC |
Lospin 14 | S13/14CCCAAGTTCGACATGAGCCTTC | GAAATAGAAGAGGAGGTTTCGTG |
Lospin 15 | S15/16CAGGGACGGTCTGTCGCTAGCG | CATATTAGCCGATTGTCTGGCTTC |
Lospin 16 | S15/16CAGGGACGGTCTGTCGCTAGCG | GGGCAAGCAGTGGTATCAATTG |
Lospin 17 | S17/6CATGGTCGTCTTGCTTCCAGAC | CGATTGTCTACCGTGGCAGAGC |
Gene ID . | Forward primer . | Reverse primer . |
---|---|---|
Lospin 1 | S1/2/3TTGTGCTCTTCACCGCAGCCGTGATG | CCACGGTTCCTTCTTCGTTTACTTC |
Lospin 2 | S1/2/3TTGTGCTCTTCACCGCAGCCGTGATG | CAAGGCTGATCCCCGATAAGTCTGC |
Lospin 3 | S1/2/3TTGTGCTCTTCACCGCAGCCGTGATG | CTGGAGGGGTGGCAAACGCGCCTTC |
Lospin 4 | ATGTTCTCCAAGTTGGTATTTCTGGCG | GCTCGCATGGCGGGCACTAGGCCG |
Lospin 5 | S15/16CAGGGACGGTCTGTCGCTAGCG | CAG CCG ATT GTC TAC CGT GGC |
Lospin 6 | S17/6CATGGTCGTCTTGCTTCCAGAC | GTCATTCTGGAAATAGAAGAGGAG |
Lospin 7 | GGATCCATGTCGGAAGCCATGGCGG | CATTCCGTTACTGACCATCCCACTC |
Lospin 8 | S9/8GGATCCCAAGAGGAGCAAAAGGTGG | GAGTGTAGATGATGACACCTGTGACG |
Lospin 9 | S9/8GGATCCCAAGAGGAGCAAAAGGTGG | GCTCTGAGTGTACATGATGACACCG |
Lospin 11 | GAAGACGTCGACAGCAAGCGAG | CTTTGCTTACCTTACAATTTAACTTTATGC |
Lospin 13 | S13/14CCCAAGTTCGACATGAGCCTTC | CGGAGGACGATGGCTCCCATCTC |
Lospin 14 | S13/14CCCAAGTTCGACATGAGCCTTC | GAAATAGAAGAGGAGGTTTCGTG |
Lospin 15 | S15/16CAGGGACGGTCTGTCGCTAGCG | CATATTAGCCGATTGTCTGGCTTC |
Lospin 16 | S15/16CAGGGACGGTCTGTCGCTAGCG | GGGCAAGCAGTGGTATCAATTG |
Lospin 17 | S17/6CATGGTCGTCTTGCTTCCAGAC | CGATTGTCTACCGTGGCAGAGC |
S1/2/3Common forward primer used for Lospins 1–3
S15/16Common forward primer used for Lospins 5, 15 and 16
S9/8Common forward primer used for Lospins 8 and 9
S17/6Common forward primer used for Lospins 17 and 6
Results
Discovery and provisional identification
A previously published PCR based approach(Mulenga et al., 2003) and high throughput sequencing were successfully used to clone 234 serpin cDNA fragments that segregated into 17 contigs and two singletons for a total of 19 partial serpin-like cDNA sequences, here named Lospins (L), an acronym representing the Lone Star tick serpin. The 17 full-length serpin cDNAs have been deposited in GenBank (accession numbers in ascending order, from L1 to L17, are EU072726, EU072727, EU072728, EU072729, EU072730, EU072731, EU072732,EU072733, EU072734, EU072735, EU072736, EU072737, EU072738, EU072739,EU072740, EU072741, EU072742). Consistent with the size of a typical serpin(Gettins, 2002), all deduced Lospin proteins range between 370 and 400 amino acid residues (not shown). BLASTX homology scanning against known protein entries in GenBank was used for routine provisional identification and revealed that all deduced polypeptides in this study showed high similarity exclusively to annotated serpins from other ticks (results not shown). However, when serpin sequences from other ticks were excluded from consideration, best matches included the leuckocyte/monocyte elastase inhibitor, neuroserpin, squamous carcinoma antigen 2, plasminogen activator inhibitor 2 (PAI-2) and antithrombin (results not shown).
Sequence analysis and structural based alignment
Scanning for signal peptides using SignalP(Emanuelsson et al., 2007)revealed that except for L2, L3, L7 and L11, all other deduced proteins possess leader sequences, indicating they are potentially secreted proteins. Signal peptidase cleavage sites are predicted after position (p) 21 for L1 and p16 for L4–6, L8–10 and L12–17 (not shown). When scanned for amino acid sequence patterns on the ScanProsite(de Castro et al., 2006), all deduced Lospin proteins were predicted to have multiple potential N-glycosylation sites (NX [T/S]) (results not shown). Except for L2,L3 and L11, all other Lospin sequences contain the serpin signature motif pattern PS00284([LIVMFY]–[G]–[LIVMFYAC]–[DNQ]–[RKHQS]–[PST]–F–[LIVMFY]–[LIVMFYC]–x–[LIVMFAH]) (results not shown). Four of the 17 Lospins(L1 and L8–10), are predicted to contain the `TKL' and `NHL' microbody C-terminal targeting signal pattern PS00342([STAGCN]–[RKH]–[LIVMAFY]) at their C-terminal end. When scanned on the 2ZIP-Server (Bornberg-Bauer et al.,1998)(http://2zip.molgen.mpg.de/index.html),all Lospins except for L8–10 are predicted to contain leucine residue repeats, [L-(x4)-L-(x4)-L-(x4)-L] that are similar to, but are predicted not fold into, leucine zipper DNA binding patterns(Bornberg-Bauer et al.,1998).
Similarity and identity comparisons between Lospin deduced proteins and human α-1 antitrypsin (accession no. AAB59495) revealed that 51 core residues occupying strategic buried positions to maintain the overall structure and facilitate the inhibitory mechanism of a serpin molecule(Irving et al., 2000) are 72–98% (38–48/51) conserved in Lospins(Fig. 1). One of the structural features that facilitate the swift function of the serpin molecule is the shutter region (Irving et al.,2000; Hopkins et al.,1993). This region is characterized by conserved amino acid motifs: S53-P54-X55-P56 (numbering is based on the serpin superfamily archetype; humanα 1-antitrypsin) and I157-N158-X159-X160-V161(Irving et al., 2000), which are 100% conserved in all Lospin polypeptides(Fig. 1).
On the basis of conservation of the consensus amino acid motif [p17 (E),p16 (E/K/R), p15 (G), p14 (T/S), p12–9 (AGS)] in the hinge region of the RCL that is conventionally used to distinguish between inhibitory and non-inhibitor serpins at the sequence level(Hopkins et al., 1993), all Lospin deduced proteins are putatively inhibitory(Fig. 2). Except for L11, where p12 (A/G/S) is replaced by `p', L18 where p17 (E), has been replaced by `Q',and L19, where p9 (A/G/S) is replaced by `V', all other residues are 100%conserved in the hinge region of putative Lospin RCLs(Fig. 2). The p8 position,which has a high preference for the small threonine side chain(Gettins, 2002), is 100%conserved in all Lospins, except for L3 and L11, where there is a `P'replacement. Assuming that there are 17 residues between the scissile bond(p1–p1′) and the hinge region of the RCL(Hopkins et al., 1993) the predicted p1 residues are `K', for L1–3, `I', for L4 and L12, `L' for L4–6, L11 and L13–18, `M' for L7, `Q' for L8–10 and `S' for L19 (Fig. 2). Overall, when compared to each other at the RCL level, amino acid residue identities ranging from19–95% were observed (not shown).
Given the conservation of key amino acids that underpin the structure and functionality of serpins, we performed structure-based alignment to gain insight on the putative secondary structures of Lospins. Consistent with the common fold of a typical serpin(Huntington, 2006), structural alignment with the monomer for native antithrombin revealed that each Lospin tertiary structure possess three β-sheets (A–C), eightα-helices and a reactive center loop (RCL)(Fig. 3).
Lospins 5 and L13–16 posses basic patches similar to antithrombin (AT) heparin binding site
Visual inspection of deduced Lospin amino acid sequences revealed two clusters of basic residues (bold), 86DKSRVLKAYKRL97 present in L5 and L13–16, and 158VRDKTRGKI166, present in all Lospins (not shown). These patterns showed similarity to glycosaminoglycan(GAG) binding sites; XBnXmB (where X and B=non-basic and basic residues respectively, n=1, 2, or 3 basic residues and m=1 or 2 non-basic residues(Munoz and Linhardt, 2004; Olenina et al., 2005). On three-dimensional models, the 86DKSRVLKAYKRL97 and 158VRDKTRGKI166 motifs mapped onto α-helices D and F, respectively (not shown). It was interesting to note that the spatial arrangement for K87,K92 and R96 in α-helix D of L5 and L13–16 was comparable to the three residues on α-helix D of antithrombin,K114, K125 and R129, which are important in heparin binding (Olson et al.,2002; Schedin-Weiss et al.,2004; dela Cruz,2006). To further examine the possibility of the basic residues onα-helix D of L5 and L13–16 being involved in heparin (GAG) binding activity, we calculated surface electrostatic potentials for L5 and L13–16 models. This analysis revealed that comparative models of L5 and L13-16 possess basic patches (Fig. 4D-F) that are comparable to that of antithrombin(Fig. 4A). Lopin 7(Fig. 4C), which possess basic residues on its α-helix F, but lacks the 86DKSRVLKAYKRL97 on itsα-helix D, has a much smaller basic patch. The plasminogen activator inhibitor-2, which does not posses basic residues on α-helix D was used as a negative control and does not posses a basic patch(Fig. 4B).
Phylogeny tree and sequence similarity comparison
To determine the relationship among tick serpins, 17 Lospin polypeptides and 15 serpins from other ticks were subjected to phylogeny analysis using the neighbor joining method. From the α1-antitrypsin outlier, the aligned sequences segregated into five major groups (A–E) that are supported by bootstrap values of 76% for group A, 100% for groups B, D and E as well as 99% for group C (Fig. 5). In group A, L7 is distantly related from other Lospin proteins, segregated together with the I. ricinus immunosuppressor protein (Iris) (Prevot et al.,2006), R. appendiculatus serpin (Ras) 1 and 2(Mulenga et al., 2003) and B. microplus serpins (Bmserpin) 1 (TC8000), 3 (CV44398) and 5(TC10590). In group B, L8–10 segregated together with Ras-4(Mulenga et al., 2003) while serpin sequences from I. ricinus [serpins 1 (ABI94055), 2 (ABI94056)and 4 (ABI94057)] and I. scapularis (AAV80788) in group C are not closely related with any of the Lospin sequences. The majority of the Lospin polypeptides, L4–6 and L11–17, segregated together with Ras-3(Mulenga et al., 2003),Bmserpin 2 (TC7417), 4 (CV450507) and 6 (AAP75707), as well as H. longicornis (Hl) serpin (BAD11156) in group D, while L1–3, in group E, did not cluster with serpin sequences from other ticks(Fig. 4). Numbering of Bmserpins used in this study is arbitrary. Except for Bmserpin6, which is annotated in GenBank, the rest of the B. microplus serpins used in this study were obtained from the EST database available at`www.tigr.org'. Percent identity analyses at amino acid level revealed that among group `A'members, L7, which shows ∼33% identity to other Lospins, is 66% and 65%identical to R. appendiculatus serpin (Ras) 1 and 2 (AYO35779 and AYO3535780), respectively, 63% to I. ricinus blood meal induced immunosuppressor (Iris, CAB55818) and 44–68% to B. microplusserpins 1, 3, 5 (TC8000, CV44398, TC10590, respectively) (not shown). While amino acid identity levels of between 93–96% were observed among group B Lospins, L8–10 are 23–43% identical to Ras-4(Mulenga et al., 2003). Alignment of group D members revealed that these serpins were highly conserved across several tick species with similarity levels of between 74–96%being observed among Lospins, 56–70%, 54–62%, 56–72% and,54–69% identity being observed when Lospins were compared to Ras-3,Hlserpin, Bmserpin 2, 4 and 6, respectively. Among group E members, L1–3 identity levels of between 83–86% were observed (not shown).
Examination of pairwise alignments among some group D members revealed interesting amino acid residue similarity and identity patterns where differences between polypeptides are confined to one segment of the sequence. The L6 and L17 alignment revealed differences confined to the first 64 amino-terminal and the last 134 carboxy-terminal (CT) amino acids, with the central domains being identical (not shown). Similarly for L15 and L16, the first 269 amino-terminal residues are identical, with differences confined to the last 125 CT amino acids (not shown). Patterns comparable to L15 and L16 were observed when any two of the following sequences, L5, L13, L14, L15 and L16, were aligned (not shown).
Lospins are ubiquitously expressed
To get an insight into tissue distribution profiles of candidate Lospins,gene specific primers based on variable regions of each Lospin cDNA sequences(Table 1) were used to investigate Lospin mRNA expression patterns in SG, MG, OV and CA, dissected from 5-day fed A. americanum female ticks. Except for L8, L9 and L17,whose PCR products were not detectable in the OV, and L16, which was not detectable in the CA, the other tested genes are ubiquitously expressed(Fig. 6A). It is interesting to note that for the most part, our RT-PCR expression analysis results were consistent with sequence clustering in the phylogeny tree in Fig. 4. Based on normalized PCR band densities, L1–3, which cluster together in the phylogeny tree(Fig. 4) and show up to 86%amino acid identity, are ∼55–70% predominantly expressed in the MG followed by SG (∼15–20%), CA (∼5–15%), and least expressed in the OV. Similarly, L8 and L9, which also segregated together on the phylogeny tree (Fig. 5) and are 93% identical at the amino acid sequence level (not shown), are∼60–70% predominantly expressed in the MG followed by ∼28%expression in the CA, ∼3–15% in the SG and no expression in the OV. Comparable to L1–3, L7 is ∼55% highly expressed in the MG, ∼25%in the CA, ∼15% in the SG and least expressed in the OV. Among group D(Fig. 4) tested members, L5,L11 and L16 display superior expression in the OV by ∼38%, 50% and 90%,followed by ∼25%, 20% and 10% expression in the MG, respectively. Additionally, while L16 expression in the CA and SG was below 1%, L5 and L11,respectively, show ∼10% and 15% expression in the CA and 2% and 8% in the SG. Among the other group D tested genes, L17, which is not expressed in the OV is expressed to equivalent levels in the CA and SG, respectively, by∼40% and ∼20% in the MG, while L4, L6, L13 and L14, respectively, are∼30%, 38%, 28% and 50% expressed in the CA, ∼28%, 32%, 20% and 25% in the SG, ∼30%, 20%, 22% and 20% in MG. Additionally, expression in OV is∼10% for L4 and L6, ∼18% for L13 and <2% for L14.
Discussion
Serpin encoding cDNAs have recently been cloned from several ticks including Boophilus microplus(www.tigr.org), Ixodes scapularis (Ribeiro et al., 2006), I. ricinus(Prevot et al., 2006), Amblyomma variegatum (Nene et al., 2002b), Rhipicephalus appendiculatus(Mulenga et al., 2003) and Haemaphysalis longicornis (Sugino et al., 2003; Imamura et al.,2005). In this study we report on identification, comparative bioinformatics and mRNA expression analyses of 17 full-length and two partial A. americanum serpin variants here named `Lospins'. The Lospin sequences reported were cloned from ticks that had fed for 24 h, 96 h and 120 h. These stages of tick feeding coincide with the preparatory and the slow feeding phases of the tick feeding process, during which the tick establishes its feeding lesion and begins to transmit disease pathogens(Sonenshine, 1993). We are interested in this stage of the tick feeding process because it precedes most of the damage caused by tick feeding activity(Sonenshine, 1993). Our thinking is that, if we block serpin function early enough in the tick feeding cycle, we will not only interfere with the tick's ability to start feeding but also possibly prevent pathogen transmission. The potential limitation of using generic primers to clone multi-member gene families such as serpin is the possibility of a bias against identification of poorly expressed genes. While we are unable to determine whether or not there was a bias against cloning of poorly expressed Lospins, this possibility was minimized by our approach to sequence the entire 344-insert positive clones that were isolated from ligations of our PCR products.
Adoption of the consensus secondary structures of a typical serpin molecule(Gettins, 2002; Huntington, 2006) and the high conservation of the core amino acid residues(Irving et al., 2000) that underpin structure and functionality of serpins strongly suggest that Lospins are functional members of the serpin superfamily. Though originally identified as inhibitors of serine proteinases, cross-class members that can inhibit cysteine proteinases (Pak et al.,2004) and others with no inhibitor functions(Askew et al., 2007) have also been identified. While we are unable to specify the classes of their target proteinases as cysteine or serine, almost all deduced Lospin proteins are predicted to have putative inhibitory functions, as determined by consensus amino acid residues in the hinge regions of their putative RCLs(Hopkins et al., 1993). While confirmatory experimentation is awaited, it is important to point out here that possession of proline residues at the critical p8 and p12 positions of L3 and L11 deduced RCLs may suggest that these Lospins have no inhibitor functions. In a previous study, point mutations of p12, alanine, p10, serine and p8 threonine to proline resulted in loss of inhibitory activity by plasminogen activator inhibitor-1(Audenaert et al., 1994). In another study, mutation of the glycine residue at the p10 position to proline converted α1-antitrypsin from an inhibitor to a substrate(Hopkins et al., 1993). Deduced RCLs in this study were predicted based on the 17-residue rule(Hopkins et al., 1993; Irving et al., 2000). Given that some characterized serpins such as α2-antiplasmin(Gettins, 2002) or serpin1k from Manduca sexta (Li et al.,1999) utilize RCLs that are shorter or longer than the conventional 17 residues, we are interpreting our predicted scissile bonds with caution. Consistent with the fact that almost all known serpins are glycosylated (Whisstock et al.,2005; Law et al.,2006; Silverman et al.,2001; Robertson et al.,2006), our bioinformatics analyses data demonstrated that all deduced Lospin sequences possess potential N-glycosylation sites. From the perspective of finding antigens for anti-tick vaccine development, it was encouraging to note that, except for L2, L3, L7 and L11, which do not possess leader sequences, the majority of Lospins are predicted to be extracellular. The significance of this finding is that the majority of Lospins represent potential target antigens for anti-tick vaccine development,in that they will be accessible to host immune response factors. In humans,intracellular serpins, which are classified as clade B serpins (ov-serpins),have a higher frequency of presence of oxidation-sensitive residues such as methionine and cysteine in RCL, which are not normally exposed to highly oxidative conditions extracellularly(Silverman et al., 2004). It is interesting to note that, L7, one of four putative intracellular Lospins,possesses three methionines at p1, p3 and p4 and a cysteine at the p1′position.
Whether or not the putative GAG binding sites in L5 and L13–16 as well as the microbody C-terminal targeting signal in L1, L8 and L9, are functional, awaits experimentation. We are, however, encouraged by the fact that most known proteins that posses GAG binding motifs are involved in regulation of several important physiological pathways, which if disrupted,could severely compromise the tick's ability to feed. For instance,antithrombin, heparin cofactor II, chemokines and selectin, whose functions are regulated by GAG binding, are involved in mediation of essential biological processes such as blood coagulation, inflammatory response, immune cell migration, tumor cell metastasis and smooth muscle cell proliferation(Munoz and Linhardt, 2004). Similarly in invertebrates, GAG binding proteins were associated with immunity(Kamimura et al., 2006; Robertson et al., 2006) and development (Tollefsen, 2007)in Drosophila. Microbodies also known as peroxisomes in vertebrates,glyoxysomes, glycosomes and hydrogenosomes, depending on their chemical composition, are small electron-dense membrane-bound organelles that perform a variety of metabolic functions, including the β-oxidation of fatty acids and the biosynthesis of cholesterol and bile acids in eukaryotes(Gould et al., 1990; Gatto et al., 2000). Occurrence of diseases collectively referred to as `peroxisome biogenesis disorders' in the case of failure to properly assemble peroxisomes(Gould et al., 1990; Gatto et al., 2000) goes to attest to the importance of these microbodies. Clearly our future goal should be to uncover physiological processes regulated by Lospins.
Similarity and identity patterns among group D Lospins, where differences were confined to one region of the sequence, appear to be consistent with features that characterize alternately spliced genes (AS)(Talavera et al., 2007). The possibility of AS as a source of diversity among Lospins is not unique to ticks in that this phenomenon, first observed in M. sexta(Jiang et al., 1996), has been observed in serpins from C. felis(Brandt et al., 2004), D. melanogaster and C. elegans(Kruger et al., 2002). Experiments are currently underway to investigate if Lospin diversity could be attributed to AS of mutually spliced exons. Our long-term interest is to identity tick proteins that can be targeted for rational design of new tick control approaches. Thus our sequence analysis data showing that Lospins in groups A and D were conserved in other ticks is encouraging. Given the huge diversity of ticks that can infest animals(de la Fuente and Kocan,2003), it will be highly desirable to develop new tick control strategies targeting conserved tick proteins such as groups A and D Lospins,in that a single treatment could protect against several tick species.
Although certain Lospins were apparently over-expressed in certain tick organs, the general trend revealed by RT-PCR expression analysis is that the majority of tested Lospins are ubiquitously expressed. Expression of Lospins in multiple tick organs underscores their importance in regulation of key tick physiological processes. Curiously, all Lospin genes that were highly expressed in the midgut were poorly expressed in the ovary and vice versa. It will be interesting to explore the significance of the differences in transcription profiles. It is also important to point out here that the transcription profile data presented here is based on ticks that had fed for 5 days. Whether the transcription profiles will change during the tick feeding cycle was not determined in the current study. While A. americanum ESTs encoding serpin fragments were present in GenBank(Hill and Gutierrez, 2000)(www.genome.ou.edu)at the inception of this project, data presented here represents the first reported attempt to characterize annotated A. americanum serpins genes. The discovery of Lospins in this study sets the framework for future studies to understand the role of serpins in tick physiology.
Acknowledgements
We would like to thank Dr Pete Teel for kindly providing ticks used in this study. Funding for this project was provided by start up funds to Albert Mulenga from the Texas A&M College of Agriculture and Life Sciences,Department of Entomology and the Texas Agriculture Experiment Station.