The retroviral reverse transcriptase is a multifunctional protein. Not only does it contain both RNA- and DNA-directed DNA synthesis activities but also it contains an endonuclease activity necessary for the integration of viral RNA and a RNase H. This latter activity can reduce to oligoribonucleotides viral RNA that has been reverse transcribed into minus-strand DNA. However, during avian retrovirus genome replication it does this in a highly specific manner so as to generate a specific 12-base primer for plus-strand DNA synthesis. Even though many other oligoribonucleotides are also made there is an efficient selection xof the specific primer followed by its efficient utilization in plus-strand DNA synthesis, and subsequent removal. We have used a reconstructed system to gain an understanding of the factors that contribute towards these observed specificities.
As part of the replication cycle of the retrovirus genome, the viral RNA or plus strand, is first reverse transcribed into minusstrand DNA and then the minus strand acts as a template for the synthesis of plus-strand DNA. This much was essentially deduced in the first papers that described the discovery of reverse transcriptase activity in detergent-disrupted retrovirus particles (Temin & Mizutani, 1970; Baltimore, 1970). Since that time many of the details of retrovirus genome replication have been made clear, as reviewed for example by Varmus & Swanstrom (1985) and Mason et al. (1987). However, one of the longstanding problems has been to obtain an understanding of the mechanism of priming of plus-strand DNA synthesis. It was shown by Varmus et al. (1978) that plus-strand synthesis was initiated after the minus-strand DNA had reached a length of only several hundred nucleotides. The initiation site was not only efficient but apparently unique and, in fact, in more detailed studies by Mitra et al. (1982), it was found that the location of the site was unique to a single nucleotide. Subsequent comparative sequence analysis of retroviral sequences revealed only a single striking feature of plus-strand initiation sites: all were immediately downstream of a 10-to 20-base purine-rich sequence (Varmus & Swanstrom, 1985), that has come to be known as the polypurine tract, PPT. Sorge & Hughes (1982) have shown that at least 9 bases and maybe as much as 29 bases spanning the PPT are essential for retrovirus replication.
Several studies have reported attempts at characterizing the priming of retroviral plus strands. Ribonucleotides were detected bound to the plus-strand DNA of an avian retrovirus (Olsen & Watson, 1980). In studies with murine leukaemia virus (MLV) ribonucleotides were initially not detected by Mitra et al. (1979, 1982). Subsequently oligoribonucleotides of heterogeneous length were reported (Finston & Champoux, 1984). In our previous studies with the endogenous reaction of an avian retrovirus, we showed that the primer was of a discrete length (Smith et al. 1984"). With a reconstructed reaction of a closely related retrovirus Resnick et al. (1984) reported an 11 b (base) primer. In our own reconstructed reactions with avian myeloblastosis virus reverse transcriptase and purified Rous sarcoma virus (RSV) plus-strand RNA hybridized to purified minus-strand DNA, we confirmed our earlier result of a 12b primer (Smith et al. 1984ft). These aspects of priming for the Prague C strain of RSV are shown in Fig. 1. In summary, it is clear that: (1) the only enzyme needed for specific plus-strand initiation is retroviral reverse transcriptase and (2) the DNA-RNA hybrid substrate has to include the PPT. This manuscript addresses the question of how these features are utilized to achieve the specific events of initiation. A four step model is proposed and subjected to experimental testing by means of reconstructed reactions.
RESULTS AND DISCUSSION
The model of plus-strand priming events shown in Fig. 2 is based upon our previous studies of avian retrovirus genome replication and, as will be explained, we have been able to test aspects of the model with a more rigorously defined system. The system has been to use chemical synthesis to create oligodeoxyribonucleotides of minus-strand DNA that span the origin of plus-strand DNA synthesis. These DNAs have then been transcribed with the RNA polymerase of Subtilis phage 6 to yield plus-strand RNA of the same length. The DNA and RNA oligonucleotides are then purified, combined by hybridization, and used as substrates for plus-strand priming. The experimental details of more extensive studies will be published elsewhere. Consider now the justification for the model.
The model shows the substrate for the reverse transcriptase as a DNA-RNA hybrid. Three arguments for this are: (1) the generation of the necessary RNA primer apparently can only be via RNase H action, implying an DNA-RNA hybrid as substrate. (2) Our previous reconstructions with viral RNA and minus-strand DNA were only specific when prehybridization was used. Without this, a limited amount of precise initiation did occur but additional sites were also present (Smith et al. 1984ft). (3) Recent reconstructions with only 50b of DNA and 50b of complimentary RNA (as in Fig. 1) also demonstrated a dependence upon a prehybridization step. It is important to point out that, while we know that plusstrand and minus-strand DNA synthesis can occur concurrently in virions, the reconstructed reactions can successfully initiate plus-strand DNA synthesis as a step following minus-strand DNA synthesis.
The first step in the model is the conversion of the RNA to oligoribonucleotides by RNase H action. The generation of oligoribonucleotides is largely as expected for retroviral RNase H: an exonuclease that releases oligonucleotides of length ranging, according to different reports, of 2-10 bases or even up to 30 bases (as reviewed by Crouch & Dirksen, 1982). However, the oligoribonucleotides have to include the specific 12 b primer. The two specific cuts needed to provide the 5’ and 3’ ends of this primer will be considered later.
In the model it is assumed that the generation of the specific primer can be separated from the selection. The evidence for this is that when the DNA-RNA hybrid is treated with RNase H in the absence of nucleoside triphosphates it is possible to create a substrate that, in a second complete reaction, will initiate plusstrand DNA correctly (Smith et al. 1984a,b). Our interpretation is that the primer oligonucleotide is unique relative to the other oligonucleotides in that once generated, it binds specifically to the minus-strand DNA. That is, this binding contributes towards the selection of the primer. Furthermore, it is reasoned that this provides the apparent specificity of the subsequent utilization by the polymerase activity. That is, it is the only bound primer.
The mechanism of the selection of the primer oligonucleotide could be at the first level of size alone. If the RNase H primarily creates oligonucleotides of 2-10 b, as some have claimed, then the observed 12 b primer may simply be a rare molecule of sufficient size to make a stable interaction with the minus-strand DNA. However, even if the RNase H digestion does yield some larger oligonucleotides (for example of 12b and even larger), then the 12b primer RNA may be unique in that it is able to make a stable interaction with the minus-strand DNA. In either case it is still necessary to explain the nature of the observed stability of the interaction between the 12b primer and the viral DNA. As an experimental approach to this we reconstructed what was considered to be the equivalent of the interaction. We chemically synthesized the 12 b DNA sequence complimentary to the known RNA primer and then analysed the interaction between this oligonucleotide of DNA and the viral RNA. The observations were that the DNA oligonucleotide bound efficiently to the RNA, even under physiological conditions of temperature and monovalent ion. The interaction was stable, with a Tm of 55 °C in low salt (10 mM-Tris.HC1, 1 mM-EDTA). Subsequent studies of the interaction between viral RNA and a range of other oligodeoxyribonucleotides showed that the interaction with the 12 b species was the most stable. Other oligonucleotides of the same length failed to bind under physiological conditions. Even an oligonucleotide of the same length and base composition (8/12, G+C) failed to interact. An obvious clue, which may explain the observed stability, is that the primer sequence includes a major tract of adjacent purines. Physicochemical studies have shown that, in a single-stranded nucleic acid, adjacent purines allow a base-stacking that transforms the random-coil to an ordered helical structure (as reviewed by Saenger, 1983). In the presence of a complimentary nucleic acid strand, such a helical structure is more likely to make an efficient and stable interaction. Both our studies and a comparative analysis of retroviral polypurine tracts (Varmus & Swanstrom, 1985) are consistent with the hypothesis that the major feature involved in the selection of the correct primer is the polypurine tract. The content of G + C in the base-pairing may be a secondary feature.
The third step in the proposed model is the utilization of the bound primer oligonucleotide. If, as we propose, the ability of the primer to bind to the template DNA is a determining factor, then the utilization specificity is achieved simply by availability.
The fourth step in the model is the removal of the primer. We know that, in an examination of the products of an endogenous reaction for the avian virus, about 40% of the plus-strand DNA had intact RNA primer attached, while for 60% the primer had been precisely removed (Smith et al. 1984a,b). The latter molecules had a 5’-phosphate consistent with primer removal by RNase H. Of course, in the reconstructed reaction primer removal also occurred and there the only possible source of RNase H was that provided by the reverse transcriptase molecule. The reconstructed reaction also showed us that there was a balance in terms of how much reverse transcriptase was used and whether, at the end of the reaction, some or any of the plus-strand DNA product still had RNA primer or even fragments of primer attached. Because of this we offer an explanation of why in previous studies with MLV priming, a discrete primer was not detected; at most shorter oligoribonucleotides of heterogeneous length were found (Champoux et al. 1984; Finston & Campoux, 1984). Our interpretation is that the activity of MLV RNase H, whether in virions or in the reconstructed reactions, was such as to begin the removal of the RNA primer immediately after the initiation of the plus-strand DNA. As described below, we have data with the MLV enzyme which support this hypothesis.
For the model described in Fig. 2, a major problem was to determine the basis for the two specific cuts that yield the 12 b primer. As was mentioned before, the precise primer sequence has only been defined for an avian retrovirus. However, for many retrovirus genomes we know the total sequence through the region and can predict where the initiation site is located. From an examination of 31 such sequences, no readily apparent conservation of sequence was detected other than that of the tract of purines (Varmus & Swanstrom, 1985). With our reconstructed reaction system we were able to address directly certain aspects of the problem in the avian retrovirus situation. In order to determine whether sequence recognition specificity provided by the avian enzyme, we asked whether the reverse transcriptase of MLV could replace that of avian myeloblastosis virus (AMV). We found that it could; the initiation observed was predominantly at the single nucleotide site expected. Thus, the specificity of the cutting lies in the nature of the nucleic acid substrate and not in the source of retroviral enzyme. (Experiments with a non-viral RNase H activity, namely that of E. coli, did not give specific initiation.)
In the above study, the MLV reverse transcriptase used (Bethesda Research Labs) had been grown in E. coli using a subgenomic-sized recombinant DNA to the MLV polymerase region. Thus, the successful priming in addition, rigorously showed that specific plus-strand priming can be achieved without the aid of other retrovirus coded or cell-coded proteins (that may have contaminated the previously used AMV reverse transcriptase that was purified from the blood of virus-infected chickens). The sequence of MLV that was cloned did not include the endonuclease function that has been shown to recognize sequences needed for the integration of viral DNA (Skalka & Leis, 1984). Thus, the fact that we still observed specific initiation with the MLV enzyme excludes any possible role of the endonuclease in this event.
There was, however, one difference between the priming events with MLV enzyme relative to our previous results with AMV polymerase. The DNA products lacked RNA primer. Our interpretation is that, under these conditions of synthesis, the MLV enzyme unlike the AMV enzyme not only carries out the correct primer generation, selection and-utilization, but continues and with high efficiency removes the primer.
If the reverse transcriptase recognizes a polypurine region, it should be possible to exchange within the primer region, one purine for another, without loss of specificity. Such studies are underway. We will compare the effects of making changes in the RNA only, the DNA only or complimentary changes in both. In this way we will be able to map the features of the DNA-RNA hybrid that are recognized by reverse transcriptase so as to generate the specific oligonucleotide primer.
In summary, we have presented in this manuscript (1) a review of retroviral plusstrand priming, (2) our model for the specific events involved, and (3) a description of the preliminary results of our approach towards verifying the model. It is remarkable that this specific set of reactions is just a subset of the biologically relevant reactions that can be carried out by a combination of nucleic acid sequence features and purified retroviral transcriptase.
This work was supported by grant MV-7I from the American Cancer Society, by Public Health Service Grants CA-22651, CA-06927 and RR-05539, and by an appropriation from the Commonwealth of Pennsylvania.