Nucleotide sequence of cloned unintegrated avian sarcoma virus DNA: Viral DNA contains direct and inverted repeats similar to those in transposable elements
Avian Sarcoma Viruses
Repetitive Sequences, Nucleic Acid
We have determined the nucleotide sequence of portions of two circular avian sarcoma virus (ASV) DNA molecules cloned in a prokaryotic host--vector system. The region whose sequence was determined represents the circle junction site--i.e., the site at which the ends of the unintegrated linear DNA are fused to form circular DNA. The sequence from one cloned molecule, SRA-2, shows that the circle junction site is the center of a 330-base-pair (bp) tandem direct repeat, presumably representing the fusion of the long terminal repeat (LTR) units known to be present at the ends of the linear DNA. The circle junction site is also the center of a 15-bp imperfect inverted repeat, which thus appears at the boundaries of the LTR. The structure of ASV DNA--unique coding region flanked by a direct repeat that is, in turn, terminated with a short inverted repeat--is very similar to the structure of certain transposable elements. Several features of the sequence imply that circularization to form the SRA-2 molecule occurred without loss of information from the linear DNA precursor. Circularization of another cloned viral DNA molecule, SRA-1, probably occurred by a different mechanism. The circle junction site of the SRA-1 molecule has a 63-bp deletion, which may have arisen by a mechanism that is analogous to the integration of viral DNA into the host genome. Flanking one side of the tandem direct repeat is the binding site for tRNATrp, the previously described primer for synthesis of the first strand of viral DNA. The other side of the direct repeat is flanked by a polypurine tract, A-G-G-G-A-G-G-G-G-G-A, which may represent the position of the primer for synthesis of the second strand of viral DNA. An A+T-rich region, upstream from the RNA capping site, and the sequence A-A-T-A-A-A are present within the direct repeat sequence. These sequences may serve as a promoter site and poly(A) addition signal, respectively, as proposed for other eukaryotic transcription units.