Preview only show first 10 pages with watermark. For full document please download

Functional Characterization Of The Evolutionarily

   EMBED


Share

Transcript

Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine 1062 Functional Characterization of the Evolutionarily Conserved Adenoviral Proteins L4-22K and L4-33K SARA ÖSTBERG ACTA UNIVERSITATIS UPSALIENSIS UPPSALA 2014 ISSN 1651-6206 ISBN 978-91-554-9132-1 urn:nbn:se:uu:diva-238487 Dissertation presented at Uppsala University to be publicly examined in C8:301, BMC, Uppsala, Friday, 13 February 2015 at 09:15 for the degree of Doctor of Philosophy (Faculty of Medicine). The examination will be conducted in English. Faculty examiner: Professor Stefan Schwartz (Institutionen för laboratoriemedicin, Lunds Universitet). Abstract Östberg, S. 2014. Functional Characterization of the Evolutionarily Conserved Adenoviral Proteins L4-22K and L4-33K. Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine 1062. 74 pp. Uppsala: Acta Universitatis Upsaliensis. ISBN 978-91-554-9132-1. Regulation of adenoviral gene expression is a complex process directed by viral proteins controlling a multitude of different activities at distinct phases of the virus life cycle. This thesis discusses adenoviral regulation of transcription and splicing by two proteins expressed at the late phase: L4-22K and L4-33K. These are closely related with a common N-terminus but unique C-terminal domains. The L4-33K protein is an alternative RNA splicing factor inducing L1-IIIa mRNA splicing, while L4-22K is stimulating transcription from the major late promoter (MLP). The L4-33K protein contains a tiny RS-repeat in its unique C-terminal end that is essential for the splicing enhancer function of the protein. Here we demonstrate that the tiny RS-repeat is required for localization of the protein to the nucleus and viral replication centers. Further, we describe an auto-regulatory loop where L4-33K enhances splicing of its own intron. The preliminary characterization of the responsive RNA-element suggests that it differs from the previously defined L4-33K-responsive element activating L1-IIIa mRNA splicing. L4-22K lacks the ability to enhance L1-IIIa splicing in vivo, and here we show that the protein is defective in L1-IIIa or other late pre-mRNA splicing reactions in vitro. Interestingly, we found a novel function for the L4-22K and L4-33K proteins as regulators of E1A alternative splicing. Both proteins selectively upregulated E1A-10S mRNA accumulation in transfection experiments, by a mechanism independent of the tiny RS-repeat. Although L4-22K is reported to be an MLP transcriptional enhancer protein, here we show that L4-22K also functions as a repressor of MLP transcription. This novel activity depends on the integrity of the major late first leader 5’ splice site. The model suggests that at low concentrations L4-22K activates MLP transcription while at high concentrations L4-22K represses transcription. So far, characterizations of the L4-22K and L4-33K proteins have been limited to human adenoviruses 2 or 5 (HAdV-2/5). We expanded our experiments to include HAdV-3, HAdV-4, HAdV-9, HAdV-11 and HAdV-41. The results demonstrated that the transcription- or splicingenhancing properties of L4-22K and L4-33K, respectively, are evolutionarily conserved and non-overlapping. Thus, the sequence-based conservation is mirrored by the functions, as expected for functionally important proteins. Keywords: L4-22K, L4-33K, RNA, splicing, adenovirus, nuclear localization, replication, transcription, evolution, SR protein, MLP, promoter, E1A, serotypes Sara Östberg, Science for Life Laboratory, SciLifeLab, Box 256, Uppsala University, SE-75105 Uppsala, Sweden. Department of Medical Biochemistry and Microbiology, Box 582, Uppsala University, SE-75123 Uppsala, Sweden. © Sara Östberg 2014 ISSN 1651-6206 ISBN 978-91-554-9132-1 urn:nbn:se:uu:diva-238487 (http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-238487) Till morfar Nils Stenman 1921-2002 Members of the committee Opponent Stefan Schwartz, Professor Department of Laboratory Medicine Lund University Members of the committee Hongxing Zhao, Dr. Department of Immunology, Genetics and Pathology Uppsala University Mikael Berg, Professor Department of Biomedical Sciences and Veterinary Public Health Swedish University of Agricultural Sciences Gun Frisk, Dr. Department of Immunology, Genetics and Pathology Uppsala University List of Papers This thesis is based on the following papers, which are referred to in the text by their Roman numerals. I Östberg, S., Törmänen Persson, H., and Akusjärvi, G. (2012) Serine 192 in the tiny RS-repeat of the adenoviral L4-33K splicing enhancer protein is essential for function and reorganization of the protein to the periphery of viral replication centers. Virology 433:273-281. II Lan, S., Östberg, S., Punga, T., and Akusjärvi, G. (2014) A suppressive effect of the first leader 5’ splice site on L4-22Kmediated activation of major late transcription. Manuscript. III Östberg, S., Backström Winquist, E., and Akusjärvi, G. (2014) RNA elements involved in adenovirus L4-33K regulation of alternative splicing. Manuscript. IV Östberg, S., Biasiotto, R., and Akusjärvi, G. (2014) Conservation of the transcriptional and post-transcriptional activities of serotype-specific adenovirus L4 proteins. Manuscript. Reprint of paper I was made with permission from Elsevier. Contents Introduction ................................................................................................... 13 Transcription ................................................................................................. 14 The polymerase......................................................................................... 14 The Promoter ............................................................................................ 16 Initiation.................................................................................................... 16 Elongation ................................................................................................. 17 Polyadenylation ........................................................................................ 18 Splicing ......................................................................................................... 19 Significance of alternative splicing .......................................................... 20 Splicing signals and elements ................................................................... 20 Splicing catalysis ...................................................................................... 21 The spliceosome ....................................................................................... 21 Splicing factors ......................................................................................... 22 Regulatory elements ................................................................................. 24 Disease ...................................................................................................... 24 Adenovirus .................................................................................................... 26 Genome ..................................................................................................... 26 Genome organization ................................................................................ 27 Evolution and phylogeny .......................................................................... 28 Pathology and disease ............................................................................... 30 Therapy and prevention ............................................................................ 31 Latency and persistence ............................................................................ 32 Adenoviruses as vectors ........................................................................... 32 The viral life cycle .................................................................................... 33 Virus entry ........................................................................................... 33 The early phase .................................................................................... 34 Viral DNA replication.......................................................................... 35 The late phase ...................................................................................... 36 Virus assembly and release .................................................................. 36 The MLP ................................................................................................... 37 The L1 model............................................................................................ 38 The L4 unit ............................................................................................... 39 Present investigation ..................................................................................... 41 Paper I ....................................................................................................... 41 Paper II ..................................................................................................... 44 Paper III .................................................................................................... 47 Paper IV .................................................................................................... 49 Concluding remarks ...................................................................................... 53 Acknowledgments ......................................................................................... 57 References ..................................................................................................... 59 Abbreviations 3RE 3VDE A Adpol ALS ARD C CAR CMV CTD C-terminal DBP DNA E1 etc. eGFP eIF EKC ESE ESS GTF G HAdV-x/n HEK HGPS hnRNP IGC Inr ISE ISS kbp kDa L1 etc. L4P MLP MLTU miRNA IIIa repressor element IIIa virus-infection dependent splicing enhancer Adenine Adenoviral DNA polymerase Amyotrophic lateral sclerosis Acute respiratory disease Cytidine Coxsackie-adenovirus receptor Cytomegalovirus Carboxy-terminal domain of RNA pol II Carboxy-terminal DNA binding protein Deoxyribonucleic acid Early region 1 of adenovirus etc. Enhanced green fluorescent protein Eukaryotic initiation factor Epidemic keratoconjunctivitis Exonic splicing enhancer Exonic splicing silencer General transcription factor Guanine Human adenovirus species x/type n Human embryonic kidney (cells) Hutchinson-Gilford progeria syndrome Heterogeneous nuclear ribonucleoprotein Interchromatin granule clusters Initiator Intronic splicing enhancer Intronic splicing silencer Kilo basepairs Kilodalton Late region 1 of adenovirus etc. L4 promoter Major late promoter Major late transcription unit Micro RNA mRNA MDa NLS N-terminal ORF PABP PAP PTB PIC PKR PPY Pre-mRNA Pri-miRNA pTP rRNA RNA pol n RNA Rpb RRM RS-domain Sern SF1 SINEs siRNA SMN snRNA snRNP SR protein SRSFn ss TBP TFIIX tRNA T VA U2AF UPE USF VA RNA Messenger RNA Megadalton Nuclear localization signal Amino-terminal Open reading frame Poly(A) binding protein Poly(A) polymerase Polypyrimidine-tract binding protein Preinitiation complex Protein kinase R Polypyrimidine tract Precursor mRNA Primary miRNA Precursor terminal protein Ribosomal RNA RNA polymerase n Ribonucleic acid RNA pol II B subunit RNA recognition motif Arginine- and serine-rich domain Serine at position n Splicing factor 1 Short interspersed elements Small interfering RNA Survival of motor neuron Small nuclear RNA Small nuclear ribonucleoprotein particle Serine-arginine rich protein Serine-arginine rich splicing factor n Splice site TATA-binding protein Transcription factor pol II X Transfer RNA Thymine Virus associated U2 snRNP auxiliary factor Upstream promoter element Upstream stimulatory factor Virus-associated RNA Introduction Animals and plants are eukaryotic organisms, made up of a complex network of different cell types. These cell types all share the same set of genes and are derived from the same precursor; the fertilized cell, yet they show different and specific characteristics. This diversity in cell function is due to the intricate regulation of complex processes such as transcription, posttranscriptional processing, and translation. These processes are differentially regulated between cell types, but also during the lifetime of a single cell. The precise regulation of gene expression is therefore an absolute requirement for the eukaryotic organism. Our chromosomes consist of 46 double-stranded DNA molecules that together contain around 24,000 protein-coding genes. These genes are transcribed by a DNA-dependent RNA polymerase to produce a precursor messenger RNA. This pre-mRNA is subject to various post-transcriptional modifications including 5’-capping, 3’-polyadenylation and, most importantly for this thesis, RNA splicing. The process of splicing includes cutting and joining of the pre-mRNA at precise sites, thus forming a mature mRNA. The mRNA is then exported from the nucleus to the cytoplasm where it is translated by the ribosome into an amino-acid sequence, which is folded into the final gene product; the protein. Eukaryotic cells can be infected by different pathogens, such as bacteria, parasites or viruses. A virus can be thought of as something in between a life form and a molecular machine. It is an obligate parasite, relying on a host cell in order to replicate. In its simplest form it is composed of an RNA or DNA genome encapsidated by a protein shell. The virus enters the cell and in most cases takes control of the host cells transcription, pre-mRNA processing and translation machineries, thereby turning the host cell into a veritable virus-replicating factory. The viral genome is limited to a certain size in order to fit into a capsid; therefore it relies heavily on the differential processing steps to maximize its own coding potential. Adenovirus is a common virus infecting various vertebrates, including humans. Its gene expression is strictly temporally regulated. The present thesis focuses on the characterization of the adenoviral proteins L4-22K and L4-33K, which both take an important part in the temporal regulation of the adenoviral life cycle. 13 Transcription Before the genetic information stored in our DNA can be translated into functional proteins, an intermediate message consisting of RNA must be produced. The DNA is transcribed into a pre-mRNA by a large enzyme complex called the RNA polymerase II (RNA pol II). The polymerase Bacteria and archaea use a single type of RNA pol, whereas all eukaryotes use (at least) three nuclear RNA polymerases. These are evolutionarily conserved with common structural framework and mechanisms, which suggests that they are derived from a last universal common ancestor before life was split into the three branches of bacteria, archaea and eukarya (1). The polymerases are specialized to transcribe distinct sets of genes (Table 1) (2). RNA pol I transcribes genes coding only for ribosomal RNA (rRNA), while RNA pol III can produce a variety of transcripts: 5S ribosomal RNA (5S rRNA), transfer RNA (tRNA), primary miRNAs (pri-miRNA) and some other small RNAs (7SL RNA, U6 snRNA). In contrast, RNA pol II produces protein-coding pre-mRNAs, pri-miRNAs and small nuclear RNAs (snRNAs), and is of main interest to this thesis (3). Structural analyses of RNA pol II subunits suggest that the ancestral RNA pol is bacterial, from which the archaeal RNA pol was evolved (Figure 1). In eukaryotes, RNA pol II is the closest relative to archaeal and bacterial polymerases, and RNA pol I and RNA pol III are thought to have evolved from RNA pol II (4). Remarkably, in plants there are two additional polymerases; IV and V, which coordinate gene-silencing processes and appear to have evolved from RNA pol II (3, 5). Table 1. Nature and function of RNA pols and their transcribed products. RNA pol Transcribed product I II rRNA Catalytic RNA of ribosomes mRNA, miRNA, small RNAs, Protein-coding pre-mRNA, long non-coding RNAs modifying and regulating RNAs, gene silencing rRNA, tRNA, miRNA, SINEs Gene silencing, protein translation, regulation siRNA precursors Gene silencing small RNAs Gene targeting of siRNAs III IV (plants) V (plants) 14 Function Figure 1. Proposed evolutionary tree of RNA polymerases. Despite the structural homology between bacterial and eukaryotic RNA pols, the regulation of transcription by these enzymes differs substantially. While the eukaryotic RNA pol II relies on temporal changes in its phosphorylated status, the bacterial RNA pol needs no such modification. There are very few bacterial and eukaryotic transcription factors that share any features, even if they have overlapping functions (6). The RNA pol II in eukaryotes consists of 12 subunits: Rpb1-Rpb12, to a total mass of more than 500 kDa (7). The largest subunit, Rpb1, has a Cterminal domain (CTD) which is essential for transcription in vivo as removing major parts of the CTD is detrimental for the fruit fly (D. melanogaster) viability (8, 9). Interestingly, some gene promoters can be in vitro transcribed without the presence of a CTD, for example the adenovirus major late promoter (MLP) (10). The human RNA pol II CTD contains 52 conserved amino acid heptad repeats, YSPTSPS. Intriguingly, the YSPTSPS sequence contains two serine residues, Ser2 and Ser5, which undergo extensive phosphorylation/dephosphorylation adjustments during gene transcription. In the initiating steps of transcription, the CTD is hypophosphorylated (11). After transcription preinitiation complex (PIC) formation, the general transcription factor IIH (TFIIH) phosphorylates Ser5 in the heptad repeat, which triggers RNA pol II transition from initiation to elongation (10, 12, 13). Shortly after initiation of elongation, Ser5 is dephosphorylated and instead Ser2 becomes hyperphosphorylated (11, 14). In addition to Ser2 and Ser5, also Ser7 can be phosphorylated during gene transcription, however the exact role of this modification is less known (15, 16). 15 The promoter RNA pol II transcription is usually initiated by the binding of a sequencespecific activator protein to a DNA enhancer element which leads to a sequential assembly of general transcription factors (GTFs) and the RNA pol II on the promoter (17–19). The core promoter extends around 40 bp upstream and 40 bp downstream of the transcription start site and can be further divided into specific regions (20). The initiator (Inr) contains the transcription start site, and is recognized and bound by a subunit of the TFIID complex (21, 22). Another subunit of this protein, the TATA-binding protein (TBP), binds to the TATA-box, which is located approximately 30 nt upstream of the start site. The TATAbox has a consensus sequence of TATAWAAR1 and is conserved among eukaryotes. Promoters can be divided into TATA-containing and TATA-less promoters. TATA-containing promoters are by far the most studied but only account for 20-30 % of all eukaryotic promoters (20, 23, 24). Interestingly, TATA-containing promoters are often associated with stress response and are extensively regulated (25, 26), while TATA-less promoters often control transcription from housekeeping genes (27). Initiation Thousands of transcription factors have been identified in the human genome, revealing the importance of gene transcriptional regulation (28). Eukaryotic chromosomes are covered by nucleosomes, which are the basic repeating structural units of chromatin. In addition to providing structural constraints, the nucleosomes are also involved in regulation of gene expression. Therefore the most basic transcriptional regulators are nucleosome remodeling factors, which permit other transcription factors and the RNA pol II access to the DNA (17). The initiation of transcription is an ordered process starting with the recognition of the core promoter by TFIID (Figure 2). The TFIID subunit protein, TBP, binds to the TATA-box in concert with several TBPassociated factors (TAFs) (18). This interaction is in turn recognized by TFIIB, and the hypophosphorylated RNA pol II/TFIIF complex is recruited. TFIIA stabilizes the interaction between TBP and DNA by counteracting repressive elements at this time point (18, 29). TFIIE and TFIIH arrive to the promoter after TFIID recruitment. TFIIH contains a helicase activity, which unwinds the DNA strands granting the RNA pol II access to the template strand (“promoter melting”) (30). Thus, the PIC is formed and transcription begins with the production of short transcripts (abortive transcription) due to 1 W stands for nucleotides A or T, R for nucleotides A or G (250). 16 Polyadenylation The final step of transcription is the 3’ end processing. An AAUAAA sequence in the pre-mRNA and a G/U-rich sequence downstream of the cleavage site comprise the core poly(A) element. Human genes may have multiple potential cleavage sites leading to the production of alternative isoforms of the mRNA. A multimeric protein complex assembles on the poly(A) site during the initiation of the polyadenylation reaction. The cleavage/polyadenylation specificity factor (CPSF) identifies the poly(A) signal AAUAAA, whereas the cleavage-stimulating factor (CstF) recognizes the G/U-rich sequence elements on the pre-mRNA. These two main polyadenylation factors direct the nucleolytic cleavage of pre-mRNA at the poly(A) site (34, 35). In total, 14 essential factors in mammals have been identified that regulate mRNA 3’-end processing (35). Interestingly, the SR proteins involved in the regulation of splicing (see below) have also been reported to regulate the polyadenylation process. Conversely, both CPSF and a subunit of CstF have been found in purified spliceosomes (36). After pre-mRNA cleavage, the poly(A) polymerase (PAP) adds the poly(A) tail to the newly cleaved 3’ hydroxyl end (33). The poly(A) tail consists of around 200-300 adenosine residues in mammals and is covered by the poly(A) binding protein (PABP). The poly(A) tail is crucial for the stability, transport and translation of the mature mRNA (35). 18 Splicing Before the entire genomes of mammals had been sequenced, it was estimated that such complex organisms should require at least 100,000 genes (37). Thus, it came somewhat as a shock in 2001 when the human genome project initially reported less than 30,000 genes for what is commonly granted to be the most complex organism on earth (28). In the following years, the finishing of the project arrived at around 24,000 protein-coding genes and 6,000 putative RNA genes (38, 39). The result is staggering. The human genome contains only approximately four times more primary genes compared to the bacterium Escherichia coli. The question is therefore how is complexity generated? The process of RNA splicing and, more specifically alternative splicing, appears to be a major mechanism explaining this apparent paradox. The primary mRNA transcript produced from transcription is composed of exons and introns. Introns are non-coding regions that are excised during splicing, and the coding exons are joined together in an ordered fashion. In constitutive splicing, the pre-mRNA is processed in the same fashion every time, whereas in alternative splicing different exons are combined to produce multiple mature mRNAs. There are five major modes of alternative splicing (Figure 3) where exon skipping is by far the most common in humans (40, 41). Figure 3. Model of different types of alternative splicing. Colored boxes represent exons and horizontal lines represent introns. 19 Significance of alternative splicing The average human gene is approximately 28,000 base-pair long, and produces a pre-mRNA that contains nine short exons of approximately 120 nucleotides each, separated by eight introns that can vary tremendously in size, ranging from less than 100 to more than 100,000 nucleotides in length. It is estimated that over 90% of all human genes are alternatively spliced, bringing our approximately 24,000 protein-coding genes to over 90,000 proteins (42, 43). Another advantage of organizing the genome into split genes, is the possibility of recombination events within non-coding intronic regions; exons from other genes may be inserted without disrupting the gene thus creating an evolutionary drive towards diversification (44, 45). Intriguingly, the use of alternative exons may lead to structurally disordered regions within the protein, which seem to be targeted for post-translational modifications and creating binding motifs for protein-protein interactions. This further increases the functional versatility of these proteins (46, 47). Alternative splicing is a hallmark of higher multicellular eukaryotes. Unicellular eukaryotes like Saccharomyces cerevisiae have short introns in only a fraction of its genes (around 230), and only three out of some 6,000 genes are thought to be alternatively spliced (48, 49). Invertebrates make use of alternative splicing to a higher degree. For example, D. melanogaster surpasses all organisms exhibiting the gene with the highest number of mRNA variants known as of today; the DSCAM gene, which has the potential to encode more than 38,000 protein isoforms (50). However, alternative splicing is most extensively used in vertebrate species, as can be deduced from the observation that invertebrates and vertebrates have approximately the same number of genes (compare 19,000 in Caenorhabditis elegans to 24,000 in human) despite the great difference in complexity (48). It has been shown that splice sites in lower eukaryotes show more conservation than those in higher eukaryotes. This infers that alternative splicing might have evolved from constitutive splicing, thus giving the organism an advantage by diversifying its repertoire of gene products (42). In two-thirds of all alternatively spliced genes, the longer form is the ancestral and the shorter isoforms have evolved due to exon skipping (49). Splicing signals and elements There are several important sequence elements in the pre-mRNA that signal and direct factors to take the proper actions. The upstream border between the intron and exon is defined as the 5’ ss, and the downstream intron-exon boundary is defined as the 3’ ss. There are four major classes of introns. Group I, II and III are ribozymes, i.e. they fold into a three-dimensional structure that catalyze their own splic20 ing reaction (51–54). A macromolecular machine called the spliceosome splices the fourth class, the nuclear introns. There is a debate whether the spliceosome can be regarded as a ribozyme as well, since the active site has been proposed to reside in the U6 snRNA. This fact also points to the conclusion that autocatalytic (specifically group II) introns and spliceosomal introns along with the spliceosome have developed from a common ancestor (53). The nuclear introns can further be divided into U2-dependent and U12dependent introns, spliced by the major and the minor spliceosome, respectively. Most U12-dependent introns have an AT-AC splice site consensus sequence. The more common group of introns is the U2-dependent introns, where most have a GT-AG consensus sequence at the splice sites (54–56). All further mentions of splicing and the spliceosome will refer to the U2dependent introns and the major spliceosome. The 3’ end of the intron contains additional sequence elements. The branch point is located 18-40 nucleotides upstream of the 3’ ss and contains one or two conserved adenine residues. One of these adenines is the acceptor of the 5’ end of the intron, forming the intermediate lariat structure. The polypyrimidine tract (PPY) is positioned between the 3’ ss and the branch point. The nature of this element influences the binding of U2AF: The more pyrimidines (C, T/U) the stronger the interaction and the efficiency of splicing is increased (57). Splicing catalysis The two-step transesterification reaction of splicing involves cutting at the 5’ ss and ligating the two exons while removing the intron. The oxygen of the 2’ hydroxyl group of a conserved adenine residue within the intron targets the 3’-5’ phosphodiester bond at the 5’ ss through a nucleophilic attack. This forms a 5’-2’ phosphodiester bond and two splicing intermediates: The free 5’ exon and the 3’ exon still containing the intron in a lariat structure. As a second step, the oxygen of the free hydroxyl group of the 5’ ss targets the 3’ splice site by a second nucleophilic attack, forming another 3’-5’ phosphodiester bond. Finally, the two exons are ligated to produce a spliced mRNA and a free intron in a lariat structure (54, 58). The spliceosome The large macromolecular machine called the spliceosome is made up of five small nuclear ribonucleoprotein particles (snRNPs) and over 100 additional proteins and splicing factors to a size of around 4.8 MDa (59, 60). The U snRNPs consist of small uridine-rich RNA molecules, called U1, U2, U4, 21 U5 or U6 snRNA, as well as Sm-proteins and Sm-like proteins. The U snRNPs assemble de novo on the pre-mRNA for each splicing event (54, 56, 61, 62). Splicing and transcription can be both physically and functionally coupled through the CTD of the RNA pol II; however, it seems that constitutive splicing are more co-transcriptional than alternative splicing (63). As the first step in spliceosomal assembly, the E complex is set up by the U1 snRNP binding to the 5’ ss, whereas SF1 binds the branch point and the U2 auxiliary factor (U2AF) associates with the PPY and the 3’ ss (Figure 4). The U2AF binding is needed for the recruitment of U2 snRNP to the branch point, thereby forming the spliceosomal A complex. A tri-snRNP consisting of U4/U5/U6 arrives, forming the B complex, and U6 replaces the U1 snRNP at the 5’ ss. U5 bridges the splice sites as U1 and U4 dissociate, thus creating the activated B* complex. Further rearrangements of the spliceosome during the first catalytic step of splicing lead to the transition from B* to C complex. After the second catalytic step the spliceosome disassembles and releases the mRNA in the form of an mRNA-protein complex (mRNP) (54). Figure 4. Spliceosome assembly. U snRNPs assemble sequentially on the premRNA, building up the different spliceosome complexes leading to splicing. Splicing factors A multitude of proteins are performing different tasks associated with the splicing process. Serine-arginine (SR) rich proteins constitute a large phylogenetically conserved and structurally related class of splicing factors. The 22 N-terminal domain includes one or two RNA recognition motifs (RRMs) and the C-terminus has an RS-domain consisting of a variable length of SR dipeptide repeats. The RRM and RS-domains are to a large extent modular and can be interchanged between different SR proteins without affecting their function (64). The SR protein SRSF1 (previously known as ASF/SF2, see (65) for revised nomenclature) binds to exonic enhancer elements and recruits and stabilizes the binding of U1 snRNP to the downstream 5’ ss. Simultaneously, SRSF1 can stimulate the recruitment of U2AF to the upstream 3’ ss. This “cross-talk” over the exon is important in pre-mRNAs with relatively short exons interspersed within long introns (i.e. vertebrate pre-mRNAs), as a mechanism defining which splice sites should be recognized by the spliceosome: the so-called exon definition model (42, 66). At low concentrations of SRSF1, only functionally strong 5’ ss are selected. At higher concentrations, U1 snRNP also binds to weak 5’ ss, thus promoting the choice of the nearest 5’ ss to be fused to the 3’ ss (67, 68). The SR proteins are found throughout the nucleoplasm but are enriched in structures called interchromatin granule clusters (IGCs), or speckles. The RS-domains are both necessary and sufficient for the localization of several SR proteins to the nucleus and further into the IGC compartments. The notable exception is SRSF1, which depends on further speckle localization signals to be correctly localized to the IGCs (69–71). The RS-domain can also permit the SR protein to shuttle between the nucleus and the cytoplasm (72). The IGCs are found in close proximity to genes with a high basal transcriptional activity and are proposed to be functional centers of enhanced mRNA processing (73). SR proteins, snRNPs and other splicing factors are in rapid flux through these compartments while the IGCs themselves stay immobile (74). Nascent pre-mRNA is predominantly located at the periphery of IGCs, in structures called perichromatin fibrils (75). Another family of splicing regulators is the class of heterogeneous nuclear ribonucleoproteins (hnRNPs). They contain one or two RRMs and an auxiliary domain. In general, hnRNPs are negative regulators of splicing; for example, the well-characterized hnRNP A1 binds silencer regions and multimerizes to inhibit usage of nearby splice sites, thus favoring the use of distal 5’ ss (as opposed to SRSF1). PPY binding protein (PTB) inhibits splicing by competing with U2AF for binding to the PPY (76). It is important to keep in mind that the regulation of splicing through these proteins is in itself a complex process and takes place in the complex microenvironment of the cell nucleus. At one specific time-point, there are numerous stimulatory and inhibitory processes affecting the splicing machinery simultaneously. Therefore the relative changes in concentration of different splicing factors can govern alternative splice site selection, leading to temporal, tissue-specific or cell-specific patterns (reviewed in (67)). 23 Regulatory elements There are several different sequence elements in the pre-mRNA, onto which splicing factors can bind and thereby regulate splicing. They are named after their location and effect; elements in exons are thus named exonic splicing silencers (ESS) or enhancers (ESE), and elements in introns are named intronic splicing silencers (ISS) or enhancers (ISE). Silencer elements are more common in introns than exons, while the opposite is true for enhancer elements. Splicing enhancer elements are typically bound by SR proteins while splicing silencer elements are bound by hnRNP proteins, but this is not universally true (76). SR proteins binding to an ESE enhance spliceosome assembly and splice site recognition. Different ESEs are recognized by specific regulatory SR proteins (77). A common feature of splicing signals and regulatory elements is that they are often short and degenerate, which can lead to elegant solutions further diversifying the proteome complexity or in worst case scenario to disease (reviewed in (78)). Disease A complex, physiological process involving many essential factors and mechanistical features is also vulnerable to error. As a result, there are a number of different pathologies reported as a consequence of disturbed splicing patterns. Mutations in both cis-elements as well as trans-acting factors, such as splicing factors and other regulatory proteins, can lead to disease or affect disease susceptibility or severity (79). One of the best-characterized examples of exonic mutations affecting splicing is spinal muscular atrophy (SMA), the most common genetic cause of infant mortality. The Survival of motor neuron (SMN) protein is encoded by two genes, SMN1 and SMN2, which are almost identical. SMN is an essential protein in splicing required for tri-snRNP assembly, and a deletion of SMN1 leads to the degradation of motor neurons resulting in paralysis. SMN2 cannot substitute for SMN1 due to a single point mutation that causes exon 7 skipping and leads to accumulation of a truncated and inactive SMN protein (80, 81). In frontotemporal dementia and Parkinsonism linked to chromosome-17 (FDTP-17), the ratio between two alternatively spliced isoforms of the tau protein has been disrupted. Several mutations might interfere with the splicing regulation of exon 10 of the gene encoding tau, and the change in balance of the isoforms leads to a pathological aggregation of tau protein (82). The activation of a cryptic 5’ ss within the LMNA gene leads to a truncation of exon 11 in the nuclear membrane protein lamin A. This is the most common cause of Hutchinson-Gilford progeria syndrome (HGPS). The re24 sulting truncated protein lacks a proteolytic cleavage site required for functional lamin A production, leading to a perturbed accumulation of the protein affecting overall gene expression. This results in a premature and rapid aging of several tissues, ending with death at the mean age of 13 years (83). A member of the hnRNP family, TDP-43, is implicated in several different diseases such as cystic fibrosis (CF), amyotrophic lateral sclerosis (ALS), and frontotemporal dementia. In CF, malfunctioning TDP-43 binds repetitive elements within the CF transmembrane conductance regulator gene (CFTR), promoting exon skipping and decreasing expression of the functional protein. In ALS, TDP-43 is found in ubiquitinated protein aggregates in neurons and glia cells (84). Cancer tissue is often enriched in aberrant splice variants, this can be the result of changes in the nuclear microenvironment as well as actual mutations in oncogenes and tumor suppressors (79). The cause-effect relationship can be difficult to establish, but some mutations have been well characterized. For example, BRCA1 is a tumor suppressor gene involved in DNA repair. An inherited mutation within an ESE in exon 18 results in the exclusion of this exon from the RNA transcript and an increases the risk of developing breast cancer (85). 25 Adenovirus Adenoviruses have been very useful as model systems for the study of mechanisms controlling gene expression in eukaryotic cells. For example, the concept of split genes and pre-mRNA splicing was discovered by the Roberts and Sharp labs in 1977 working in the adenovirus system (86–88). Several adenoviruses are oncogenic and can transform cells in vitro into tumorigenic cell lines. While they are not known to cause cancer in humans, members of species A adenoviruses can cause cancer in rodents and rabbits (89). Genome The family Adenoviridae consists of five genera. Mastadenovirus originate from mammals, Aviadenovirus from birds, whereas Atadenovirus and Siadenovirus have a wider host spectrum. The Atadenovirus genus was named due to the members high AT nucleotide content and include viruses that target ruminant, avian, reptilian and marsupial hosts (90). The Siadenovirus genus gathers frog and bird viruses with the common denominator that they encode for a putative sialidase or sialidase-like gene. The first fish adenovirus has been assigned to a fifth genus: Ichtadenovirus (91–93). Interestingly, several turtle adenoviruses have been recently isolated and sequenced and due to the large evolutionary distance a sixth adenovirus genus has been proposed: the Testadenovirus (94). Adenovirus infections are generally not zoonotic, but asymptomatic infections across species barriers have been described. For example, antibodies against HAdV-12 have been found in simians and inversely, antibodies against simian, bovine and canine adenoviruses have been detected in humans (95). There are about 100 adenovirus types characterized so far, infecting humans or other mammals, birds, reptiles, amphibians and fish. 67 of these infect humans and have been further divided into seven species: A to G (table 1) (91, 92, 96, 97). This classification is based on virus oncogenic potential, immunological characteristics, genome homology and their ability to agglutinate red blood cells (98). It can be noted that types 1-51 were characterized by serological methods (hence called serotypes from the beginning), while types 52-67 have been identified by genomic sequencing and phyloge26 netic tools (97). The sequence homology within the species ranges from 48% (A) to over 99% (C), with less than 20% homology between species (99). Table 2. Classification of human adenoviruses. Species HG Types Tumorigenic in Transforanimals mation of cells Disease Enteric infection Conjunctivitis Acute respiratory disease Hemorrhagic cystitis CNS Endemic infection Respiratory symptoms Keratoconjunctivitis in immunocompromised patients HAdV-A IV HAdV-B I 12, 18, 31, 61 3, 7, 11, 14, 16, 21, 34-35, 50, 55, 66 High Moderate Positive Positive HAdV-C II 1-2, 5-6, 57 Low or none Positive HAdV-D III Positive HAdV-E III 8-10, 13, 15, Low or none 17, 19-20, 2230, 32-33, 3639, 42-49, 51, 53-54, 56, 5860, 62-65, 67 4 Low or none HAdV-F III HAdV-G 40-41 52 Negative Unknown Unknown Unknown Positive Conjunctivitis Acute respiratory disease Infantile diarrhea Gastroenteritis HG = Hemagglutination groups: (I) complete for monkey erythrocytes (II) partial for rat erythrocytes (III) complete for rat erythrocytes (IV) little or none. Genome organization The viruses in the Adenoviridae family have linear double-stranded DNA genomes, ranging from 26-45 kbp. The viral genome is organized into eight transcription units, which produce around 40 proteins due to an extensive use of alternative splicing and polyadenylation (Figure 5) (89, 100, 101). During the early phase of infection viral regulatory proteins are produced. The late phase commences after the onset of DNA replication and leads to accumulation of adenovirus structural proteins. The division into an early and a late phase contributes to the efficiency of the viral life cycle by ensuring a sizeable pool of newly replicated genomes before packaging into capsids (102). This also allows time for the early, regulatory proteins to highjack the cellular pathways before replication, thus streamlining its production of progeny (103). In addition to early (E1A, E1B, E2, E3, E4) and late genes (L1-L5), there are two intermediately expressed genes: pIX and IVa2 (104). IVa2 binds the MLP to stimulate transcription (105) and has been shown crucial for the packaging of viral DNA into the capsid (106). The pIX 27 protein acts as a transcriptional activator and as a cement protein stabilizing the viral capsid (107). Both viral DNA strands contain open reading frames (ORFs) and are templates for transcription by the cellular RNA pol II. The terminal protein (TP) is covalently attached to the 5’ end of the genome and functions as a primer protein during the initiation of viral DNA replication (108). At each end of the genome there are inverted terminal repeats (ITRs), functioning as origins for viral genome replication (98). Two adenovirus genes are transcribed by RNA pol III to produce two virus-associated RNAs, known as VA RNAI and VA RNAII. These small, ca. 160 nt in length, non-coding RNAs form imperfect stem loop structures and accumulate to massive amounts during the late phase of infection (108 copies per cell) (109, 110). During the infection viral dsRNA is produced by symmetrical transcription of the genome. These molecules are recognized by the cellular protein kinase R (PKR) that can phosphorylate the eukaryotic initiation factor 2 alpha (eIF2α), which in turn blocks the cellular translation machinery. The high levels of VA RNAI produced in the infected cell compete with the dsRNAs for binding to PKR and thereby block PKR activation and concomitantly eIF2α phosphorylation. Thus, VA RNAI ensures an efficient viral protein synthesis in infected cells (111). Figure 5. Schematic map of the human adenovirus 5 genome. Grey arrows represent early genes, light grey arrows represent late genes, white arrows represent intermediate genes. Evolution and phylogeny There are at least four major genome organization patterns, which matches the classification into the different genera (112). The internal part of the genome is the most conserved, containing the structural proteins and enzymes. The ends of the genome, however, are more variable and also confer the difference in sizes of the genomes of different genera (113). The Aviadenovirus genus-specific genomes reaches up to 45 kbp, mastadenoviruses 28 comes in second with a range of 31-36 kbp. The smallest genomes are found in atadenoviruses (29-33 kbp) and siadenoviruses (26 kbp), due to the high AT content that leads to overlapping genes and thus a more compacted genome (113). Among the mastadenoviruses, E1 and E4 are located at either end of the genome, and E2 and E3 internally (114). In atadenoviruses, the position of E1A is taken up by a structural protein p32K and in siadenoviruses it is the place for the sialidase gene (113). IVa2 is found among all types investigated from the different genera, as is the E2 region containing the DBP, pTP and the Adpol (113). The E3 region is only found among mastadenoviruses, and its immune response-modulating genes can be deleted or replaced without affecting replication – a fact often used when constructing adenovirus vectors (113). The E4 region is only conserved among mastadenoviruses. The late region is well conserved across the genera. All late genes are coupled to the MLP, which is also conserved (114–116). Twelve late genes are fully conserved within the family, namely 52,52K, pIIIa, III, pVII, pX, pVI, hexon, protease, 100K, 33K, pVIII and fiber (113). The fiber gene is a special case; most adenoviruses have only one fiber protein, but aviadenoviruses have two fibers per vertex. They can be encoded from one or two genes created by duplication. The mastadenovirus species F, HAdV-40 and 41, have two fiber genes but only one protrusion per vertex, thereby alternating between a long and a short fiber protein (113). The VA RNA genes are only found in human, chimpanzee and some monkey adenoviruses. The promoter elements of the VA RNA genes are homologous to tRNA genes, suggesting that tRNA pseudogenes may have been captured by retrotransposable elements, thus giving birth to the VA RNAs. Both VA RNAI and II from human adenovirus species B, D, E and chimpanzee are evolutionarily close, indicating a gene duplication arising before the segregation of these species – probably in a hominoid ancestor (109, 117, 118). Viruses within species A contain a single VA RNA gene, which is most related to monkey adenoviruses. Strains within human species C are unique in having two VA RNA genes that does not seem to be closely related - VA RNAI is monkey-like and VA RNAII is chimp-like, indicating a recombination between a human strain and a macaque virus, SAV13 (117). Benkö and Harrach postulate that the five genera corresponds to the five major vertebrate classes (113). The fact that lower vertebrate classes, such as amphibians and fish, have their own adenovirus genera suggests that this virus family existed before divergence of bony fish from other vertebrate species, approximately 450 million years ago (119). The ancestral adenovirus is speculated to have been very small, with only the most critical information for replication and survival. In all likelihood, the evolution of adenoviruses used the same mechanisms suggested for other large dsDNA viruses including recombination, gene capture and duplication (113). Gene capture may have happened both from host chromosomes and 29 simultaneous infections of different viruses. For example, the capture of a VA RNA gene is estimated to have taken place between 5.5 and 23 million years ago: the point of divergence of chimps, old world monkeys and humans (the only species with VA RNA genes) (113, 117). Recombination is quite common in adenoviruses and seems to be especially utilized when it comes to the fiber gene. Gene duplication is exemplified by the previously discussed VA RNA genes in HAdV species B, D or E, but also by the fiber in some strains (113). The bacteriophage PRD1, within the family Tectiviridae, shares some obvious features with adenoviruses. They both have an icosahedral capsid with fiber protrusions and a unique symmetry, the genome is dsDNA with ITRs of the same length, and they contain genes for a DNA pol and a terminal protein (pTP). The latter is used as a protein primer in genome replication, a strategy shared only by these two viruses and an additional phage. Structural studies on the hexon protein of HAdV-2 and corresponding P3 from PRD1 reveal a common feature of two β-barrels (or “jelly rolls”) without any amino acid sequence homology. The hexon proteins are also both trimers in need of chaperones to assemble correctly (120–122). Adenoviruses isolated from great apes are phylogenetically related to human adenoviruses in species B, C, and E, consistent with the possibility of interspecies spread at some point in time; however, frequent interspecies infections are unlikely (95). Seroprevalence to HAdV-5 in neonates is very high in some parts of the world (close to 90 % in Africa, Thailand, India, China, Brazil) while dropping to around 50% in Europe and the United States. On the other hand, seroprevalence for more unusual types such as Ad35, ranges from less than 10% in Europe and the US, to 15% in Japan and Thailand to up to 20% in Sub-Saharan Africa (95). Pathology and disease Adenoviruses are spread from person to person through contaminated water or fomites, most commonly via the fecal/oral route in young children but also via the respiratory and ocular systems (95). The virus is contagious for up to three weeks in room temperature. The virions are relatively stable and withstands low pH and gastric secretions, allowing the virus to replicate within the gut to a high viral load (123). Severe adenovirus infections occur only under pre-disposing conditions such as immunocompromisation (AIDS, cancer treatment, organ transplantation treatment), simultaneous infections, old age or individual susceptibility (14). The adenovirus-infected cell degenerates in specific ways and the nucleus swells with inclusion bodies. Viral toxins are uncommon but the fiber protein is directly toxic to cells and has been found in the blood of fatal cases of Ad pneumonia (124). 30 Acute respiratory disease (ARD) is a frequent manifestation of adenovirus infection, especially in children, and is most commonly caused by HAdV-1, -2, -5 and -6. The symptoms include rhinitis, cough, fever, myalgia and headache, and can sometimes not be distinguished from other viral respiratory infections such as influenza (95, 125). Severe lesions within bronchioli and alveoli can be found in adenovirus pulmonary syndromes. Hypertrophy in lymphatic tissue can also be associated with adenovirus infections (95). Mild infections of the conjunctiva are normally caused by HAdV-3 and 7, but the more serious epidemic keratoconjunctivitis (EKC) are caused by HAdV-8, -19 or -37. EKC is epidemic in certain parts of Asia, such as Japan, Vietnam and Taiwan, and can lead to severe lesions of conjunctival tissue (95, 126). Most HAdV types replicate in the gastrointestinal tract, but HAdV-40 and -41 can cause disease within the intestines. Gastrointestinal adenoviral infections are mainly presented in children under the age of four, and are not as common as rotavirus infection (95). HAdV-B can cause urinary tract infections, which suggests that the adenoviral spread has been viremic at some stage in order to reach the urinary bladder (95). HAdV-11 and -21 are associated with severe hematuria and various other types have been found in the urine during systemic adenoviral infections (127, 128). Infections in other organs can sometimes occur, such as in the pancreas or the central nervous system (129, 130). In immunocompromised and/or hepatic transplant patients, hepatitis caused by HAdV-1, -2, -4, -5 and -6 has been reported (131, 132). The infection is either acquired de novo or by activation of a latent virus population. The outcome of these cases is poor, due to the intake of immunosuppressive drugs. Therapy and prevention Adenoviruses was first discovered in 1953 when Rowe and associates isolated adenovirus from human adenoid tissue while on the hunt for a single agent responsible for the “common cold”, or ARD (133). Already during the time of World War II, the epidemics of adenovirus-caused ARD were recorded: It occurred in up to 80% of new military recruits with 20-40% hospitalized, but did not affect more experienced personnel or civilians (134). The high rates of infection and hospitalizations led to the production of a vaccine against HAdV-4 and -7, the types responsible for infection in the US army (125). The vaccine consisted of live, microencapsulated virus given orally, which bypasses the respiratory tract and probably lead to a mild infection that immunized the host. However, the manufacturer ceased production in the late 20th century, and the incidence of adenovirus-caused ARD rose to levels seen before the vaccination program was initiated. In 2004 an initial study of a new adenovirus vaccine showed efficacy and safety and the 31 vaccination program has been implemented again in the US military (135). The vaccine is not administered to civilians (95). No specific antiviral therapy against adenovirus infections exists today, only palliative care can be administered and preventive measures taken such as basic hygiene. Intravenous ribavirin has been tested in isolated cases and could possibly be effective in localized infections such as cystitis, but has no effect on disseminated adenovirus infections (99). Latency and persistence The lytic replicative cycle of adenoviral infections has been extensively studied since the discovery in the 1950’s, but the details of a long-term latent/persistent infection is still unknown. If the virus is not cleared after the initial replicative phase, it can persist in a dormant state in specific cell populations within the host. Virus amplification can then be reactivated under certain conditions. Recently, studies have shown that HAdV-1, -2 and -5 are common in human adenoids and can be reactivated in immunocompromised patients with the potential to cause serious illnesses (123, 136). The rate of adenovirus infections depends on various factors such as patient age (pediatric patients are around three times more likely to be infected than adults), type of immunosuppression or background disease. For bone marrow or hematopoietic stem cell transplants, 5-47% (mean value 18%) of patients were infected with a mortality rate of 2-70% (mean value 27%) (123). When it comes to solid organ transplants, pediatric patients incidence of adenovirus infection spans from 4-10% with mortality rates of as high as 53%, in renal transplantations as many as 76% of patients were diagnosed with adenovirus infections (137) and the mortality rate reaches 17% (123). The risk of adenovirus infection in patients with AIDS is 28% after the first year. The large group of species D adenovirus is associated with AIDS patients; most cases of adenovirus infections in AIDS patients are due to HAdV-D and many species D types have been isolated from this patient group as well. The potential for long-term co-infections with other adenovirus types in AIDS patients may provide a susceptible environment for recombination and thus creation of new types (123). Adenoviruses as vectors Recombinant adenoviruses have been the most prevalent vectors used in clinical trials. Approximately 23% (or n= 438) of all gene therapy clinical trials use adenovirus as the delivery vector (138). The great majority of these are aimed at treating cancer, while a smaller part of the clinical trials use 32 adenovirus vectors for the treatment of cardiovascular disease or for the development of vaccines against other infectious agents (139). The rationale for using adenoviruses as vectors is based on their large packaging size, easy production and manipulation, broad tropism, and efficient transduction of both dividing and non-dividing cells. However, the transduction has been shown to be limited due to strong immune responses, pre-existing neutralizing antibodies to many strains, and inefficient targeting of the vector to appropriate tissues and cells because of promiscuous binding to many receptors (139, 140). In order to reduce the immunogenic response, vectors have been “gutted” of the immune response genes (E3), covered with polyethylene glycol or simply replaced with a more uncommon type of adenovirus. The inefficient targeting of vectors is mainly caused by binding to other molecules, thus directing the vector to the liver where it will be degraded. Well-designed mutations of the E1A or E1B genes lead to a conditionally replicating virus targeting cancer cells lacking the Rb or p53-controlled tumor suppressor pathways, thus leaving normal cells intact (140, 141). For oncolytic therapies, the results are wavering – although adenoviruses are well tolerated and safe, only for a fraction of patients do they increase survival slightly, shrink solid tumors or stabilize the progression of the cancer (142). The promise and the challenges of the use of adenoviruses as vectors clearly demonstrate the need for further research into both molecular mechanisms and clinical matters. With special consideration to this thesis, the special challenge of pronounced residual activity of the late genes, which leads to a low persistence of transgene expression, need to be addressed (143–146). Therefore, a better understanding of how late gene expression is controlled during a lytic infection is of great importance for the development of new and improved adenoviral gene therapy delivery vehicles. The viral life cycle Virus entry The adenovirus capsid is icosahedral, around 70-100 nm in diameter with protruding fibers at the twelve corners of the capsid. The traditional view is that the fiber contacts the cellular coxsackievirus and adenovirus receptor (CAR), while the penton, at the base of the fiber, interacts with a cellular integrin. These concerted interactions lead to the endosomal uptake of the virus particle into the cells (reviewed in (98)). However, there are some problems with this picture; CAR is poorly expressed in lung tissue and on the apical side of the epithelial cells where the adenovirus is thought to initiate infection (139, 147). CAR is forming ho33 modimers with two molecules on the basolateral sides of the tight junctions. The fiber is overproduced in the infected epithelium, and when secreted they bind to the CAR molecules thus breaking up the homodimers and the cellcell adhesion, giving the newly formed adenovirus capsids an easy way out (147). This may well be the main function of the CAR receptor in adenoviral infection. On the other hand, in a wild type infection the sheer number of produced virions may be enough to find any ruptures of the tight junctions between the epithelial cells in order to be able to bind CAR and enter the cells. Besides CAR, there is a plethora of cellular receptors, like sialic acid, CD46, MHC1-α2 and a heparan sulfate proteoglycan, that might be used in order for the virus to be taken up by the cell (139). While inside the endosome, the acidic environment induces conformational changes which lead to the stepwise release of virus particle components (148). The release of viral polypeptide VI from the interior of the virus particle triggers the degradation of the endosomal membrane and the partly dismantled capsid is released into the cytoplasm (149) and further delivered to the nuclear membrane via an active transport on microtubuli (150). At the nuclear membrane, the viral DNA is imported into the nucleus via nuclear pores by the help of viral protein pVII (151). The early phase The early phase of the viral gene expression is devoted to producing proteins needed for reprogramming of the cell to favor virus replication. The E1A transcription unit is the first to be expressed and is referred to as an immediate-early transcription unit. It does not require expression of other viral proteins but is in itself required for other viral genes to be transcribed (152, 153). The E1A mRNA accumulation is temporally regulated at the level of alternative splicing during the infectious cycle. Early in infection two major mRNAs, 13S and 12S are produced, while the minor 9S mRNA is preferentially expressed at the late phase of infection. In addition to these mRNAs, two minor isoforms 11S and 10S are produced predominantly during the late phase of infection (101, 154, 155). The proteins translated from the two major E1A mRNAs function as transcriptional regulators that activate expression of other viral transcription units as well as regulating transcription of some cellular genes. The E1B transcription unit codes for a protein of 55 kDa in size, which plays a number of functions during infection. For example, together with E4ORF6 the E1B-55K protein has a function in degradation of cellular proteins, including p53, and the transport of late viral mRNAs (156). The concerted expression of both E1A and E1B genes is needed for replication to take place. E1A binds pRB, thus freeing the E2F transcription factor which activates transcription of genes required for driving cells into the S phase (157). E1A also enhances accumulation of the tumor suppressor pro34 tein p53, which triggers cellular defense systems leading to apoptosis. This pro-apoptotic activity of E1A is counter-productive for the virus amplification. Therefore, the viral E1B-55K/E4-ORF6 complex causes proteasomal degradation of the p53 protein to ensure an efficient virus replication in infected cells (158). The E4 unit codes for several regulatory proteins referred to as E4-ORFs (E4 open reading frames). The E4-ORF4 protein interacts with the cellular protein phosphatase 2A. This interaction can cause dephosphorylation of transcription factors (e.g. E1A, AP-1) and SR proteins (e.g. SRSF1, SRSF9), thus inducing the shift from early to late phase (159–162). E4-ORF3 and E4ORF6 both stimulate the accumulation of spliced late mRNAs, block the DNA repair system from degrading the viral genome and interact with E1B55K to degrade the p53 protein (162, 163). The adenovirus E3 genes regulate the immune response of the host. E3gp19K blocks transport of MHC class I antigens to the cell surface, thereby inhibiting cytotoxic T-lymphocyte recognition (164). The adenovirus death protein (E3-11.6K) is pro-apoptotic and at the late stage of virus infection transcribed by the major late promoter. A high level of E3-11.6K production is crucial for the release of mature progeny viruses (165). Finally, the E2 unit encodes proteins needed for viral DNA replication (see below) and is the last of the early transcription units to be activated (89, 101, 166). Viral DNA replication The distinction between the early and the late phase of gene expression is defined by the onset of viral DNA replication. The major site of initial replication is the oropharynx, as expected by the initial findings of adenovirus within the tonsils. However, experiments have shown that replication is optimal within the respiratory epithelium (98). Three viral proteins encoded from the E2 transcription unit are involved in adenovirus DNA replication: DNA polymerase (Adpol), the precursor terminal protein (pTP) and the single-stranded DNA binding protein (E2A72K) (167). In addition, three cellular proteins (NFI-III) are also needed (168). The replication starts when the E2 proteins have accumulated and the cell has entered the S phase. The cellular factors NFI and NFIII bind to Adpol and pTP, respectively, and recruit them to the core origin of replication through specific interactions with cis-elements within the ITRs of the genome (169). This pre-initiation complex can assemble at either end of the genome, and then replication is initiated by the covalent attachment of the terminal nucleotide of viral DNA to the pTP protein. The free 3’ hydroxyl group of the pTP-nucleotide then primes the synthesis of nascent DNA by Adpol in a polarized fashion from 5’ to 3’. The E2A-72K protein coats the 35 second strand thus enabling Adpol to elongate the growing strand to fulllength viral DNA (168). The displaced strand can reanneal to itself through its ITRs, forming a short duplex identical to that of the dsDNA genome thereby starting a second round of replication (170). The viral DNA replication takes place at defined sites in the nuclear replication centers, which can be monitored by immunostaining of infected cells with an E2-72K antibody. At the start of replication, viral replication centers are visible as small dots in the nucleoplasm, however as infection proceeds they expand in number and size until they fill the entire nucleus (171–174). Viral transcription occurs simultaneously with DNA replication, and is thought to take place in ring-like structures surrounding the replication centers (171, 173, 175–179). Splicing factors such as SRSF1 and snRNPs that are normally found in IGCs are relocalized to the ring-like structures of viral transcription, indicating that these are also sites of viral RNA splicing (171– 174, 180, 181). Interestingly, some splicing factors are displaced from these sites later in infection. The nuclear localization of the virus-encoded alternative splicing factor L4-33K is discussed in paper I. The late phase With the exception of pIX, all the structural proteins needed for virus assembly are encoded from the MLTU. The pre-mRNA produced from the MLTU is around 28 kb. From this single transcript 20 different mRNAs are produced by alternative 3’ ss selection and polyadenylation (100). The late mRNAs are divided into five families, L1-L5, depending on their poly(A) site choice (182). Early in infection the major fraction of RNA pol II initiates transcription at the MLP and terminates gradually after the L1 polyadenylation site. Only a minor proportion of the RNA pol II continues past the L3 poly(A) site. After the onset of DNA replication, but before late viral protein synthesis has commenced, the L4 poly(A) site is activated (183–185), and later the transcription proceeds to the L5 poly(A) site (166, 186). All mRNAs expressed from the MLTU share a tripartite leader sequence at their 5’-end. This leader sequence is important for efficient translation of the late structural proteins. The leader consists of three small exons, where the 5’ ss of the third exon is joined to the alternative 3’ ss forming the individual late viral mRNAs. The tripartite leader may also contain a small element called the i-leader, which is present between exon two and three in some mRNAs during the early and intermediate phase of infection. This exon serves an as yet unknown function (101). Virus assembly and release Virus particles are assembled in the nucleoplasm. Thus the structural proteins must be imported into the nucleus after their translation in the cyto36 plasm. The individual capsid proteins are oligomerized by several different mechanisms (74), and the empty procapsids are assembled. The viral genome is packaged in a polar fashion through one of the open vertexes in the capsid, probably through interactions between the capsid and viral proteins binding the AT-rich packaging sequences of the genome. Viral proteins implicated in this process are IVa2, which is positioned at the vertex where the viral DNA enters the procapsid, L1-52,55K, L4-22K and L4-33K (187, 188). Finally, the proteolytic cleavage of several precursor proteins by the L3 protease matures the capsid into an infectious particle. As the virions accumulate in the late phase, several viral proteins (most prominently the adenoviral death protein, E3-11.6K) are involved in triggering the lysis of the infected cell in order to release the infectious progeny (187). The MLP The core MLP contains an initiator sequence (Inr) and a consensus TATAbox along with two additional cis-activating upstream elements: the CAAT box and the upstream promoter element (UPE). The latter two are functionally redundant, while the TATA-box seems to be the most crucial element (22, 189, 190) to initiate transcription from the MLP. The MLP TATA-box conforms neatly to the canonical TATA sequence found in many vertebrate promoter sequences (22). The general transcription factor TBP binds efficiently to the MLP TATA-box (191, 192), which is needed for transcription initiation from MLP. There is a complex interplay between the players of the transcription machinery and the cis-elements of the promoter, which could help explain the general robustness of the MLP. Adenoviruses with single or even multiple point mutations in the UPE, TATA or CAAT box do not show any significant transcriptional deficiencies in vivo, unless mutations in several MLP elements were affected simultaneously (190, 193). Also, the Inr element is dispensable if the TATA-box or other DNA elements are functional (22). As described above (chapter Transcription, section Initiation), the TATAbox and Inr are bound by TFIID, thus stimulating pre-initiation complex formation. TFII-I binds both the Inr and the UPE (194) and USF can also bind the Inr thus stimulating transcription (195), furthermore there are protein-protein interactions between TFII-I and USF (196). There are also communication between the TBP-associated factors (TAFs) in the TFIID complex and the Inr, which are important for the correct function (21, 197, 198). CP1 binds to the CAAT box, and the upstream stimulatory factor (USF) binds UPE. The TATA-box is surrounded by highly GC-rich sequences, which are required for high levels of transcription (199, 200). The exact function of this peculiar sequence arrangement is not known, but it has been speculated 37 that they could create a specific DNA structure scaffold for the PIC (22) or alternatively function as recognition sites for cellular transcription factors such as TBP, TFIIB (201–203), Sp1 and Maz (199). It can also be noted that the GC-rich sequence are lacking in several non-primate adenoviral MLPs (114). The late-specific activation of the MLP also requires the so-called downstream element located within the first intron (204–206). There are two elements, DE1 and DE2b/a, which specifically bind factors present in adenovirus late infected cells. The so-called DEF-B factor binds DE2b and consists of a homodimer of the IVa2 protein. DEF-A on the other hand binds both to DE1 and DE2a, and has been proposed to consist of a heterodimer of IVa2 and another viral protein (207). One research group has suggested that the unknown protein of DEF-A is the adenoviral L4-33K (see section “The L4 unit” below) (208). However, more convincing evidence suggests that the L4-22K protein is the second component of DEF-A (209–213). The interaction of IVa2 with the downstream elements has been suggested to be needed for the efficient activation of the MLP specifically at late times during a virus infection (105). The L1 model The regulation of L1 alternative splicing is an important model system for the study of the temporal shift in 3’ ss selection. The L1 unit produces two mRNAs, the 52,55K and the IIIa (Figure 6A). The 52,55K mRNA is the exclusive L1 mRNA produced during the early phase of infection. During the late phase the IIIa 3’ ss is also activated, which leads to the production of both L1 mRNAs (100). The selection of the IIIa 3’ ss depends on two cis-elements in the premRNA: the IIIa repressor element (3RE) and the IIIa virus infectiondependent splicing enhancer (3VDE). These elements are located in the intron upstream of the IIIa 3’ ss (Figure 6B). Highly phosphorylated SR proteins bind to the 3RE during the early phase of infection. This blocks U2 snRNP recruitment to the branch point sequence, thereby inhibiting the use of the IIIa 3’ ss for spliceosome assembly (214). To relieve this inhibitory effect on IIIa 3´ss selection, the E4-ORF4 protein associates with the cellular protein phosphatase IIA (PP2A) and induces a dephosphorylation of SR proteins (160). The 3VDE has been shown to be the major cis-acting element controlling IIIa 3’ ss usage in adenovirus infected nuclear extracts (Ad-NE). It contains a weak PPY, which has a low affinity for binding the cellular splicing factor U2AF. The cellular IgM transcript shares the characteristic of a weak PPY, but can actually proceed through the first step of splicing in U2AF-depleted Ad-NE, which is a strong indication that a viral factor somehow replaces 38 U2AF late during infection (215). This factor has been named 3VDF and is suggested to contain the adenoviral L4-33K protein (185) together with a yet to be discovered cellular factor. L4-33K has also been shown to play a role in the early to late switch of adenoviral gene expression during infection, which further supports this hypothesis (216). There are additional viral proteins that have been shown to play a role in the temporal shift in MLTU alternative splicing. The early proteins E4ORF3 and E4-ORF6 stimulate tripartite leader splicing by inducing i-leader exon inclusion and i-leader exon skipping, respectively (163). Whether these activities are essential for the regulated expression of the MLTU mRNAs is not known. Figure 6. The L1 transcription unit. A) The temporal splicing pattern of the L1 premRNA. B) The working model for L1 alternative splicing. In the late phase, hypophosphorylated SR proteins are released from the 3RE and the 3VDF recruits U2 snRNP to the branch point, thus activating IIIa 3’ ss usage. The L4 unit The L4 unit contributes to the temporal control of adenoviral gene expression. It encodes for five proteins: pVIII, two isoforms of L4-100K, L4-33K and L4-22K. pVIII is a structural protein, while the L4-100K protein aids in the translation of viral mRNAs (217) and the assembly of viral hexon trimers (218). The L4-22K and L4-33K protein share the 3’ ss, which is joined to the 5’ ss of the third tripartite leader exon. L4-33K is spliced at an additional intron, leading to a frame-shift in the amino acid sequence (Figure 7A). Thus, the two proteins share the first 105 amino acids, but their C-termini are unique. Both L4-33K and L4-22K have been implicated in binding to the core sequences of the viral DNA packaging domain (187, 188, 208–210), which suggests that this function might be connected with the N-terminus. 39 L4-33K is a nuclear phosphoprotein of 227 amino acids (HAdV-5), with a highly conserved C-terminal end. This part includes an intron designated “ds” which is rarely used. The ds region has been shown to be critical for both the function (185) and localization of the protein (see paper I). The 27 amino acids of the ds region comprise three RS- and one SR-repeat (Figure 7B). As mentioned before, these RS/SR-repeats are important for the function of SR proteins, suggesting that L4-33K might indeed be similar in structure to this class of splicing factors. However, L4-33K is not a classical SR protein, since a classification into this family of proteins requires a domain of at least 50 amino acids with more than 40% RS content (65). Also, L433K differs from the classical SR proteins in that it cannot complement cytoplasmic S100 extracts in an in vitro splicing assay. Further, L4-33K is not phosphorylated by the SR protein–specific kinases Clk/Sty or SRPK1 (219). A distinctive feature of L4-22K is its ability to stimulate transcription. It has been shown to activate expression from the MLP and that maximal transcriptional activation of the MLP by L4-22K requires the presence of the DE element (220). In addition, the L4-22K protein can activate other viral promoters such as the pIX and L4 promoters, which makes it an important part of the early to late switch of adenoviral gene expression (220, 221). Figure 7. A) Organization of the L4 region of the MLTU. Boxes represent ORFs, and diagonal lines show which parts are excised from the RNA. B) Schematic picture of the L4-33K protein. The C-terminus is evolutionarily conserved and contains the tiny RS-repeat important for both function and localization (ds). 40 Present investigation Paper I Serine 192 in the tiny RS-repeat of the adenoviral L4-33K protein is essential for nuclear localization and reorganization of the protein to the periphery of viral replication centers The L1 pre-mRNA transcribed from the MLTU has one 5’ ss and two 3’ ss, which are used differentially during the virus life cycle. Early in infection, the proximal 3’ ss is selected, producing only the 52,55K mRNA. In the late phase there is an effective use of the distal 3’ ss too, leading to the accumulation of IIIa mRNA. The L4-33K protein, which starts to be expressed at intermediate times, increases the use of the distal 3’ ss, thus taking part in the switch from early to late phase. L4-33K has a tiny RS-repeat, which is closely connected to its function. This feature can also be found among the members of the SR family of splicing factors, which are essential splicing regulators. The function of SR proteins is regulated by reversible phosphorylation of the RS-repeats (69–71). However, L4-33K is not classified as an SR protein since it does not fulfill the formal requirements (65). Our group has previously shown that the tiny RS-repeat of L4-33K is crucial for the function of the protein as an alternative RNA splicing regulator (185). The RS-domain in the cellular SR family of splicing factors has been shown to govern the subcellular localization of the SR proteins within the nucleus, where the level of phosphorylation is closely connected to their presence in the IGCs (splicing factor compartments) (71, 222–224). Here we further investigated the role of the tiny RS-repeat in the subcellular localization of the L4-33K protein. L4-33K does not localize to nuclear speckles When transfecting a plasmid expressing a FLAG-tagged L4-33K protein into HEK293-cells, and subsequently immunostaining for FLAG and SRSF2, we found that L4-33K does not co-localize with the nuclear IGCs. In fact, L433K was evenly distributed in the nucleus with considerable enrichment in the nuclear membrane. The protein was also present in small cytoplasmic spots in proximity to the nucleus, but the nature of these structures was not further investigated. The distribution of L4-33K within the nuclear membrane closely resembled that of lamins (nuclear matrix proteins), and when 41 transfected cells were co-stained for L4-33K and lamin B a certain colocalization was evident. Interestingly, the nuclear lamina tethers transcription factors and DNA to the nuclear membrane. The related L4-22K protein also localized to the nucleus but did not co-localize with the nuclear membrane, which implies that the domain responsible for membrane attachment is confined to the C-terminus of L4-33K. Serines in the tiny RS-repeat are important for nuclear localization The L4-22K and L4-33K proteins are closely related and share the 105 Nterminal amino acids. Since the two proteins both localize exclusively to the nucleus, we decided to sequentially delete parts of the N-terminus of L4-33K to investigate whether there was a common nuclear localization signal within this region. For this experiment, we transfected L4-33K deletion mutants into HEK293 cells and investigated the sub-cellular distribution using immunofluorescence and fractionation methods. None of the truncated proteins displayed an altered nuclear localization, indicating that both proteins must have their own localization signal within their unique C-terminal ends. The tiny RS-repeat has previously been shown to be essential for the L433K splicing enhancer function (185). Since there is a correlation between RS-domains and nuclear localization, we decided to investigate whether this was also the case for L4-33K. Using deletion and point mutants of L4-33K, we observed that a deletion taking out the tiny RS-repeat resulted in a loss of the exclusive nuclear distribution of the L4-33K protein. Mutating the serine residues individually demonstrated that serine 192 was crucial for nuclear localization, whereas mutations affecting the other serines did not change the nuclear localization pattern. Interestingly, these results correlate well with our previous results from in vitro splicing assays with the same mutant proteins (185). Our data suggest a tight connection between the splicing enhancer function of L4-33K and its sub-cellular localization, which is a hallmark of several SR proteins (71, 222–224). This further strengthens the notion that L4-33K shows similarity to the SR protein family of splicing factors. Splicing defective L4-33K mutant proteins do not relocalize to viral replication centers During a wild type HAdV-5 infection L4-33K is organized into ring-like structures in the nucleoplasm, the so-called viral replication centers (225). We asked whether this distribution pattern might depend on the tiny RSrepeat. Cells were transiently transfected with wild type and mutant L4-33K constructs and subsequently infected with wild type HAdV-5. The results showed that the wild type L4-33K protein efficiently reorganizes to the periphery of replication centers. On the other hand, the splicing defective mutant proteins maintained their diffuse nuclear distribution. Interestingly, our results demonstrated an excellent correlation between the function of L442 33K as a splicing enhancer protein and the capacity of the protein to localize to the nucleus and reorganize into viral replication centers. This suggests a link between the splicing enhancer function and the sub-nuclear localization mediated through the tiny RS-repeat. It should also be noted that the functional splicing assays were performed in vitro with uninfected HeLa cell nuclear extracts (HeLa-NE) supplemented with recombinant L4-33K protein. Thus, the defective splicing function of the mutants cannot be ascribed to loss of nuclear localization or reorganization. During a wild type HAdV-5 infection, SRSF1 has been reported to inhibit L1-IIIa splicing (160). The viral E4-ORF4 protein binds the cellular protein phosphatase PP2A, relieving an SR-mediated inhibition on L1 splicing, thereby governing the shift from early to late pattern of L1 alternative splicing. The E4-ORF4/PP2A complex binds to highly phosphorylated SR proteins and induces SR protein dephosphorylation (226). Early in infection SRSF1 is present in ring-like structures surrounding the viral replication centers, but as the infection proceeds it is redistributed into the periphery of the nucleus (180, 181, 227). The same has been shown for hnRNP A1 (227). Collectively, these observations might suggest that adenovirus reorganizes the cellular splicing machinery to the viral replication centers early in infection. At late times of infection, a subset of the cellular splicing factors that are not needed for late virus-specific splicing are redistributed to the periphery of the nucleus, perhaps in order for the virus to use its own factors, like L4-33K, to induce the late phase of gene expression. In fact, when monitoring SRSF1 localization and L1 alternative splicing in a time course experiment, the IIIa 3’ ss choice is favored at the time point when SRSF1 is displaced from the viral replication centers (181). Function and localization of L4-33K is not restored by replacing serine 192 with the phosphomimetic aspartic acid Phosphorylation can be used as a means of regulating both function and localization of SR-proteins (228). In the mutants we had tested so far serine was substituted for glycine, which is the smallest amino acid with only a hydrogen atom in place of the side chain. To mimic the properties of a phosphorylated serine, we exchanged Ser192 with aspartic acid, which has a permanent negative charge. However, this mutant behaved essentially as the S192G mutant with a partial nuclear localization in transiently transfected cells and a failure to relocalize to viral replication centers after wild type HAdV-5 infection. Further testing of the S192D mutant protein in our L1 alternative splicing reporter assay, demonstrated that it lacked the IIIa splicing enhancer activity associated with the wild type L4-33K protein. 43 The ds region is necessary but not sufficient for the nuclear localization of L4-33K Since we found that the ds region and its serine residues are important for the subnuclear localization of L4-33K, we decided to investigate whether this region was also sufficient for directing the protein to the nucleus and into viral replication centers. For this experiment we transfected 911 cells with plasmids expressing different eGFP-L4-33K fusion proteins, infected a proportion of them with wild type HAdV-5 and investigated protein localization by immunofluorescence. The eGFP protein itself is dispersed throughout the whole cell. However, fusion of the entire wild type L4-33K reading frame efficiently relocalized the eGFP-L4-33K protein to the nucleus, as expected. The fusion protein also efficiently relocalized to the periphery of the viral replication centers in HAdV-5 infected cells. Though, when attaching only the ds region with its tiny RS-repeat to eGFP, the construct localized to the whole cell both in infected and uninfected cells. Thus, we conclude that the ds region is necessary but not sufficient for the exclusive nuclear localization and redistribution of L4-33K. Paper II A suppressive effect of the first leader 5’ splice site on L4-22K-mediated activation of major late transcription The adenoviral MLP is active both at the early and the late phase of infection, but maximal activation requires both DNA replication and expression of viral late proteins (182, 229). Activation of the MLP depends on several viral factors, for example E1A-289R (22), IVa2 (230), pIX (231) and L422K (220). Downstream of the MLP transcription start site, the DE element is located. It has been shown to bind the IVa2 and L4-22K proteins (209, 210, 212, 213, 220), which enhances MLP transcription. Here we report a novel effect of L4-22K on MLP transcription through another promoter element overlapping the first leader 5’ ss, the so-called R1 region. The R1 region suppresses L4-22K-mediated activation of MLP transcription in vivo and in vitro Efficient transcription from the MLP has been suggested to need binding of viral factors to sites downstream of the MLP start site, R1-R3 (205, 232). R2 and R3 overlap the DE element, and R1 interacts with a viral factor specific for the late phase. To test whether the L4-22K protein was the unknown late phase-specific factor interacting with the R1 region, we cotransfected cells with L4-22K-expressing plasmids and an MLP reporter with or without R1. Reporter gene expression was measured using the primer extension assay or the S1 protection assay. The results showed that whereas L4-22K activated 44 expression from the MLP wild type construct, the activation was approximately 2-fold higher for the MLP∆R1 reporter construct. To further characterize the significance of the R1 region for L4-22K-mediated activation of MLP transcription we resorted to an in vitro experimental system. In this experiment linearized MLP wild type and MLPΔR1 templates were incubated in HeLa-NE with recombinant L4-22K protein. The results showed that deletion of the R1 region lowered the basal activity of the MLP, while addition of L4-22K activated the MLPΔR1 template approximately 1.5-fold better compared to the MLP wild type template. Interestingly, addition of higher amounts of L4-22K protein suppressed accumulation of the full-length transcript on the wild type MLP template and resulted in an increased accumulation of prematurely terminated transcripts around 40 nucleotides in length. Deletion of the R1 region resulted in a reduction of premature transcript accumulation. Previously, it has been reported that MLP transcripts, at late times of infection, can terminate at discrete sites within 300 nucleotides after the transcription initiation site (233, 234). Inactivation of the major late first leader 5’ splice site increases L4-22Kmediated activation of the MLP The major late first leader 5’ ss is situated within the R1 region. To investigate whether this was the regulatory element of R1, we introduced four point mutations in the conserved 5’ ss consensus sequence. This construct was cotransfected with L4-22K-expressing plasmid, and gene expression assayed with primer extension and S1 analysis. The disruption of the 5’ ss led to an increase in transcription analogously to the R1 deletion mutant. The MLP5’ss mutant, similarly to the MLP-ΔR1, accumulated less prematurely terminated transcripts as compared to the wild type MLP. L4-22K binds to the distal part of the R1 region A factor specific for the late phase has been shown to bind to the R1 region (205), and we speculated that this factor might be L4-22K. The distal end of the R1 region contains a sequence motif, 5’-TTTG-3’, which is the consensus sequence for L4-22K-specific binding (210). In a gel shift assay we demonstrated that not only did recombinant L4-22K protein bind to the wild type R1 region, but also that it interacted specifically with the 5’-TTTG-3’ motif in R1. R1 oligonucleotides with mutations within the 5’-TTTG-3’ motif could not out-compete L4-22K binding to the radiolabeled wild type R1 probe, whereas mutants with point mutations within other parts of the R1 region did so efficiently. As pointed out earlier, L4-22K has been shown to bind to the DE element and also enhance MLP transcription by a DE element-dependent mechanism (220). In an effort to analyze the relative efficiency of L4-22K binding to the R1 region contra the DE element, gel shift assays with different combinations of cold and radiolabeled probes were performed. The binding of L4-22K to the DE element was much stronger, 45 but seemed to be saturated at a point where L4-22K started to bind to the R1 region. This result agrees with the observation that protection of the R1 region was much weaker than L4-22K binding to the R2 and R3 regions (which overlap the DE element) (205). These observations could indicate that the virus controls transcription from the MLP both in a positive and negative way – once there is profuse amounts of L4-22K, the protein starts to bind the R1 region, thus exerting an inhibitory effect on MLP transcription. L4-22K binding to the R1 region promotes recruitment of Sp1 to the 5’ splice site The incubation of radiolabeled R1 oligonucleotides in HeLa-NE with L422K enhanced the formation of four specific complexes. Complex 1 corresponds to the complex formed without HeLa-NE and thus only consists of the L4-22K protein. Using a cold DE element probe, we could show that complex 3 appears to form at lower L4-22K concentrations as compared to complex 2 and 4. The R1 region has a high GC-content, which could suggest the presence of binding sites for Sp1 and Sp3-related transcription factors (235). Sp1 can both activate and repress transcription, and has been reported to activate MLP transcription through GC-rich sequences located immediately downstream of the TATA box (199, 236). To determine whether complex 2 and/or 4 was composed of these proteins, the R1 DNA probe was incubated in HeLa-NE with or without antibodies against Sp1 or Sp3 transcription factors. A super-shift of complex 4 occurred with the Sp1 but not with the Sp3 antibody. The nature of complexes 2 and 3 is currently unknown. Cellular factors are recruited to the major late first leader 5’ splice site Next, we tested whether L4-22K could recruit cellular factors through binding to the 5’ ss by comparing the R1 and MLP-5’ssM probes. The addition of a surplus of cold MLP-5’ssM probe competed as well as the R1 oligonucleotide for binding of L4-22K protein to the R1 region. However, L4-22K was not able to stimulate the recruitment of cellular factors to the R1 region in the 5’ssM. This result suggests that formation of complexes 2-4 takes place at the major late first leader 5’ ss, or at sites overlapping this element. Collectively, our data suggests that complex 1 consists of L4-22K and complex 4 of L4-22K and Sp1 in unknown constellations. 46 Paper III RNA elements involved in adenovirus L4-33K regulation of alternative splicing Adenovirus gene expression is subjected to a temporal regulation with distinct phases designated as the early, the intermediate and the late phase of infection (100). Several viral proteins have been shown to participate in the switch from the early to the late phase, two of them being L4-22K and L433K. In this paper we further investigated the differential roles of these proteins on RNA splicing and additionally characterized a novel cis-acting RNA element required for L4-33K activation of splicing. L4-33K but not L4-22K functions as a splicing enhancer protein Since the L4-22K and L4-33K proteins are closely related, sharing the first 105 amino acids, they might have overlapping functions. L4-22K has been suggested to function as a transcriptional activator (220) whereas L4-33K has been demonstrated to possess an alternative splicing enhancer activity (185). However, L4-22K has not been investigated in detail when it comes to a possible function as a splicing enhancer protein. In fact, Morris and Leppard proposed that both L4-22K and L4-33K are processing factors activating MLTU RNA expression, since both proteins could rescue late protein synthesis in transient transfection assays (184). We tested the activity of L4-22K and L4-33K in in vitro splicing experiments using the IIIa, pV, pVII, penton and hexon pre-mRNAs as templates. The results clearly demonstrated that L4-22K did not activate splicing of any of the transcripts, but actually seemed to inhibit hexon splicing. In contrast, and in agreement with our previous results, L4-33K activated splicing of all tested transcripts (185). This result further strengthens the hypothesis that the two L4 proteins carry out different functions in the cell. L4-33K activates splicing of its own pre-mRNA In order to create a timely shift from the early to the late phase of gene expression, a self-inducible feature such as a feed-forward mechanism of activation is an attractive hypothesis for a key regulator such as L4-33K. We decided to test this both in vivo and in vitro. For the in vivo experiments we cotransfected cells with L4-33K-expressing plasmids and reporter constructs and analyzed reporter gene expression by the S1 protection assay. The in vitro splicing assay was performed on an L4-33K transcript incubated with recombinant L4-33K protein in HeLa-NE. Both assay systems confirmed the hypothesis that L4-33K induces splicing of its own pre-mRNA. 47 It is interesting to note that L4-33K has previously been shown to preferentially activate splicing of transcripts with a weak 3’ ss context, i.e. a highly interrupted PPY such as that of our L1 model transcript (185). However, the L4-33K second exon 3’ ss has a moderately weak PPY. We therefore decided to map the L4-33K responsive cis-element(s) in the L4-33K pre-mRNA, by creating several β-globin/L4-33K chimeric transcripts. β-globin was chosen since it is constitutively spliced and non-responsive to L4-33K (185). The transcripts were tested in in vitro splicing assays, with or without the addition of L4-33K. Surprisingly, we found that the L4-33K responsive element in the L4-33K pre-mRNA was positioned upstream of the branch point, in variance with the position of the 3VDE in the L1 model. The results from these studies could indicate that the L4-33K responsive element in the L433K pre-mRNA might not correlate with the PPY as previously has been demonstrated for the L1 model pre-mRNA. The L4-33K induced shift in L1 alternative splicing is not promoterdependent In vivo, transcription and splicing are for the most part closely connected (237). Several transcription and splicing factors are promiscuous in that they can regulate both processes. This connection is lacking in our in vitro experiments, and as a way to study whether the IIIa splicing activation capacity of L4-33K might be promoter-dependent, we compared the effect of L4-33K on mRNA expression from an L1 construct driven by a CMV- or an MLPpromoter. Both reporter constructs were transfected into HEK293-cells with or without an L4-33K expressing plasmid and mRNA expression was visualized by the S1 protection assay. The basal level of expression was higher from the CMV-construct, in agreement with the observation that the CMVpromoter is intrinsically strong. However, the observed L4-33K enhancement of IIIa mRNA expression was similar in both constructs, implying that the promoter driving transcription is not crucial for the correct control of L1 alternative 3’ ss usage. The requirement for cis-competition in L1 alternative splicing Activation of IIIa splicing in single 3’ ss constructs appears to be regulated correctly in vitro (238). Therefore cis-competition between the 52,55K and IIIa 3’ ss has not been believed to be required for a correct regulation of L1 alternative splicing. However, a deletion of the 52,55K 3’ ss weakly activated the non-regulated use of the IIIa 3’ ss in transient transfection assays (239). Clearly, these opposing results warranted a further analysis. For these experiments we removed a large part of the L1 sequence (600 bp) including the 52,55K 3’ ss, which resulted in a high basal activity of IIIa splicing that was not further enhanced by L4-33K cotransfection. This result 48 indicated that an upstream 3’ ss can indeed suppress the usage of the downstream IIIa 3’ ss. Further, exchanging the wild type 52,55k 3’ ss with an unrelated 3’ ss from the rabbit β-globin gene, the L4-33K activation of the IIIa splice site usage was restored. The loss of normal splicing regulation of L1 in the absence of an upstream splice site suggests that there may be another dimension to the regulation of L1 alternative splicing. Interaction of L4-33K with cellular proteins We have previously failed to detect a direct RNA binding capacity of L433K (unpublished results). Therefore it seems that this protein must rely on a cellular component in order to exert its effect on splicing. To investigate whether selected cellular splicing regulators control L4-33K-activated splicing, we cotransfected plasmids expressing L4-33K and SRSF3, PTB, PUF60, U2AF, hnRNP F or hnRNP H, along with a IIIa reporter construct, and analyzed the results using the S1 protection assay. Strikingly, none of the cellular candidate proteins had an activating effect on IIIa splicing, with our without the presence of L4-33K protein. However, most interestingly we found that PUF60 and U2AF repressed IIIa splicing. SRSF3 and PTB actually neutralized the L4-33K enhancer effect on IIIa splicing. Both these proteins are known to bind PPYs and regulate alternative splicing (34, 240, 241). This may be an indication that, if not L4-33K itself, another protein must bind to the PPY of IIIa to activate its splicing, and that this binding is outcompeted by overexpressing SRSF3 and PTB. Paper IV Conservation of transcriptional and post-transcriptional activities of serotype-specific adenoviral L4 proteins A decade ago, the functions of the L4-22K and L4-33K proteins were basically unknown but interest was sparked after a publication stating that L4 gene products were involved in the early to late switch in viral gene expression (216). L4-22K as a transcriptional activator and L4-33K as a splicing enhancer have gained interest in the past few years, but little is known about what function/-s they have in other adenovirus serotypes. Since they are highly conserved in the C-terminal parts and share the N-terminus, there may well be an overlap in functions. Here we investigated whether the functions of the L4 proteins encoded by HAdV-3, HAdV-4, HAdV-5, HAdV-9, HAdV-11 and HAdV-41 are indeed conserved. While investigating the different serotypes, we came across an interesting finding, namely that the L422K and L4-33K proteins from all tested serotypes also functioned as enhancer proteins of E1A alternative RNA splicing. 49 Activation of adenovirus L1 pre-mRNA splicing by serotype-specific L4-33K proteins L4-33K has a pivotal role in the selection of the distal 3’ ss of the L1 premRNA, thus leading to an efficient accumulation of the IIIa mRNA. This has been shown for HAdV-5 L4-33K both in vivo and in vitro, while HAdV5 L4-22K has been shown to lack this effect (5, 21, paper III). Here, we investigated the regulatory activity of L4-22K and L4-33K proteins from different serotypes by cotransfecting HEK293 cells with an L1 reporter plasmid and serotype-specific L4-22K or L4-33K-expressing plasmids. The harvested RNA was probed using the S1 nuclease protection assay, which showed that L4-33K from all tested serotypes stimulated accumulation of the IIIa mRNA to the same degree. On the other hand, none of the L4-22K proteins could stimulate distal 3’ ss usage. Differential MLP transcriptional activation properties of the serotypespecific L4-22K proteins As previously described, HAdV-5 L4-22K, but not HAdV-5 L4-33K, can activate transcription from the MLP (220). We investigated whether this feature was conserved through the different serotypes tested here. For this experiment, HEK293 cells were cotransfected with an MLP reporter plasmid and L4-22K or L4-33K serotypes, and RNA was subsequently analyzed by the S1 assay. As previously shown for the HAdV-5 protein, the L4-33K proteins from the other serotypes could not activate MLP transcription. However, among the L4-22K proteins there were striking differences in activity with the HAdV-4 and HAdV-9 L4-22K proteins stimulating MLP activity much better than the previously characterized HAdV-5 L4-22K protein; 6-fold and 10-fold, respectively. HAdV-9 L4-22K protein localizes to the nuclear rim Previously, L4-22K and L4-33K has been shown to be exclusively localized within the nucleus of infected and/or transfected cells (242). The nuclear pattern of L4-33K localization depends on the serine residues within the tiny RS-repeat. Serines can be phosphorylated, which is a common means of controlling both function and localization of proteins. An analysis of the potential phosphorylation sites within the serotype-specific L4-22K proteins, demonstrated that the HAdV-9 L4-22K protein had significantly less potential phosphorylation sites compared to the other L4-22K proteins. We found this interesting, since HAdV-9 L4-22K was considerably better at enhancing transcription from the MLP compared to the HAdV-5 L4-22K protein. We also tested whether this lack of potential phosphorylation sites affected the intracellular localization using in vivo transfection experiments followed by immunofluorescence staining. While the L4-22K protein from HAdV-5, 50 HAdV-3 and HAdV-4 localized exclusively to the nucleus, HAdV-9 L4-22K was also found in the cytoplasm and most interestingly in the nuclear margin. The nuclear rim, or margin, has been reported as a place for both transcriptional silencing and activation (243, 244), thus it is an interesting location for the L4-22K protein as a transcriptional regulator. HAdV-5 L4-22K acts at the level of E1A alternatively spliced mRNA accumulation While L4-33K functions as a splicing enhancer protein preferentially activating L1-IIIa and other late adenovirus pre-mRNA splicing reactions (185, 220, 242), L4-22K has been shown to have no stimulatory effect on splicing in our experimental systems ((220) and paper III). Others have implicated L4-22K in splicing and/or post-transcriptional processing (184, 245). Here we demonstrated that the L4-22K protein, much to our surprise, activated E1A alternatively spliced mRNA accumulation in transiently transfected HeLa cells. Isoform-specific E1A mRNAs were analyzed by RT-PCR, with carefully designed primer pairs differentiating between isoforms. The results demonstrated that the E1A-10S mRNA was greatly induced, while only small differences could be detected for the other isoforms. The wild type E1A plasmid includes the viral DNA packaging signal, which contains seven so-called A-repeats that have been shown to bind L4-22K (209, 210, 212, 213). In order to test whether L4-22K binding to the A-repeats was necessary for the E1A alternative splicing enhancer function, we used a CMVdriven E1A construct that lacks the packaging motifs. Although more efficient in basal transcription (and thus accumulation of all isoforms), the E1A10S was still upregulated by L4-22K cotransfection. HAdV-5 L4-33K enhances E1A-10S mRNA accumulation Since L4-33K stimulates splicing of several late transcripts (185), we also tested whether this protein regulated E1A alternative splicing. The experimental set up was as before, but this time L4-22K only had a modest effect on E1A splicing. L4-33K on the other hand, gave a dramatic increase in E1A-10S mRNA accumulation. The tiny RS-repeat within the C-terminal end unique to the L4-33K protein has previously been shown to be the crucial motif required for the L1IIIa splicing enhancer effect of the protein. The serine residues within this motif are essential for the stimulatory effect on IIIa splicing with serine 192 being crucial for the function of the protein (185, 242). However, the S192G point mutant had no negative effect on the E1A-10S stimulatory effect of the protein. Since our experiments show that only L4-33K stimulates L1 splicing (185) but both L4-33K and L4-22K activates E1A-10S splicing, we propose that the splicing activation of L1-IIIa and E1A-10S is carried out by different 51 mechanisms. The E1A-10S effect is probably dependent on the common Nterminal of the L4-22K and L4-33K proteins. Impact of other adenoviral L4-22K serotype proteins on E1A mRNA expression As commented on before, little if any data exists on the function of L4 proteins from human serotypes other than HAdV-5. On this note, we tested whether the serotype-specific L4-22K proteins also regulated E1A alternatively spliced mRNA accumulation. The results demonstrated that all serotype-specific L4-22K proteins increased E1A-10S mRNA accumulation, although the magnitude of this increase varied somewhat between serotypes. Interestingly, HAdV-11 L4-22K markedly inhibited the overall E1A mRNA expression, although the pattern of E1A-10S mRNA increase was still obvious. We have also constructed L4-33K-expressing plasmids from the corresponding serotypes that will be of great interest to investigate whether the effects on E1A splicing pattern will be similar or different with these constructs. 52 Concluding remarks In this thesis I have outlined the current knowledge about the general concepts in transcription and RNA splicing, and in more depth described what is known about adenoviruses and more specifically the late proteins L4-22K and L4-33K. Before this project was undertaken, conflicting reports existed on the properties and functions of these two proteins. Both have been implicated in viral genome packaging and virus assembly, as well as being regulators of gene expression. Multiple reports had shown that L4-22K binds to the DE element (209, 210, 212, 213), and our group has shown that this binding leads to an induction of MLP transcription (220). We have also shown that L4-33K stimulates IIIa pre-mRNA splicing (185), while L4-22K lacks this activity in vivo (220). However, other groups have reported on a possible role for L4-22K in post-transcriptional gene regulation (184) and for L4-33K as a factor binding to the DE element and activating MLP transcription (208). The aim of my project was to further characterize the functional properties of these two L4 proteins. In paper III, I characterize some basic properties of L4-22K and L4-33K. I further establish the difference between them in in vitro splicing assays, where L4-22K, in contrast to L4-33K, were found to be non-functional in activating splicing of several of the pre-mRNAs expressing late viral genes. Further, I showed that L4-33K activates its own internal splicing event both in vitro and in vivo. Interestingly, recent in vivo experiments point to L4-22K also being involved in the post-transcriptional regulation of L4-33K mRNA splicing (211). Clearly, my experiments should be complemented by the addition of L4-22K to the assay. While dissecting the L4-33K pre-mRNA and constructing hybrid transcripts with β-globin (which is inert to L4-33Kinduced splicing (185)), I identified a region immediately upstream of the branch-site of the 3’ ss that was required for L4-33K activation of selfsplicing. This RNA element does not correspond to the earlier described weak PPY as responsive to L4-33K activation of splicing. Thus, it may represent a new motif perhaps responsive to stimulation by both L4-22K and L4-33K. However, this remains to be experimentally tested. Previously, our group identified and characterized a region within the Cterminus of the L4-33K protein closely connected to its function as an alternative splicing enhancer (185). I show here that this so-called tiny RS-repeat contains a signal governing the exclusive nuclear localization of the protein in transient transfection assays, as well as the relocalization of the protein to 53 replication centers during infection (paper I). Removing the entire RS-repeat redistributed the protein equally between the cytoplasm and the nucleus. An analysis of the localization of eGFP-L4-33K fusion proteins, demonstrated that a hybrid of eGFP and the full-length L4-33K protein effectively relocalized the cytoplasmic eGFP protein to the nucleoplasm. The eGFP fusion with the tiny RS-repeat did not relocalize to the nucleus or to the viral replication centers in HAdV-5 infected cells. Thus, the tiny RS-repeat is necessary but not sufficient for the correct localization of L4-33K. Our immunofluorescent experiments with L4-22K demonstrated that this protein also is exclusively nuclear. L4-33K mutant proteins that fail to accumulate in the nucleus retain the N-terminal 105 amino acids; this indicates that the L4-22K protein must have its own unique nuclear localization signal within its unique C-terminus. Future experiments could be aimed at identifying this signal through sequential deletions of the N-terminus. A specific nuclear receptor, Transportin-SR, is responsible for import of SR proteins to the nucleus (246, 247). Using a Yeast Two Hybrid system, it should be possible to check for an interaction between our proteins and known nuclear import and export factors. Also, several SR-proteins shuttle between the nucleus and cytoplasm through their RS-domains (248). Thus it would be interesting to investigate whether the tiny RS-repeat (or any other domain) in L4-33K could convey this property by using a heterokaryon assay (249). In order to further establish the connection between nascent viral mRNA and the L4 proteins an in situ hybridization strategy could be employed. Using the same methodology as Gama-Carvalho and colleagues (181), we could survey the interactions between L4-22K and L4-33K and our model L1 transcript. In paper II, the interactions of L4-22K with the MLP were further investigated. While L4-22K binding to the DE element stimulates MLP transcription (220), we identified a novel self-regulatory mechanism causing an inhibition of MLP transcription. A region spanning the first leader 5’ ss was shown to bind L4-22K, and when deleted resulted in a higher L4-22Kmediated activation of MLP transcription. In fact, inactivation of the first leader 5’ ss achieved the same effect. The inhibitory effect of the first leader 5’ ss seems to rely on L4-22K recruiting cellular factors to the MLP. One of these was demonstrated to be the cellular transcription factor Sp1. However, we have indications on at least two more possible cooperative partners, which we will try to identify by using chromatin immunoprecipitation assays. Surprisingly, we found that high amounts of L4-22K increased the accumulation of prematurely terminated MLP transcripts. This may indicate that L4-22K might limit MLP transcription by suppressing transcription elongation. These findings could suggest a way of self-control of MLP transcriptional activation. The L4-22K binding affinity was greater to the DE element than to the R1 region. Therefore, at low concentrations L4-22K binding to 54 the DE will be preferred, leading to a stimulation of MLP transcription. Later in infection, the amounts of L4-22K will build up and start to bind to the R1 region, thus leading to an inhibition in MLTU transcript synthesis. This is an attractive hypothesis of a negative feedback loop, effectively regulating the temporal expression from the MLTU. However, it needs much further experimental proof. Despite the use of several different methods, we have not been able to detect a direct binding of L4-33K to RNA. Thus, this protein might act through a hit-and-run mechanism or use a cellular protein in order to bind RNA. In paper III, we tested different attractive candidate proteins known to have functions in transcription and splicing, but none of them showed a positive effect on IIIa splicing in vivo. Interestingly, two of them actually inhibited IIIa splicing, for example the polypyrimidine binding protein (PTB) and the splicing factor SRSF3. The experiment was based on transfection of PTBand SRSF3-expressing plasmids, and thus an overexpression of these essential proteins might affect the entire process of splicing in unforeseen ways. A search for an associative cellular protein to L4-33K could be further extended. A possible approach would be to use a Yeast Two Hybrid assay with a library of cellular proteins, in an attempt to identify a protein that might bridge L4-33K binding to the RNA. Using mass-spectrometry-based methods, our group has previously identified a cellular protein kinase, DNA-PK, which interacts with L4-33K. This protein has a negative effect on L4-33K-activated L1 alternative splicing during a virus infection, while a second cellular protein kinase, PKA, has the opposite effect (219). Since the tiny RS-repeat contains several serine residues, a regulation of L4-33K activity by phosphorylation/de-phosphorylation reactions is an attractive model. Serine 192 in L4-33K has been shown to be crucial both for the splicing enhancer function (185) and subcellular localization of the protein (see paper I). Serines are commonly phosphorylated as a means of regulating the activity or specificity of proteins, which is interesting in the context of the specific interaction of L4-33K with DNA-PK and PKA, respectively. However, replacing serine 192 with a phospho-mimetic, aspartic acid, the specific function and localization of L4-33K was lost – just as with the glycine mutant. Thus, phosphorylation of serine 192 might not be the regulatory step needed for function, but perhaps a cycle of phosphorylation/de-phosphorylation reactions. Also, we do not know whether serine 192 is actually phosphorylated in vivo, so it might not depend on the net charge of the amino acid at this position but rather on other characteristics of the serine residue itself. While L4-22K and L4-33K from HAdV-5 have received a great deal of attention during the last decade, no information on the function of these proteins in other human serotypes exists. Since the C-terminal parts of L4-22K and L4-33K are highly conserved, we decided to test whether the functions of the two proteins also have been evolutionary conserved. For this we con55 structed several clones from adenovirus serotypes representing species A-F and tested them in vivo for their regulatory properties in MLP transcription and L1 splicing (paper IV). The results demonstrated that all serotypespecific proteins mimicked that of their HAdV-5 counterparts: L4-22K enhances transcription but not splicing, L4-33K enhances splicing but not transcription. Interestingly, the HAdV-9 L4-22K protein was considerably more efficient at stimulating transcription than the other serotype specific proteins, and also showed an aberrant localization. While HAdV-5 L4-22K was diffusively localized exclusively to the nucleus, a proportion of the HAdV-9 L422K protein was enriched at the nuclear rim while a fraction was localized to the cytoplasm. This compartment have been associated with both transcriptional silencing and activation (243, 244). Further, HAdV-9 L4-22K have significantly less potential phosphorylation sites compared to the other serotype specific L4-22K proteins, indicating a possible difference in means of regulating function and localization (as discussed above). It will be interesting to compare the phosphorylation status and localization of the L4-22K and L4-33K proteins from additional serotypes, to see whether these can be linked to each other. We have access to HAdV-5 mutant viruses lacking L422K or L4-33K, which both are hampered in virus growth. To investigate whether these L4 proteins from other serotypes are essential for virus growth, we will infect cell cultures with the HAdV-5 ΔL4-22K or ΔL4-33K viruses and determine whether we can improve virus growth with serotypespecific proteins. While L4-22K and L4-33K have been characterized by their function on late genes/gene products, new data indicates that L4-22K inhibits transcription of the E1A unit and other early genes (211). We investigated this finding more closely, and found that L4-22K in fact regulates the alternative splicing pattern of the E1A unit (paper IV). The E1A-10S isoform was strongly upregulated by the addition of an L4-22K-expressing plasmid in transient transfection experiments. L4-33K also increased the expression of E1A-10S, even when mutated on serine 192. These findings suggest a new mechanism of splicing activation common to L4-22K and L4-33K, which do not rely on the characterized RS-repeats unique to the L4-33K reading frame. This will be investigated further by using mutant proteins with sequential deletions of the C-terminus of L4-33K and N-terminal deletions of the common L4-22K and L4-33K reading frame. We will also try to find the responsive RNA element in the E1A pre-mRNA required for activation of E1A-10S splicing. Even after 10 years of research, much remains to be elucidated when it comes to the precise mechanisms of action of the L4-22K and L4-33K proteins. In this thesis, I take an exploratory step forward towards an understanding of these adenoviral proteins. 56 Acknowledgments In 2004 I attended the course in Molecular Virology given by professor Göran Akusjärvi; I was instantly hooked and started badgering him for a PhD position. Finally in 2007 I succeeded, and since then I’ve had some of the best and worst times of my life. Since you’re reading this, I apparently made it and this I owe to my supervisor Göran. Thank you for badgering me right back on my experiments, pushing me and supporting all my wild scholarship applications (which all succeeded!). You’re a great scientist and I keep wondering how you can fit all that knowledge into your brain. Thanks also to Catharina Svensson for questioning both Göran and me, Göran Magnusson and my examinator Dan Andersson. A huge thank you to Tanel Punga, my secondary supervisor, who’s been a real support from the first day to the last. Thank you for boosting my self-esteem and taking such an interest in my project. You’re incredibly talented and I hope you will have a long and productive scientific career. Many thanks to present and past members of the virology labs! Xin – my office-pal and soon to be co-author, you’re a great friend and I hope to see you in China soon! Troy – always complimenting me on my English skills! Wael – you are so smart and talented, and I see an amazing future for you if you’re just willing to take it. Ravi – always talking and always smiling, you can brighten every situation. Sibel – good luck in the future! Roberta – I hope we can have a super L4 collaboration together! Daniel – thanks for bringing your norrländska lugn to the lab. Monika – we are so alike and yet so different, I feel we share a special connection. Anna – you lit up the lab with your down-to-earth humor and wisdom. Ellenor – I admire you in many ways, you are a tuff brud i lyxförpackning and a really good scientist. Cecilia – your combination of positive energy and jävlar anamma is a winning recipe. Anne-Katrine – my great role model in all things important, I’m full of admiration for your courage and your warm and loving nature. Sofia – you brilliant parrot-clown-brainiac! How I miss you in the lab with your excellent lab skills and witty remarks. Susanne – for being a calm, positive presence in the lab. Alexis – the avocado man with the Cheshire cat smile, always ready to help. Gunnar – there was never a boring moment with you in the lab… Fia – for keeping our gang together. Helen, Farzaneh, Xiaoze, Ning, Edyta, Carlos, Roger, Angelica, Kerstin, Tove, Hanna, and many more. 57 Anette – a special thanks to you, for your help and support in so many ways. I owe you much more than you think! A heartfelt thank you to the administrative staff and Olav for fixing anything that can be fixed. There are so many people at IMBIM that I’m glad I’ve gotten to know! Chris (and Johan), Maria, Lisa, Erik, Linus, Jocke, Tijs, Nona, Else, Kathrin, Ananya, Julia, Marlen, Jessika, Patric, Anna and Vahid, Alejandro, Daniel, Benjamin, Andrea, Jessika. Ivana – for sharing my pre-graduation anxiety and prioritizations in life. My wonderful friends! Thanks for enriching my life and making it worth to live! To the Thunmans for always being there, and being fun, and for keeping a guest room just for us! To Ola and Fredrik for being such great and fun guys, to the Noskos for being so energizing (apart from Markus occasional bouts of fältkoma…). Kristoffer and Johanna – for spoiling Emy with both gifts and attention. To Kia for being my long-lasting friend, we’ve gone together from primary school in Sandviken to university in Gothenburg, Canberra and then Uppsala, so hey – you got a job for me in Brasil?! Katta – once a Högbobarn, always a Högbobarn; I wish I could be your neighbor once again! To Anna, who is no longer with us – I miss you like crazy. The parents group from Hjärtat – for all your support and friendship on our entire journey of parenthood. And to Heidi – what would I do without you? You’re the Piff to my Puff! Seriously, your never-ending support, optimism, and kindness, providing clarity on all my problems; I don’t think I could have done this without you. Till min familj. Ni har styrkt mig hela vägen fram till denna dag. Das Family (a.k.a. Westinarna): Ina – den bästa svärmor man kan ha, alltid redo att ta emot oss i ditt hem, och världens roligaste att driva med. Du är en sann förebild! Katharina, Magnus, Hugo och Emil – tack för alla glada och härliga stunder tillsammans. Oma Annemarie – matriarken vår, du har varit med om så mycket och är ändå så stark, och jag beundrar dig oerhört. Anna och Peter – tack för så många glada kvällar, härliga resor och Bellman-historier. Gunilla – jag är så tacksam för allt du har gett mig genom åren, och framförallt hur du blivit den mamma och mormor som ryckts bort från mig och Emy. Pappa – för att du alltid förväntat dig det bästa från mig utan att ställa krav. För att du låtsas vara sträng men är en riktig mjukis. För att du kan allt och vet allt (eller hur vad det nu igen) och alltid ställer upp. Mamma - herregud vad jag saknar dig och vad jag önskar att du var här, det finns inga ord för att uttrycka saknaden. Morfar – min största supporter, du trodde alltid på mig och jag önskar att du fanns här för att dela denna stund med mig. Deniz – mitt livs kärlek. Det största för mig är att veta att du alltid finns där, att du är min bästa vän, min själsfrände och motpol på bästa möjliga sätt (”buck up, sissy pants”). Och Emy – du är den bästa ungen som finns! Jag visste inte att det fanns så mycket kärlek på denna jord innan du kom till oss. Jag älskar er! 58 References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. Werner F, Grohmann D. 2011. Evolution of multisubunit RNA polymerases in the three domains of life. Nat. Rev. Microbiol. 9:85–98. Vannini A, Cramer P. 2012. Conservation between the RNA polymerase I, II, and III transcription initiation machineries. Mol. Cell 45:439–46. Haag JR, Pikaard CS. 2011. Multisubunit RNA polymerases IV and V: purveyors of non-coding RNA for plant gene silencing. Nat. Rev. Mol. Cell Biol. 12:483–92. Werner F. 2008. Structural evolution of multisubunit RNA polymerases. Trends Microbiol. 16:247–50. Landick R. 2009. Functional divergence in the growing family of RNA polymerases. Structure 17:323–5. Sekine S, Tagami S, Yokoyama S. 2012. Structural basis of transcription by bacterial and eukaryotic RNA polymerases. Curr. Opin. Struct. Biol. 22:110–8. Liu X, Bushnell DA, Kornberg RD. 2013. RNA polymerase II transcription: structure and mechanism. Biochim. Biophys. Acta 1829:2–8. Zehring WA, Lee JM, Weeks JR, Jokerst RS, Greenleaf AL. 1988. The C-terminal repeat domain of RNA polymerase II largest subunit is essential in vivo but is not required for accurate transcription initiation in vitro. Proc. Natl. Acad. Sci. U. S. A. 85:3698–702. Phatnani HP, Greenleaf AL. 2006. Phosphorylation and functions of the RNA polymerase II CTD. Genes Dev. 20:2922–36. Akoulitchev S, Makela TP, Weinberg RA, Reinberg D. 1995. Requirement for TFIIH kinase activity in transcription by RNA polymerase II. Nature, 1995/10/12 ed. 377:557–560. Komarnitsky P, Cho EJ, Buratowski S. 2000. Different phosphorylated forms of RNA polymerase II and associated mRNA processing factors during transcription. Genes Dev. 14:2452–60. Allison LA, Moyle M, Shales M, Ingles CJ. 1985. Extensive homology among the largest subunits of eukaryotic and prokaryotic RNA polymerases. Cell 42:599–610. Corden JL, Cadena DL, Ahearn JM, Dahmus ME. 1985. A unique structure at the carboxyl terminus of the largest subunit of eukaryotic RNA polymerase II. Proc. Natl. Acad. Sci. U. S. A. 82:7934–8. Medlin J, Scurry A, Taylor A, Zhang F, Peterlin BM, Murphy S. 2005. P-TEFb is not an essential elongation factor for the intronless human U2 snRNA and histone H2b genes. EMBO J. 24:4154–65. Chapman RD, Heidemann M, Albert TK, Mailhammer R, Flatley A, Meisterernst M, Kremmer E, Eick D. 2007. Transcribing RNA 59 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 60 polymerase II is phosphorylated at CTD residue serine-7. Science 318:1780–2. Egloff S, O’Reilly D, Chapman RD, Taylor A, Tanzhaus K, Pitts L, Eick D, Murphy S. 2007. Serine-7 of the RNA polymerase II CTD is specifically required for snRNA gene expression. Science 318:1777–9. Fuda NJ, Ardehali MB, Lis JT. 2009. Defining mechanisms that regulate RNA polymerase II transcription in vivo. Nature, 2009/09/11 ed. 461:186– 192. Shandilya J, Roberts SGE. 2012. The transcription cycle in eukaryotes: from productive initiation to RNA polymerase II recycling. Biochim. Biophys. Acta 1819:391–400. Thomas MC, Chiang C-M. The general transcription machinery and general cofactors. Crit. Rev. Biochem. Mol. Biol. 41:105–78. Yang C, Bolotin E, Jiang T, Sladek FM, Martinez E. 2007. Prevalence of the initiator over the TATA box in human and yeast genes and identification of DNA motifs enriched in human TATA-less core promoters. Gene 389:52–65. Kaufmann J, Smale ST. 1994. Direct recognition of initiator elements by a component of the transcription factor IID complex. Genes Dev, 1994/04/01 ed. 8:821–829. Young CS. 2003. The structure and function of the adenovirus major late promoter. Curr Top Microbiol Immunol, 2003/05/16 ed. 272:213–249. Ohler U, Liao G, Niemann H, Rubin GM. 2002. Computational analysis of core promoters in the Drosophila genome. Genome Biol. 3:RESEARCH0087. Rhee HS, Pugh BF. 2012. Genome-wide structure and organization of eukaryotic pre-initiation complexes. Nature 483:295–301. Zanton SJ, Pugh BF. 2004. Changes in genomewide occupancy of core transcriptional regulators during heat stress. Proc. Natl. Acad. Sci. U. S. A. 101:16843–8. Basehoar AD, Zanton SJ, Pugh BF. 2004. Identification and distinct regulation of yeast TATA box-containing genes. Cell 116:699–709. Grünberg S, Hahn S. 2013. Structural insights into transcription initiation by RNA polymerase II. Trends Biochem. Sci. 38:603–11. Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA, Gocayne JD, Amanatides P, Ballew RM, Huson DH, Wortman JR, Zhang Q, Kodira CD, Zheng XH, Chen L, Skupski M, Subramanian G, Thomas PD, Zhang J, Gabor Miklos GL, Nelson C, Broder S, Clark AG, Nadeau J, McKusick VA, Zinder N, Levine AJ, Roberts RJ, Simon M, Slayman C, Hunkapiller M, Bolanos R, Delcher A, Dew I, Fasulo D, Flanigan M, Florea L, Halpern A, Hannenhalli S, Kravitz S, Levy S, Mobarry C, Reinert K, Remington K, Abu-Threideh J, Beasley E, Biddick K, Bonazzi V, Brandon R, Cargill M, Chandramouliswaran I, Charlab R, Chaturvedi K, Deng Z, Di Francesco V, Dunn P, Eilbeck K, Evangelista C, Gabrielian AE, Gan W, Ge W, Gong F, Gu Z, Guan P, Heiman TJ, Higgins ME, Ji RR, Ke Z, Ketchum KA, Lai Z, Lei Y, Li Z, Li J, Liang Y, Lin X, Lu F, Merkulov G V, Milshina N, Moore HM, Naik AK, Narayan VA, Neelam B, Nusskern D, Rusch DB, Salzberg S, Shao W, 29. 30. 31. 32. 33. 34. 35. Shue B, Sun J, Wang Z, Wang A, Wang X, Wang J, Wei M, Wides R, Xiao C, Yan C, Yao A, Ye J, Zhan M, Zhang W, Zhang H, Zhao Q, Zheng L, Zhong F, Zhong W, Zhu S, Zhao S, Gilbert D, Baumhueter S, Spier G, Carter C, Cravchik A, Woodage T, Ali F, An H, Awe A, Baldwin D, Baden H, Barnstead M, Barrow I, Beeson K, Busam D, Carver A, Center A, Cheng ML, Curry L, Danaher S, Davenport L, Desilets R, Dietz S, Dodson K, Doup L, Ferriera S, Garg N, Gluecksmann A, Hart B, Haynes J, Haynes C, Heiner C, Hladun S, Hostin D, Houck J, Howland T, Ibegwam C, Johnson J, Kalush F, Kline L, Koduru S, Love A, Mann F, May D, McCawley S, McIntosh T, McMullen I, Moy M, Moy L, Murphy B, Nelson K, Pfannkoch C, Pratts E, Puri V, Qureshi H, Reardon M, Rodriguez R, Rogers YH, Romblad D, Ruhfel B, Scott R, Sitter C, Smallwood M, Stewart E, Strong R, Suh E, Thomas R, Tint NN, Tse S, Vech C, Wang G, Wetter J, Williams S, Williams M, Windsor S, Winn-Deen E, Wolfe K, Zaveri J, Zaveri K, Abril JF, Guigo R, Campbell MJ, Sjolander K V, Karlak B, Kejariwal A, Mi H, Lazareva B, Hatton T, Narechania A, Diemer K, Muruganujan A, Guo N, Sato S, Bafna V, Istrail S, Lippert R, Schwartz R, Walenz B, Yooseph S, Allen D, Basu A, Baxendale J, Blick L, Caminha M, Carnes-Stine J, Caulk P, Chiang YH, Coyne M, Dahlke C, Mays A, Dombroski M, Donnelly M, Ely D, Esparham S, Fosler C, Gire H, Glanowski S, Glasser K, Glodek A, Gorokhov M, Graham K, Gropman B, Harris M, Heil J, Henderson S, Hoover J, Jennings D, Jordan C, Jordan J, Kasha J, Kagan L, Kraft C, Levitsky A, Lewis M, Liu X, Lopez J, Ma D, Majoros W, McDaniel J, Murphy S, Newman M, Nguyen T, Nguyen N, Nodell M, Pan S, Peck J, Peterson M, Rowe W, Sanders R, Scott J, Simpson M, Smith T, Sprague A, Stockwell T, Turner R, Venter E, Wang M, Wen M, Wu D, Wu M, Xia A, Zandieh A, Zhu X. 2001. The sequence of the human genome. Science (80-. )., 2001/02/22 ed. 291:1304–1351. Kraemer SM, Ranallo RT, Ogg RC, Stargell LA. 2001. TFIIA interacts with TFIID via association with TATA-binding protein and TAF40. Mol. Cell. Biol. 21:1737–46. Orphanides G, Lagrange T, Reinberg D. 1996. The general transcription factors of RNA polymerase II. Genes Dev, 1996/11/01 ed. 10:2657–2683. Chopra VS, Cande J, Hong J-W, Levine M. 2009. Stalled Hox promoters as chromosomal boundaries. Genes Dev. 23:1505–9. Gilchrist DA, Nechaev S, Lee C, Ghosh SKB, Collins JB, Li L, Gilmour DS, Adelman K. 2008. NELF-mediated stalling of Pol II can enhance gene expression by blocking promoter-proximal nucleosome assembly. Genes Dev. 22:1921–33. Hocine S, Singer RH, Grunwald D. RNA processing and export. Cold Spring Harb Perspect Biol, 2010/10/22 ed. 2:a000752. Lou H, Neugebauer KM, Gagel RF, Berget SM. 1998. Regulation of alternative polyadenylation by U1 snRNPs and SRp20. Mol Cell Biol, 1998/08/26 ed. 18:4977–4985. Mandel CR, Bai Y, Tong L. 2008. Protein factors in pre-mRNA 3’-end processing. Cell Mol Life Sci, 2007/12/26 ed. 65:1099–1122. 61 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 62 Rappsilber J, Ryder U, Lamond AI, Mann M. 2002. Large-scale proteomic analysis of the human spliceosome. Genome Res, 2002/08/15 ed. 12:1231–1245. Nilsen TW, Graveley BR. 2010. Expansion of the eukaryotic proteome by alternative splicing. Nature 463:457–63. 2004. Finishing the euchromatic sequence of the human genome. Nature 431:931–45. Naidoo N, Pawitan Y, Soong R, Cooper DN, Ku C-S. 2011. Human genetics and genomics a decade after the release of the draft sequence of the human genome. Hum. Genomics 5:577. Sammeth M, Foissac S, Guigó R. 2008. A general definition and nomenclature for alternative splicing events. PLoS Comput. Biol. 4:e1000147. Kim E, Magen A, Ast G. 2007. Different levels of alternative splicing among eukaryotes. Nucleic Acids Res, 2006/12/13 ed. 35:125–131. Ast G. 2004. How did alternative splicing evolve? Nat Rev Genet, 2004/10/29 ed. 5:773–782. Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore SF, Schroth GP, Burge CB. 2008. Alternative isoform regulation in human tissue transcriptomes. Nature, 2008/11/04 ed. 456:470– 476. Hynes RO. 2012. The evolution of metazoan extracellular matrix. J. Cell Biol. 196:671–9. França GS, Cancherini D V, de Souza SJ. 2012. Evolutionary history of exon shuffling. Genetica 140:249–57. Buljan M, Chalancon G, Eustermann S, Wagner GP, Fuxreiter M, Bateman A, Babu MM. 2012. Tissue-specific splicing of disordered segments that embed binding motifs rewires protein interaction networks. Mol. Cell 46:871–83. Ellis JD, Barrios-Rodiles M, Colak R, Irimia M, Kim T, Calarco JA, Wang X, Pan Q, O’Hanlon D, Kim PM, Wrana JL, Blencowe BJ. 2012. Tissue-specific alternative splicing remodels protein-protein interaction networks. Mol. Cell 46:884–92. Kornblihtt AR, Schor IE, Alló M, Dujardin G, Petrillo E, Muñoz MJ. 2013. Alternative splicing: a pivotal step between eukaryotic transcription and translation. Nat. Rev. Mol. Cell Biol. 14:153–65. Kondrashov FA, Koonin E V. 2003. Evolution of alternative splicing: deletions, insertions and origin of functional parts of proteins from intron sequences. Trends Genet. 19:115–9. Hemani Y, Soller M. 2012. Mechanisms of Drosophila Dscam mutually exclusive splicing regulation. Biochem. Soc. Trans. 40:804–9. Fedorova O, Waldsich C, Pyle AM. 2007. Group II intron folding under near-physiological conditions: collapsing to the near-native state. J Mol Biol, 2007/01/02 ed. 366:1099–1114. Nielsen H, Johansen SD. 2009. Group I introns: Moving in new directions. RNA Biol, 2009/08/12 ed. 6:375–383. Toor N, Keating KS, Pyle AM. 2009. Structural insights into RNA splicing. Curr Opin Struct Biol, 2009/05/16 ed. 19:260–266. 54. 55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. Wahl MC, Will CL, Luhrmann R. 2009. The spliceosome: design principles of a dynamic RNP machine. Cell, 2009/02/26 ed. 136:701–718. Sharp PA, Burge CB. 1997. Classification of introns: U2-type or U12-type. Cell, 1998/01/15 ed. 91:875–879. Staley JP, Woolford Jr. JL. 2009. Assembly of ribosomes and spliceosomes: complex ribonucleoprotein machines. Curr Opin Cell Biol, 2009/01/27 ed. 21:109–118. Pacheco TR, Coelho MB, Desterro JM, Mollet I, Carmo-Fonseca M. 2006. In vivo requirement of the small subunit of U2AF for recognition of a weak 3’ splice site. Mol Cell Biol, 2006/08/31 ed. 26:8183–8190. Ritchie DB, Schellenberg MJ, MacMillan AM. 2009. Spliceosome structure: piece by piece. Biochim Biophys Acta, 2009/09/08 ed. 1789:624– 633. Brody E, Abelson J. 1985. The “spliceosome”: yeast pre-messenger RNA associates with a 40S complex in a splicing-dependent reaction. Science 228:963–7. Müller S, Wolpensinger B, Angenitzki M, Engel A, Sperling J, Sperling R. 1998. A supraspliceosome model for large nuclear ribonucleoprotein particles based on mass determinations by scanning transmission electron microscopy. J. Mol. Biol. 283:383–94. Munoz MJ, de la Mata M, Kornblihtt AR. The carboxy terminal domain of RNA polymerase II and alternative splicing. Trends Biochem Sci, 2010/04/27 ed. 35:497–504. Kornblihtt AR, de la Mata M, Fededa JP, Munoz MJ, Nogues G. 2004. Multiple links between transcription and splicing. RNA, 2004/09/24 ed. 10:1489–1498. Vargas DY, Shah K, Batish M, Levandoski M, Sinha S, Marras SAE, Schedl P, Tyagi S. 2011. Single-molecule imaging of transcriptionally coupled and uncoupled splicing. Cell 147:1054–65. Graveley BR. 2000. Sorting out the complexity of SR protein functions. RNA, 2000/09/22 ed. 6:1197–1211. Manley JL, Krainer AR. 2010. A rational nomenclature for serine/arginine-rich protein splicing factors (SR proteins). Genes Dev, 2010/06/03 ed. 24:1073–1074. Zheng ZM. 2004. Regulation of alternative RNA splicing by exon definition and exon sequences in viral and mammalian gene expression. J Biomed Sci, 2004/04/07 ed. 11:278–294. Smith CW, Valcarcel J. 2000. Alternative pre-mRNA splicing: the logic of combinatorial control. Trends Biochem Sci, 2000/08/01 ed. 25:381–388. Berget SM. 1995. Exon Recognition in Vertebrate Splicing. J. Biol. Chem. 270:2411–2414. Li H, Bingham PM. 1991. Arginine/serine-rich domains of the su(wa) and tra RNA processing regulators target proteins to a subnuclear compartment implicated in splicing. Cell, 1991/10/18 ed. 67:335–342. Hedley ML, Amrein H, Maniatis T. 1995. An amino acid sequence motif sufficient for subnuclear localization of an arginine/serine-rich splicing factor. Proc Natl Acad Sci U S A, 1995/12/05 ed. 92:11524–11528. 63 71. 72. 73. 74. 75. 76. 77. 78. 79. 80. 81. 82. 83. 84. 85. 86. 87. 88. 89. 64 Caceres JF, Misteli T, Screaton GR, Spector DL, Krainer AR. 1997. Role of the modular domains of SR proteins in subnuclear localization and alternative splicing specificity. J Cell Biol, 1997/07/28 ed. 138:225–238. Cazalla D, Zhu J, Manche L, Huber E, Krainer AR, Caceres JF. 2002. Nuclear export and retention signals in the RS domain of SR proteins. Mol Cell Biol, 2002/09/07 ed. 22:6871–6882. Shopland LS, Johnson C V, Byron M, McNeil J, Lawrence JB. 2003. Clustering of multiple specific genes and gene-rich R-bands around SC-35 domains: evidence for local euchromatic neighborhoods. J Cell Biol, 2003/09/17 ed. 162:981–990. Hall K, Blair Zajdel ME, Blair GE. 2010. Unity and diversity in the human adenoviruses: exploiting alternative entry pathways for gene therapy. Biochem J, 2010/10/13 ed. 431:321–336. Spector DL, Lamond AI. 2011. Nuclear speckles. Cold Spring Harb Perspect Biol, 2010/10/12 ed. 3. Solis AS, Shariat N, Patton JG. 2008. Splicing fidelity, enhancers, and disease. Front Biosci, 2007/11/06 ed. 13:1926–1942. Tacke R, Manley JL. 1999. Determinants of SR protein specificity. Curr Opin Cell Biol, 1999/07/08 ed. 11:358–362. Cooper TA, Mattox W. 1997. The regulation of splice-site selection, and its role in human disease. Am J Hum Genet, 1997/08/01 ed. 61:259–266. Ward AJ, Cooper TA. The pathobiology of splicing. J Pathol, 2009/11/18 ed. 220:152–163. Lorson CL, Hahnen E, Androphy EJ, Wirth B. 1999. A single nucleotide in the SMN gene regulates splicing and is responsible for spinal muscular atrophy. Proc. Natl. Acad. Sci. U. S. A. 96:6307–11. Monani UR, Lorson CL, Parsons DW, Prior TW, Androphy EJ, Burghes AH, McPherson JD. 1999. A single nucleotide difference that alters splicing patterns distinguishes the SMA gene SMN1 from the copy gene SMN2. Hum. Mol. Genet. 8:1177–83. Qian W, Liu F. 2014. Regulation of alternative splicing of tau exon 10. Neurosci. Bull. 30:367–77. Luo Y-B, Mastaglia FL, Wilton SD. 2014. Normal and aberrant splicing of LMNA. J. Med. Genet. 51:215–23. Baralle M, Buratti E, Baralle FE. 2013. The role of TDP-43 in the pathogenesis of ALS and FTLD. Biochem. Soc. Trans. 41:1536–40. Srebrow A, Kornblihtt AR. 2006. The connection between splicing and cancer. J Cell Sci 119:2635–2641. Berget SM, Sharp PA. 1977. A spliced sequence at the 5’-terminus of adenovirus late mRNA. Brookhaven Symp Biol, 1977/05/12 ed. 332–344. Berget SM, Moore C, Sharp PA. 1977. Spliced segments at the 5’ terminus of adenovirus 2 late mRNA. Proc Natl Acad Sci U S A, 1977/08/01 ed. 74:3171–3175. Chow LT, Gelinas RE, Broker TR, Roberts RJ. 1977. An amazing sequence arrangement at the 5’ ends of adenovirus 2 messenger RNA. Cell, 1977/09/01 ed. 12:1–8. Norkin LC. 2010. Adenoviruses, p. 444–470. In Virology: molecular biology and pathogenesis. ASM Press, Washington. 90. 91. 92. 93. 94. 95. 96. 97. 98. 99. 100. 101. 102. 103. 104. 105. 106. 107. Benko M, Harrach B. 1998. A proposal for a new (third) genus within the family Adenoviridae. Arch Virol, 1998/06/25 ed. 143:829–837. Davison AJ, Benkö M, Harrach B. 2003. Genetic content and evolution of adenoviruses. J Gen Virol, 2003/10/24 ed. 84:2895–2908. King Adams, M.J., Carstens, E.B. and Lefkowitz, E.J. AMQ, Press EA. 2012. Virus taxonomy: classification and nomenclature of virusesNinth Report of the International Committee on Taxonomy of Viruses. San Diego. Kovacs GM, LaPatra SE, D’Halluin JC, Benko M. 2003. Phylogenetic analysis of the hexon and protease genes of a fish adenovirus isolated from white sturgeon (Acipenser transmontanus) supports the proposal for a new adenovirus genus. Virus Res, 2003/11/12 ed. 98:27–34. Doszpoly A, Wellehan JFX, Childress AL, Tarján ZL, Kovács ER, Harrach B, Benkő M. 2013. Partial characterization of a new adenovirus lineage discovered in testudinoid turtles. Infect. Genet. Evol. 17:106–12. Wold WS, Ison MG. 2013. Adenoviruses, p. 1732–1767. In Fields, BM, Knipe, DM, Howley, PM (eds.), Field’s Virology, 6th ed. Lippincott Williams & Wilkins, Philadelphia. Walsh MP, Seto J, Liu EB, Dehghan S, Hudson NR, Lukashev AN, Ivanova O, Chodosh J, Dyer DW, Jones MS, Seto D. 2011. Computational analysis of two species C human adenoviruses provides evidence of a novel virus. J Clin Microbiol, 2011/08/19 ed. 49:3482–3490. Ghebremedhin B. 2014. Human adenovirus: Viral pathogen with increasing importance. Eur. J. Microbiol. Immunol. (Bp). 4:26–33. Berk AJ. 2013. Adenoviridae, p. 1704–1731. In Fields, BN, Knipe, DM, Howley, PM (eds.), Fields Virology, 6th ed. Lippincott Williams & Wilkins, Philadelphia. Walls T, Shankar A, Shingadia D. 2003. Adenovirus: an increasingly important pathogen in paediatric bone marrow transplant patients. Lancet Infect. Dis. 3:79–86. Akusjärvi G. 2008. Temporal regulation of adenovirus major late alternative RNA splicing. Front Biosci, 2008/05/30 ed. 13:5006–5015. Akusjärvi G, Stevenin J. 2003. Remodelling of the host cell RNA splicing machinery during an adenovirus infection. Curr Top Microbiol Immunol 272:253–286. 2012. Schaechter’s Mechanisms of Microbial Disease. Lippincott Williams & Wilkins. Jia R, Liu X, Tao M, Kruhlak M, Guo M, Meyers C, Baker CC, Zheng Z-M. 2009. Control of the papillomavirus early-to-late switch by differentially expressed SRp20. J. Virol. 83:167–80. Chow LT, Broker TR. 1978. The spliced structures of adenovirus 2 fiber message and the other late mRNAs. Cell, 1978/10/01 ed. 15:497–510. Tribouley C, Lutz P, Staub A, Kedinger C. 1994. The product of the adenovirus intermediate gene IVa2 is a transcriptional activator of the major late promoter. J Virol, 1994/07/01 ed. 68:4450–4457. Ostapchuk P, Almond M, Hearing P. 2011. Characterization of Empty adenovirus particles assembled in the absence of a functional adenovirus IVa2 protein. J Virol, 2011/04/01 ed. 85:5524–5531. Parks RJ. 2005. Adenovirus protein IX: a new look at an old protein. Mol Ther, 2004/12/09 ed. 11:19–25. 65 108. 109. 110. 111. 112. 113. 114. 115. 116. 117. 118. 119. 120. 121. 122. 123. 124. 125. 66 Russell WC. 2009. Adenoviruses: update on structure and function. J Gen Virol, 2008/12/18 ed. 90:1–20. Ma Y, Mathews MB. 1996. Structure, function, and evolution of adenovirus-associated RNA: a phylogenetic approach. J Virol, 1996/08/01 ed. 70:5083–5099. Kamel W, Segerman B, Öberg D, Punga T, Akusjärvi G. 2013. The adenovirus VA RNA-derived miRNAs are not essential for lytic virus growth in tissue culture cells. Nucleic Acids Res. 41:4802–12. Galabru J, Katze MG, Robert N, Hovanessian AG. 1989. The binding of double-stranded RNA and adenovirus VAI RNA to the interferon-induced protein kinase. Eur J Biochem, 1989/01/02 ed. 178:581–589. Davison AJ, Wright KM, Harrach B. 2000. DNA sequence of frog adenovirus. J Gen Virol, 2000/09/20 ed. 81:2431–2439. Benko M, Harrach B. 2003. Molecular evolution of adenoviruses. Curr Top Microbiol Immunol, 2003/05/16 ed. 272:3–35. Song B, Hu SL, Darai G, Spindler KR, Young CS. 1996. Conservation of DNA sequence in the predicted major late promoter regions of selected mastadenoviruses. Virology, 1996/06/15 ed. 220:390–401. Sheppard M, Werner W, McCoy RJ, Johnson MA. 1998. The major late promoter and bipartite leader sequence of fowl adenovirus. Arch Virol, 1998/05/08 ed. 143:537–548. Vrati S, Brookes DE, Boyle DB, Both GW. 1996. Nucleotide sequence of ovine adenovirus tripartite leader sequence and homologues of the IVa2, DNA polymerase and terminal proteins. Gene, 1996/10/24 ed. 177:35–41. Kidd AH, Garwicz D, Oberg M. 1995. Human and simian adenoviruses: phylogenetic inferences from analysis of VA RNA genes. Virology, 1995/02/20 ed. 207:32–45. Van Blerkom LM. 2003. Role of viruses in human evolution. Am J Phys Anthr., 2003/12/11 ed. Suppl 37:14–46. Kumar S, Hedges SB. 1998. A molecular timescale for vertebrate evolution. Nature, 1998/05/15 ed. 392:917–920. Belnap DM, Steven AC. 2000. “Deja vu all over again”: the similar structures of bacteriophage PRD1 and adenovirus. Trends Microbiol, 2000/03/09 ed. 8:91–93. Benson SD, Bamford JK, Bamford DH, Burnett RM. 1999. Viral evolution revealed by bacteriophage PRD1 and human adenovirus coat protein structures. Cell, 1999/09/28 ed. 98:825–833. Saren AM, Ravantti JJ, Benson SD, Burnett RM, Paulin L, Bamford DH, Bamford JK. 2005. A snapshot of viral evolution from genome analysis of the tectiviridae family. J Mol Biol, 2005/06/11 ed. 350:427–440. Echavarria M. 2008. Adenoviruses in immunocompromised hosts. Clin Microbiol Rev, 2008/10/16 ed. 21:704–715. Ladisch S, Lovejoy FH, Hierholzer JC, Oxman MN, Strieder D, Vawter GF, Finer N, Moore M. 1979. Extrapulmonary manifestations of adenovirus type 7 pneumonia simulating Reye syndrome and the possible role of an adenovirus toxin. J Pediatr, 1979/09/01 ed. 95:348–355. Top Jr. FH. 1975. Control of adenovirus acute respiratory disease in U.S. Army trainees. Yale J Biol Med, 1975/07/11 ed. 48:185–195. 126. 127. 128. 129. 130. 131. 132. 133. 134. 135. 136. 137. 138. 139. 140. 141. Adhikary AK, Banik U. 2014. Human adenovirus type 8: The major agent of epidemic keratoconjunctivitis (EKC). J. Clin. Virol. 61:477–486. Mufson MA, Belshe RB, Horrigan TJ, Zollar LM. 1973. Cause of acute hemorrhagic cystitis in children. Am J Dis Child, 1973/11/01 ed. 126:605– 609. Numazaki Y, Kumasaka T, Yano N, Yamanaka M, Miyazawa T, Takai S, Ishida N. 1973. Further study on acute hemorrhagic cystitis due to adenovirus type 11. N Engl J Med, 1973/08/16 ed. 289:344–347. Niemann TH, Trigg ME, Winick N, Penick GD. 1993. Disseminated adenoviral infection presenting as acute pancreatitis. Hum Pathol, 1993/10/01 ed. 24:1145–1148. Kelsey DS. 1978. Adenovirus meningoencephalitis. Pediatrics, 1978/02/01 ed. 61:291–293. Michaels MG, Green M, Wald ER, Starzl TE. 1992. Adenovirus infection in pediatric liver transplant recipients. J Infect Dis, 1992/01/01 ed. 165:170– 174. Cames B, Rahier J, Burtomboy G, de Ville de Goyet J, Reding R, Lamy M, Otte JB, Sokal EM. 1992. Acute adenovirus hepatitis in liver transplant recipients. J Pediatr, 1992/01/01 ed. 120:33–37. Rowe WP, Huebner RJ, Gilmore LK, Parrott RH, Ward TG. 1953. Isolation of a cytopathogenic agent from human adenoids undergoing spontaneous degeneration in tissue culture. Proc Soc Exp Biol Med, 1953/12/01 ed. 84:570–573. Rowe WP, Hartley JW, Huebner RJ. 1956. Additional serotypes of the APC virus group. Proc Soc Exp Biol Med, 1956/02/01 ed. 91:260–262. Kuschner RA, Russell KL, Abuja M, Bauer KM, Faix DJ, Hait H, Henrick J, Jacobs M, Liss A, Lynch JA, Liu Q, Lyons AG, Malik M, Moon JE, Stubbs J, Sun W, Tang D, Towle AC, Walsh DS, Wilkerson D. 2013. A phase 3, randomized, double-blind, placebo-controlled study of the safety and efficacy of the live, oral adenovirus type 4 and type 7 vaccine, in U.S. military recruits. Vaccine 31:2963–71. Garnett CT, Talekar G, Mahr JA, Huang W, Zhang Y, Ornelles DA, Gooding LR. 2009. Latent species C adenoviruses in human tonsil tissues. J Virol, 2008/12/26 ed. 83:2417–2428. Watcharananan SP, Avery R, Ingsathit A, Malathum K, Chantratita W, Mavichak V, Chalermsanyakorn P, Jirasiritham S, Sumethkul V. 2011. Adenovirus disease after kidney transplantation: course of infection and outcome in relation to blood viral load and immune recovery. Am J Transpl., 2011/04/01 ed. 11:1308–1314. Ginn SL, Alexander IE, Edelstein ML, Abedi MR, Wixon J. 2013. Gene therapy clinical trials worldwide to 2012 - an update. J Gene Med, 2013/01/29 ed. 15:65–77. Arnberg N. 2012. Adenovirus receptors: implications for targeting of viral vectors. Trends Pharmacol Sci, 2012/05/25 ed. 33:442–448. Larocca C, Schlom J. 2011. Viral vector-based therapeutic cancer vaccines. Cancer J, 2011/09/29 ed. 17:359–371. Kirn D. 2001. Clinical research results with dl1520 (Onyx-015), a replication-selective adenovirus for the treatment of cancer: what have we learned? Gene Ther, 2001/04/21 ed. 8:89–98. 67 142. 143. 144. 145. 146. 147. 148. 149. 150. 151. 152. 153. 154. 155. 156. 157. 158. 68 Bourke MG, Salwa S, Harrington KJ, Kucharczyk MJ, Forde PF, de Kruijf M, Soden D, Tangney M, Collins JK, O’Sullivan GC. 2011. The emerging role of viruses in the treatment of solid tumours. Cancer Treat Rev, 2011/01/15 ed. 37:618–632. Michou AI, Santoro L, Christ M, Julliard V, Pavirani A, Mehtali M. 1997. Adenovirus-mediated gene transfer: influence of transgene, mouse strain and type of immune response on persistence of transgene expression. Gene Ther, 1997/05/01 ed. 4:473–482. Yang L, Sanchez A, Ward JM, Murphy BR, Collins PL, Bukreyev A. 2008. A paramyxovirus-vectored intranasal vaccine against Ebola virus is immunogenic in vector-immune animals. Virology, 2008/06/24 ed. 377:255–264. Yang Y, Ertl HC, Wilson JM. 1994. MHC class I-restricted cytotoxic T lymphocytes to viral antigens destroy hepatocytes in mice infected with E1deleted recombinant adenoviruses. Immunity, 1994/08/01 ed. 1:433–442. Yang Y, Nunes FA, Berencsi K, Furth EE, Gonczol E, Wilson JM. 1994. Cellular immunity to viral antigens limits E1-deleted adenoviruses for gene therapy. Proc Natl Acad Sci U S A, 1994/05/10 ed. 91:4407–4411. Walters RW, Freimuth P, Moninger TO, Ganske I, Zabner J, Welsh MJ. 2002. Adenovirus fiber disrupts CAR-mediated intercellular adhesion allowing virus escape. Cell, 2002/09/26 ed. 110:789–799. Greber UF, Willetts M, Webster P, Helenius A. 1993. Stepwise dismantling of adenovirus 2 during entry into cells. Cell, 1993/11/05 ed. 75:477–486. Wiethoff CM, Wodrich H, Gerace L, Nemerow GR. 2005. Adenovirus protein VI mediates membrane disruption following capsid disassembly. J Virol, 2005/02/01 ed. 79:1992–2000. Mabit H, Nakano MY, Prank U, Saam B, Dohner K, Sodeik B, Greber UF. 2002. Intact microtubules support adenovirus and herpes simplex virus infections. J Virol, 2002/09/05 ed. 76:9962–9971. Wodrich H, Cassany A, D’Angelo MA, Guan T, Nemerow G, Gerace L. 2006. Adenovirus core protein pVII is translocated into the nucleus by multiple import receptor pathways. J Virol, 2006/09/16 ed. 80:9608–9618. Spangler R, Bruner M, Dalie B, Harter ML. 1987. Activation of adenovirus promoters by the adenovirus E1A protein in cell-free extracts. Science (80-. )., 1987/08/28 ed. 237:1044–1046. Berk AJ. 1986. Adenovirus promoters and E1A transactivation. Annu Rev Genet, 1986/01/01 ed. 20:45–79. Stephens C, Harlow E. 1987. Differential splicing yields novel adenovirus 5 E1A mRNAs that encode 30 kd and 35 kd proteins. EMBO J. 6:2027–35. Swaminathan S, Thimmapaya B. 1995. Regulation of adenovirus E2 transcription unit. Curr. Top. Microbiol. Immunol. 199 ( Pt 3:177–94. Blackford AN, Grand RJ. 2009. Adenovirus E1B 55-kilodalton protein: multiple roles in viral infection and cell transformation. J Virol, 2009/02/13 ed. 83:4000–4012. Frisch SM, Mymryk JS. 2002. Adenovirus-5 E1A: paradox and paradigm. Nat Rev Mol Cell Biol, 2002/06/04 ed. 3:441–452. Debbas M, White E. 1993. Wild-type p53 mediates apoptosis by E1A, which is inhibited by E1B. Genes Dev, 1993/04/01 ed. 7:546–554. 159. 160. 161. 162. 163. 164. 165. 166. 167. 168. 169. 170. 171. 172. 173. 174. 175. Bondesson M, Öhman K, Manervik M, Fan S, Akusjärvi G. 1996. Adenovirus E4 open reading frame 4 protein autoregulates E4 transcription by inhibiting E1A transactivation of the E4 promoter. J Virol, 1996/06/01 ed. 70:3844–3851. Kanopka A, Mühlemann O, Petersen-Mahrt S, Estmer C, Öhrmalm C, Akusjärvi G. 1998. Regulation of adenovirus alternative RNA splicing by dephosphorylation of SR proteins. Nature, 1998/05/29 ed. 393:185–187. Marcellus RC, Chan H, Paquette D, Thirlwell S, Boivin D, Branton PE. 2000. Induction of p53-independent apoptosis by the adenovirus E4orf4 protein requires binding to the Balpha subunit of protein phosphatase 2A. J Virol, 2000/08/10 ed. 74:7869–7877. Tauber B, Dobner T. 2001. Molecular regulation and biological function of adenovirus early genes: the E4 ORFs. Gene, 2001/11/15 ed. 278:1–23. Nordqvist K, Ohman K, Akusjarvi G. 1994. Human adenovirus encodes two proteins which have opposite effects on accumulation of alternatively spliced mRNAs. Mol Cell Biol, 1994/01/01 ed. 14:437–445. Rawle FC, Tollefson AE, Wold WS, Gooding LR. 1989. Mouse antiadenovirus cytotoxic T lymphocytes. Inhibition of lysis by E3 gp19K but not E3 14.7K. J Immunol, 1989/09/15 ed. 143:2031–2037. Horwitz MS. 2004. Function of adenovirus E3 proteins and their interactions with immunoregulatory cell proteins. J Gene Med, 2004/02/24 ed. 6 Suppl 1:S172–83. Akusjärvi G, Persson H. 1981. Controls of RNA splicing and termination in the major late adenovirus transcription unit. Nature, 1981/07/30 ed. 292:420–426. Liu H, Naismith JH, Hay RT. 2003. Adenovirus DNA replication. Curr Top Microbiol Immunol, 2003/05/16 ed. 272:131–164. De Jong RN, van der Vliet PC. 1999. Mechanism of DNA replication in eukaryotic cells: cellular host factors stimulating adenovirus DNA replication. Gene, 1999/08/06 ed. 236:1–12. De Jong RN, van der Vliet PC, Brenkman AB. 2003. Adenovirus DNA replication: protein priming, jumping back and the role of the DNA binding protein DBP. Curr Top Microbiol Immunol, 2003/05/16 ed. 272:187–211. Flint SJ. 2009. Principles of virology Third. ASM Press, Washington, D.C. Aspegren A, Bridge E. 2002. Release of snRNP and RNA from transcription sites in adenovirus-infected cells. Exp Cell Res, 2002/05/25 ed. 276:273–283. Bridge E, Carmo-Fonseca M, Lamond A, Pettersson U. 1993. Nuclear organization of splicing small nuclear ribonucleoproteins in adenovirusinfected cells. J Virol, 1993/10/01 ed. 67:5792–5802. Bridge E, Mattsson K, Aspegren A, Sengupta A. 2003. Adenovirus early region 4 promotes the localization of splicing factors and viral RNA in latephase interchromatin granule clusters. Virology, 2003/07/02 ed. 311:40–50. Bridge E, Xia DX, Carmo-Fonseca M, Cardinali B, Lamond AI, Pettersson U. 1995. Dynamic organization of splicing factors in adenovirusinfected cells. J Virol, 1995/01/01 ed. 69:281–290. Bridge E, Pettersson U. 1996. Nuclear organization of adenovirus RNA biogenesis. Exp Cell Res, 1996/12/15 ed. 229:233–239. 69 176. 177. 178. 179. 180. 181. 182. 183. 184. 185. 186. 187. 188. 189. 190. 191. 192. 70 Pombo A, Ferreira J, Bridge E, Carmo-Fonseca M. 1994. Adenovirus replication and transcription sites are spatially separated in the nucleus of infected cells. Embo J, 1994/11/01 ed. 13:5075–5085. Puvion E, Puvion-Dutilleul F. 1996. Ultrastructure of the nucleus in relation to transcription and splicing: roles of perichromatin fibrils and interchromatin granules. Exp Cell Res, 1996/12/15 ed. 229:217–225. Puvion-Dutilleul F, Bachellerie JP, Visa N, Puvion E. 1994. Rearrangements of intranuclear structures involved in RNA processing in response to adenovirus infection. J Cell Sci, 1994/06/01 ed. 107 ( Pt 6:1457–1468. Puvion-Dutilleul F, Roussev R, Puvion E. 1992. Distribution of viral RNA molecules during the adenovirus type 5 infectious cycle in HeLa cells. J Struct Biol, 1992/05/01 ed. 108:209–220. Aspegren A, Rabino C, Bridge E. 1998. Organization of splicing factors in adenovirus-infected cells reflects changes in gene expression during the early to late phase transition. Exp Cell Res, 1998/11/26 ed. 245:203–213. Gama-Carvalho M, Condado I, Carmo-Fonseca M. 2003. Regulation of adenovirus alternative RNA splicing correlates with a reorganization of splicing factors in the nucleus. Exp Cell Res, 2003/08/28 ed. 289:77–85. Shaw AR, Ziff EB. 1980. Transcripts from the adenovirus-2 major late promoter yield a single early family of 3’ coterminal mRNAs and five late families. Cell, 1980/12/01 ed. 22:905–916. Larsson S, Svensson C, Akusjärvi G. 1992. Control of adenovirus major late gene expression at multiple levels. J Mol Biol 225:287–298. Morris SJ, Leppard KN. 2009. Adenovirus serotype 5 L4-22K and L433K proteins have distinct functions in regulating late gene expression. J Virol, 2009/01/30 ed. 83:3049–3058. Törmänen H, Backström E, Carlsson A, Akusjärvi G. 2006. L4-33K, an adenovirus-encoded alternative RNA splicing factor. J Biol Chem 281:36510–36517. Nevins JR, Wilson MC. 1981. Regulation of adenovirus-2 gene expression at the level of transcriptional termination and RNA processing. Nature, 1981/03/12 ed. 290:113–118. Ostapchuk P, Hearing P. 2005. Control of adenovirus packaging. J Cell Biochem, 2005/07/01 ed. 96:25–35. Ostapchuk P, Hearing P. 2003. Regulation of adenovirus packaging. Curr Top Microbiol Immunol, 2003/05/16 ed. 272:165–185. Reach M, Babiss LE, Young CS. 1990. The upstream factor-binding site is not essential for activation of transcription from the adenovirus major late promoter. J Virol, 1990/12/01 ed. 64:5851–5860. Reach M, Xu LX, Young CS. 1991. Transcription from the adenovirus major late promoter uses redundant activating elements. Embo J, 1991/11/01 ed. 10:3439–3446. Griffith JD, Makhov A, Zawel L, Reinberg D. 1995. Visualization of TBP oligomers binding and bending the HIV-1 and adeno promoters. J Mol Biol, 1995/03/10 ed. 246:576–584. Kim JL, Nikolov DB, Burley SK. 1993. Co-crystal structure of TBP recognizing the minor groove of a TATA element. Nature, 1993/10/07 ed. 365:520–527. 193. 194. 195. 196. 197. 198. 199. 200. 201. 202. 203. 204. 205. 206. 207. Lu H, Reach MD, Minaya E, Young CS. 1997. The initiator element of the adenovirus major late promoter has an important role in transcription initiation in vivo. J Virol, 1997/01/01 ed. 71:102–109. Roy AL, Du H, Gregor PD, Novina CD, Martinez E, Roeder RG. 1997. Cloning of an inr- and E-box-binding protein, TFII-I, that interacts physically and functionally with USF1. Embo J, 1998/01/31 ed. 16:7091– 7104. Du H, Roy AL, Roeder RG. 1993. Human transcription factor USF stimulates transcription through the initiator elements of the HIV-1 and the Ad-ML promoters. Embo J, 1993/02/01 ed. 12:501–511. Roy AL, Meisterernst M, Pognonec P, Roeder RG. 1991. Cooperative interaction of an initiator-binding transcription initiation factor and the helix-loop-helix activator USF. Nature, 1991/11/21 ed. 354:245–248. Kaufmann J, Verrijzer CP, Shao J, Smale ST. 1996. CIF, an essential cofactor for TFIID-dependent initiator function. Genes Dev, 1996/04/01 ed. 10:873–886. Kaufmann J, Ahrens K, Koop R, Smale ST, Muller R. 1998. CIF150, a human cofactor for transcription factor IID-dependent initiator function. Mol Cell Biol, 1998/01/07 ed. 18:233–239. Parks CL, Shenk T. 1997. Activation of the adenovirus major late promoter by transcription factors MAZ and Sp1. J Virol, 1997/11/26 ed. 71:9600–9607. Brunet LJ, Babiss LE, Young CS, Mills DR. 1987. Mutations in the adenovirus major late promoter: effects on viability and transcription during infection. Mol Cell Biol, 1987/03/01 ed. 7:1091–1100. Lagrange T, Kim TK, Orphanides G, Ebright YW, Ebright RH, Reinberg D. 1996. High-resolution mapping of nucleoprotein complexes by site-specific protein-DNA photocrosslinking: organization of the human TBP-TFIIA-TFIIB-DNA quaternary complex. Proc Natl Acad Sci U S A, 1996/10/01 ed. 93:10620–10625. Lee S, Hahn S. 1995. Model for binding of transcription factor TFIIB to the TBP-DNA complex. Nature, 1995/08/17 ed. 376:609–612. Nikolov DB, Chen H, Halay ED, Usheva AA, Hisatake K, Lee DK, Roeder RG, Burley SK. 1995. Crystal structure of a TFIIB-TBP-TATAelement ternary complex. Nature, 1995/09/14 ed. 377:119–128. Jansen-Durr P, Mondesert G, Kedinger C. 1989. Replication-dependent activation of the adenovirus major late promoter is mediated by the increased binding of a transcription factor to sequences in the first intron. J Virol, 1989/12/01 ed. 63:5124–5132. Leong K, Lee W, Berk AJ. 1990. High-level transcription from the adenovirus major late promoter requires downstream binding sites for latephase-specific factors. J Virol, 1990/01/01 ed. 64:51–60. Mansour SL, Grodzicker T, Tjian R. 1986. Downstream sequences affect transcription initiation from the adenovirus major late promoter. Mol Cell Biol, 1986/07/01 ed. 6:2684–2694. Lutz P, Kedinger C. 1996. Properties of the adenovirus IVa2 gene product, an effector of late-phase-dependent activation of the major late promoter. J Virol, 1996/03/01 ed. 70:1396–1405. 71 208. 209. 210. 211. 212. 213. 214. 215. 216. 217. 218. 219. 220. 221. 222. 72 Ali H, LeRoy G, Bridge G, Flint SJ. 2007. The adenovirus L4 33kilodalton protein binds to intragenic sequences of the major late promoter required for late phase-specific stimulation of transcription. J Virol 81:1327–1338. Ewing SG, Byrd SA, Christensen JB, Tyler RE, Imperiale MJ. 2007. Ternary complex formation on the adenovirus packaging sequence by the IVa2 and L4 22-kilodalton proteins. J Virol, 2007/09/07 ed. 81:12450– 12457. Ostapchuk P, Anderson ME, Chandrasekhar S, Hearing P. 2006. The L4 22-kilodalton protein plays a role in packaging of the adenovirus genome. J Virol, 2006/07/01 ed. 80:6973–6981. Wu K, Orozco D, Hearing P. 2012. The adenovirus L4-22K protein is multifunctional and is an integral component of crucial aspects of infection. J Virol, 2012/07/20 ed. 86:10474–10483. Yang TC, Maluf NK. 2012. Cooperative heteroassembly of the adenoviral L4-22K and IVa2 proteins onto the viral packaging sequence DNA. Biochemistry, 2012/02/07 ed. 51:1357–1368. Yang TC, Maluf NK. 2010. Self-association of the adenoviral L4-22K protein. Biochemistry, 2010/10/12 ed. 49:9830–9838. Kanopka A, Mühlemann O, Akusjärvi G. 1996. Inhibition by SR proteins of splicing of a regulated adenovirus pre-mRNA. Nature, 1996/06/06 ed. 381:535–538. Lützelberger M, Backström E, Akusjärvi G. 2005. Substrate-dependent differences in U2AF requirement for splicing in adenovirus-infected cell extracts. J Biol Chem 280:25478–25484. Farley DC, Brown JL, Leppard KN. 2004. Activation of the early-late switch in adenovirus type 5 major late transcription unit expression by L4 gene products. J Virol 78:1782–1791. Hayes BW, Telling GC, Myat MM, Williams JF, Flint SJ. 1990. The adenovirus L4 100-kilodalton protein is necessary for efficient translation of viral late mRNA species. J Virol, 1990/06/01 ed. 64:2732–2742. Hong SS, Szolajska E, Schoehn G, Franqueville L, Myhre S, Lindholm L, Ruigrok RWH, Boulanger P, Chroboczek J. 2005. The 100Kchaperone protein from adenovirus serotype 2 (Subgroup C) assists in trimerization and nuclear localization of hexons from subgroups C and B adenoviruses. J. Mol. Biol. 352:125–38. Törmänen Persson H, Aksaas AK, Kvissel AK, Punga T, Engström Å, Skålhegg BS, Akusjärvi G. 2012. Two cellular protein kinases, DNA-PK and PKA, phosphorylate the adenoviral L4-33K protein and have opposite effects on L1 alternative RNA splicing. PLoS One, 2012/03/01 ed. 7:e31871. Backström E, Kaufmann KB, Lan X, Akusjärvi G. 2010. Adenovirus L422K stimulates major late transcription by a mechanism requiring the intragenic late-specific transcription factor-binding site. Virus Res, 2010/07/14 ed. 151:220–228. Morris SJ, Scott GE, Leppard KN. 2010. Adenovirus late-phase infection is controlled by a novel L4 promoter. J Virol, 2010/05/07 ed. 84:7096–7104. Bourgeois CF, Lejeune F, Stevenin J. 2004. Broad specificity of SR (serine/arginine) proteins in the regulation of alternative splicing of pre- 223. 224. 225. 226. 227. 228. 229. 230. 231. 232. 233. 234. 235. 236. 237. messenger RNA. Prog Nucleic Acid Res Mol Biol, 2004/06/24 ed. 78:37– 88. Fu XD, Maniatis T. 1990. Factor required for mammalian spliceosome assembly is localized to discrete regions in the nucleus. Nature, 1990/02/01 ed. 343:437–441. Hall LL, Smith KP, Byron M, Lawrence JB. 2006. Molecular anatomy of a speckle. Anat Rec A Discov Mol Cell Evol Biol, 2006/06/09 ed. 288:664– 675. Gambke C, Deppert W. 1981. Late nonstructural 100,000- and 33,000dalton proteins of adenovirus type 2. I. Subcellular localization during the course of infection. J Virol, 1981/11/01 ed. 40:585–593. Estmer Nilsson C, Petersen-Mahrt S, Durot C, Shtrichman R, Krainer AR, Kleinberger T, Akusjärvi G. 2001. The adenovirus E4-ORF4 splicing enhancer protein interacts with a subset of phosphorylated SR proteins. Embo J 20:864–871. James NJ, Howell GJ, Walker JH, Blair GE. 2010. The role of Cajal bodies in the expression of late phase adenovirus proteins. Virology, 2010/02/09 ed. 399:299–311. Sanford JR, Ellis JD, Cazalla D, Caceres JF. 2005. Reversible phosphorylation differentially affects nuclear and cytoplasmic functions of splicing factor 2/alternative splicing factor. Proc Natl Acad Sci U S A, 2005/10/08 ed. 102:15042–15047. Lucas JJ, Ginsberg HS. 1972. Transcription and transport of virus-specific ribonucleic acids in African green monkey kidney cells abortively infected with type 2 adenovirus. J. Virol. 10:1109–17. Mondesert G, Tribouley C, Kedinger C. 1992. Identification of a novel downstream binding protein implicated in late-phase-specific activation of the adenovirus major late promotor. Nucleic Acids Res, 1992/08/11 ed. 20:3881–3889. Lutz P, Rosa-Calatrava M, Kedinger C. 1997. The product of the adenovirus intermediate gene IX is a transcriptional activator. J. Virol. 71:5102–9. Jansen-Durr P, Boeuf H, Kédinger C. 1988. Replication-induced stimulation of the major late promoter of adenovirus is correlated to the binding of a factor to sequences in the first intron. Nucleic Acids Res. 16:3771–86. Maderious A, Chen-Kiang S. 1984. Pausing and premature termination of human RNA polymerase II during transcription of adenovirus in vivo and in vitro. Proc. Natl. Acad. Sci. U. S. A. 81:5931–5. Mok M, Maderious A, Chen-Kiang S. 1984. Premature termination by human RNA polymerase II occurs temporally in the adenovirus major late transcriptional unit. Mol. Cell. Biol. 4:2031–40. Li L, Davie JR. 2010. The role of Sp1 and Sp3 in normal and cancer cell biology. Ann. Anat. 192:275–83. Won J, Yim J, Kim TK. 2002. Sp1 and Sp3 recruit histone deacetylase to repress transcription of human telomerase reverse transcriptase (hTERT) promoter in normal human somatic cells. J. Biol. Chem. 277:38230–8. Montes M, Becerra S, Sánchez-Álvarez M, Suñé C. 2012. Functional coupling of transcription and splicing. Gene 501:104–17. 73 238. 239. 240. 241. 242. 243. 244. 245. 246. 247. 248. 249. 250. 74 Kreivi JP, Akusjärvi G. 1994. Regulation of adenovirus alternative RNA splicing at the level of commitment complex formation. Nucleic Acids Res, 1994/02/11 ed. 22:332–337. Delsert C, Morin N, Klessig DF. 1989. cis-acting elements and a transacting factor affecting alternative splicing of adenovirus L1 transcripts. Mol Cell Biol, 1989/10/01 ed. 9:4364–4371. Singh R, Valcarcel J, Green MR. 1995. Distinct binding specificities and functions of higher eukaryotic polypyrimidine tract-binding proteins. Science (80-. )., 1995/05/26 ed. 268:1173–1176. Page-McCaw PS, Amonlirdviman K, Sharp PA. 1999. PUF60: a novel U2AF65-related splicing activity. RNA, 1999/12/22 ed. 5:1548–1560. Östberg S, Törmänen Persson H, Akusjärvi G. 2012. Serine 192 in the tiny RS repeat of the adenoviral L4-33K splicing enhancer protein is essential for function and reorganization of the protein to the periphery of viral replication centers. Virology, 2012/09/05 ed. 433:273–281. Stancheva I, Schirmer EC. 2014. Nuclear envelope: connecting structural genome organization to regulation of gene expression. Adv. Exp. Med. Biol. 773:209–44. Steglich B, Sazer S, Ekwall K. Transcriptional regulation at the yeast nuclear envelope. Nucleus 4:379–89. Guimet D, Hearing P. 2013. The adenovirus L4-22K protein has distinct functions in the posttranscriptional regulation of gene expression and encapsidation of the viral genome. J Virol, 2013/05/03 ed. 87:7688–7699. Kataoka N, Bachorik JL, Dreyfuss G. 1999. Transportin-SR, a nuclear import receptor for SR proteins. J Cell Biol, 1999/06/15 ed. 145:1145–1152. Lai MC, Lin RI, Huang SY, Tsai CW, Tarn WY. 2000. A human importin-beta family protein, transportin-SR2, interacts with the phosphorylated RS domain of SR proteins. J. Biol. Chem. 275:7950–7. Cáceres JF, Screaton GR, Krainer AR. 1998. A specific subset of SR proteins shuttles continuously between the nucleus and the cytoplasm. Genes Dev. 12:55–66. Gama-Carvalho M, Carvalho MP, Kehlenbach A, Valcarcel J, CarmoFonseca M. 2001. Nucleocytoplasmic shuttling of heterodimeric splicing factor U2AF. J Biol Chem, 2000/12/29 ed. 276:13104–13112. 2001. IUPAC-IUB joint commission on biochemical nomenclature abbreviations and symbols for the description of conformations of polynucleotide chains. Curr Protoc Nucleic Acid Chem, 2008/04/23 ed. Appendix 1:Appendix 1C. Acta Universitatis Upsaliensis Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine 1062 Editor: The Dean of the Faculty of Medicine A doctoral dissertation from the Faculty of Medicine, Uppsala University, is usually a summary of a number of papers. A few copies of the complete dissertation are kept at major Swedish research libraries, while the summary alone is distributed internationally through the series Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine. (Prior to January, 2005, the series was published under the title “Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine”.) Distribution: publications.uu.se urn:nbn:se:uu:diva-238487 ACTA UNIVERSITATIS UPSALIENSIS UPPSALA 2014