Homology searching of the amino acid sequence databases revealed that the 1,417 amino acid protein encoded by the BLM cDNA sequence contains homology to RecQ helicases (Fig. 30-7), a subfamily of DExH box-containing DNA and RNA helicases.26 The RecQ helicases are member of a much larger group of proteins that contain seven amino acid motifs that are present in most DNA and RNA helicases. Helicases are defined biochemically by two activities: (a) they are DNA- or RNA-dependent ATPases, and (b) with ATP and Mg++ as co-factors, they catalyze the unwinding of duplex nucleic acids. Because nucleic acids are predominantly in a duplex form in the cell, helicases are active in all processes in nucleic acid metabolism that require access to single-stranded molecules, namely, DNA replication, DNA repair, recombination, RNA transcription, and protein translation. Given the fact that the BLM protein is a helicase, 37 two questions arise: In which of these many processes does BLM participate? and What are the specific nucleic acid substrates in the cell on which BLM acts? Recent experimental evidence from the study of RecQ family members and BLM has provided some important insights.
Alignment of the amino acid sequences in the domains containing the seven helicase motifs (I, Ia, II, III, IV, V and VI) of selected RecQ helicases. The Megalign computer program (DNAStar) performed the sequence alignments. Numbers at left indicate the amino acid positions in each protein, and gene product names are at the right. Identities present in all six selected proteins are boxed. Overlined sequences mark the seven helicase motifs in the helicase domain. The DExH box is in helicase motif II. (Reprinted with permission from Ellis, NA, German J: Molecular genetics of Bloom's syndrome. Hum Mol Genet 5:1457, 1996.)
recQ was isolated as a mutation in E. coli that generated resistance to thymineless death and was identified as a member of the RecF pathway of DNA recombination.38 In appropriately marked bacterial strains, recQ mutants are hyporecombinogenic, which strongly suggests that RecQ carries out a step in DNA recombination. When the RecQ protein was purified, it was found to have DNA-dependent ATPase and DNA strand-displacement activities that defined it as a DNA helicase.39 Together with the RecA and SSB (single-stranded DNA-binding) proteins, RecQ can catalyze both the formation and the dissolution of recombinational intermediates in vitro. 40 What might drive the reaction in the cell in one direction or the other is unknown; however, genetic evidence has been presented to suggest that RecQ can inhibit illegitimate recombination.41 So, in E. coli, RecQ could help maintain genomic stability by preventing the formation of duplexes between imperfectly homologous DNA sequences. Also, there is evidence that the RecF pathway operates during DNA replication to maintain and reactivate replication complexes which have stalled, e.g., when they have encountered sites of DNA damage such as cyclopyrimidine dimers.42 RecQ, as a member of the RecF pathway, could be helping maintain the integrity of the replication complex when fork progression is impeded.
In Saccharomyces cerevisiae where the entire genomic DNA sequence is known, there is a single RecQ family member. This gene, SGS1, was first identified as a mutation that is a slow-growth suppressor of a cell containing a mutation in its topoisomerase 3 (TOP3) gene.43 top3 mutants not only proliferate slowly but also have, by measures at several different loci, vastly elevated recombination frequencies; e.g., at the rDNA locus, near telomeres, and in diploids at genes marked by heteroallelic mutations. Suppression of these phenotypes by deletion of SGS1 suggests that top3p and sgs1p (the proteins) interact physically. Supporting this possibility, SGS1 was identified by TOP3 in a yeast two-hybrid screen.43 In addition to interaction with top3p, sgs1p interacts physically with top2p.44 Finally, sgs1/top1 double mutants exhibit a slow-growth phenotype that neither single mutant exhibits.42 Thus, with the three topoisomerases in yeast, sgs1p has genetic interactions, physical interactions, or both. sgs1p possesses DNA-dependent ATPase and DNA strand-displacement activities.42
Mutation in SGS1 by itself also causes a hyperrecombination phenotype that is milder than that in TOP3. 46 In addition, sgs1 cell-doubling time is increased, and the mutant cells feature increased nondisjunction in both mitosis and meiosis.44,46 These attributes raise the possibility of a defect in genomic stability. It has been suggested that sgs1p suppresses recombination, for example, in highly repetitive DNA sequences, 43,46 and recent genetic evidence suggests that sgs1p, like RecQ, can inhibit illegitimate recombination.47 E. coli RecQ in vitro can enter and unwind a closed circular duplex DNA molecule, and it can stimulate the activity of E. coli topoisomerase 3 to catenate such double-stranded molecules.48 These data suggest that at the very least a functional interaction exists between RecQ and topoisomerase 3 and that potentially the function of the yeast and bacterial RecQ family proteins is highly conserved.
The structure of S. cerevisiae's sgs1p differs from E. coli's RecQ in an important way: RecQ is a 610 amino acid protein with an N-terminal helicase domain (approximately 300 amino acids) and a C-terminal domain of unknown function. These two domains are highly positively charged. Sgs1p, on the other hand, is a 1,447 amino acid protein that, in addition to the helicase and C-terminal domains, contains a highly negatively charged N-terminal domain (approximately 650 amino acids). The regions of sgs1p that interact with topoisomerases have been mapped to this N-terminal domain.43,44 In its structure, BLM resembles sgs1p in having a highly negatively charged N-terminal domain (650 amino acids) along with positively charged helicase and C-terminal domains. Thus, as with sgs1p, the N-terminal domain of BLM could provide specificity to the function of the helicase by determining BLM's interactions with other proteins.
In Schizosaccharomyces pombe, an SGS1-like recQ gene, referred to as rqh1+, is present that was identified by a mutation, rad12, that causes hypersensitivity to UV irradiation, and by a second, hus2, that caused hypersensitivity to hydroxyurea.49,50 These mutations also confer hyperrecombination and chromosome-nondisjunction phenotypes similar to those in S. cerevisiae sgs1. Molecular genetic evidence points to a function for rqh1+ protein in maintaining replication-fork integrity when DNA damage or fork-progression inhibition occurs during S phase, 49,50 and possibly a function in signaling to the cell-cycle-control machinery.51 top3 mutants in S. pombe are viable for only a limited number of cells generations, and, like rqh1 mutants, they exhibit a ‘cut’ phenotype, which signifies aberrant chromosome segregation. Consistent with a conserved interaction between RecQ family proteins and topoisomerase 3s, rqh1 mutation suppresses the top3 lethal phenotype.52 Such observations support the hypothesis that the RecQ family proteins and topoisomerase 3s together facilitate sister-chromatid separation at the sites of termination of DNA replication.43,52 The combined genetic and biochemical evidence points to possible roles for RecQ family proteins in three critical processes: the suppression of illegitimate recombination events, the maintenance of replication fork integrity during periods when the complex is stalled, and separation of sister chromatids.
The cloning of rqh1+ uncovered another feature of the domain structure of the RecQ family. Immediately C-terminal of the central helicase domain, rqh1+ and BLM both contain segment of 200 amino acids that have approximately 20 percent identity, referred to as the C-terminal extended homology region. The other RecQ family members mentioned above contain this region, but in pairwise comparisons, the homology varies both in the number of amino acids and in the percent of identity53 . Additionally, by homology searching of the protein databases, a second motif was identified as C-terminal of the extended homology region.54 This motif, called the HRDC (for helicase and RNAaseD C-terminal domain), is implicated in DNA binding. Because mutation in the C-terminal extended homology region can destroy helicase activity (see below), and because the HRDC is proposed to act in DNA binding, 54 these regions may play a role in the recognition of specific substrates in vivo.
Mammalian cells have a RecQ-like protein consisting of 659 amino acids that is referred to as RECQL1 (also called RECQL). RECQ1 was isolated as a major ATPase of HeLa cells and was shown to have DNA helicase activity.55,56 The cellular role of RECQ1 is unknown. After RECQL1 and BLM, a third RecQ family member was identified: WRN, the gene that when mutated results in Werner syndrome (WS)—defined clinically by premature aging (see Chap. 33)—encodes a 1,432 amino acid product having domain structures similar to that of sgs1p and BLM.57 WRN is a DNA helicase, 58,59 but, unlike BLM or the other known RecQ helicases, WRN contains a 5′ to 3′ exonuclease activity in its N-terminal domain.60 This difference in their N-terminal domain structures and functions could explain, in part, why clinical BS bears essentially no resemblance to clinical WS. WS does predispose to certain rare neoplasms, and WS cells exhibit what has been called “variegated translocation mosaicism.”61 Although excessive chromosome breakage as seen in BS is not present, fibroblasts cultured from WS skin grow as clones, with each clonal line marked by a distinctive chromosome translocation.61,62 WS cells also exhibit an increased frequency of mutations at the only specific locus tested so far, the HPRT locus, which are mostly deletions. Thus, the identification of WRN may have established a connection between the aging process and the maintenance of genomic stability. Correspondingly, the sgs1 mutation in yeast gives a premature aging phenotype which is associated with the formation of extrachromosomal rDNA circles.63,64
Recently, two additional human RecQ helicase family members, RECQL4 and RECQ5, were identified by searching the cDNA sequence database.65 A report has been made of mutations in RECQL4 in persons diagnosed with Rothmund-Thomson syndrome.66 For BLM and WRN we know that absence of a normal allele leads to genomic instability. For the three other RecQ members, RECQL1, and RECQL5, although there is no information to suggest that mutations in them produce viable phenotypes, it is possible that unexplained entities caused by such already are known in clinical medicine.
Structure and Function of the BLM Helicase
Homology to the RecQ helicases strongly suggested that BLM itself is a DNA helicase. To demonstrate that BLM has this activity, however, it was expressed with a C-terminal hexahistidine tag in S. cerevisiae, partially purified by nickel-chelation chromatography, and tested in conventional assays for DNA-dependent ATPase and strand-displacement activities.37,67 BLM can unwind a number of different DNA duplex substrates, but it has a striking preference for G4 DNA (a tetrameric DNA structure that can form between runs of guanines).61
In the 108 persons with BS in whom the 48 different mutations have been identified, 4 of the 9 missense mutations identified (above) alter different amino acid residues in the helicase domain. One mutation replaces the glutamine at residue 672 with an arginine (Q672R; see Table 30-4); this glutamine lies 10 amino acid residues N-terminal of the helicase motif I, and it is conserved in all RecQ helicases (see Fig. 30-7). Two mutations have been identified in motif IV, and one at a conserved histidine residue between motifs V and VI (our unpublished observations). We expect that all four of these amino acid substitutions either reduce or destroy BLM's helicase activity. Experimentally, the Q672R mutation has been introduced into a BLM cDNA expression construct, mutant BLM produced in yeast, and the partially purified protein assayed for helicase activity; indeed, BLM Q672R protein has reduced DNA-dependent ATPase activity and lacks detectable DNA strand-displacement activity.67
In addition to finding mutations inside the helicase domain, a cluster of 5 amino acid substitutions has been found in a 50 amino-acid stretch of the RecQ C-terminal extended homology region (reference69 and our unpublished observations). The first such mutation that has been studied in some detail replaces a conserved cystine at residue 1055 with a serine (C1055S). This amino acid substitution has been introduced experimentally into the BLM cDNA expression construct, the mutant BLM then produced in yeast, and the partially purified protein assayed for helicase activity; the BLM C1055S protein lacks detectable ATPase and DNA strand-displacement activities.67 A similar result was obtained when this mutation was introduced at the same position of the mouse Blm gene.53 (Mouse and human BLM genes are highly conserved throughout this region.) Given the clustering of the amino acid substitutions in this region, we predict that the other mutations in the C-terminal extended homology region have similar effects on BLM's helicase activity. The observation that BLM mutations in persons with BS ablate its helicase activity may be interpreted to mean that the helicase activity is indispensable to the protein's normal function.
Antibodies to an N-terminal segment of BLM have been raised in rabbits, and a protein of apparent molecular weight of 180-kDa has been identified by Western blot analysis of fibroblast, lymphoblastoid, and HeLa cells. This 180-kDa molecule is absent from all the BS cell lines homozygous for premature translation-termination mutations that have been examined.67,70 This indicates that the 180-kDa molecule is BLM, the BS protein. Simultaneously, it demonstrates that anti-BLM antibody is useful for characterizing BLM, for defining its location in the cell, and for identifying proteins with which BLM may interact.
With the BLM antibodies available, it has been possible to introduce a BLM cDNA expression construct into BLM-lacking BS cells and to determine whether BLM becomes detectable by Western blot analysis and, or, by cellular immunofluorescence. The normal BLM expression construct was transfected into SV40-transformed BS fibroblasts (cell line GM08505). This cell line is derived from a diploid fibroblast line homozygous for blm Ash , and it has the high-SCE phenotype of BS and lacks detectable BLM protein. Transfection of BLM restores the 180-kDa BLM molecule to GM08505 cells and concomitantly reduces the SCE rate of these cells from a mean of 58 SCEs per 46 metaphase chromosomes to a mean of 23.70 The same level of SCE reduction has been observed when normal cells are hybridized to GM08505 cells or when a normal chromosome 15 is introduced into these cells by chromosome-mediated gene transfer, 71 i.e., not completely to the level seen in non-SV40-transformed non-BS fibroblasts. Consequently, the transfected BLM cDNA functions in GM08505 cells to reduce SCEs as efficiently as the BLM gene when in its normal chromosomal location; similar correction results now have been reported by others.72
These complementation experiments have allowed the development of a system for studying structure-function relationships of BLM. Two BS-causing mutations mentioned above—the helicase-negative Q672R and the C1055S amino acid substitutions—were introduced experimentally into the BLM cDNA and transfected into GM08505 cells. Although a 180-kDa molecule was detectable by Western analysis after transfection and cloning of the cells, albeit present at levels lower than when normal BLM cDNA is transfected, expression of the mutant BLM proteins failed to reduce the high-SCE rate of these cells.67 Experiments are underway using the transfection system to investigate the effects of small, experimentally produced deletions in the nonhelicase domains and of exchanging homologous domains between other RecQ helicases and BLM, e.g., the N-terminal domain of WRN for that of BLM.
The intracellular localization of BLM has been determined by employing BLM antibodies and the indirect immunofluorescence technique in the study of various BS and non-BS cell lines. BLM protein is present in the nucleus of all cells examined save those from persons with BS. Consistent with BLM's presence in the nucleus, transient transfections of constructs in which amino acids C-terminal of residue 1341 were deleted demonstrated that BLM protein contains a nuclear localization signal (NLS) in its last 100 amino acids. Examination of the sequences there disclosed a bipartite NLS at residues 1334 to 1349 as found in numerous other nuclear proteins (e.g., DNA polymerase α and topoisomerase II).73 The WRN helicase contains an NLS at a similar location (residues 1370 to 1376).74 Because the NLSs of BLM and WRN are at the C-termini, premature translation-termination mutations N-terminal of the NLSs render the proteins nonfunctional, via the mutant protein's inability to be moved into the nucleus. Supporting this observation is the finding of a protein-truncating mutation in the BLM of a person with BS that encodes a BLM abnormal only in lacking its C-terminal 175 amino acids (our unpublished observation).
The abundance of BLM at different phases of the cell cycle varies strikingly, being at its lowest in early G1. Presently being defined by immunofluorescence microscopy are interesting focal concentrations that BLM makes before and during S and its association with chromatin and nuclear matrix, co-localization with other nuclear proteins or lack thereof, and its representation in various nucleoprotein complexes (e.g., nuclear bodies, DNA replication “centers”). These microscopic observations complemented by appropriate immunoprecipitation and biochemical experiments will eventually define BLM's role(s) in the various mechanisms that require opening of the DNA helix. A knockout of the mouse Blm gene has been produced by homologous recombination and embryonic stem cell technology.75 Blm −/− fibroblasts exhibit a high SCE phenotype indicating that the mutation introduced into the mouse gene is a null. The Blm −/− embryos die at day 13.5 developmentally delayed and anemic. A wave of apoptosis occurs in the postimplantation embryo, which provides a possible explanation for the developmental delay observed later in gestation. That the Blm null mutation is lethal in mouse, whereas the human BLM null (BS) is not, points to some underlying variation during evolution in the requirement for BLM or the RecQ family genes relative to their physiological function. The development of a mouse model of BS would be desirable in order to permit physiological experimentation.