Saturday, March 3, 2012

Evolutionary genomic remodelling of the human 4q subtelomere (4q35.2).(Research article)

Authors: Beatrice Bodega [1]; Maria Francesca Cardone [2]; Stefan M�ller [3]; Michaela Neusser [3]; Francesca Orzan [1]; Elena Rossi [1]; Elena Battaglioli [1]; Anna Marozzi [1]; Paola Riva [1]; Mariano Rocchi [2]; Raffaella Meneveri [4]; Enrico Ginelli (corresponding author) [1]

Background

The distal portion of the human 4q35 genomic region (4q35.2) contains a complex arrangement of repetitive sequences and several genes including facioscapulohumeral muscular dystrophy (FSHD) region genes 1 and 2 (

FRG1 and FRG2 ) [1]. A polymorphic tandem array of 3.3 kb repeats (D4Z4) has also been detected distally to these genes [2]; the D4Z4 unit contains an open reading frame (ORF) encoding a homeobox protein (DUX4), but DUX4 transcription has not been demonstrated [3, 4].

Deletions within the subtelomeric D4Z4 array lead to autosomal-dominant facioscapulohumeral muscular dystrophy (FSHD MIM 158900) [2, 5]. Unaffected individuals have 11-100 copies of the D4Z4 repeat on both chromosomes 4q, whereas almost all FSHD patients have one chromosome 4q characterised by 1-10 tandem copies [3]. Moreover, 4q35.2 D4Z4 repeats are methylated in the general population, whereas the contracted array is hypomethylated in FSHD patients [6, 7]. The property of 4qter array methylation has led to the hypothesis that the pathogenesis of FSHD is associated with an epigenetic mechanism. In this regard, the array contraction might cause proximal or distal changes in chromatin structure, with the consequent up-regulation of one or more 4qter genes [8], and it is known that transgenic mice over-expressing the

FRG1 gene (located 125 kb from the D4Z4 array) develop a muscular dystrophy resembling human FSHD.

Some studies of inappropriate gene activation in the FSHD region support this hypothesis [9, 10, 11], but other papers report a similar level of 4qter gene expression in FSHD patients and controls [12, 13]. Furthermore, recent findings indicate that FSHD may share some features with nuclear envelope dystrophies [14, 15, 16]. Unlike most other chromosome ends, normal and FSHD 4qter alleles are preferentially localised at the nuclear rim [14, 15]. It is assumed that this consistent and non-random localisation of 4qter in the periphery of human cell nuclei has some functional significance, and that FSHD may be due to improper interactions with transcription factors or chromatin modifiers at the nuclear envelope [15].

Despite these observations, studies aimed at defining the molecular mechanisms underlying the pathogenesis of FSHD are difficult to perform because the genomic sequences involved in the disease are a complex patchwork of duplications. The D4Z4 repeat is not restricted to chromosome 4; perfect arrays of D4Z4 units can also be detected on chromosome 10q, and there are additional homologous sequences interspersed with beta satellites on many heterochromatic loci, such as the short arms of acrocentric chromosomes and the pericentromeric region of chromosome 1q [17]. Moreover, a subset of 4q35.2 sequences proximal and distal to the D4Z4 array, including the

FRG1 and FRG2 genes, are duplicated in the human genome [4, 18, 19]. This complex genomic scenario makes it difficult to understand the regulation of gene expression at 4q35.2, as well as its alteration in FSHD.

In order to obtain further insights into the function of the human 4qter genomic region, we studied its evolution by investigating genomic organisation, nuclear positioning, chromatin acetylation levels and gene expression in African apes. We chose the gorilla as a starting point because previous studies have indicated the presence of a chromosome-specific block of subtelomeric sequences in 4qter [20], and D4Z4 dispersion is less complex than in humans.

Results

Isolation and sequence characterisation of gorilla 4qter genomic clones

A gorilla genomic library was screened using a probe from a human subtelomeric block of sequences (defined as block 3 in [20]) that are located on chromosomes 1p, 8q, 15q and 19p in the human genome, but only on chromosome 4qter in gorilla and chimpanzee. A non-repetitive hybridisation probe was generated by PCR on DNA from BAC AC140725 mapped to 15q26.3, which includes subtelomeric block 3 [20] (Additional file 1). By this approach, six BAC clones were obtained. In the human reference sequence, one or both ends of five clones (CH255-11C6, CH255-18C5, CH255-23B19, CH255-39M12, and CH255-41H7) were similar within a region of approximately 35 kb on chromosomes 15q26.3 and 19p13.3 [21] (Additional file 1 and Additional file 2), which proved to be a complex patchwork of fragments from different LINE subfamilies (the LINE block) (Additional file 3). Subtelomeric block 3 and the LINE block in the human reference sequence overlap for ~10 kb (Additional file 1). Furthermore, one end of three clones (CH255-23B19, CH255-39M12 and CH255-39N14) showed similarity with the 4q35.2 locus, within a region of approximately 15 kb distal to the

FRG2 gene (Additional file 2).Additional File 1: Supplementary Figure 1 . UCSC Human blat server [21] analysis of the location of gorilla BAC ends on chromosome 15q26.3, and partial overlapping on the same chromosome between the LINE block and subtelomeric block 3 [20]. A) Gorilla BAC ends (11C6-Sp6/T7, 18C5-Sp6/T7, 23B19-T7, 39M12-T7, 41H7-Sp6/T7) (GGO BAC ends, upper black rectangles) identify a LINE block of 35 kb on chromosome 15q26.3 (light grey bar). Similar repetitive blocks are also mapped on the 1p, 8p, and 19p subtelomeres. The repeat elements recognised by the Repeat Masker program are represented by rectangles in different shades of grey. B) Partial overlapping on chromosome 15q26.3 between the LINE block identified in A) and subtelomeric block 3 [20]. As in A), the repeat elements recognised by the Repeat Masker program are represented by rectangles in different shades of grey. The red rectangle (probe) identifies the location of the non-repetitive DNA sequence used as a probe for the screening of the gorilla genomic library.Additional File 2: Supplementary Table 1 . Similarity of gorilla BAC ends with the human genome reference sequence. The Sp6 and T7 BAC ends of six gorilla clones were analysed for their similarity with the human genome reference sequence [21]; the table shows the most similar human chromosome regions.Additional File 3: Supplementary Table 2 . Composition of the repeat sequences contained in the LINE block. Raw data obtained from Repeat Masker analysis of the LINE block repeat composition using Repeat Masker software [34].

After KpnI or EcoRI digestion and hybridisation with D4Z4 (LSau probe), all of the clones showed a very similar restriction and hybridisation pattern of bands (Additional file 4). These results strongly suggested that all of the isolated BACs carry very similar DNA sequences, including an array of approximately 10-15 D4Z4 repetition units, as derived from the densitometric profile of KpnI-digested BAC DNAs.

Additional File 4: Supplementary Figure 2 . Southern blot analysis of six D4Z4- and LINE-positive gorilla BAC clones. A ) Agarose gel electrophoresis of EcoRI-digested DNA from six gorilla BACs (CH255-41H7, CH255-39N14, CH255-39M12, CH255-23B19, CH255-18C5 and CH255-11C6) positive for both D4Z4 and LINE sequences. Ethidium bromide staining (left), and Southern blot hybridisation with a LSau probe (right). B ) Agarose gel electrophoresis of KpnI-digested DNA from BAC CH255-39M12 positive for both D4Z4 and LINE sequences. Ethidium bromide staining (OD left), and Southern blot hybridization (right) with LSau and beta satellite probes. M = molecular weight marker; bp = base pair.

The sequence content of the gorilla BAC clones was investigated by means of orthologous PCR using primer pairs derived from the human reference sequence that defined the

FRG1 and FRG2 promoters and marker 13E11 at 4q35.2, and 35 kb of the LINE block at 15q26.3 (the primer pairs are listed in Additional file 5). All of the clones were remarkably similar insofar as they included the FRG2 promoter, the 13E11 marker, and an almost complete LINE block of 35 kb (Additional file 6). The isolated gorilla genomic clones thus showed redundancy of essentially three classes of sequences: an array of D4Z4 repeats, a LINE block, and a region of non-repetitive 4q35.2 DNA between the FRG2 gene and the 13E11 marker (approx. 40 kb). Taken together, the molecular analyses strongly suggested that the isolated genomic sectors genuinely derived from the gorilla 4q35.2 locus.Additional File 5: Supplementary Table 3 . Primer pairs used for sequencing, PCR, ChIP and RT-PCR. All the primer pairs used for the sequencing and PCR-based analyses are listed …

No comments:

Post a Comment