Collagen, Type I, Alpha-1

Watchlist
Retrieved
2019-09-22
Source
Trials
Genes
Drugs

Description

Collagen has a triple-stranded rope-like coiled structure. The major collagen of skin, tendon, and bone is the same protein containing 2 alpha-1 polypeptide chains and 1 alpha-2 chain. Although these are long (the procollagen chain has a molecular mass of about 120 kD, before the 'registration peptide' is cleaved off; see 225410), each messenger RNA is monocistronic (Lazarides and Lukens, 1971). Differences in the collagens from these 3 tissues are a function of the degree of hydroxylation of proline and lysine residues, aldehyde formation for cross-linking, and glycosylation. The alpha-1 chain of the collagen of cartilage and that of the collagen of basement membrane are determined by different structural genes. The collagen of cartilage contains only 1 type of polypeptide chain, alpha-1, and this is determined by a distinct locus. The fetus contains collagen of distinctive structure. The genes for types I, II, and III collagens, the interstitial collagens, exhibit an unusual and characteristic structure of a large number of relatively small exons (54 and 108 bp) at evolutionarily conserved positions along the length of the triple-helical gly-X-Y portion (Boedtker et al., 1983). The family of collagen proteins consists of a minimum of 9 types of collagen molecules whose constituent chains are encoded by a minimum of 17 genes (Ninomiya and Olsen, 1984).

Cloning and Expression

Tromp et al. (1988) characterized a full-length cDNA clone for the COL1A1 gene.

Mapping

Sundar Raj et al. (1977) used the methods of cell hybridization and microcell hybridization to assign a collagen I gene to chromosome 17. Solomon and Sykes (1979) concluded, incorrectly as it turned out, that both the alpha-1 and the alpha-2 genes of collagen I are on chromosome 7. Solomon and Sykes (1979) also presented evidence that the alpha-1 chains of collagen III are also coded by chromosome 7. Church et al. (1981) assigned a structural gene for corneal type I procollagen to chromosome 7 by somatic cell hybridization involving corneal stromal fibroblasts. Because they had previously assigned a gene for skin type I procollagen to chromosome 17, they wondered whether skin and corneal type I collagen may be under separate control.

Huerre et al. (1982) used a cDNA probe in both mouse-man and Chinese hamster-man somatic cell hybrids to demonstrate cosegregation with human chromosome 17. In situ hybridization using the same probe indicated that the gene is in the middle third of the long arm, probably in band 17q21 or 17q22.

By chromosome-mediated gene transfer (CMGT), Klobutcher and Ruddle (1979) transferred the genes for thymidine kinase, galactokinase (604313), and type I procollagen (gene for alpha-1 polypeptide). The data indicated the following gene order: centromere--GALK--(TK1-COL1A1). Later studies (Ruddle, 1982) put the growth hormone gene cluster (see 139250) between GALK and (TK1-COL1A1).

A HindIII restriction site polymorphism in the alpha-1(I) gene was described by Driesel et al. (1982), who probably unjustifiably stated that the gene is on chromosome 7. By in situ hybridization, Retief et al. (1985) concluded that the alpha-1(I) and alpha-2(I) genes are located in bands 17q21.31-q22.05 and 7q21.3-q22.1, respectively.

Sippola-Thiele et al. (1986) commented on the limited number of informative RFLPs in the collagen genes, especially COL1A1. They proposed a method for assessing RFLPs that were otherwise undetectable in total human genomic DNA. Using the centromere-based locus D17Z1, Tsipouras et al. (1988) found a recombination fraction of 0.20 with COL1A1. Furthermore, they demonstrated that COL1A1 and GH1 (139250) show a recombination fraction of 0.10. They proposed that the most likely order is D17Z1--COL1A1--GH1.

Byrne and Church (1983) had concluded that both subunits of type I collagen, alpha-1 and alpha-2, are coded by chromosome 16 in the mouse. SOD1 (147450), which in man is on chromosome 21, is also carried by mouse 16. It may have been type VI collagen (120220, 120240) that they dealt with; both COL6A1 and COL6A2 are coded by human chromosome 21. (In fact, the Col6a1 and Col6a2 genes are carried by mouse chromosome 10 (Justice et al., 1990).) Munke et al. (1986) showed that the alpha-1 gene of type I collagen is located on mouse chromosome 11; the Moloney murine leukemia virus is stably integrated into this site when microinjected into the pronuclei of fertilized eggs. This insertion results in a lethal mutation through blockage of the developmentally regulated expression of the gene (Schnieke et al., 1983).

Molecular Genetics

Osteogenesis Imperfecta

Pope et al. (1985) described a substitution of cysteine in the C-terminal end of the alpha-1 collagen chain in a 9-year-old boy with mild osteogenesis imperfecta (OI) of Sillence type I. They assumed that this was a substitution for either arginine or serine (which could be accomplished by a single base change) because substitution of cysteine for glycine produced a much more drastic clinical picture. In a neonatal lethal case of OI congenita, Barsh and Byers (1981) demonstrated a defect in pro-alpha-1 chains (see OI type II, 166210).

Byers et al. (1988) found an insertion in one COL1A1 allele in an infant with OI II. One alpha-1 chain was normal in length, whereas the other contained an insertion of approximately 50-70 amino acid residues within the triple-helical domain defined by amino acids 123-220. The structure of the insertion was consistent with duplication of an approximately 600-bp segment in 1 allele.

Brookes et al. (1989) used an S1 nuclease directed cleavage of heteroduplex DNA molecules formed between genomic material and cloned sequences to search for mutations in the COL1A1 gene in 5 cases in which previous linkage studies had shown the mutation to be located in the COL1A1 gene and in 4 cases in which a COL1A1 null allele had been identified by protein and RNA studies. No abnormality was found in the complete 18 kb COL1A1 gene or in 2 kb of 5-prime flanking sequence. The method used was known to permit the detection of short length variations of the order of 4 bp in heterozygous subjects but not single basepair alterations. Thus, Brookes et al. (1989) suggested that single basepair alterations may be the predominant category of mutation in type I OI.

COL1A1 and NGFR (162010) are in the same restriction fragment. In a 3-generation family with OI type I, Willing et al. (1990) found that all affected members had one normal COL1A1 allele and another from which the intragenic EcoRI restriction site near the 3-prime end of the gene was missing. They found, furthermore, a 5-bp deletion at the EcoRI site which changed the translational reading frame and predicted the synthesis of a pro-alpha-1(I) chain that extended 84 amino acids beyond the normal termination. Although the mutant chain was synthesized in an in vitro translation system, they were unable to detect its presence in intact cells, suggesting that it is unstable and rapidly destroyed in one of the cell's degradative pathways.

Cohn et al. (1990) demonstrated a clear instance of paternal germline mosaicism as the cause of 2 offspring with OI type I by different women. Both affected infants had a G-to-A change that resulted in substitution of aspartic acid for glycine at position 883 of the alpha-1 chain of type I collagen. Although not detected in the father's skin fibroblasts, the mutation was detected in somatic DNA from the father's hair root bulbs and lymphocytes. It was also found in the father's sperm where about 1 in 8 sperm carried the mutation, suggesting that at least 4 progenitor cells populate the germline in human males. The father was clinically normal. In an infant with perinatal lethal OI (OI type II), Wallis et al. (1990) demonstrated both normal and abnormal type I procollagen molecules. The abnormal molecules had substitution of arginine for glycine at position 550 of the triple-helical domain as a result of a G-to-A transition in the first base of the glycine codon. The father was shown to be mosaic for this mutation, which accounted for about 50% of the COL1A1 alleles in his fibroblasts, 27% of those in blood cells, and 37% of those in sperm. The father was short of stature; he had bluish sclerae, grayish discoloration of the teeth (which were small), short neck, barrel-shaped chest, right inguinal hernia, and hyperextensible fingers and toes. A triangular-shaped head had been noted at birth and he was thought to have hydrocephalus. No broken bones had been noted at that time. He had had only 1 fracture, that of the clavicle at age 8 years.

Cole et al. (1990) reported the clinical features of 3 neonates with lethal perinatal OI resulting from a substitution of glycine by arginine in the COL1A1 gene product. The mutations were gly391-to-arg, gly667-to-arg, and gly976-to-arg. All 3 were small, term babies who died soon after birth. The ribs were broad and continuously beaded in the first, discontinuously beaded in the second, and slender with few fractures in the third. The overall radiographic classifications were type IIA, IIA/IIB, and IIB, respectively (based on an old classification by Sillence et al., 1984; see HISTORY in 166210). The findings suggested that there was a gradient of bone modeling capacity from the slender and overmodeled bones associated with the mutation nearest the C-terminal end of the molecule to absence of modeling with that nearest the N-terminal end.

Dermal fibroblasts from most persons with OI type I produce about half the normal amount of type I procollagen as a result of decreased synthesis of one of its constituent chains, namely, the alpha-1 chain. Willing et al. (1992) used a polymorphic MnlI restriction endonuclease site in the 3-prime untranslated region of COL1A1 to distinguish the transcripts of the 2 alleles in 23 heterozygotes from 21 unrelated families with OI type I. In each case there was marked diminution in steady-state mRNA levels from one COL1A1 allele. They demonstrated that loss of an allele through deletion or rearrangement was not the cause of the diminished COL1A1 mRNA levels. Primer extension with nucleotide-specific chain termination allowed identification of the mutant allele in cell strains that were heterozygous for an expressed polymorphism. Willing et al. (1992) suggested that the method is applicable to sporadic cases, to small families, and to large families in which key persons are uninformative at the polymorphic sites used in linkage analysis.

Willing et al. (1993) pointed out that the abnormally low ratio of COL1A1 mRNA to COL1A2 (120160) mRNA in fibroblasts cultured from OI type I patients is an indication of a defect in the COL1A1 gene in the great majority of patients with this form of OI.

Byers (1993) counted a total of approximately 70 point mutations identified in the helical portion of the alpha-1 peptide, approximately 10 exon skipping mutations, and about 6 point mutations in the C-propeptide.

Steady state amounts of COL1A1 mRNA are reduced in both the nucleus and cytoplasm of dermal fibroblasts from most subjects with type I osteogenesis imperfecta (166200). Willing et al. (1995) investigated whether mutations involving key regulatory sequences in the COL1A1 promoter, such as the TATAAA and CCAAAT boxes, are responsible for the reduced levels of mRNA. They used PCR-amplified genomic DNA in conjunction with denaturing gradient gel electrophoresis and SSCP to screen the 5-prime untranslated domain, exon 1, and a small portion of intron 1 of the COL1A1 gene. In addition, direct sequence analysis was performed on an amplified genomic DNA fragment that included the TATAAA and CCAAAT boxes. In a survey of 40 unrelated probands with OI type I in whom no causative mutation was known, Willing et al. (1995) identified no mutations in the promoter region and there was 'little evidence of sequence diversity among any of the 40 subjects.'

Whereas most cases of severe osteogenesis imperfecta result from mutations in the coding region of the COL1A1 or COL1A2 genes yielding an abnormal collagen alpha-chain, many patients with mild OI show evidence of a null allele due to a premature stop mutation in the mutant RNA transcript. As indicated in 120150.0046, mild OI in one case resulted from a null allele arising from a splice donor mutation where the transcript containing the included intron was sequestered in the nucleus. Nuclear sequestration precluded its translation and thus rendered the allele null. Using RT-PCR and SSCP of COL1A1 mRNA from patients with mild OI, Redford-Badwal et al. (1996) identified 3 patients with distinct null-producing mutations identified from the mutant transcript within the nuclear compartment. In a fourth patient with a gly-to-arg expressed point mutation, they found the mutant transcript in both the nucleus and the cytoplasm.

Willing et al. (1996) analyzed the effects of nonsense and frameshift mutations on steady state levels of COL1A1 mRNA. Total cellular and nuclear RNA was analyzed. They found that mutations which predict premature termination reduce steady-state amounts of COL1A1 mRNA from the mutant allele in both nuclear and cellular mRNA. The investigators concluded that premature termination mutations have a predictable and uniform effect on COL1A1 gene expression which ultimately leads to decreased production of type I collagen and to the mild phenotype associated with OI type I. Willing et al. (1996) reported that mutations which lead to premature translation termination appear to be the most common molecular cause of OI type I. They identified 21 mutations, 15 of which lead to premature termination as a result of translational frameshifts or single-nucleotide substitutions. Five mutations were splicing defects leading to cryptic splicing or intron retention within the mature mRNA. Both of these alternative splicing pathways indirectly lead to frameshifts and premature termination in downstream exons.

In 4 apparently unrelated patients with OI, Korkko et al. (1997) found 2 new recurrent nucleotide mutations in the COL1A1 gene, using a protocol whereby 43 exons and exon-flanking sequences were amplified by PCR and scanned for mutations by denaturing gradient gel electrophoresis. From an analysis of previous publications, they concluded that up to one-fifth of mutations causing OI are recurrent in the sense that they are identical in apparently unrelated probands. About 80% of these identical mutations were found to be in CpG dinucleotide sequences. Korkko et al. (1997) tabulated reported cases of recurrent mutations causing OI. The most frequent recurrent mutation was gly352ser (120150.0042), reported in 4 unrelated patients. They also reported a nonsense mutation in the codon for arginine-963 (120150.0055).

Since collagen I consists of 2 alpha-1 chains and 1 alpha-2 chain, a mutation in the COL1A1 gene might affect the function of the collagen molecule more than would a similar substitution in the COL1A2 gene, thereby causing more severe OI, for example. Lund et al. (1997) tested this hypothesis by comparing patients with identical substitutions in different alpha chains. They presented a G586V substitution in the alpha-1 gene (120150.0056) and compared it with a G586V substitution in the alpha-2 gene (120160.0023). Their patient had lethal OI type II. Patients with the same substitution in the alpha-2 chain had either OI type IV (166220) or type III (259420). Lund et al. (1997) pointed out that identical biochemical alterations in the same chain are known to have different phenotypic effects, both within families and between unrelated patients. They took this into account in their cautious proposal that substitutions in the alpha-1 chain may have more serious consequences than similar substitutions in the alpha-2 chain.

Kuivaniemi et al. (1997) summarized the data on 278 different mutations found in genes for types I, II, III, IX, X, and XI collagens from 317 apparently unrelated patients. Most mutations (217; 78% of the total) were single-base and either changed the codon of a critical amino acid (63%) or led to abnormal RNA splicing (13%). Most (155; 56%) of the amino acid substitutions were those of a bulkier amino acid replacing the obligatory glycine of the repeating Gly-X-Y sequence of the collagen triple helix. Altogether, 26 different mutations (9.4%) occurred in more than 1 unrelated individual. The 65 patients in whom the 26 mutations were characterized constituted almost one-fifth (20.5%) of the 317 patients analyzed. The mutations in these 6 collagens caused a wide spectrum of diseases of bone, cartilage, and blood vessels, including osteogenesis imperfecta, a variety of chondrodysplasias, types IV (130050) and VII (130060) Ehlers-Danlos syndrome, and, rarely, some forms of osteoporosis, osteoarthritis, and familiar aneurysms.

(The amino acid numbering system for collagen involves assigning number 1 to the first glycine of the triple-helical domain of an alpha chain. The numbers for the alpha-1 chain of type I collagen can be converted to positions in the pro-alpha-1 chain by adding 156, and the numbers for the alpha-2 chain can be converted to the human pro-alpha-2 chain by adding 68.)

Dalgleish (1997) described a mutation database for the COL1A1 and COL1A2 genes.

Mutations in the COL1A2 gene appear to be very rare causes of type I osteogenesis imperfecta. Korkko et al. (1998) developed a method for analysis of the COL1A1 and COL1A2 genes in 15 patients with type I OI and found only COL1A1 mutations. They described their protocols for PCR amplification of the exon and exon boundaries of all 103 exons in the COL1A1 and COL1A2 genes. As previously pointed out, most mutations found in patients with OI type I introduce either premature termination codons or aberrant RNA splicing and thereby reduce the expression of the COL1A1 gene. The mutations tend to occur in common sequence context. All 9 mutations, found by Korkko et al. (1998) to convert the arginine codon CGA to the premature-termination codon TGA, occurred in the sequence context of G/CCC CGA GG/T of the COL1A1 gene. None was found in 7 CGA codons for arginine in other sequence contexts of the COL1A1 gene. The COL1A1 gene has 6 such sequences, whereas the COL1A2 gene has none.

Triple helix formation is a prerequisite for the passage of type I procollagen from the endoplasmic reticulum and secretion from the cell to form extracellular fibrils that will support mineral deposition in bone. In an analysis of cDNA from 11 unrelated individuals with osteogenesis imperfecta, Pace et al. (2001) found 11 novel, short in-frame deletions or duplications of 3, 9, or 18 nucleotides in the helical coding regions of the COL1A1 or COL1A2 collagen genes. Triple helix formation was impaired, type I collagen alpha chains were posttranslationally overmodified, and extracellular secretion was markedly reduced. With one exception, the obligate Gly-Xaa-Yaa repeat pattern of amino acids in the helical domains was not altered, but the Xaa and Yaa position residues were out of register relative to the amino acid sequences of adjacent chains in the triple helix. Thus, the identity of these amino acids, in addition to third position glycines, is important for normal helix formation. These findings expanded the repertoire of uncommon in-frame deletions and duplications in OI, and provided insight into normal collagen biosynthesis and collagen triple helix formation.

Cabral et al. (2001) reported a 13-year-old girl with severe type III OI in whom they identified heterozygosity for a gly76-to-glu substitution in the COL1A1 gene (120150.0065). The authors stated that this was the first delineation of a glutamic acid substitution in the alpha-1(I) chain causing nonlethal osteogenesis imperfecta.

Chamberlain et al. (2004) used adeno-associated virus vectors to disrupt dominant-negative mutant COL1A1 collagen genes in mesenchymal stem cells, also known as marrow stromal cells, from individuals with severe OI, demonstrating successful gene targeting in adult human stem cells.

Ehlers-Danlos Syndromes

In 2 unrelated patients with classic EDS (EDSCL1; 130000), Nuytinck et al. (2000) identified an arg134-to-cys mutation (120150.0059) in the COL1A1 gene.

Cabral et al. (2005) identified 7 children with the combination of skeletal fragility and characteristics of Ehlers-Danlos syndrome. In each child they identified a mutation in the first 90 residues of the helical region of alpha-1(I) collagen. These mutations prevented or delayed removal of the procollagen N-propeptide by purified N-proteinase (ADAMTS2; 604539) in vitro and in pericellular assays. The mutant pN-collagen which resulted was efficiently incorporated into matrix by cultured fibroblasts and osteoblasts and was prominently present in newly incorporated and immaturely cross-linked collagen. Dermal collagen fibrils had significantly reduced cross-sectional diameters, corroborating incorporation of pN-collagen into fibrils in vivo. The mutations disrupted disrupted a distinct folding region of high thermal stability in the first 90 residues at the amino end of type I collagen and altered the secondary structure of the adjacent N-proteinase cleavage site. Thus, these mutations are directly responsible for the bone fragility of OI and indirectly responsible for EDS symptoms, by interference with N-propeptide removal.

Cabral et al. (2005) hypothesized that the nature of EDS-like symptoms in OI/EDS patients is similar to type VII EDS derived primarily by deletions of the N-propeptide cleavage site in alpha-1(I) and alpha-2(I) (120160) chains, in EDS VIIA (EDSARTH1; 130060) and VIIB (EDSARTH2; 617821), respectively, or by N-proteinase deficiency in EDS VIIC (EDSDRMS; 225410). It remained unclear why alpha-1(I)-OI/EDS patients had a somewhat different EDS phenotype (e.g., pronounced early scoliosis and no bilateral hip dysplasia) and why their collagen fibrils had more rounded cross-section under electron microscopy investigation. Makareeva et al. (2006) demonstrated that 85 N-terminal amino acids of the alpha1(I) chain participate in a highly stable folding domain, acting as the stabilizing anchor for the amino end of the type I collagen triple helix. This anchor region is bordered by a microunfolding region, 15 amino acids in each chain, which includes no proline or hydroxyproline residues and contains a chymotrypsin cleavage site. Glycine substitutions and amino acid deletions within the N-anchor domain induced its reversible unfolding above 34 degrees C. The overall triple helix denaturation temperature was reduced by 5 to 6 degrees C, similar to complete N-anchor removal. N-propeptide partially restored the stability of mutant procollagen but not sufficiently to prevent N-anchor unfolding and a conformational change at the N-propeptide cleavage site. The ensuing failure of N-proteinase to cleave at the misfolded site led to incorporation of pN-collagen into fibrils. As in EDS VIIA/B, fibrils containing pN-collagen are thinner and weaker causing EDS-like laxity of large and small joints and paraspinal ligaments. Makareeva et al. (2006) concluded that distinct structural consequences of N-anchor destabilization result in a distinct alpha1(I)-OI/EDS phenotype.

Caffey Disease

In affected individuals and obligate carriers from 3 unrelated families with Caffey disease (114000), Gensure et al. (2005) identified heterozygosity for an arg836-to-cys mutation (R836C; 120150.0063) in the COL1A1 gene. Kamoun-Goldrat et al. (2008) identified heterozygosity for the R836C mutation in the COL1A1 gene in the pulmonary tissue of a fetus with a severe form of prenatal cortical hyperostosis (see 114000) from a terminated pregnancy at 30 weeks' gestation. The authors speculated that mutation in another gene might also be involved.

Susceptibility to Osteoporosis

Osteoporosis (166710) is a common disorder with a strong genetic component. One way in which the genetic component could be expressed is through polymorphism of COL1A1. Grant et al. (1996) described a novel G-to-T transversion at the first base of a binding site for the transcription factor Sp1 (189906) in intron 1 of COL1A1 (rs1800012; 120150.0051). They found that the polymorphism was associated with low bone density and increased appearance of osteoporotic vertebral fractures in 299 British women. In a study of 1,778 postmenopausal Dutch women, Uitterlinden et al. (1998) confirmed the association of the Sp1-binding site polymorphism and bone mineral density.

Lohmueller et al. (2003) performed a metaanalysis of 301 published genetic association studies covering 25 different reported associations. For 8 of the 25 associations, strong evidence of replication of the initial report was available. One of these 8 was the association between COL1A1 and osteoporotic fracture as first reported by Grant et al. (1996). Of a G/T SNP in intron 1, osteoporotic fractures showed association with the T allele.

In 1,873 Caucasian subjects from 405 nuclear families, Long et al. (2004) examined the relationship between 3 SNPs in the COL1A1 gene and bone size at the spine, hip, and wrist. They found suggestive evidence for an association with wrist size at SNP2 (p = 0.011): after adjusting for age, sex, height, and weight, subjects with the T allele of SNP2 had, on average, a 3.05% smaller wrist size than noncarriers. Long et al. (2004) concluded that the COL1A1 gene may have some effect on bone size variation at the wrist, but not at the spine or hip, in these families.

Jin et al. (2009) showed that the previously reported 5-prime untranslated region (UTR) SNPs in the COL1A1 gene (-1997G-T, rs1107946, 120150.0067; -1663indelT, rs2412298, 120150.0068; +1245G-T, rs1800012) affected COL1A1 transcription. Transcription was 2-fold higher with the osteoporosis-associated G-del-T haplotype compared with the common G-ins-G haplotype. The region surrounding rs2412298 recognized a complex of proteins essential for osteoblast differentiation and function including NMP4 (ZNF384; 609951) and Osterix (SP7; 606633), and the osteoporosis-associated -1663delT allele had increased binding affinity for this complex. Further studies showed that haplotype G-del-T had higher binding affinity for RNA polymerase II, consistent with increased transcription of the G-del-T allele, and there was a significant inverse association between carriage of G-del-T and bone mineral density (BMD) in a cohort of 3,270 Caucasian women. Jin et al. (2009) concluded that common polymorphic variants in the 5-prime UTR of COL1A1 regulate transcription by affecting DNA-protein interactions, and that increased levels of transcription correlated with reduced BMD values in vivo by altering the normal 2:1 ratio between alpha-1(I) and alpha-2(I) chains.

Genotype/Phenotype Correlations

Di Lullo et al. (2002) stated that binding sites on type I collagen had been elucidated for approximately half of the almost 50 molecules that had been found to interact with it. In addition, more than 300 mutations in type I collagen associated with human connective tissue disorders had been described. However, the spatial relationships between the ligand-binding sites and mutation positions had not been examined. Di Lullo et al. (2002) therefore created a map of type I collagen that included all of its ligand-binding sites and mutations. The map revealed several hotspots for ligand interactions on type I collagen and showed that most of the binding sites locate to its C-terminal half. Moreover, some potentially relevant relationships between binding sites were observed on the collagen fibril, including the following: fibronectin- and certain integrin-binding regions are near neighbors, which may mechanistically relate to fibronectin-dependent cell-collagen attachment; proteoglycan binding may influence collagen fibrillogenesis, cell-collagen attachment, and collagen glycation seen in diabetes and aging; and mutations associated with osteogenesis imperfecta and other disorders show apparently nonrandom distribution patterns within both the monomer and fibril, implying that mutation positions correlate with disease phenotype.

A missense mutation leading to the replacement of 1 Gly in the (Gly-Xaa-Yaa)n repeat of the collagen triple helix can cause a range of heritable connective tissue disorders that depend on the gene in which the mutation occurs. Persikov et al. (2004) found that the spectrum of amino acids replacing Gly was not significantly different from that expected for the COL7A1 (120120)-encoded collagen chains, suggesting that any Gly replacement will cause dystrophic epidermolysis bullosa (604129). On the other hand, the distribution of residues replacing Gly was significantly different from that expected for all other collagen chains studied, with a particularly strong bias seen for the collagen chains encoded by COL1A1 and COL3A1 (120180). The bias did not correlate with the degree of chemical dissimilarity between gly and the replacement residues, but in some cases a relationship was observed with the predicted extent of destabilization of the triple helix. Of the COL1A1-encoded chains, the most destabilizing residues (valine, glutamic acid, and aspartic acid) and the least destabilizing residue (alanine) were underrepresented. This bias supported the hypothesis that the level of triple-helix destabilization determines clinical outcome.

In an extensive review of published and unpublished sources, Marini et al. (2007) identified and assembled 832 independent mutations in the type I collagen genes (493 in COL1A1 and 339 in COL1A2). There were 682 substitutions of glycine residues within the triple-helical domains of the proteins (391 in COL1A1 and 291 in COL1A2) and 150 splice site mutations (102 in COL1A1 and 48 in COL1A2). One-third of the mutations that result in glycine substitutions in COL1A1 were lethal, whereas substitutions in the first 200 residues were nonlethal and had variable outcomes unrelated to folding or helix stability domains. Two exclusively lethal regions, helix positions 691-823 and 910-964, aligned with major ligand binding regions. Mutations in COL1A2 were predominantly nonlethal (80%), but lethal regions aligned with proteoglycan bindings sites. Splice site mutations accounted for 20% of helical mutations, were rarely lethal, and often led to a mild phenotype.

Rauch et al. (2010) compared the results of genotype analysis and clinical examination in 161 patients who were diagnosed as having OI type I, III, or IV according to the Sillence classification (median age: 13 years) and had glycine mutations in the triple-helical domain of alpha-1(I) (n = 67) or alpha-2(I) (n = 94). There were 111 distinct mutations, of which 38 affected the alpha-1(I) chain and 73 the alpha-2(I) chain. Serine substitutions were the most frequently encountered type of mutation in both chains. Overall, the majority of patients had a phenotypic diagnosis of OI type III or IV, had dentinogenesis imperfecta and blue sclera, and were born with skeletal deformities or fractures. Compared with patients with serine substitutions in alpha-2(I) (n = 40), patients with serine substitutions in alpha-1(I) (n = 42) on average were shorter (median height z-score -6.0 vs -3.4; P = 0.005), indicating that alpha-1(I) mutations cause a more severe phenotype. Height correlated with the location of the mutation in the alpha-2(I) chain but not in the alpha-1(I) chain. Patients with mutations affecting the first 120 amino acids at the N-terminal end of the collagen type I triple helix had blue sclera but did not have dentinogenesis imperfecta. Among patients from different families sharing the same mutation, about 90% and 75% were concordant for dentinogenesis imperfecta and blue sclera, respectively.

Takagi et al. (2011) reported 4 Japanese patients, including 2 unrelated patients with what the authors called 'classic OI IIC' and 2 sibs with features of 'OI IIC' but less distortion of the tubular bones (OI dense bone variant). No consanguinity was reported in their parents. In both sibs and 1 sporadic patient, they identified heterozygous mutations in the C-propeptide region of COL1A1 (120150.0069 and 120150.0070, respectively), whereas no mutation in this region was identified in the other sporadic patient. Familial gene analysis revealed somatic mosaicism of the mutation in the clinically unaffected father of the sibs, whereas their mother and healthy older sister did not have the mutation. Histologic examination in the 2 sporadic cases showed a network of broad, interconnected cartilaginous trabeculae with thin osseous seams in the metaphyseal spongiosa. Thick, cartilaginous trabeculae (cartilaginous cores) were also found in the diaphyseal spongiosa. Chondrocyte columnization appeared somewhat irregular. These changes differed from the narrow and short metaphyseal trabeculae found in other lethal or severe cases of OI. Takagi et al. (2011) concluded that heterozygous C-propeptide mutations in the COL1A1 gene may result in OI IIC with or without twisting of the long bones and that OI IIC appears to be inherited as an autosomal dominant trait.

Cytogenetics

COL1A1/PDGFB Fusion Gene

Dermatofibrosarcoma protuberans (DFSP; 607907), an infiltrative skin tumor of intermediate malignancy, presents specific cytogenetic features such as reciprocal translocations t(17;22)(q22;q13) and supernumerary ring chromosomes derived from t(17;22). Simon et al. (1997) characterized the breakpoints from translocations and rings in dermatofibrosarcoma protuberans and its juvenile form, giant cell fibroblastoma, on the genomic and RNA levels. They found that these rearrangements fuse the PDGFB gene (190040) and the COL1A1 gene. Simon et al. (1997) commented that PDGFB has transforming activity and is a potent mitogen for a number of cell types, but its role in oncogenic processes was not fully understood. They noted that neither COL1A1 nor PDGFB had hitherto been implicated in tumor translocations. The gene fusions deleted exon 1 of PDGFB and released this growth factor from its normal regulation; see 190040.0002.

Nakanishi et al. (2007) used RT-PCR to examine the COL1A1/PDGFB transcript using frozen biopsy specimens from 3 unrelated patients with DFSP and identified fusion of COL1A1 exon 25, exon 31, and exon 46, respectively, to exon 2 of the PDGFB gene. Clinical features and histopathology did not demonstrate any specific characteristics associated with the different transcripts.

Biochemical Features

Gauba and Hartgerink (2008) reported the design of a novel model system based upon collagen-like heterotrimers that can mimic the glycine mutations present in either the alpha-1 or alpha-2 chains of type I collagen. The design utilized an electrostatic recognition motif in 3 chains that can force the interaction of any 3 peptides, including AAA (all same), AAB (2 same and 1 different), or ABC (all different) triple helices. Therefore, the component peptides could be designed in such a way that glycine mutations were present in zero, 1, 2, or all 3 chains of the triple helix. They reported collagen mutants containing 1 or 2 glycine substitutions with structures relevant to native forms of OI. Gauba and Hartgerink (2008) demonstrated the difference in thermal stability and refolding half-life times between triple helices that vary only in the frequency of glycine mutations at a particular position.

By differential scanning calorimetry and circular dichroism, Makareeva et al. (2008) measured and mapped changes in the collagen melting temperature (delta-T(m)) for 41 different glycine substitutions from 47 OI patients. In contrast to peptides, they found no correlation of delta-T(m) with the identity of the substituting residue but instead observed regular variations in delta-T(m) with the substitution location on different triple helix regions. To relate the delta-T(m) map to peptide-based stability predictions, the authors extracted the activation energy of local helix unfolding from the reported peptide data and constructed the local helix unfolding map and tested it by measuring the hydrogen-deuterium exchange rate for glycine NH residues involved in interchain hydrogen bonds. Makareeva et al. (2008) delineated regional variations in the collagen triple helix stability. Two large, flexible regions deduced from the delta-T(m) map aligned with the regions important for collagen fibril assembly and ligand binding. One of these regions also aligned with a lethal region for Gly substitutions in the alpha-1(I) chain.

Animal Model

Pereira et al. (1993) established a line of transgenic mice that expressed moderate levels of an internally deleted human COL1A1 gene. The gene construct was modeled after a sporadic in-frame deletion that produced a lethal variant of OI. About 6% of the transgenic mice had a lethal phenotype with extensive fractures at birth, and 33% had fractures but were viable. The remaining 61% of the transgenic mice had no apparent fractures as assessed by x-ray examination on the day of birth. Brother-sister matings produced 8 litters in which approximately 40% of the mice had the lethal phenotype, indicating that expression of the transgene was more lethal in homozygous mice. The shortened collagen polypeptide chains synthesized from the human transgene were thought to bind to and produce degradation of the normal collagen genes synthesized from the normal mouse alleles. Khillan et al. (1994) extended these studies using an antisense gene. The strategy of specifically inhibiting expression of a gene with antisense RNA generated from an inverted gene was introduced in 1984 (Izant and Weintraub, 1984; Mizuno et al., 1984; and Pestka et al., 1984). Khillan et al. (1994) assembled an antisense gene that was similar to the internally deleted COL1A1 minigene used by Pereira et al. (1993) except that the 3-prime half of the gene was inverted so as to code for an antisense RNA. Transgenic mice expressing the antisense gene had a normal phenotype, apparently because the antisense gene contained human sequences instead of mouse sequences. Two lines of mice expressing the antisense gene were bred to 2 lines of transgenic mice expressing the minigene. In mice that inherited both genes, the incidence of the lethal fragile bone phenotype was reduced from 92 to 27%. The effect of the antisense gene was directly demonstrated by an increase in the ratio of normal mouse pro-alpha-1(I) chains to human mini-chains in tissues from mice that inherited both genes and had a normal phenotype. The results raised the possibility that chimeric gene constructs that contain intron sequences and in which only the first half of a gene is inverted may be particularly effective as antisense genes.

Pereira et al. (1994) used an inbred strain of transgenic mice expressing a mutated COL1A1 gene to demonstrate interesting features concerning phenotypic variability and incomplete penetrance. These phenomena are striking in families with osteogenesis imperfecta and are usually explained by differences in genetic background or in environmental factors. The inbred strain of transgenic mice expressing an internally deleted COL1A1 gene was bred to wildtype mice of the same strain so that the inheritance of proneness to fracture could be examined in a homogeneous genetic background. To minimize the effects of environmental factors, the phenotype was evaluated in embryos that were removed from the mother one day before term. Examination of stained skeletons from 51 transgenic embryos from 11 separate litters demonstrated that approximately 22% had a severe phenotype with extensive fractures of both long bones and ribs, approximately 51% had a mild phenotype with fractures of ribs only, and approximately 27% had no fractures. The ratio of steady-state levels of the mRNA from the transgene to the level of mRNA from the endogenous gene was the same in all transgenic embryos. The results demonstrated that the phenotypic variability and incomplete penetrance were not explained by variation in genetic background or levels in gene expression. Pereira et al. (1994) concluded from these results that phenotypic variation may be an inherent characteristic of the mutated collagen gene.

Pereira et al. (1998) studied a transgenic model of osteogenesis imperfecta (OI) in mice who expressed a mini-COL1A1 gene containing a large in-frame deletion. Marrow stromal cells from wildtype mice were infused into OI-transgenic mice. In mice that were irradiated with potentially lethal levels or sublethal levels, DNA from the donor marrow stromal cells was detected consistently in marrow, bone, cartilage, and lung at either 1 or 2.5 months after the infusion. The DNA also was detected, but less frequently, in the spleen, brain, and skin. There was a small but statistically significant increase in both collagen content and mineral content of bone 1 month after the infusion. In experiments in which male marrow stromal cells were infused into a female OI-transgenic mouse, fluorescence in situ hybridization assays for the Y chromosome indicated that after 2.5 months, donor male cells accounted for 4 to 19% of the fibroblasts or fibroblast-like cells obtained from primary cultures of the lung, calvaria, cartilage, long bone, tail, and skin. The results supported previous suggestions that marrow stromal cells or related cells in marrow serve as a source for continual renewal of cells in a number of nonhematopoietic tissues.

Aihara et al. (2003) evaluated intraocular pressure (IOP) in transgenic mice with a targeted mutation in the Col1a1 gene and found that the mice had ocular hypertension. The authors suggested an association between IOP regulation and fibrillar collagen turnover.

The mouse mutation 'abnormal gait-2' (Aga2) was identified in an N-ethyl-N- nitrosourea mutagenesis screen. Lisse et al. (2008) identified the Aga2 mutation as a T-to-A transversion within intron 50 of the Col1a1 gene, which introduced a novel 3-prime splice acceptor site that resulted in a frameshift. The mutant protein was predicted to have a novel C terminus that lacked a critical cysteine. Homozygosity for Aga2 was embryonic lethal. Heterozygous Aga2 (Aga2/+) animals showed early lethality, and surviving heterozygotes had widely variable phenotypes that included loss of bone mass, fractures, deformity, osteoporosis, and disorganized trabecular and collagen structures. Abnormal pro-Col1a1 chains accumulated intracellularly in Aga2/+ dermal fibroblasts and were poorly secreted. Intracellular accumulation of Col1a1 was associated with induction of an endoplasmic reticulum stress response and apoptosis characterized by caspase-12 (CASP12; 608633) and caspase-3 (CASP3; 600636) activation in vitro and in vivo.