Numerous applications in molecular biology and genomics require characterization of mutant DNA molecules present at low levels within a larger sample of non-mutant DNA. This is often achieved either by selectively amplifying mutant DNA, or by sequencing all the DNA followed by computational identification of the mutant DNA. However, selective amplification is challenging for insertions and deletions (indels). Additionally, sequencing all the DNA in a sample may not be cost effective when only the presence of a mutation needs to be ascertained rather than its allelic fraction. The MutS protein evolved to detect DNA heteroduplexes in which the two DNA strands are mismatched. Prior methods have utilized MutS to enrich mutant DNA by hybridizing mutant to non-mutant DNA to create heteroduplexes. However, the purity of heteroduplex DNA these methods achieve is limited because they can only feasibly perform one or two enrichment cycles. We developed a MutS-magnetic bead system that enables rapid serial enrichment cycles. With six cycles, we achieve complete purification of heteroduplex indel DNA originally present at a 5% fraction and over 40-fold enrichment of heteroduplex DNA originally present at a 1% fraction. This system may enable novel approaches for enriching mutant DNA for targeted sequencing.
Choufani S*, McNiven V, Cytrynbaum C, Jangjoo M, Adam MP, Bjornsson HT, Harris J, Dyment D, Graham GE, Nezarati MM, Aul Ritu B, Castiglioni C, …,Chorin O, Evrony GD, Kraatari-Tiri M, Dudding-Byth T, Richardson A, Hunt D, Hamilton L, Dyack S, Mendelsohn B, Rodriguez N, Sanchez-Martinez R, Tenorio-Castano J, Nevado J, Lupunzina P, Tirado P, Rodrigues M, Quteineh L, Innes AM, Kline A, Au PYB*, Weksberg R*America Journal of Human Genetics | 2021
Au-Kline syndrome (AKS) is a neurodevelopmental disorder associated with multiple malformations and a characteristic facial gestalt. The first individuals ascertained carried de novo loss-of-function (LoF) variants in HNRNPK. Here, we report 32 individuals with AKS (26 previously unpublished), including 13 with de novo missense variants. We propose new clinical diagnostic criteria for AKS that differentiate it from the clinically overlapping Kabuki syndrome and describe a significant phenotypic expansion to include individuals with missense variants who present with subtle facial features and few or no malformations. Many gene-specific DNA methylation (DNAm) signatures have been identified for neurodevelopmental syndromes. Because HNRNPK has roles in chromatin and epigenetic regulation, we hypothesized that pathogenic variants in HNRNPK may be associated with a specific DNAm signature. Here, we report a unique DNAm signature for AKS due to LoF HNRNPK variants, distinct from controls and Kabuki syndrome. This DNAm signature is also identified in some individuals with de novo HNRNPK missense variants, confirming their pathogenicity and the phenotypic expansion of AKS to include more subtle phenotypes. Furthermore, we report that some individuals with missense variants have an “intermediate” DNAm signature that parallels their milder clinical presentation, suggesting the presence of an epi-genotype phenotype correlation. In summary, the AKS DNAm signature may help elucidate the underlying pathophysiology of AKS. This DNAm signature also effectively supported clinical syndrome delineation and is a valuable aid for variant interpretation in individuals where a clinical diagnosis of AKS is unclear, particularly for mild presentations.
Over the past decade, genomic analyses of single cells—the fundamental units of life—have become possible. Single-cell DNA sequencing has shed light on biological questions that were previously inaccessible across diverse fields of research, including somatic mutagenesis, organismal development, genome function, and microbiology. Single-cell DNA sequencing also promises significant future biomedical and clinical impact, spanning oncology, fertility, and beyond. While single-cell approaches that profile RNA and protein have greatly expanded our understanding of cellular diversity, many fundamental questions in biology and important biomedical applications require analysis of the DNA of single cells. Here, we review the applications and biological questions for which single-cell DNA sequencing is uniquely suited or required. We include a discussion of the fields that will be impacted by single-cell DNA sequencing as the technology continues to advance.
Gruber C, Calis J, Buta S, Evrony GD, Martin JC, Uhl SA, Caron R, Jarchin L, Dunkin D, Phelps R, Webb B, Saland J, Merad M, Orange JS, Mace EM, Rosenberg BR, Gelb BD, Bogunovic D.
Autoinflammatory disease can result from monogenic errors of immunity. We describe a patient with early-onset multi-organ immune dysregulation resulting from a mosaic, gain-of-function mutation (S703I) in JAK1, encoding a kinase essential for signaling downstream of >25 cytokines. By custom single-cell RNA sequencing, we examine mosaicism with single-cell resolution. We find that JAK1 transcription was predominantly restricted to a single allele across different cells, introducing the concept of a mutational “transcriptotype” that differs from the genotype. Functionally, the mutation increases JAK1 activity and transactivates partnering JAKs, independent of its catalytic domain. S703I JAK1 is not only hypermorphic for cytokine signaling but also neomorphic, as it enables signaling cascades not canonically mediated by JAK1. Given these results, the patient was treated with tofacitinib, a JAK inhibitor, leading to the rapid resolution of clinical disease. These findings offer a platform for personalized medicine with the concurrent discovery of fundamental biological principles.
Chorin O*, Yachelevich N*, Mohamed K, Moscatelli I, Pappas J, Henriksen K, Evrony GD.
Background: Over half of children with rare genetic diseases remain undiagnosed despite maximal clinical evaluation and DNA‐based genetic testing. As part of an Undiagnosed Diseases Program applying transcriptome (RNA) sequencing to identify the causes of these unsolved cases, we studied a child with severe infantile osteopetrosis leading to cranial nerve palsies, bone deformities, and bone marrow failure, for whom whole‐genome sequencing was nondiagnostic.
Methods: We performed transcriptome (RNA) sequencing of whole blood followed by analysis of aberrant transcript isoforms and osteoclast functional studies.
Results: We identified a pathogenic deep intronic variant in CLCN7 creating an unexpected, frameshifting pseudoexon causing complete loss of function. Functional studies, including osteoclastogenesis and bone resorption assays, confirmed normal osteoclast differentiation but loss of osteoclast function.
Conclusion: This is the first report of a pathogenic deep intronic variant in CLCN7 , and our approach provides a model for systematic identification of noncoding variants causing osteopetrosis—a disease for which molecular‐genetic diagnosis can be pivotal for potentially curative hematopoietic stem cell transplantation. Our work illustrates that cryptic splice variants may elude DNA‐only sequencing and supports broad first‐line use of transcriptome sequencing for children with undiagnosed diseases.
Urreizti R*, Mayer K*, Evrony GD*, Said E, Castilla-Vallmanya L, Cody NAL, Plasencia G, Gelb BD, Grinberg D, Brinkmann U, Webb BD, Balcells S.
DPH1 variants have been associated with an ultra-rare and severe neurodevelopmental disorder, mainly characterized by variable developmental delay, short stature, dysmorphic features, and sparse hair. We have identified four new patients (from two different families) carrying novel variants in DPH1, enriching the clinical delineation of the DPH1 syndrome. Using a diphtheria toxin ADP-ribosylation assay, we have analyzed the activity of seven identified variants and demonstrated compromised function for five of them [p.(Leu234Pro); p.(Ala411Argfs*91); p.(Leu164Pro); p.(Leu125Pro); and p.(Tyr112Cys)]. We have built a homology model of the human DPH1–DPH2 heterodimer and have performed molecular dynamics simulations to study the effect of these variants on the catalytic sites as well as on the interactions between subunits of the heterodimer. The results show correlation between loss of activity, reduced size of the opening to the catalytic site, and changes in the size of the catalytic site with clinical severity. This is the first report of functional tests of DPH1 variants associated with the DPH1 syndrome. We demonstrate that the in vitro assay for DPH1 protein activity, together with structural modeling, are useful tools for assessing the effect of the variants on DPH1 function and may be used for predicting patient outcomes and prognoses.
Evrony GD*, Cordero DR*, Shen J*, Partlow JN, Yu TW, Rodin RE, Hill RS, Coulter ME, Lam AN, Jayaraman D, Gerrelli D, Diaz DG, Santos C, Morrison V, Galli A, Tschulena U, Wiemann S, Martel MJ, Spooner B, Ryu SC, Elhosary PC, Richardson JM, Tierney D, Robinson CA, Chibbar R, Diudea D, Folkerth R, Wiebe S, Barkovich AJ, Mochida GH, Irvine J, Lemire EG, Blakley P, Walsh CA.
While next-generation sequencing has accelerated the discovery of human disease genes, progress has been largely limited to the “low hanging fruit” of mutations with obvious exonic coding or canonical splice site impact. In contrast, the lack of high-throughput, unbiased approaches for functional assessment of most noncoding variants has bottlenecked gene discovery. We report the integration of transcriptome sequencing (RNA-seq), which surveys all mRNAs to reveal functional impacts of variants at the transcription level, into the gene discovery framework for a unique human disease, microcephaly-micromelia syndrome (MMS). MMS is an autosomal recessive condition described thus far in only a single First Nations population and causes intrauterine growth restriction, severe microcephaly, craniofacial anomalies, skeletal dysplasia, and neonatal lethality. Linkage analysis of affected families, including a very large pedigree, identified a single locus on Chromosome 21 linked to the disease (LOD > 9). Comprehensive genome sequencing did not reveal any pathogenic coding or canonical splicing mutations within the linkage region but identified several nonconserved noncoding variants. RNA-seq analysis detected aberrant splicing in DONSON due to one of these noncoding variants, showing a causative role for DONSON disruption in MMS. We show that DONSON is expressed in progenitor cells of embryonic human brain and other proliferating tissues, is co-expressed with components of the DNA replication machinery, and that Donson is essential for early embryonic development in mice as well, suggesting an essential conserved role for DONSON in the cell cycle. Our results demonstrate the utility of integrating transcriptomics into the study of human genetic disease when DNA sequencing alone is not sufficient to reveal the underlying pathogenic mutation.
We each begin life as a single cell harboring a single genome, which—over the course of development—gives rise to the trillions of cells that make up the body. From skin cells to heart cells to neurons of the brain, each bears a copy of the original cell’s genome. But as anyone who has used a copy machine or played the childhood game of “telephone” knows, copies are never perfect. Every cell in an individual actually has a unique genome, an imperfect copy of its cellular ancestor differentiated by inevitable somatic mutations arising from errors in DNA replication and other mutagenic forces (1). Somatic mutation is the fundamental process leading to all genetic diseases, including cancer; every inherited genetic disease also has its origins in such mutation events that occurred in an ancestor’s germline cells. Yet how many and what kinds of somatic mutations accumulate in our cells as we develop and age has long been unknown and a blind spot in our understanding of the origins of genetic disease.
The patient was rushed into the room, listless, intermittently trying to lift his head only to fall back down. Soon, he became unresponsive and cold. I placed the ultrasound probe on his chest and saw a barely contracting heart—heart failure. A massive clot filled the left atrium. The team became silent, then quickly regained its composure, and the supervising doctor began disbursing orders in rapid fire. We stabilized the patient, though he remained in serious condition, and then we shifted to research mode. We emailed doctors across the country: “Have you seen acute dilated cardiomyopathy before in similar patients? He is a 3-year-old meerkat.”
Whether somatic mutations contribute functional diversity to brain cells is a long-standing question. Single-neuron genomics enables direct measurement of somatic mutation rates in human brain and promises to answer this question. A recent study (Upton et al., 2015) reported high rates of somatic LINE-1 element (L1) retrotransposition in the hippocampus and cerebral cortex that would have major implications for normal brain function, and suggested that these events preferentially impact genes important for neuronal function. We identify aspects of the single-cell sequencing approach, bioinformatic analysis, and validation methods that led to thousands of artifacts being interpreted as somatic mutation events. Our reanalysis supports a mutation frequency of approximately 0.2 events per cell, which is about fifty-fold lower than reported, confirming that L1 elements mobilize in some human neurons but indicating that L1 mosaicism is not ubiquitous. Through consideration of the challenges identified, we provide a foundation and framework for designing single-cell genomics studies.
Lodato MA*, Woodworth MB*, Lee S*, Evrony GD, Mehta BK, Karger A, Lee S, Chittenden TW, Cai X, Lovelace JL, Lee E, Park PJ, Walsh CA.
Neurons live for decades in a postmitotic state, their genomes susceptible to DNA damage. Here we survey the landscape of somatic single-nucleotide variants (SNVs) in the human brain. We identified thousands of somatic SNVs by single-cell sequencing of 36 neurons from the cerebral cortex of three normal individuals. Unlike germline and cancer SNVs, which are often caused by errors in DNA replication, neuronal mutations appear to reflect damage during active transcription. Somatic mutations create nested lineage trees, allowing them to be dated relative to developmental landmarks and revealing a polyclonal architecture of the human cerebral cortex. Thus, somatic mutations in the brain represent a durable and ongoing record of neuronal life history, from development through postmitotic function.
Somatic mutations occur during brain development and are increasingly implicated as a cause of neurogenetic disease. However, the patterns in which somatic mutations distribute in the human brain are unknown. We used high-coverage whole-genome sequencing of single neurons from a normal individual to identify spontaneous somatic mutations as clonal marks to track cell lineages in human brain. Somatic mutation analyses in >30 locations throughout the nervous system identified multiple lineages and sublineages of cells marked by different LINE-1 (L1) retrotransposition events and subsequent mutation of poly-A microsatellites within L1. One clone contained thousands of cells limited to the left middle frontal gyrus, whereas a second distinct clone contained millions of cells distributed over the entire left hemisphere. These patterns mirror known somatic mutation disorders of brain development and suggest that focally distributed mutations are also prevalent in normal brains. Single-cell analysis of somatic mutation enables tracing of cell lineage clones in human brain.
De novo copy-number variants (CNVs) can cause neuropsychiatric disease, but the degree to which they occur somatically, and during development, is unknown. Single-cell whole-genome sequencing (WGS) in >200 single cells, including >160 neurons from three normal and two pathological human brains, sensitively identified germline trisomy of chromosome 18 but found most (≥95%) neurons in normal brain tissue to be euploid. Analysis of a patient with hemimegalencephaly (HMG) due to a somatic CNV of chromosome 1q found unexpected tetrasomy 1q in ∼20% of neurons, suggesting that CNVs in a minority of cells can cause widespread brain dysfunction. Single-cell analysis identified large (>1 Mb) clonal CNVs in lymphoblasts and in single neurons from normal human brain tissue, suggesting that some CNVs occur during neurogenesis. Many neurons contained one or more large candidate private CNVs, including one at chromosome 15q13.2-13.3, a site of duplication in neuropsychiatric conditions. Large private and clonal somatic CNVs occur in normal and diseased human brains.
Reiff RE*, Ali BR*, Baron B, Yu TW, Ben-Salem S, Coulter ME, Schubert CR, Hill RS, Akawi NA, Al-Younes B, Kaya N, Evrony GD, Al-Saffar M, Felie JM, Partlow JN, Sunu CM, Schembri-Wismayer P, Alkuraya FS, Meyer BF, Walsh CA, Al-Gazali L, Mochida GH.
Whereas many genes associated with intellectual disability (ID) encode synaptic proteins, transcriptional defects leading to ID are less well understood. We studied a large, consanguineous pedigree of Arab origin with seven members affected with ID and mild dysmorphic features. Homozygosity mapping and linkage analysis identified a candidate region on chromosome 17 with a maximum multipoint logarithm of odds score of 6.01. Targeted high-throughput sequencing of the exons in the candidate region identified a homozygous 4-bp deletion (c.169_172delCACT) in the METTL23 (methyltransferase like 23) gene, which is predicted to result in a frameshift and premature truncation (p.His57Valfs*11). Overexpressed METTL23 protein localized to both nucleus and cytoplasm, and physically interacted with GABPA (GA-binding protein transcription factor, alpha subunit). GABP, of which GABPA is a component, is known to regulate the expression of genes such as THPO (thrombopoietin) and ATP5B (ATP synthase, H+ transporting, mitochondrial F1 complex, beta polypeptide) and is implicated in a wide variety of important cellular functions. Overexpression of METTL23 resulted in increased transcriptional activity at the THPO promoter, whereas knockdown of METTL23 with siRNA resulted in decreased expression of ATP5B, thus revealing the importance of METTL23 as a regulator of GABPA function. The METTL23 mutation highlights a new transcriptional pathway underlying human intellectual function.
Bae BI*, Tietjen I*, Atabay KD, Evrony GD, Johnson MB, Asare E, Wang PP, Murayama AY, Im K, Lisgo SN, Overman L, Sestan N, Chang BS, Barkovich AJ, Grant PE, Topcu M, Politsky J, Okano H, Piao X, Walsh CA.
The human neocortex has numerous specialized functional areas whose formation is poorly understood. Here, we describe a 15–base pair deletion mutation in a regulatory element of GPR56 that selectively disrupts human cortex surrounding the Sylvian fissure bilaterally including “Broca’s area,” the primary language area, by disrupting regional GPR56 expression and blocking RFX transcription factor binding. GPR56 encodes a heterotrimeric guanine nucleotide–binding protein (G protein)–coupled receptor required for normal cortical development and is expressed in cortical progenitor cells. GPR56 expression levels regulate progenitor proliferation. GPR56 splice forms are highly variable between mice and humans, and the regulatory element of gyrencephalic mammals directs restricted lateral cortical expression. Our data reveal a mechanism by which control of GPR56 expression pattern by multiple alternative promoters can influence stem cell proliferation, gyral patterning, and, potentially, neocortex evolution.
Genetic mutations causing human disease are conventionally thought to be inherited through the germ line from one’s parents and present in all somatic (body) cells, except for most cancer mutations, which arise somatically. Increasingly, somatic mutations are being identified in diseases other than cancer, including neurodevelopmental diseases. Somatic mutations can arise during the course of prenatal brain development and cause neurological disease—even when present at low levels of mosaicism, for example—resulting in brain malformations associated with epilepsy and intellectual disability. Novel, highly sensitive technologies will allow more accurate evaluation of somatic mutations in neurodevelopmental disorders and during normal brain development.
Yang YJ, Baltus AE, Matthew RS, Murphy EA, Evrony GD, Gonzalez DM, Wang EP, Marshall-Walker CA, Barry BJ, Jernej M, Tatarakis A, Mahajan MA, Samuels HH, Shi Y, Golden JA, Mahajnah M, Shenhav R, Walsh CA.
Microcephaly is a neurodevelopmental disorder causing significantly reduced cerebral cortex size. Many known microcephaly gene products localize to centrosomes, regulating cell fate and proliferation. Here, we identify and characterize a nuclear zinc finger protein, ZNF335/NIF-1, as a causative gene for severe microcephaly, small somatic size, and neonatal death. Znf335 null mice are embryonically lethal, and conditional knockout leads to severely reduced cortical size. RNA-interference and postmortem human studies show that ZNF335 is essential for neural progenitor self-renewal, neurogenesis, and neuronal differentiation. ZNF335 is a component of a vertebrate-specific, trithorax H3K4-methylation complex, directly regulating REST/NRSF, a master regulator of neural gene expression and cell fate, as well as other essential neural-specific genes. Our results reveal ZNF335 as an essential link between H3K4 complexes and REST/NRSF and provide the first direct genetic evidence that this pathway regulates human neurogenesis and neuronal differentiation.
Evrony GD*, Cai X*, Lee E, Hills LB, Elhosary PC, Lehmann HS, Parker JJ, Atabay KD, Gilmore EC, Poduri A, Park PJ, Walsh CA.
A major unanswered question in neuroscience is whether there exists genomic variability between individual neurons of the brain, contributing to functional diversity or to an unexplained burden of neurological disease. To address this question, we developed a method to amplify genomes of single neurons from human brains. Because recent reports suggest frequent LINE-1 (L1) retrotransposition in human brains, we performed genome-wide L1 insertion profiling of 300 single neurons from cerebral cortex and caudate nucleus of three normal individuals, recovering >80% of germline insertions from single neurons. While we find somatic L1 insertions, we estimate <0.6 unique somatic insertions per neuron, and most neurons lack detectable somatic insertions, suggesting that L1 is not a major generator of neuronal diversity in cortex and caudate. We then genotyped single cortical cells to characterize the mosaicism of a somatic AKT3 mutation identified in a child with hemimegalencephaly. Single-neuron sequencing allows systematic assessment of genomic diversity in the human brain.
Poduri A, Evrony GD, Cai X, Elhosary PC, Beroukhim R, Lehtinen MK, Hills LB, Heinzen EL, Hill A, Hill RS, Barry BJ, Bourgeois BFD, Riviello JJ, Barkovich AJ, Black PM, Ligon KL, Walsh CA.
Hemimegalencephaly (HMG) is a developmental brain disorder characterized by an enlarged, malformed cerebral hemisphere, typically causing epilepsy that requires surgical resection. We studied resected HMG tissue to test whether the condition might reflect somatic mutations affecting genes critical to brain development. We found that two out of eight HMG samples showed trisomy of chromosome 1q, which encompasses many genes, including AKT3, a gene known to regulate brain size. A third case showed a known activating mutation in AKT3 (c.49G→A, creating p.E17K) that was not present in the patient’s blood cells. Remarkably, the E17K mutation in AKT3 is exactly paralogous to E17K mutations in AKT1 and AKT2 recently discovered in somatic overgrowth syndromes. We show that AKT3 is the most abundant AKT paralog in the brain during neurogenesis and that phosphorylated AKT is abundant in cortical progenitor cells. Our data suggest that somatic mutations limited to the brain could represent an important cause of complex neurogenetic disease.
The use of nanoparticles for targeted drug delivery is often facilitated by specific conjugation of functional targeting molecules to the nanoparticle surface. We compared different biotin-binding proteins (avidin, streptavidin, or neutravidin) as crosslinkers to conjugate proteins to biodegradable nanoparticles prepared from poly(lactic-co-glycolic acid) (PLGA)–polyethylene glycol (PEG)-biotin polymers. Avidin gave the highest levels of overall protein conjugation, whereas neutravidin minimized protein non-specific binding to the polymer. The tetanus toxin C fragment (TTC), which is efficiently retrogradely transported in neurons and binds to neurons with high specificity and affinity, retained the ability to bind to neuroblastoma cells following amine group modifications. TTC was conjugated to nanoparticles using neutravidin, and the resulting nanoparticles were shown to selectively target neuroblastoma cells in vitro. TTC-conjugated nanoparticles have the potential to serve as drug delivery vehicles targeted to the central nervous system.