Hematopoiesis is a continuous process of blood and immune cell production. It is orchestrated by thousands of gene products that respond to extracellular signals by guiding cell fate decisions to meet the needs of the organism. Although much of our knowledge of this process comes from work in model systems, we have learned a great deal from studies on human genetic variation. Considerable insight has emerged from studies on presumed monogenic blood disorders, which continue to provide key insights into the mechanisms critical for hematopoiesis. Furthermore, the emergence of large-scale biobanks and cohorts has uncovered thousands of genomic loci associated with blood cell traits and diseases. Some of these blood cell trait-associated loci act as modifiers of what were once thought to be monogenic blood diseases. However, most of these loci await functional validation. Here, we discuss the validation bottleneck and emerging methods to more effectively connect variant to function. In particular, we highlight recent innovations in genome editing, which have paved the path forward for high-throughput functional assessment of loci. Finally, we discuss existing barriers to progress, including challenges in manipulating the genomes of primary hematopoietic cells.

“The life of the flesh is in the blood” (Leviticus 17:11). Every second, our body produces more than 2 million red blood cells to help deliver oxygen to our tissues, more than 1 million platelets to help maintain hemostasis, and more than 1 million white blood cells to enable our immune function. This production process, termed hematopoiesis, is highly organized, and responds to extracellular signals to prioritize production of specific lineages in periods of injury, stress or acute illness. Chronic deficiencies in the production line or defects within any of the individual products may lead to pathological consequences to the individual.

Our understanding of the organization of the hematopoietic system and the mechanisms governing its operation are rooted in the hypothesis that it is a hierarchical process with all blood elements derived from a common precursor cell (Maximow, 1909) (Fig. 1). Since the discovery of the ‘polyblast’, now termed the hematopoietic stem cell (HSC), studies in mice and other model organisms have helped characterize factors responsible for the maintenance of HSCs and the differentiation process through which mature blood cells are formed (Alexander et al., 1996; Liggett and Sankaran, 2020). Although these studies have helped us understand this process at a high level, there are clearly aspects of hematopoiesis that are unique to humans and insights that can emerge through the study of human genetic variation (Medetgul-Ernar and Davis, 2022). The completion of the initial draft of the human genome and subsequent advances in sequencing technologies have helped accelerate the pace of discovery (Shendure et al., 2017). And, more recently, the establishment of large-scale biobanks and cohorts linking genomic variation and clinical data, such as the Trans-Omics for Precision Medicine (TOPMed) (Taliun et al., 2021) and the UK Biobank (Bycroft et al., 2018), have provided the framework and data for increasingly well-powered genome-wide association studies (GWASs) and sequencing-based rare variant association studies. In the past few years alone, thousands of new germline genetic loci have been described that contribute to variation in various hematopoietic traits or increase the risk for different blood diseases, including blood and immune cell phenotypes (Chen et al., 2020; Vuckovic et al., 2020), clonal hematopoiesis (see Glossary, Box 1) (Bick et al., 2020; Brown et al., 2022 preprint; Kar et al., 2022) and blood cancers (Bao et al., 2020; Mitchell et al., 2016; Vijayakrishnan et al., 2019). As the number of individuals in these databases grows, our ability to identify the genetic contribution to hematopoietic phenotypes and diseases will grow in tandem. However, until these variants are experimentally validated, and the underlying biological mechanisms are uncovered, there are limited insights to inform therapeutic and preventive strategies. Additionally, with increasing sequencing of patients with blood diseases, particularly those thought to arise due to monogenic causes, the growing list of variants of unknown significance and unknown mechanism represents a major bottleneck in human genetics. Although the global scientific community is growing, an increase in manpower is not enough to address the chasm between the exponentially growing identification of both monogenic and polygenic variation in disease, and our mechanistic insights. It is paramount that tools for experimental validation of putative causal variants keep pace with variant discovery. Somewhat fortuitously, a new generation of tools for high-throughput variant assessment has emerged, bolstered by the recent development of next-generation genome-editing technologies. Here, we discuss traditional and emerging experimental approaches that address the variant-to-function problem for blood phenotypes and diseases.

“Somewhat fortuitously, a new generation of tools for high-throughput variant assessment has emerged, bolstered by the recent development of next-generation genome-editing technologies.”

Box 1. Glossary

Acute lymphoblastic leukemia: a cancer of the blood and bone marrow affecting lymphoid progenitors, including immature T cells, B cells and natural killer (NK) cells.

Aldolase A: an enzyme involved in the fourth step of glycolysis and found predominantly in red blood cells and muscle tissue. Deficiency results in dysregulation of energy homeostasis within red blood cells, leading to membrane instability and rupture (see ‘hemolytic anemia’).

Alpha-thalassemia: an inherited hemoglobinopathy characterized by insufficient production of alpha-globin chains due to large deletions at the alpha-globin locus or point mutations in hemoglobin subunit alpha 1 (HBA1) and/or hemoglobin subunit alpha 2 (HBA2). Affected individuals have chronic anemia and are often dependent on blood transfusions.

B-cell lymphoma/leukemia 11A (BCL11A): a transcription factor involved in the regulation of gene expression at the beta-globin locus. In the postnatal period, BCL11A orchestrates the switch from high transcriptional activity at hemoglobin subunit gamma 1/2 (HBG1/2) loci to high transcriptional activity at the hemoglobin subunit beta (HBB) locus through changes in chromatin looping and enhancer–promoter interactions. An erythroid-specific enhancer of BCL11A located in an intron of the BCL11A gene is currently the target of multiple gene therapy trials aimed at restoring high levels of fetal hemoglobin to ameliorate disease in beta-hemoglobinopathies (see ‘fetal hemoglobin’ and ‘sickle cell disease’).

Beta-thalassemia: an inherited hemoglobinopathy characterized by insufficient production of beta-globin chains due to large deletions at the beta-globin locus or point mutations in HBB. Affected individuals have chronic anemia and are often dependent on blood transfusions.

Beta-2 microglobin (B2M): a component of major histocompatibility complex (MHC) class I molecules that present intracellular peptide fragments to cytotoxic CD8+ T cells. Loss of B2M leads to near-complete loss of surface expression of MHC class I. Ablation of B2M is currently being used in a number of clinical trials to create ‘off the shelf’ CAR-T therapies that are resistant to allorejection.

C-C chemokine receptor type 5 (CCR5): a G-protein-coupled receptor on the surface of T lymphocytes, macrophages and immature dendritic cells that regulates trafficking and effector functions. It also acts as a co-receptor for membrane fusion and viral entry of human immunodeficiency virus (HIV) viral particles. Individuals with homozygous loss-of-function mutations in CCR5 are resistant to HIV infection. It is currently a target of gene-editing therapies for prevention and amelioration of HIV infection.

Clonal hematopoiesis: an age-related disorder characterized by the emergence of a detectable population of blood cells that share somatically acquired mutation(s). It is a risk factor for the development of leukemia and atherosclerotic cardiovascular disease.

Diamond-Blackfan anemia: a congenital disorder of the bone marrow characterized by ineffective production of red blood cell progenitors. It is a genetically heterogenous disorder enriched for mutations in ribosomal protein genes that cause aberrant translation of key erythroid maturation factors.

DNA methyltransferase 3A (DNMT3A): a de novo DNA methyltransferase responsible for deposition of methyl groups on the C-5 carbon of cytosines in DNA. In humans, the enzyme preferentially methylates cytosines at CG dinucleotides, which are enriched at gene promoters. Loss-of-function mutations in DNMT3A are among the most common somatic mutations found in clonal hematopoiesis and hematopoietic malignancies, likely acting through aberrant epigenetic programs interfering with normal hematopoiesis.

Fetal hemoglobin: the predominant oxygen carrier protein during gestation and in the perinatal period. It is composed of two alpha-globin subunits (α2) encoded by HBA1/2 and two gamma-globin subunits (γ2) encoded by HBG1/2. After birth, BCL11A orchestrates an epigenetic reprogramming of the beta-globin locus to increase HBB expression at the expense of HBG1/2, resulting in the transition from fetal hemoglobin to adult hemoglobin. Individuals with sickle cell disease who have higher levels of fetal hemoglobin have less-severe symptoms, and drugs that increase baseline fetal hemoglobin levels represent the most effective disease-modifying agents for patients with sickle cell disease.

Hemolytic anemia: a form of anemia caused by increased breakdown of red blood cells either in the blood stream (intravascular) or in other organ systems (extravascular). It can present in patients as an acquired or inherited disorder. Inherited forms include defects in red blood cell membranes (membranopathies), such as hereditary spherocytosis, or defects in red blood cell metabolism, such as aldolase A deficiency (see ‘aldolase A’).

Immune dysregulation: a maladaptive process through which normal immune system functions are corrupted, leading to autoimmune disease, cancer and hyperinflammatory states. Patients often have co-existing immunodeficiencies.

Landing pad: a synthetic or endogenous segment of DNA that can be used for precise and efficient genomic integration of one or more genetic elements.

Myeloid neoplasia: a group of malignant disorders specifically affecting the myeloid lineage of the hematopoietic system. Examples include acute myeloid leukemia, myelodysplastic syndrome and myeloproliferative neoplasia.

RNA-binding motif protein 38 (RBM38): an RNA-binding protein that regulates alternative splicing during terminal erythropoiesis. Individuals with inherited variants in RBM38 are at increased risk for the development of anemia.

Sickle cell disease: a group of inherited hemoglobinopathies characterized by aberrant polymerization of hemoglobin secondary to missense mutations in HBB. Affected individuals have chronic anemia, vaso-occlusive episodes (pain crisis) and increased infection susceptibility.

Terminal erythropoiesis: the process by which nucleated red blood cell progenitor cells undergo maturation into anuclear erythrocytes (red blood cells).

Fig. 1.

Flow diagram of hematopoiesis. Schematic depicting the cellular trajectories of hematopoiesis. Disease processes affecting individual cell types at different stages of differentiation are highlighted in gray boxes. The combined use of small molecules (SR1 and UM171) and human cytokines (SCF, FLT3L and TPO) can expand hematopoietic stem and progenitor cell (HSPC) populations (top blue box) (Fares et al., 2014). Cytokines used to differentiate HSPCs into erythrocytes (Giani et al., 2016), platelets (Perdomo et al., 2017), eosinophils (Shalit et al., 1995), basophils/mast cells (Kepley et al., 1998), neutrophils (Rao et al., 2021), macrophages (Way et al., 2009), dendritic cells (Swartz and Nair, 2022), natural killer (NK) cells (Spanholtz et al., 2010), innate lymphoid cells (Hernández et al., 2021), T cells (Singh et al., 2019) and B cells (Dubois et al., 2020; Kraus et al., 2014; Luo et al., 2009; Spanholtz et al., 2010; Hernández et al., 2021) are highlighted in blue boxes. EPO, erythropoietin; FLT3L, FMS-related tyrosine kinase 3 ligand; G-CSF, granulocyte colony-stimulating factor; GM-CSF, granulocyte macrophage colony-stimulating factor; IL-2, interleukin-2; IL-3, interleukin-3; IL-5, interleukin-5; IL-7, interleukin-7; IL-15, interleukin-15; M-CSF, macrophage colony-stimulating factor; SCF, stem cell factor; SR1, StemRegenin 1; TPO, thrombopoietin.

Fig. 1.

Flow diagram of hematopoiesis. Schematic depicting the cellular trajectories of hematopoiesis. Disease processes affecting individual cell types at different stages of differentiation are highlighted in gray boxes. The combined use of small molecules (SR1 and UM171) and human cytokines (SCF, FLT3L and TPO) can expand hematopoietic stem and progenitor cell (HSPC) populations (top blue box) (Fares et al., 2014). Cytokines used to differentiate HSPCs into erythrocytes (Giani et al., 2016), platelets (Perdomo et al., 2017), eosinophils (Shalit et al., 1995), basophils/mast cells (Kepley et al., 1998), neutrophils (Rao et al., 2021), macrophages (Way et al., 2009), dendritic cells (Swartz and Nair, 2022), natural killer (NK) cells (Spanholtz et al., 2010), innate lymphoid cells (Hernández et al., 2021), T cells (Singh et al., 2019) and B cells (Dubois et al., 2020; Kraus et al., 2014; Luo et al., 2009; Spanholtz et al., 2010; Hernández et al., 2021) are highlighted in blue boxes. EPO, erythropoietin; FLT3L, FMS-related tyrosine kinase 3 ligand; G-CSF, granulocyte colony-stimulating factor; GM-CSF, granulocyte macrophage colony-stimulating factor; IL-2, interleukin-2; IL-3, interleukin-3; IL-5, interleukin-5; IL-7, interleukin-7; IL-15, interleukin-15; M-CSF, macrophage colony-stimulating factor; SCF, stem cell factor; SR1, StemRegenin 1; TPO, thrombopoietin.

The validation of genetic variants has long been hampered by difficulty in identifying the appropriate biological context for testing and the absence of high-throughput experimental approaches. This has led to significant investment into the development of computational tools for variant prioritization and mechanistic inference. One particularly valuable approach is fine-mapping, which leverages population genetics and linkage disequilibrium patterns to identify likely causal variants and also can be integrated with epigenomic annotations to help discern putative mechanisms at specific loci (Schaid et al., 2018).

Variant interpretation and prioritization tools often employ evolutionary conservation or variation constraint to estimate the pathogenicity of a variant (Adzhubei et al., 2010; Vaser et al., 2016). These tools are often more sensitive in coding regions, in which variant impact on coding sequence is more readily discerned, but this can also be applied to non-coding elements. Nonetheless, interpretation of non-coding variants is challenging, as regulatory elements are less conserved across species, and defining the consequences of these changes can be harder. Multiple collaborative efforts, such as the Encyclopedia of DNA Elements (ENCODE) project (ENCODE Project Consortium, 2004), the Roadmap Epigenomics project (Bernstein et al., 2010) and the International Human Epigenome Consortium (Stunnenberg et al., 2016), have helped address this problem through the generation of genome-wide epigenomic maps across diverse human tissues. These epigenomic maps label tissue-specific regulatory elements, which are enriched in causal variants for blood and immune phenotypes and can be used to prioritize GWAS hits for downstream functional validation (Cano-Gamez and Trynka, 2020). Integrating these epigenomic maps into variant prediction algorithms, such as Combined Annotation-Dependent Depletion (CADD) (Kircher et al., 2014), has led to improved accuracy in prioritizing functional and pathogenic variants. Emerging data on tissue-specific three-dimensional genome interactions should further improve variant prioritization efforts. For instance, Javierre et al. (2016) applied promoter capture Hi-C, which detects interactions between gene promoters and cis-regulatory elements, to 17 human primary hematopoietic cell types and computationally linked thousands of previously uncharacterized GWAS single-nucleotide polymorphisms to their putative target genes.

Early epigenomic datasets were derived from mixed populations of cells masking important cell-type- or state-specific regulatory elements. Incorporating single-cell omics data into variant interpretation algorithms has led to improved sensitivity to detect causal variants driving blood cell traits (Ulirsch et al., 2019) and autoimmune diseases (Zhang et al., 2021). Unfortunately, the high sparsity and signal dropout in many single-cell datasets limits the power to accurately detect colocalization of a variant and an epigenomic state. To overcome this problem in sparsity and noise, our laboratory, alongside collaborators, recently developed an approach termed Single Cell Analysis of Variant Enrichment through Network propagation of GEnomic data (SCAVENGE) (Yu et al., 2022), which uses network propagation to better discern phenotype-relevant cells in sparse single-cell data and map trait- and disease-relevant genetic variation to the appropriate cellular context. Specifically, SCAVENGE allowed us to link variants associated with severe coronavirus disease (COVID-19) to immature CD14+ monocyte populations and map the dynamic changes of acute lymphoblastic leukemia (Box 1) risk predisposition along the B-cell developmental trajectory. These examples highlight the particular relevance of SCAVENGE to studying the influence of genetic variation on blood cells, as both phenotypes were restricted to rare hematopoietic populations with subtle global transcriptional differences compared to their neighbors that were missed in prior bulk and single-cell analyses.

In the coming years, the emergence of high-resolution single-cell datatypes and datasets (Regev et al., 2017; van der Wijst et al., 2020), as well as the development of new computational tools, will undoubtedly improve our ability to fine-map and prioritize variants. With the increasing availability and affordability of whole-genome sequencing, it is inevitable that most individuals living in the developed world will soon have the opportunity to have their genomes sequenced, and some individuals will harbor variants that increase their risk of various hematopoietic disorders, such as clonal hematopoiesis and blood cancers. Ideally, this genetic information could be used clinically to guide more frequent screening or lifestyle modifications to reduce disease risk. However, to properly inform these clinical recommendations, we must experimentally recapitulate the phenotypic consequences of these genetic variants (MacArthur et al., 2014), which is a tall task given the growing list of putative variants.

Congratulations, you have your list of putative causal variants. Now an important and formidable challenge in validation emerges. The most straightforward approach for validation of a variant associated with blood phenotypes/diseases is to directly investigate the gene product at the RNA or protein level using blood or bone marrow cells from individuals harboring these variants (Wolfe et al., 1982). However, this presumes that appropriate samples are available and that the impact gene has already been implicated in the phenotype/disease state. Most variants identified from GWASs do not meet these criteria and, often, even variants identified through sequencing approaches of rare patients and cohorts may not as well. Instead, investigators have classically relied upon molecular cloning and transfection/transduction of the gene of interest into cell lines (Box 2).

Box 2. Traditional approaches for validating genetic variants

Through exogenous delivery of complementary DNA (cDNA), investigators have identified variant-specific defects in transcription, splicing, protein stability and protein/cellular function. This approach can be applied in a high-throughput manner through delivery of cDNA pools containing all possible variants in a gene (Coyote-Maestas et al., 2022; Majithia et al., 2016; Melnikov et al., 2014; Mighell et al., 2018). In theory, these saturation mutagenesis screens could be performed for every human gene, essentially solving the variant-to-function problem for coding variants. However, many coding sequences are too large to clone/transfect, and delivery of an exogenous product often leads to supraphysiologic expression, confounding the biologic interpretation. Furthermore, these high-throughput approaches have been difficult to apply to primary cells, precluding the ability to properly assess the effect of variants on cellular phenotype, particularly in the context of physiologic differentiation or transient cell states.

For non-coding variants, the most common classic validation approach has been to use reporter assays in which the regulatory element encompassing the variant is placed upstream of a minimal promoter and a reporter gene, often GFP or luciferase. The success of such assays has led to multiple high-throughput versions that incorporate nucleic acid barcodes and allow for thousands of putative variants/regulatory elements to be assayed in parallel. Aside from a few studies in primary T-cells (Bourges et al., 2020; Mouri et al., 2022), these massively parallel reporter assays (MPRAs) have been performed using cell lines. Although MPRAs are effective at identifying variants that abrogate the activity of strong enhancers, the minimal genomic regions profiled precludes the identification of variants with more complicated effects on three-dimensional genome structure (Inoue et al., 2017).

The use of exogenous assays for functional validation has proven extremely useful in identifying causative variants and facilitating a deeper exploration of the biology of specific hematopoietic cell types. Exogenous delivery of complementary DNA has led to the characterization of promoter variants affecting transcription in beta-thalassemia (Box 1) (Orkin et al., 1983) and splicing variants driving alpha-thalassemia (Box 1) (Felber et al., 1982), and has also been used to identify thermolabile variants in aldolase A (Box 1) causing hemolytic anemia (Box 1) (Kishi et al., 1987). More recently, systematic profiling of hundreds of variants in DNA methyltransferase 3A (DNMT3A; Box 1), a key driver of clonal hematopoiesis and myeloid neoplasia (Box 1), identified a key factor regulating DNMT3A turnover (Huang et al., 2022). Massively parallel reporter assays (MPRAs) have also been used to successfully validate putative non-coding variants linked to various blood/immune cell traits, including the identification of an enhancer variant causing downregulation of RNA-binding motif protein 38 (RBM38; Box 1) and a subsequent defect in terminal erythropoiesis (Box 1) (Ulirsch et al., 2016). Unfortunately, the validation rate of putative non-coding variants using MPRAs has been low (Tewhey et al., 2016), likely due to the non-physiological background in which the experiments were performed.

Investigators interested in using MPRAs to screen a set of putative non-coding variants should plan their experiments carefully to mitigate the shortcomings of the approach. These assays can be optimized for use in more physiologically relevant cell types, such as primary hematopoietic stem and progenitor cells (HSPCs), or run in the setting of a physiologically relevant perturbation, such as addition of a stress signal or infectious agent. Alternatively, a landing pad (Box 1) could be used to drop in the reporter construct at a specific genomic locus to mitigate cell-to-cell variation in reporter expression based on the local chromatin milieu (Durrant et al., 2022). However, the prudent approach would be to transition to an endogenous system for the validation of both coding and non-coding variants (Box 3).

Box 3. A Cas-cade of new technologies

Since tools were developed for homologous recombination (Thomas and Capecchi, 1987), endogenous gene targeting has become the gold standard for assessing the effect of mutations on cellular and organismal function. The low efficiency of homologous recombination placed a high cost and time burden on the approach, effectively preventing it from being applied in a high-throughput fashion. The discovery of RNA interference (RNAi) (Lee et al., 1993) and the subsequent hacking of the endogenous silencing machinery using custom small interfering RNA (siRNA)/short hairpin RNA (shRNA) molecules (Fire et al., 1998) created a cost-effective tool for the systematic disruption of endogenous gene activity. The approach is amenable to multiplexing through the creation of siRNA libraries (Moffat et al., 2006). However, its utility is limited to gene knockdown/knockout, and there can be frequent off target effects (Jackson and Linsley, 2010). Therefore, these approaches have recently been used less frequently since the discovery and adoption of the CRISPR/Cas9 system and derived screening tools by the research community (Deltcheva et al., 2011; Jinek et al., 2012).

The emergence of CRISPR technologies has revolutionized biomedical research (Doudna and Charpentier, 2014). The initial iteration used the Cas9 nuclease and guide RNA to introduce site-specific double-strand DNA (dsDNA) breaks that are repeatedly repaired by non-homologous end joining (NHEJ) before an error occurs, resulting in a short insertion/deletion (indel) at the target site (Fig. 2A) (Brinkman et al., 2018). In coding regions, this indel can cause a frameshift leading to the creation of downstream premature stop codons and consequent nonsense-mediated decay of the transcript or impaired translation into protein. However, unlike RNAi silencing approaches, it can also be used to probe regulatory elements enriched in non-coding variants through the use of multiple guides to generate deletions across such regions (Diao et al., 2017). As an alternative approach to probe regulatory regions, investigators engineered fusion proteins composed of an endonuclease-dead Cas9 (dCas9) fused to transcriptional repressors (CRISPRi) (Gilbert et al., 2013) or activators (CRISPRa) (Gilbert et al., 2014). New versions of the tools (CRISPRoff and CRISPRon) allow for more stable repression or activation of targeted regulatory elements, at least when regulatory elements can be targeted by introducing or removing DNA methylation marks (Amabile et al., 2016; Nuñez et al., 2021).

The CRISPR/Cas9 system was quickly adapted to facilitate homology-directed repair (HDR), allowing for the precise introduction (or repair) of mutations at target loci (Fig. 1A) (Cong et al., 2013; Mali et al., 2013). For the first time, investigators could more readily introduce precise mutations in non-coding or coding elements and assess the effect on gene regulation and cellular phenotype (Ajore et al., 2022). Cas9-mediated HDR is also amenable to high-throughput screening (Findlay et al., 2014,, 2018); however, competition between HDR and NHEJ repair pathways leads to complex readouts, complicating the downstream interpretation of these results. Next-generation CRISPR technologies have helped solve this problem by bypassing the need for repair of dsDNA breaks.

Base editing and prime editing are two novel approaches for precision gene editing that avoid creating dsDNA breaks, significantly improving the ratio of precision edits to random indels (Anzalone et al., 2019; Gaudelli et al., 2017; Komor et al., 2016). Base editors come in two categories: cytosine base editors (CBEs), which promote the conversion of a C:G to a T:A base pair, and adenine base editors (ABEs), which convert an A:T into a G:C base pair. Structurally, base editors consist of a mutant Cas9 capable of introducing single-strand cuts (Cas9 nickase) or that has no cleavage activity fused to a deaminase enzyme (APOBEC1 in the case of CBEs and TadA in the case of ABEs) (Fig. 2B). Because the deaminase activity of the Cas9 fusion protein is dictated by proximity to the target nucleotide (A or C), a specific set of positions within the guide sequence are ‘editable’ and combinatorial editing can occur, occasionally leading to the introduction of multiple non-synonymous mutations in the same allele.

Prime editors consist of a Cas9 nickase fused to a reverse transcriptase domain, utilizing the template provided on a modified guide RNA to introduce specific DNA changes at the target site (Fig. 1C) (Anzalone et al., 2019). Although the rate of NHEJ is slightly higher than with base editors, the ability to introduce specific mutations without restrictions with respect to position within the guide sequence represents a major advantage of this technology. The technology has recently been adapted to facilitate the creation of large desired deletions (Choi et al., 2022) and insertions (Anzalone et al., 2022), and it is likely that new versions will emerge with improved on-target efficiency and lower rates of NHEJ, similar to updated versions of base editors (Koblan et al., 2018).

Fig. 2.

CRISPR/Cas9-based tools and delivery approaches. (A) Cas9 endonucleases form complexes with single-guide RNAs (sgRNA) by binding to the constant scaffold region. The Cas9/sgRNA complex then scans the genome until it identifies a region with complementary DNA to the spacer sequence (first 20 bp of sgRNA). Transient binding of the sgRNA to its cognate target DNA facilitates the formation of a double-strand DNA (dsDNA) break. Repeat cycles of cleavage and repair with non-homologous end joining (NHEJ) eventually leads to mis-repair through the addition or removal of nucleotides (left arrow). By delivering an alternative homologous template for repair, one can introduce specific edits at the target site through homology-directed repair (HDR; right arrow). (B) Base editing avoids the formation of dsDNA breaks by using a mutant Cas9 ‘nickase’, which is only capable of creating a single-stranded nick. Adenine base editors (ABE; left) consist of a Cas9 nickase fused to two copies of tRNA adenine deaminase (TadA), one of which (TadA*) is mutated to accept DNA as a substrate, leading to deamination of adenines to inosines (treated as guanine by DNA polymerase). Cytosine base editors (CBE; right) are composed of Cas9 nickase fused to APOBEC1, a cytidine deaminase that converts cytidine to uracil (treated as thymine by DNA polymerase). The addition of two uracil-DNA glycosylase inhibitor (UGI) domains to the fusion protein prevents the conversion of uracil back to cytidine. Base-editor technologies use the same sgRNAs as the traditional Cas9 endonuclease approach. The deamination reactions lead to one or more DNA mismatches at the target site, which are repaired by the endogenous mismatch repair machinery (bottom). (C) Prime editors consist of a Cas9 nickase fused to a reverse transcriptase (RT), which utilizes the sequence provided in a specialized prime editing guide RNA (pegRNA) to introduce a specific edit at the target site. The pegRNA contains the spacer and scaffold sequences used in sgRNAs but adds an additional 3′ DNA sequence including an RT template, the desired edit and a primer binding site (PBS) complementary to the nicked strand (top). Upon binding of the PBS to the free end of the nicked DNA strand, the fused reverse transcriptase incorporates the edit and any additional homologous sequence encoded in the RT template into the nicked DNA strand (middle). Upon DNA re-annealing, a 5′ flap is present, which is cleaved by endogenous exonucleases. The edit is then copied to the opposite strand by mismatch repair machinery or during DNA replication (bottom). A, adenine; APOBEC1, apolipoprotein B mRNA editing enzyme, catalytic polypeptide 1; C, cytosine; Cas9, CRISPR-associated protein 9; G, guanine; I, inosine; U, uracil; T, thymine.

Fig. 2.

CRISPR/Cas9-based tools and delivery approaches. (A) Cas9 endonucleases form complexes with single-guide RNAs (sgRNA) by binding to the constant scaffold region. The Cas9/sgRNA complex then scans the genome until it identifies a region with complementary DNA to the spacer sequence (first 20 bp of sgRNA). Transient binding of the sgRNA to its cognate target DNA facilitates the formation of a double-strand DNA (dsDNA) break. Repeat cycles of cleavage and repair with non-homologous end joining (NHEJ) eventually leads to mis-repair through the addition or removal of nucleotides (left arrow). By delivering an alternative homologous template for repair, one can introduce specific edits at the target site through homology-directed repair (HDR; right arrow). (B) Base editing avoids the formation of dsDNA breaks by using a mutant Cas9 ‘nickase’, which is only capable of creating a single-stranded nick. Adenine base editors (ABE; left) consist of a Cas9 nickase fused to two copies of tRNA adenine deaminase (TadA), one of which (TadA*) is mutated to accept DNA as a substrate, leading to deamination of adenines to inosines (treated as guanine by DNA polymerase). Cytosine base editors (CBE; right) are composed of Cas9 nickase fused to APOBEC1, a cytidine deaminase that converts cytidine to uracil (treated as thymine by DNA polymerase). The addition of two uracil-DNA glycosylase inhibitor (UGI) domains to the fusion protein prevents the conversion of uracil back to cytidine. Base-editor technologies use the same sgRNAs as the traditional Cas9 endonuclease approach. The deamination reactions lead to one or more DNA mismatches at the target site, which are repaired by the endogenous mismatch repair machinery (bottom). (C) Prime editors consist of a Cas9 nickase fused to a reverse transcriptase (RT), which utilizes the sequence provided in a specialized prime editing guide RNA (pegRNA) to introduce a specific edit at the target site. The pegRNA contains the spacer and scaffold sequences used in sgRNAs but adds an additional 3′ DNA sequence including an RT template, the desired edit and a primer binding site (PBS) complementary to the nicked strand (top). Upon binding of the PBS to the free end of the nicked DNA strand, the fused reverse transcriptase incorporates the edit and any additional homologous sequence encoded in the RT template into the nicked DNA strand (middle). Upon DNA re-annealing, a 5′ flap is present, which is cleaved by endogenous exonucleases. The edit is then copied to the opposite strand by mismatch repair machinery or during DNA replication (bottom). A, adenine; APOBEC1, apolipoprotein B mRNA editing enzyme, catalytic polypeptide 1; C, cytosine; Cas9, CRISPR-associated protein 9; G, guanine; I, inosine; U, uracil; T, thymine.

Endogenous gene targeting has traditionally been limited due to the low efficiency of homologous recombination. A solution emerged with the discovery of RNA silencing, which has been used to study the effect of putative genes driving Diamond-Blackfan anemia (Box 1) (Ebert et al., 2005), the regulation of fetal hemoglobin (Box 1) (Sankaran et al., 2008), and variant effects from GWASs on clotting disorders (de Vries et al., 2019), red blood cell traits (Nandakumar et al., 2019) and immune dysregulation (Box 1) (Peters et al., 2017).

The CRISPR/Cas9 system (Fig. 2) represents a more reliable and stable version of RNA silencing and has become the gold standard for validating the effect of variants on blood cell phenotypes through targeted gene knockout (Anderson et al., 2019; Giani et al., 2016), disruption of cis-regulatory elements (Guo et al., 2017) or the precise introduction of variants using homology-directed repair (HDR) (Wienert et al., 2017). The clinical applications of the technology are perhaps even more exciting. CRISPR/Cas9-based therapies are currently being tested for the amelioration of blood disorders in a rapidly growing number of clinical trials (Table 1) (Kanter et al., 2021), and a few are close to garnering U.S. Food and Drug Administration (FDA) approval, including exa-cel, which uses CRISPR/Cas9 to disrupt the erythroid-specific B-cell lymphoma/leukemia 11A (BCL11A; Box 1) enhancer and thus reactivate the production of fetal hemoglobin in autologous CD34+ HSPCs. This particular approach is a promising treatment for sickle cell disease (Box 1) and beta-thalassemia (Frangoul et al., 2021).

Table 1.

CRISPR/Cas9 therapies targeting hematopoietic cells in clinical trials

CRISPR/Cas9 therapies targeting hematopoietic cells in clinical trials
CRISPR/Cas9 therapies targeting hematopoietic cells in clinical trials

Although CRISPR allows for the precise introduction of individual variants at endogenous loci, to overcome the variant-to-function bottleneck, investigators needed a way to edit thousands of putative sites in a single experiment. Fortunately, CRISPR is amenable to multiplexing, allowing investigators to probe multiple regulatory elements in parallel, such as BCL11A enhancers or the cis-regulatory elements in the HBS1L-MYB intergenic region, both associated with fetal hemoglobin and other red blood cell phenotypes (Canver et al., 2015, 2017). However, this approach is imprecise, relying on the random incorporation of small insertions/deletions at the site of a putative variant or the generation of large deletions surrounding the variant using a paired guide approach. Multiple second-generation CRISPR tools have been developed that help circumvent the imprecision of early CRISPR screens. CRISPRa and CRISPRi allow for site-specific recruitment of active or repressive epigenetic machinery, respectively, allowing researchers to probe thousands of putative regulatory elements associated with blood cell traits in a single experiment (Morris et al., 2021 preprint; Nasser et al., 2021). Base editing has also proven to be an effective tool for massively parallel mutation scanning, enabling validation of GWAS hits (Cuella-Martin et al., 2021; Hanna et al., 2021) and saturation mutagenesis of individual genes (Lue et al., 2023; Sangree et al., 2022). And, similar to CRISPR/Cas9, base editors have already entered the clinic, with two clinical trials for sickle cell disease and leukemia actively recruiting patients (Kingwell, 2022). Finally, prime editing, which represents the most precise endonuclease-mediated repair process to date, has been successfully applied to primary human hematopoietic cells (Petri et al., 2022), used for saturation editing (Erwood et al., 2022) and will surely be used to screen GWAS hits in the near future.

“As we develop methods for targeting HSPCs in their native environment through in vivo delivery of CRISPR products, the true impact of these discoveries on our understanding and treatment of blood diseases will be felt.”

Recent technological advances have facilitated more precise control in manipulating the genomes of mammalian cells. It is only a matter of time before high-throughput base or prime editor screens successfully interrogate the human genome at base-pair resolution. This will undoubtedly be a major step towards solving the variant-to-function problem in genomics. However, most variants uncovered through GWASs are non-coding variants predicted to fall into regulatory elements with tissue-specific and highly restricted activity. Indeed, this may underlie why many expression quantitative trait loci have had limited value for elucidating disease-causal variants identified through GWASs (Mostafavi et al., 2022 preprint). The use of primary hematopoietic cells is ideal in this circumstance, but has historically been limited due to difficulty in maintaining these cells in culture and low transfection efficiencies with considerable toxicity. Fortunately, new protocols for culture of HSPCs have been developed that allow short-term maintenance of stem and progenitor populations, as well as the directed differentiation into various lineages through the addition of cytokine cocktails, allowing investigators to study lineage-specific effects of genetic variation.

HSPCs can now be expanded more than 30-fold during short-term culture through the addition of the small molecules SR1 and UM171 (Boitano et al., 2010; Fares et al., 2014), which are pyrimidoindole derivatives that promote human HSPC self-renewal by altering epigenomic reprogramming (Subramaniam et al., 2020), enabling large-scale screens of individual umbilical cord blood or mobilized peripheral blood samples. If left to their own device, HSPCs maintained in vitro will eventually differentiate into mature blood cells with a significant, if not complete, myeloid skew. However, there are protocols for directed differentiation of HSPCs into almost every mature blood/immune cell lineage (Fig. 1).

With the ability to expand human HSPCs and differentiate them into almost any lineage, one would think that hacking hematopoiesis with CRISPR would be a matter of ‘plug and play’. However, it has taken time for CRISPR tools to be optimized for use in primary hematopoietic cells. Early attempts to edit HSPCs using the CRISPR/Cas9 system utilized lentiviral transduction (Heckl et al., 2014) or plasmid DNA transfection (Mandal et al., 2014) (Fig. 3A). These strategies allowed for targeted gene disruption of clinically relevant targets, such as beta-2 microglobin (B2M; Box 1) and C-C chemokine receptor type 5 (CCR5; Box 1). However, high toxicity limited plasmid transfection approaches and low-efficiency knockout was observed with lentiviral approaches, possibly due to HSPC intolerance to constitutive expression of Cas9. These complications prevented high-efficiency gene disruption that could have permitted phenotypes to be screened on the bulk population of cells. An alternative strategy emerged, driven by success in mouse embryo knockout experiments, whereby the raw components of CRISPR [Cas9 protein and the single-guide RNA (sgRNA)] were directly delivered into HSPCs through electroporation (Gundry et al., 2016; Hendel et al., 2015) (Fig. 3A). This approach has proven effective for studying the phenotypic effect of gene disruption in HSPCs (Nakauchi et al., 2022), as well as for therapeutic gene editing of hematopoietic cells, including all of the clinical trials that use CRISPR to edit HSPCs in patients with beta-thalassemia and sickle cell disease (Frangoul et al., 2021) (Table 1). Importantly, the approach does not require phenotypic readout in HSPCs populations, as investigators have edited HSPCs and differentiated them into erythroid (Wu et al., 2019) or neutrophil populations (Rao et al., 2021) prior to exploring a phenotype. The delivery of templates for HDR into HSPCs has also been fine-tuned (Fig. 3B). Initial attempts to deliver plasmids or single-stranded oligodeoxynucleotides (ssODNs) were successful, but demonstrated low-efficiency knock-in. More recently, investigators have found that adeno-associated virus (AAV)-mediated delivery of HDR templates leads to higher knock-in efficiencies and an improved ratio of HDR to non-homologous end joining (NHEJ) (Romero et al., 2019). However, there remains some controversy as to whether long-term HSCs are more effectively repaired by ssODN or AAV templates, as in vitro and in vivo data have not always correlated (Pattabhi et al., 2019).

Fig. 3.

CRISPR delivery systems. (A) Approaches for delivery of CRISPR products into hematopoietic cells. (B) Templates utilized for homology-directed repair (HDR). AAV, adeno-associated virus; Cas9, CRISPR-associated protein 9; sgRNA, single-guide RNA; ssODN, single-stranded oligonucleotide.

Fig. 3.

CRISPR delivery systems. (A) Approaches for delivery of CRISPR products into hematopoietic cells. (B) Templates utilized for homology-directed repair (HDR). AAV, adeno-associated virus; Cas9, CRISPR-associated protein 9; sgRNA, single-guide RNA; ssODN, single-stranded oligonucleotide.

Michael Gundry (left) and Vijay G. Sankaran (right)

Michael Gundry (left) and Vijay G. Sankaran (right)

The final step in optimizing CRISPR or newer generation genome-editing tools for use in HSPCs is ongoing, with a number of groups currently working on developing protocols for high-content CRISPR or other editing screens in primary cells and with potential single-cell readouts (Bock et al., 2022). Nevertheless, the impact of efficient gene-editing of human HSPCs on the field of hematopoiesis has already been enormous. In a short timeframe, we have seen multiple instances in which these technologies have been used to model or explore disease pathogenesis and the same strategy used to model the disease has become the potential cure (Table 1). As we develop methods for targeting HSPCs in their native environment through in vivo delivery of CRISPR products, the true impact of these discoveries on our understanding and treatment of blood diseases will be felt.

“The continued optimization of novel gene-editing approaches in primary cells will allow for the simultaneous assessment of thousands of variants in a physiologically relevant setting.”

The continued optimization of novel gene-editing approaches in primary cells will allow for the simultaneous assessment of thousands of variants in a physiologically relevant setting. By combining these techniques with new tools for cellular barcoding (Ludwig et al., 2019; Sankaran et al., 2022), we can better characterize the clonal dynamics of native hematopoiesis and discover the mechanisms through which germline or somatically acquired mutations perturb this complex and dynamic process (Qiu et al., 2022). For instance, single-cell omics and clonal tracking were recently used to study how a common somatically acquired mutation found in clonal hematopoiesis perturbs early myeloid progenitor states through alterations in CpG methylation (Nam et al., 2022).

It remains to be seen how effective this approach will be for modeling polygenic traits/disease with subtle in vitro phenotypes. Many hematopoietic alterations, such as clonal hematopoiesis, take decades to manifest into a clinical phenotype, such as the onset of a blood cancer, if at all (Fabre et al., 2022; Robertson et al., 2022). It is unclear whether phenotypic changes will declare themselves at early time points. The use of xenotransplantation models, whereby edited human hematopoietic stem or progenitor cells are transplanted into immunocompromised mice, can technically be used to maintain edited cells for longer intervals, but studies have shown that a small fraction of transplanted human HSPCs engraft in these mice, at least in earlier models (Cheung et al., 2013; Sharma et al., 2021). This limits the ability to screen large numbers of variants, but may be improved with new models that enable higher levels of engraftment with more diverse cell types (Cosgun et al., 2014; Martinov et al., 2021; McIntosh et al., 2015; Sargent et al., 2022; Song et al., 2021). Alternatively, screening for molecular phenotypes using assays such as Perturb-seq (Adamson et al., 2016) may uncover perturbed gene regulatory networks within mutant clones that have not yet manifested into obvious cellular phenotypes.

During the past 50 years, we have deciphered the human genome and developed tools for precise manipulation of its individual elements. The next 50 years will likely be defined by applying our knowledge of the genetic basis of disease through personalized pharmacogenomics and therapeutic genome editing. As we work to comprehensively define the contribution of common and rare genetic variation to blood and immune diseases, we should aim to keep the translational implications of our work front and center. The genomic era in medicine has arrived.

We thank members of the Sankaran laboratory for valuable comments and discussion on the manuscript. We apologize for our inability to cite many relevant papers in this field, given space limitations.

Funding

The Sankaran laboratory is supported by the New York Stem Cell Foundation (NYSCF), a gift from the Lodish Family to Boston Children's Hospital, the Edward P. Evans Foundation, the MPN Research Foundation, the Ellison Medical Foundation and the National Institutes of Health [R01 DK103794, R01 CA265726, R01 HL146500 (V.G.S.)]. V.G.S. is a NYSCF Robertson Investigator.

Adamson
,
B.
,
Norman
,
T. M.
,
Jost
,
M.
,
Cho
,
M. Y.
,
Nuñez
,
J. K.
,
Chen
,
Y.
,
Villalta
,
J. E.
,
Gilbert
,
L. A.
,
Horlbeck
,
M. A.
,
Hein
,
M. Y.
et al. 
(
2016
).
A multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response
.
Cell
167
,
1867
-
1882.e21
.
Adzhubei
,
I. A.
,
Schmidt
,
S.
,
Peshkin
,
L.
,
Ramensky
,
V. E.
,
Gerasimova
,
A.
,
Bork
,
P.
,
Kondrashov
,
A. S.
and
Sunyaev
,
S. R.
(
2010
).
A method and server for predicting damaging missense mutations
.
Nat. Methods
7
,
248
-
249
.
Ajore
,
R.
,
Niroula
,
A.
,
Pertesi
,
M.
,
Cafaro
,
C.
,
Thodberg
,
M.
,
Went
,
M.
,
Bao
,
E. L.
,
Duran-Lozano
,
L.
,
Lopez de Lapuente Portilla
,
A.
,
Olafsdottir
,
T.
et al. 
(
2022
).
Functional dissection of inherited non-coding variation influencing multiple myeloma risk
.
Nat. Commun.
13
,
151
.
Alexander
,
W. S.
,
Roberts
,
A. W.
,
Nicola
,
N. A.
,
Li
,
R.
and
Metcalf
,
D.
(
1996
).
Deficiencies in progenitor cells of multiple hematopoietic lineages and defective megakaryocytopoiesis in mice lacking the thrombopoietic receptor c-Mpl
.
Blood
87
,
2162
-
2170
.
Amabile
,
A.
,
Migliara
,
A.
,
Capasso
,
P.
,
Biffi
,
M.
,
Cittaro
,
D.
,
Naldini
,
L.
and
Lombardo
,
A.
(
2016
).
Inheritable silencing of endogenous genes by hit-and-run targeted epigenetic editing
.
Cell
167
,
219
-
232.e14
.
Anderson
,
W.
,
Thorpe
,
J.
,
Long
,
S. A.
and
Rawlings
,
D. J.
(
2019
).
Efficient CRISPR/Cas9 disruption of autoimmune-associated genes reveals key signaling programs in primary human T cells
.
J. Immunol.
203
,
3166
-
3178
.
Anzalone
,
A. V.
,
Randolph
,
P. B.
,
Davis
,
J. R.
,
Sousa
,
A. A.
,
Koblan
,
L. W.
,
Levy
,
J. M.
,
Chen
,
P. J.
,
Wilson
,
C.
,
Newby
,
G. A.
,
Raguram
,
A.
et al. 
(
2019
).
Search-and-replace genome editing without double-strand breaks or donor DNA
.
Nature
576
,
149
-
157
.
Anzalone
,
A. V.
,
Gao
,
X. D.
,
Podracky
,
C. J.
,
Nelson
,
A. T.
,
Koblan
,
L. W.
,
Raguram
,
A.
,
Levy
,
J. M.
,
Mercer
,
J. A. M.
and
Liu
,
D. R.
(
2022
).
Programmable deletion, replacement, integration and inversion of large DNA sequences with twin prime editing
.
Nat. Biotechnol.
40
,
731
-
740
.
Bao
,
E. L.
,
Nandakumar
,
S. K.
,
Liao
,
X.
,
Bick
,
A. G.
,
Karjalainen
,
J.
,
Tabaka
,
M.
,
Gan
,
O. I.
,
Havulinna
,
A. S.
,
Kiiskinen
,
T. T. J.
,
Lareau
,
C. A.
et al. 
(
2020
).
Inherited myeloproliferative neoplasm risk affects haematopoietic stem cells
.
Nature
586
,
769
-
775
.
Bernstein
,
B. E.
,
Stamatoyannopoulos
,
J. A.
,
Costello
,
J. F.
,
Ren
,
B.
,
Milosavljevic
,
A.
,
Meissner
,
A.
,
Kellis
,
M.
,
Marra
,
M. A.
,
Beaudet
,
A. L.
,
Ecker
,
J. R.
et al. 
(
2010
).
The NIH roadmap epigenomics mapping consortium
.
Nat. Biotechnol.
28
,
1045
-
1048
.
Bick
,
A. G.
,
Weinstock
,
J. S.
,
Nandakumar
,
S. K.
,
Fulco
,
C. P.
,
Bao
,
E. L.
,
Zekavat
,
S. M.
,
Szeto
,
M. D.
,
Liao
,
X.
,
Leventhal
,
M. J.
,
Nasser
,
J.
et al. 
(
2020
).
Inherited causes of clonal haematopoiesis in 97,691 whole genomes
.
Nature
586
,
763
-
768
.
Bock
,
C.
,
Datlinger
,
P.
,
Chardon
,
F.
,
Coelho
,
M. A.
,
Dong
,
M. B.
,
Lawson
,
K. A.
,
Lu
,
T.
,
Maroc
,
L.
,
Norman
,
T. M.
,
Song
,
B.
et al. 
(
2022
).
High-content CRISPR screening
.
Nat. Rev. Methods Primers
2
,
8
.
Boitano
,
A. E.
,
Wang
,
J.
,
Romeo
,
R.
,
Bouchez
,
L. C.
,
Parker
,
A. E.
,
Sutton
,
S. E.
,
Walker
,
J. R.
,
Flaveny
,
C. A.
,
Perdew
,
G. H.
,
Denison
,
M. S.
et al. 
(
2010
).
Aryl hydrocarbon receptor antagonists promote the expansion of human hematopoietic stem cells
.
Science
329
,
1345
-
1348
.
Bourges
,
C.
,
Groff
,
A. F.
,
Burren
,
O. S.
,
Gerhardinger
,
C.
,
Mattioli
,
K.
,
Hutchinson
,
A.
,
Hu
,
T.
,
Anand
,
T.
,
Epping
,
M. W.
,
Wallace
,
C.
et al. 
(
2020
).
Resolving mechanisms of immune-mediated disease in primary CD4 T cells
.
EMBO Mol. Med.
12
,
e12112
.
Brinkman
,
E. K.
,
Chen
,
T.
,
de Haas
,
M.
,
Holland
,
H. A.
,
Akhtar
,
W.
and
van Steensel
,
B.
(
2018
).
Kinetics and fidelity of the repair of Cas9-induced double-strand DNA breaks
.
Mol. Cell
70
,
801
-
813.e6
.
Brown
,
D. W.
,
Cato
,
L. D.
,
Zhao
,
Y.
,
Nandakumar
,
S. K.
,
Bao
,
E. L.
,
Rehling
,
T.
,
Song
,
L.
,
Yu
,
K.
,
Chanock
,
S. J.
,
Perry
,
J. R. B.
et al. 
(
2022
).
Shared and distinct genetic etiologies for different types of clonal hematopoiesis
.
bioRxiv
2022.03.14.483644
.
Bycroft
,
C.
,
Freeman
,
C.
,
Petkova
,
D.
,
Band
,
G.
,
Elliott
,
L. T.
,
Sharp
,
K.
,
Motyer
,
A.
,
Vukcevic
,
D.
,
Delaneau
,
O.
,
O'Connell
,
J.
et al. 
(
2018
).
The UK Biobank resource with deep phenotyping and genomic data
.
Nature
562
,
203
-
209
.
Cano-Gamez
,
E.
and
Trynka
,
G.
(
2020
).
From GWAS to function: using functional genomics to identify the mechanisms underlying complex diseases
.
Front. Genet.
11
,
424
.
Canver
,
M. C.
,
Smith
,
E. C.
,
Sher
,
F.
,
Pinello
,
L.
,
Sanjana
,
N. E.
,
Shalem
,
O.
,
Chen
,
D. D.
,
Schupp
,
P. G.
,
Vinjamur
,
D. S.
,
Garcia
,
S. P.
et al. 
(
2015
).
BCL11A enhancer dissection by Cas9-mediated in situ saturating mutagenesis
.
Nature
527
,
192
-
197
.
Canver
,
M. C.
,
Lessard
,
S.
,
Pinello
,
L.
,
Wu
,
Y.
,
Ilboudo
,
Y.
,
Stern
,
E. N.
,
Needleman
,
A. J.
,
Galactéros
,
F.
,
Brugnara
,
C.
,
Kutlar
,
A.
et al. 
(
2017
).
Variant-aware saturating mutagenesis using multiple Cas9 nucleases identifies regulatory elements at trait-associated loci
.
Nat. Genet.
49
,
625
-
634
.
Chen
,
M.-H.
,
Raffield
,
L. M.
,
Mousas
,
A.
,
Sakaue
,
S.
,
Huffman
,
J. E.
,
Moscati
,
A.
,
Trivedi
,
B.
,
Jiang
,
T.
,
Akbari
,
P.
,
Vuckovic
,
D.
et al. 
(
2020
).
Trans-ethnic and ancestry-specific blood-cell genetics in 746,667 individuals from 5 global populations
.
Cell
182
,
1198
-
1213.e14
.
Cheung
,
A. M. S.
,
Nguyen
,
L. V.
,
Carles
,
A.
,
Beer
,
P.
,
Miller
,
P. H.
,
Knapp
,
D. J. H. F.
,
Dhillon
,
K.
,
Hirst
,
M.
and
Eaves
,
C. J.
(
2013
).
Analysis of the clonal growth and differentiation dynamics of primitive barcoded human cord blood cells in NSG mice
.
Blood
122
,
3129
-
3137
.
Choi
,
J.
,
Chen
,
W.
,
Suiter
,
C. C.
,
Lee
,
C.
,
Chardon
,
F. M.
,
Yang
,
W.
,
Leith
,
A.
,
Daza
,
R. M.
,
Martin
,
B.
and
Shendure
,
J.
(
2022
).
Precise genomic deletions using paired prime editing
.
Nat. Biotechnol.
40
,
218
-
226
.
Cong
,
L.
,
Ran
,
F. A.
,
Cox
,
D.
,
Lin
,
S.
,
Barretto
,
R.
,
Habib
,
N.
,
Hsu
,
P. D.
,
Wu
,
X.
,
Jiang
,
W.
,
Marraffini
,
L. A.
et al. 
(
2013
).
Multiplex genome engineering using CRISPR/Cas systems
.
Science
339
,
819
-
823
.
Cosgun
,
K. N.
,
Rahmig
,
S.
,
Mende
,
N.
,
Reinke
,
S.
,
Hauber
,
I.
,
Schäfer
,
C.
,
Petzold
,
A.
,
Weisbach
,
H.
,
Heidkamp
,
G.
,
Purbojo
,
A.
et al. 
(
2014
).
Kit regulates HSC engraftment across the human-mouse species barrier
.
Cell Stem Cell
15
,
227
-
238
.
Coyote-Maestas
,
W.
,
Nedrud
,
D.
,
He
,
Y.
and
Schmidt
,
D.
(
2022
).
Determinants of trafficking, conduction, and disease within a K+ channel revealed through multiparametric deep mutational scanning
.
Elife
11
,
e76903
.
Cuella-Martin
,
R.
,
Hayward
,
S. B.
,
Fan
,
X.
,
Chen
,
X.
,
Huang
,
J.-W.
,
Taglialatela
,
A.
,
Leuzzi
,
G.
,
Zhao
,
J.
,
Rabadan
,
R.
,
Lu
,
C.
et al. 
(
2021
).
Functional interrogation of DNA damage response variants with base editing screens
.
Cell
184
,
1081
-
1097.e19
.
de Vries
,
P. S.
,
Sabater-Lleal
,
M.
,
Huffman
,
J. E.
,
Marten
,
J.
,
Song
,
C.
,
Pankratz
,
N.
,
Bartz
,
T. M.
,
de Haan
,
H. G.
,
Delgado
,
G. E.
,
Eicher
,
J. D.
et al. 
(
2019
).
A genome-wide association study identifies new loci for factor VII and implicates factor VII in ischemic stroke etiology
.
Blood
133
,
967
-
977
.
Deltcheva
,
E.
,
Chylinski
,
K.
,
Sharma
,
C. M.
,
Gonzales
,
K.
,
Chao
,
Y.
,
Pirzada
,
Z. A.
,
Eckert
,
M. R.
,
Vogel
,
J.
and
Charpentier
,
E.
(
2011
).
CRISPR RNA maturation by trans-encoded small RNA and host factor RNase III
.
Nature
471
,
602
-
607
.
Diao
,
Y.
,
Fang
,
R.
,
Li
,
B.
,
Meng
,
Z.
,
Yu
,
J.
,
Qiu
,
Y.
,
Lin
,
K. C.
,
Huang
,
H.
,
Liu
,
T.
,
Marina
,
R. J.
et al. 
(
2017
).
A tiling-deletion-based genetic screen for cis-regulatory element identification in mammalian cells
.
Nat. Methods
14
,
629
-
635
.
Doudna
,
J. A.
and
Charpentier
,
E.
(
2014
).
Genome editing. The new frontier of genome engineering with CRISPR-Cas9
.
Science
346
,
1258096
.
Dubois
,
F.
,
Gaignerie
,
A.
,
Flippe
,
L.
,
Heslan
,
J.-M.
,
Tesson
,
L.
,
Chesneau
,
M.
,
Haspot
,
F.
,
Conchon
,
S.
,
David
,
L.
and
Brouard
,
S.
(
2020
).
Toward a better definition of hematopoietic progenitors suitable for B cell differentiation
.
PLoS One
15
,
e0243769
.
Durrant
,
M. G.
,
Fanton
,
A.
,
Tycko
,
J.
,
Hinks
,
M.
,
Chandrasekaran
,
S. S.
,
Perry
,
N. T.
,
Schaepe
,
J.
,
Du
,
P. P.
,
Lotfy
,
P.
,
Bassik
,
M. C.
et al. 
(
2022
).
Systematic discovery of recombinases for efficient integration of large DNA sequences into the human genome
.
Nat. Biotechnol.
Ebert
,
B. L.
,
Lee
,
M. M.
,
Pretz
,
J. L.
,
Subramanian
,
A.
,
Mak
,
R.
,
Golub
,
T. R.
and
Sieff
,
C. A.
(
2005
).
An RNA interference model of RPS19 deficiency in Diamond-Blackfan anemia recapitulates defective hematopoiesis and rescue by dexamethasone: identification of dexamethasone-responsive genes by microarray
.
Blood
105
,
4620
-
4626
.
ENCODE Project Consortium.
(
2004
).
The ENCODE (ENCyclopedia Of DNA Elements) project
.
Science
306
,
636
-
640
.
Erwood
,
S.
,
Bily
,
T. M. I.
,
Lequyer
,
J.
,
Yan
,
J.
,
Gulati
,
N.
,
Brewer
,
R. A.
,
Zhou
,
L.
,
Pelletier
,
L.
,
Ivakine
,
E. A.
and
Cohn
,
R. D.
(
2022
).
Saturation variant interpretation using CRISPR prime editing
.
Nat. Biotechnol.
40
,
885
-
895
.
Fabre
,
M. A.
,
de Almeida
,
J. G.
,
Fiorillo
,
E.
,
Mitchell
,
E.
,
Damaskou
,
A.
,
Rak
,
J.
,
Orrù
,
V.
,
Marongiu
,
M.
,
Chapman
,
M. S.
,
Vijayabaskar
,
M. S.
et al. 
(
2022
).
The longitudinal dynamics and natural history of clonal haematopoiesis
.
Nature
606
,
335
-
342
.
Fares
,
I.
,
Chagraoui
,
J.
,
Gareau
,
Y.
,
Gingras
,
S.
,
Ruel
,
R.
,
Mayotte
,
N.
,
Csaszar
,
E.
,
Knapp
,
D. J. H. F.
,
Miller
,
P.
,
Ngom
,
M.
et al. 
(
2014
).
Cord blood expansion. Pyrimidoindole derivatives are agonists of human hematopoietic stem cell self-renewal
.
Science
345
,
1509
-
1512
.
Felber
,
B. K.
,
Orkin
,
S. H.
and
Hamer
,
D. H.
(
1982
).
Abnormal RNA splicing causes one form of alpha thalassemia
.
Cell
29
,
895
-
902
.
Findlay
,
G. M.
,
Boyle
,
E. A.
,
Hause
,
R. J.
,
Klein
,
J. C.
and
Shendure
,
J.
(
2014
).
Saturation editing of genomic regions by multiplex homology-directed repair
.
Nature
513
,
120
-
123
.
Findlay
,
G. M.
,
Daza
,
R. M.
,
Martin
,
B.
,
Zhang
,
M. D.
,
Leith
,
A. P.
,
Gasperini
,
M.
,
Janizek
,
J. D.
,
Huang
,
X.
,
Starita
,
L. M.
and
Shendure
,
J.
(
2018
).
Accurate classification of BRCA1 variants with saturation genome editing
.
Nature
562
,
217
-
222
.
Fire
,
A.
,
Xu
,
S.
,
Montgomery
,
M. K.
,
Kostas
,
S. A.
,
Driver
,
S. E.
and
Mello
,
C. C.
(
1998
).
Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans
.
Nature
391
,
806
-
811
.
Frangoul
,
H.
,
Altshuler
,
D.
,
Cappellini
,
M. D.
,
Chen
,
Y.-S.
,
Domm
,
J.
,
Eustace
,
B. K.
,
Foell
,
J.
,
de la Fuente
,
J.
,
Grupp
,
S.
,
Handgretinger
,
R.
et al. 
(
2021
).
CRISPR-Cas9 gene editing for sickle cell disease and β-thalassemia
.
N. Engl. J. Med.
384
,
252
-
260
.
Gaudelli
,
N. M.
,
Komor
,
A. C.
,
Rees
,
H. A.
,
Packer
,
M. S.
,
Badran
,
A. H.
,
Bryson
,
D. I.
and
Liu
,
D. R.
(
2017
).
Programmable base editing of A•T to G•C in genomic DNA without DNA cleavage
.
Nature
551
,
464
-
471
.
Giani
,
F. C.
,
Fiorini
,
C.
,
Wakabayashi
,
A.
,
Ludwig
,
L. S.
,
Salem
,
R. M.
,
Jobaliya
,
C. D.
,
Regan
,
S. N.
,
Ulirsch
,
J. C.
,
Liang
,
G.
,
Steinberg-Shemer
,
O.
et al. 
(
2016
).
Targeted application of human genetic variation can improve red blood cell production from stem cells
.
Cell Stem Cell
18
,
73
-
78
.
Gilbert
,
L. A.
,
Larson
,
M. H.
,
Morsut
,
L.
,
Liu
,
Z.
,
Brar
,
G. A.
,
Torres
,
S. E.
,
Stern-Ginossar
,
N.
,
Brandman
,
O.
,
Whitehead
,
E. H.
,
Doudna
,
J. A.
et al. 
(
2013
).
CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes
.
Cell
154
,
442
-
451
.
Gilbert
,
L. A.
,
Horlbeck
,
M. A.
,
Adamson
,
B.
,
Villalta
,
J. E.
,
Chen
,
Y.
,
Whitehead
,
E. H.
,
Guimaraes
,
C.
,
Panning
,
B.
,
Ploegh
,
H. L.
,
Bassik
,
M. C.
et al. 
(
2014
).
Genome-Scale CRISPR-mediated control of gene repression and activation
.
Cell
159
,
647
-
661
.
Gundry
,
M. C.
,
Brunetti
,
L.
,
Lin
,
A.
,
Mayle
,
A. E.
,
Kitano
,
A.
,
Wagner
,
D.
,
Hsu
,
J. I.
,
Hoegenauer
,
K. A.
,
Rooney
,
C. M.
,
Goodell
,
M. A.
et al. 
(
2016
).
Highly efficient genome editing of murine and human hematopoietic progenitor cells by CRISPR/Cas9
.
Cell Rep.
17
,
1453
-
1461
.
Guo
,
M. H.
,
Nandakumar
,
S. K.
,
Ulirsch
,
J. C.
,
Zekavat
,
S. M.
,
Buenrostro
,
J. D.
,
Natarajan
,
P.
,
Salem
,
R. M.
,
Chiarle
,
R.
,
Mitt
,
M.
,
Kals
,
M.
et al. 
(
2017
).
Comprehensive population-based genome sequencing provides insight into hematopoietic regulatory mechanisms
.
Proc. Natl. Acad. Sci. USA
114
,
E327
-
E336
.
Hanna
,
R. E.
,
Hegde
,
M.
,
Fagre
,
C. R.
,
DeWeirdt
,
P. C.
,
Sangree
,
A. K.
,
Szegletes
,
Z.
,
Griffith
,
A.
,
Feeley
,
M. N.
,
Sanson
,
K. R.
,
Baidi
,
Y.
et al. 
(
2021
).
Massively parallel assessment of human variants with base editor screens
.
Cell
184
,
1064
-
1080.e20
.
Heckl
,
D.
,
Kowalczyk
,
M. S.
,
Yudovich
,
D.
,
Belizaire
,
R.
,
Puram
,
R. V.
,
McConkey
,
M. E.
,
Thielke
,
A.
,
Aster
,
J. C.
,
Regev
,
A.
and
Ebert
,
B. L.
(
2014
).
Generation of mouse models of myeloid malignancy with combinatorial genetic lesions using CRISPR-Cas9 genome editing
.
Nat. Biotechnol.
32
,
941
-
946
.
Hendel
,
A.
,
Bak
,
R. O.
,
Clark
,
J. T.
,
Kennedy
,
A. B.
,
Ryan
,
D. E.
,
Roy
,
S.
,
Steinfeld
,
I.
,
Lunstad
,
B. D.
,
Kaiser
,
R. J.
,
Wilkens
,
A. B.
et al. 
(
2015
).
Chemically modified guide RNAs enhance CRISPR-Cas genome editing in human primary cells
.
Nat. Biotechnol.
33
,
985
-
989
.
Hernández
,
D. C.
,
Juelke
,
K.
,
Müller
,
N. C.
,
Durek
,
P.
,
Ugursu
,
B.
,
Mashreghi
,
M.-F.
,
Rückert
,
T.
and
Romagnani
,
C.
(
2021
).
An in vitro platform supports generation of human innate lymphoid cells from CD34+ hematopoietic progenitors that recapitulate ex vivo identity
.
Immunity
54
,
2417
-
2432.e5
.
Huang
,
Y.-H.
,
Chen
,
C.-W.
,
Sundaramurthy
,
V.
,
Słabicki
,
M.
,
Hao
,
D.
,
Watson
,
C. J.
,
Tovy
,
A.
,
Reyes
,
J. M.
,
Dakhova
,
O.
,
Crovetti
,
B. R.
et al. 
(
2022
).
Systematic profiling of DNMT3A variants reveals protein instability mediated by the DCAF8 E3 ubiquitin ligase adaptor
.
Cancer Discov.
12
,
220
-
235
.
Inoue
,
F.
,
Kircher
,
M.
,
Martin
,
B.
,
Cooper
,
G. M.
,
Witten
,
D. M.
,
McManus
,
M. T.
,
Ahituv
,
N.
and
Shendure
,
J.
(
2017
).
A systematic comparison reveals substantial differences in chromosomal versus episomal encoding of enhancer activity
.
Genome Res.
27
,
38
-
52
.
Jackson
,
A. L.
and
Linsley
,
P. S.
(
2010
).
Recognizing and avoiding siRNA off-target effects for target identification and therapeutic application
.
Nat. Rev. Drug Discov.
9
,
57
-
67
.
Javierre
,
B. M.
,
Burren
,
O. S.
,
Wilder
,
S. P.
,
Kreuzhuber
,
R.
,
Hill
,
S. M.
,
Sewitz
,
S.
,
Cairns
,
J.
,
Wingett
,
S. W.
,
Várnai
,
C.
,
Thiecke
,
M. J.
et al. 
(
2016
).
Lineage-specific genome architecture links enhancers and non-coding disease variants to target gene promoters
.
Cell
167
,
1369
-
1384.e19
.
Jinek
,
M.
,
Chylinski
,
K.
,
Fonfara
,
I.
,
Hauer
,
M.
,
Doudna
,
J. A.
and
Charpentier
,
E.
(
2012
).
A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity
.
Science
337
,
816
-
821
.
Kanter
,
J.
,
DiPersio
,
J. F.
,
Leavey
,
P.
,
Shyr
,
D. C.
,
Thompson
,
A. A.
,
Porteus
,
M. H.
,
Intondi
,
A.
,
Lahiri
,
P.
,
Dever
,
D. P.
,
Petrusich
,
A.
et al. 
(
2021
).
Cedar trial in progress: a first in human, phase 1/2 study of the correction of a single nucleotide mutation in autologous HSCs (GPH101) to convert HbS to HbA for treating severe SCD
.
Blood
138
,
1864
.
Kar
,
S. P.
,
Quiros
,
P. M.
,
Gu
,
M.
,
Jiang
,
T.
,
Mitchell
,
J.
,
Langdon
,
R.
,
Iyer
,
V.
,
Barcena
,
C.
,
Vijayabaskar
,
M. S.
,
Fabre
,
M. A.
et al. 
(
2022
).
Genome-wide analyses of 200,453 individuals yield new insights into the causes and consequences of clonal hematopoiesis
.
Nat. Genet.
54
,
1155
-
1166
.
Kepley
,
C. L.
,
Pfeiffer
,
J. R.
,
Schwartz
,
L. B.
,
Wilson
,
B. S.
and
Oliver
,
J. M.
(
1998
).
The identification and characterization of umbilical cord blood-derived human basophils
.
J. Leukoc. Biol.
64
,
474
-
483
.
Kingwell
,
K.
(
2022
).
Base editors hit the clinic
.
Nat. Rev. Drug Discov.
21
,
545
-
547
.
Kircher
,
M.
,
Witten
,
D. M.
,
Jain
,
P.
,
O'Roak
,
B. J.
,
Cooper
,
G. M.
and
Shendure
,
J.
(
2014
).
A general framework for estimating the relative pathogenicity of human genetic variants
.
Nat. Genet.
46
,
310
-
315
.
Kishi
,
H.
,
Mukai
,
T.
,
Hirono
,
A.
,
Fujii
,
H.
,
Miwa
,
S.
and
Hori
,
K.
(
1987
).
Human aldolase A deficiency associated with a hemolytic anemia: thermolabile aldolase due to a single base mutation
.
Proc. Natl. Acad. Sci. USA
84
,
8623
-
8627
.
Koblan
,
L. W.
,
Doman
,
J. L.
,
Wilson
,
C.
,
Levy
,
J. M.
,
Tay
,
T.
,
Newby
,
G. A.
,
Maianti
,
J. P.
,
Raguram
,
A.
and
Liu
,
D. R.
(
2018
).
Improving cytidine and adenine base editors by expression optimization and ancestral reconstruction
.
Nat. Biotechnol.
36
,
843
-
846
.
Komor
,
A. C.
,
Kim
,
Y. B.
,
Packer
,
M. S.
,
Zuris
,
J. A.
and
Liu
,
D. R.
(
2016
).
Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage
.
Nature
533
,
420
-
424
.
Kraus
,
H.
,
Kaiser
,
S.
,
Aumann
,
K.
,
Bönelt
,
P.
,
Salzer
,
U.
,
Vestweber
,
D.
,
Erlacher
,
M.
,
Kunze
,
M.
,
Burger
,
M.
,
Pieper
,
K.
et al. 
(
2014
).
A feeder-free differentiation system identifies autonomously proliferating B cell precursors in human bone marrow
.
J. Immunol.
192
,
1044
-
1054
.
Lee
,
R. C.
,
Feinbaum
,
R. L.
and
Ambros
,
V.
(
1993
).
The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14
.
Cell
75
,
843
-
854
.
Liggett
,
L. A.
and
Sankaran
,
V. G.
(
2020
).
Unraveling hematopoiesis through the lens of genomics
.
Cell
182
,
1384
-
1400
.
Ludwig
,
L. S.
,
Lareau
,
C. A.
,
Ulirsch
,
J. C.
,
Christian
,
E.
,
Muus
,
C.
,
Li
,
L. H.
,
Pelka
,
K.
,
Ge
,
W.
,
Oren
,
Y.
,
Brack
,
A.
et al. 
(
2019
).
Lineage tracing in humans enabled by mitochondrial mutations and single-cell genomics
.
Cell
176
,
1325
-
1339.e22
.
Lue
,
N. Z.
,
Garcia
,
E. M.
,
Ngan
,
K. C.
,
Lee
,
C.
,
Doench
,
J. G.
and
Liau
,
B. B.
(
2023
).
Base editor scanning charts the DNMT3A activity landscape
.
Nat. Chem. Biol.
19
,
176
-
186
.
Luo
,
X. M.
,
Maarschalk
,
E.
,
O'Connell
,
R. M.
,
Wang
,
P.
,
Yang
,
L.
and
Baltimore
,
D.
(
2009
).
Engineering human hematopoietic stem/progenitor cells to produce a broadly neutralizing anti-HIV antibody after in vitro maturation to human B lymphocytes
.
Blood
113
,
1422
-
1431
.
MacArthur
,
D. G.
,
Manolio
,
T. A.
,
Dimmock
,
D. P.
,
Rehm
,
H. L.
,
Shendure
,
J.
,
Abecasis
,
G. R.
,
Adams
,
D. R.
,
Altman
,
R. B.
,
Antonarakis
,
S. E.
,
Ashley
,
E. A.
et al. 
(
2014
).
Guidelines for investigating causality of sequence variants in human disease
.
Nature
508
,
469
-
476
.
Majithia
,
A. R.
,
Tsuda
,
B.
,
Agostini
,
M.
,
Gnanapradeepan
,
K.
,
Rice
,
R.
,
Peloso
,
G.
,
Patel
,
K. A.
,
Zhang
,
X.
,
Broekema
,
M. F.
,
Patterson
,
N.
et al. 
(
2016
).
Prospective functional classification of all possible missense variants in PPARG
.
Nat. Genet.
48
,
1570
-
1575
.
Mali
,
P.
,
Yang
,
L.
,
Esvelt
,
K. M.
,
Aach
,
J.
,
Guell
,
M.
,
DiCarlo
,
J. E.
,
Norville
,
J. E.
and
Church
,
G. M.
(
2013
).
RNA-guided human genome engineering via Cas9
.
Science
339
,
823
-
826
.
Mandal
,
P. K.
,
Ferreira
,
L. M. R.
,
Collins
,
R.
,
Meissner
,
T. B.
,
Boutwell
,
C. L.
,
Friesen
,
M.
,
Vrbanac
,
V.
,
Garrison
,
B. S.
,
Stortchevoi
,
A.
,
Bryder
,
D.
et al. 
(
2014
).
Efficient ablation of genes in human hematopoietic stem and effector cells using CRISPR/Cas9
.
Cell Stem Cell
15
,
643
-
652
.
Martinov
,
T.
,
McKenna
,
K. M.
,
Tan
,
W. H.
,
Collins
,
E. J.
,
Kehret
,
A. R.
,
Linton
,
J. D.
,
Olsen
,
T. M.
,
Shobaki
,
N.
and
Rongvaux
,
A.
(
2021
).
Building the next generation of humanized hemato-lymphoid system mice
.
Front. Immunol.
12
,
643852
.
Maximow
,
A. A.
(
1909
).
Der Lymphozyt als gemeinsame Stammzelle der verschiedenen Blutelemente in der embryonalen Entwicklung und im postfetalen Leben der Säugetiere
.
Fol Haematol
.
8
,
125
-
134
.
McIntosh
,
B. E.
,
Brown
,
M. E.
,
Duffin
,
B. M.
,
Maufort
,
J. P.
,
Vereide
,
D. T.
,
Slukvin
,
I. I.
and
Thomson
,
J. A.
(
2015
).
Nonirradiated NOD,B6.SCID Il2rγ-/- Kit(W41/W41) (NBSGW) mice support multilineage engraftment of human hematopoietic cells
.
Stem Cell Rep.
4
,
171
-
180
.
Medetgul-Ernar
,
K.
and
Davis
,
M. M.
(
2022
).
Standing on the shoulders of mice
.
Immunity
55
,
1343
-
1353
.
Melnikov
,
A.
,
Rogov
,
P.
,
Wang
,
L.
,
Gnirke
,
A.
and
Mikkelsen
,
T. S.
(
2014
).
Comprehensive mutational scanning of a kinase in vivo reveals substrate-dependent fitness landscapes
.
Nucleic Acids Res.
42
,
e112
.
Mighell
,
T. L.
,
Evans-Dutson
,
S.
and
O'Roak
,
B. J.
(
2018
).
A saturation mutagenesis approach to understanding PTEN lipid phosphatase activity and genotype-phenotype relationships
.
Am. J. Hum. Genet.
102
,
943
-
955
.
Mitchell
,
J. S.
,
Li
,
N.
,
Weinhold
,
N.
,
Försti
,
A.
,
Ali
,
M.
,
van Duin
,
M.
,
Thorleifsson
,
G.
,
Johnson
,
D. C.
,
Chen
,
B.
,
Halvarsson
,
B.-M.
et al. 
(
2016
).
Genome-wide association study identifies multiple susceptibility loci for multiple myeloma
.
Nat. Commun.
7
,
12050
.
Moffat
,
J.
,
Grueneberg
,
D. A.
,
Yang
,
X.
,
Kim
,
S. Y.
,
Kloepfer
,
A. M.
,
Hinkle
,
G.
,
Piqani
,
B.
,
Eisenhaure
,
T. M.
,
Luo
,
B.
,
Grenier
,
J. K.
et al. 
(
2006
).
A lentiviral RNAi library for human and mouse genes applied to an arrayed viral high-content screen
.
Cell
124
,
1283
-
1298
.
Morris
,
J. A.
,
Daniloski
,
Z.
,
Domingo
,
J.
,
Barry
,
T.
,
Ziosi
,
M.
,
Glinos
,
D. A.
,
Hao
,
S.
,
Mimitou
,
E. P.
,
Smibert
,
P.
,
Roeder
,
K.
et al. 
(
2021
).
Discovery of target genes and pathways of blood trait loci using pooled CRISPR screens and single cell RNA sequencing
.
bioRxiv
2021.04.07.438882
.
Mostafavi
,
H.
,
Spence
,
J. P.
,
Naqvi
,
S.
and
Pritchard
,
J. K.
(
2022
).
Limited overlap of eQTLs and GWAS hits due to systematic differences in discovery
.
bioRxiv
2022.05.07.491045
.
Mouri
,
K.
,
Guo
,
M. H.
,
de Boer
,
C. G.
,
Lissner
,
M. M.
,
Harten
,
I. A.
,
Newby
,
G. A.
,
DeBerg
,
H. A.
,
Platt
,
W. F.
,
Gentili
,
M.
,
Liu
,
D. R.
et al. 
(
2022
).
Prioritization of autoimmune disease-associated genetic variants that perturb regulatory element activity in T cells
.
Nat. Genet.
54
,
603
-
612
.
Nakauchi
,
Y.
,
Azizi
,
A.
,
Thomas
,
D.
,
Corces
,
M. R.
,
Reinisch
,
A.
,
Sharma
,
R.
,
Cruz Hernandez
,
D.
,
Köhnke
,
T.
,
Karigane
,
D.
,
Fan
,
A.
et al. 
(
2022
).
The cell type-specific 5hmC landscape and dynamics of healthy human hematopoiesis and TET2-mutant preleukemia
.
Blood Cancer Discov.
3
,
346
-
367
.
Nam
,
A. S.
,
Dusaj
,
N.
,
Izzo
,
F.
,
Murali
,
R.
,
Myers
,
R. M.
,
Mouhieddine
,
T. H.
,
Sotelo
,
J.
,
Benbarche
,
S.
,
Waarts
,
M.
,
Gaiti
,
F.
et al. 
(
2022
).
Single-cell multi-omics of human clonal hematopoiesis reveals that DNMT3A R882 mutations perturb early progenitor states through selective hypomethylation
.
Nat. Genet.
54
,
1514
-
1526
.
Nandakumar
,
S. K.
,
McFarland
,
S. K.
,
Mateyka
,
L. M.
,
Lareau
,
C. A.
,
Ulirsch
,
J. C.
,
Ludwig
,
L. S.
,
Agarwal
,
G.
,
Engreitz
,
J. M.
,
Przychodzen
,
B.
,
McConkey
,
M.
et al. 
(
2019
).
Gene-centric functional dissection of human genetic variation uncovers regulators of hematopoiesis
.
Elife
8
,
e44080
.
Nasser
,
J.
,
Bergman
,
D. T.
,
Fulco
,
C. P.
,
Guckelberger
,
P.
,
Doughty
,
B. R.
,
Patwardhan
,
T. A.
,
Jones
,
T. R.
,
Nguyen
,
T. H.
,
Ulirsch
,
J. C.
,
Lekschas
,
F.
et al. 
(
2021
).
Genome-wide enhancer maps link risk variants to disease genes
.
Nature
593
,
238
-
243
.
Nuñez
,
J. K.
,
Chen
,
J.
,
Pommier
,
G. C.
,
Cogan
,
J. Z.
,
Replogle
,
J. M.
,
Adriaens
,
C.
,
Ramadoss
,
G. N.
,
Shi
,
Q.
,
Hung
,
K. L.
,
Samelson
,
A. J.
et al. 
(
2021
).
Genome-wide programmable transcriptional memory by CRISPR-based epigenome editing
.
Cell
184
,
2503
-
2519.e17
.
Orkin
,
S. H.
,
Sexton
,
J. P.
,
Cheng
,
T. C.
,
Goff
,
S. C.
,
Giardina
,
P. J.
,
Lee
,
J. I.
and
Kazazian
,
H. H.
Jr
(
1983
).
ATA box transcription mutation in beta-thalassemia
.
Nucleic Acids Res.
11
,
4727
-
4734
.
Pattabhi
,
S.
,
Lotti
,
S. N.
,
Berger
,
M. P.
,
Singh
,
S.
,
Lux
,
C. T.
,
Jacoby
,
K.
,
Lee
,
C.
,
Negre
,
O.
,
Scharenberg
,
A. M.
and
Rawlings
,
D. J.
(
2019
).
In vivo outcome of homology-directed repair at the HBB Gene in HSC using alternative donor template delivery methods
.
Mol. Ther. Nucleic Acids
17
,
277
-
288
.
Perdomo
,
J.
,
Yan
,
F.
,
Leung
,
H. H. L.
and
Chong
,
B. H.
(
2017
).
Megakaryocyte differentiation and platelet formation from human cord blood-derived CD34+ cells
.
J. Vis. Exp.
130
,
56420
.
Peters
,
L. A.
,
Perrigoue
,
J.
,
Mortha
,
A.
,
Iuga
,
A.
,
Song
,
W.-M.
,
Neiman
,
E. M.
,
Llewellyn
,
S. R.
,
Di Narzo
,
A.
,
Kidd
,
B. A.
,
Telesco
,
S. E.
et al. 
(
2017
).
A functional genomics predictive network model identifies regulators of inflammatory bowel disease
.
Nat. Genet.
49
,
1437
-
1449
.
Petri
,
K.
,
Zhang
,
W.
,
Ma
,
J.
,
Schmidts
,
A.
,
Lee
,
H.
,
Horng
,
J. E.
,
Kim
,
D. Y.
,
Kurt
,
I. C.
,
Clement
,
K.
,
Hsu
,
J. Y.
et al. 
(
2022
).
CRISPR prime editing with ribonucleoprotein complexes in zebrafish and primary human cells
.
Nat. Biotechnol.
40
,
189
-
193
.
Qiu
,
X.
,
Zhang
,
Y.
,
Martin-Rufino
,
J. D.
,
Weng
,
C.
,
Hosseinzadeh
,
S.
,
Yang
,
D.
,
Pogson
,
A. N.
,
Hein
,
M. Y.
,
Hoi Joseph Min
,
K.
,
Wang
,
L.
et al. 
(
2022
).
Mapping transcriptomic vector fields of single cells
.
Cell
185
,
690
-
711.e45
.
Rao
,
S.
,
Yao
,
Y.
,
Soares de Brito
,
J.
,
Yao
,
Q.
,
Shen
,
A. H.
,
Watkinson
,
R. E.
,
Kennedy
,
A. L.
,
Coyne
,
S.
,
Ren
,
C.
,
Zeng
,
J.
et al. 
(
2021
).
Dissecting ELANE neutropenia pathogenicity by human HSC gene editing
.
Cell Stem Cell
28
,
833
-
845.e5
.
Regev
,
A.
,
Teichmann
,
S. A.
,
Lander
,
E. S.
,
Amit
,
I.
,
Benoist
,
C.
,
Birney
,
E.
,
Bodenmiller
,
B.
,
Campbell
,
P.
,
Carninci
,
P.
,
Clatworthy
,
M.
et al. 
(
2017
).
The human cell atlas
.
Elife
6
,
e27041
.
Robertson
,
N. A.
,
Latorre-Crespo
,
E.
,
Terradas-Terradas
,
M.
,
Lemos-Portela
,
J.
,
Purcell
,
A. C.
,
Livesey
,
B. J.
,
Hillary
,
R. F.
,
Murphy
,
L.
,
Fawkes
,
A.
,
MacGillivray
,
L.
et al. 
(
2022
).
Longitudinal dynamics of clonal hematopoiesis identifies gene-specific fitness effects
.
Nat. Med.
28
,
1439
-
1446
.
Romero
,
Z.
,
Lomova
,
A.
,
Said
,
S.
,
Miggelbrink
,
A.
,
Kuo
,
C. Y.
,
Campo-Fernandez
,
B.
,
Hoban
,
M. D.
,
Masiuk
,
K. E.
,
Clark
,
D. N.
,
Long
,
J.
et al. 
(
2019
).
Editing the sickle cell disease mutation in human hematopoietic stem cells: comparison of endonucleases and homologous donor templates
.
Mol. Ther.
27
,
1389
-
1406
.
Sangree
,
A. K.
,
Griffith
,
A. L.
,
Szegletes
,
Z. M.
,
Roy
,
P.
,
DeWeirdt
,
P. C.
,
Hegde
,
M.
,
McGee
,
A. V.
,
Hanna
,
R. E.
and
Doench
,
J. G.
(
2022
).
Benchmarking of SpCas9 variants enables deeper base editor screens of BRCA1 and BCL2
.
Nat. Commun.
13
,
1318
.
Sankaran
,
V. G.
,
Menne
,
T. F.
,
Xu
,
J.
,
Akie
,
T. E.
,
Lettre
,
G.
,
Van Handel
,
B.
,
Mikkola
,
H. K. A.
,
Hirschhorn
,
J. N.
,
Cantor
,
A. B.
and
Orkin
,
S. H.
(
2008
).
Human fetal hemoglobin expression is regulated by the developmental stage-specific repressor BCL11A
.
Science
322
,
1839
-
1842
.
Sankaran
,
V. G.
,
Weissman
,
J. S.
and
Zon
,
L. I.
(
2022
).
Cellular barcoding to decipher clonal dynamics in disease
.
Science
378
,
eabm5874
.
Sargent
,
J. K.
,
Warner
,
M. A.
,
Low
,
B. E.
,
Schott
,
W. H.
,
Hoffert
,
T.
,
Coleman
,
D.
,
Woo
,
X. Y.
,
Sheridan
,
T.
,
Erattupuzha
,
S.
,
Henrich
,
P. P.
et al. 
(
2022
).
Genetically diverse mouse platform to xenograft cancer cells
.
Dis. Model. Mech.
15
,
dmm049457
.
Schaid
,
D. J.
,
Chen
,
W.
and
Larson
,
N. B.
(
2018
).
From genome-wide associations to candidate causal variants by statistical fine-mapping
.
Nat. Rev. Genet.
19
,
491
-
504
.
Shalit
,
M.
,
Sekhsaria
,
S.
and
Malech
,
H. L.
(
1995
).
Modulation of growth and differentiation of eosinophils from human peripheral blood CD34+ cells by IL5 and other growth factors
.
Cell. Immunol.
160
,
50
-
57
.
Sharma
,
R.
,
Dever
,
D. P.
,
Lee
,
C. M.
,
Azizi
,
A.
,
Pan
,
Y.
,
Camarena
,
J.
,
Köhnke
,
T.
,
Bao
,
G.
,
Porteus
,
M. H.
and
Majeti
,
R.
(
2021
).
The TRACE-Seq method tracks recombination alleles and identifies clonal reconstitution dynamics of gene targeted human hematopoietic stem cells
.
Nat. Commun.
12
,
472
.
Shendure
,
J.
,
Balasubramanian
,
S.
,
Church
,
G. M.
,
Gilbert
,
W.
,
Rogers
,
J.
,
Schloss
,
J. A.
and
Waterston
,
R. H.
(
2017
).
DNA sequencing at 40: past, present and future
.
Nature
550
,
345
-
353
.
Singh
,
J.
,
Chen
,
E. L. Y.
,
Xing
,
Y.
,
Stefanski
,
H. E.
,
Blazar
,
B. R.
and
Zúñiga-Pflücker
,
J. C.
(
2019
).
Generation and function of progenitor T cells from StemRegenin-1-expanded CD34+ human hematopoietic progenitor cells
.
Blood Adv.
3
,
2934
-
2948
.
Song
,
Y.
,
Shan
,
L.
,
Gbyli
,
R.
,
Liu
,
W.
,
Strowig
,
T.
,
Patel
,
A.
,
Fu
,
X.
,
Wang
,
X.
,
Xu
,
M. L.
,
Gao
,
Y.
et al. 
(
2021
).
Combined liver-cytokine humanization comes to the rescue of circulating human red blood cells
.
Science
371
,
1019
-
1025
.
Spanholtz
,
J.
,
Tordoir
,
M.
,
Eissens
,
D.
,
Preijers
,
F.
,
van der Meer
,
A.
,
Joosten
,
I.
,
Schaap
,
N.
,
de Witte
,
T. M.
and
Dolstra
,
H.
(
2010
).
High log-scale expansion of functional human natural killer cells from umbilical cord blood CD34-positive cells for adoptive cancer immunotherapy
.
PLoS One
5
,
e9221
.
Stunnenberg
,
H. G.
,
International Human Epigenome Consortium
and
Hirst
,
M.
(
2016
).
The International Human Epigenome Consortium: a blueprint for scientific collaboration and discovery
.
Cell
167
,
1145
-
1149
.
Subramaniam
,
A.
,
Žemaitis
,
K.
,
Talkhoncheh
,
M. S.
,
Yudovich
,
D.
,
Bäckström
,
A.
,
Debnath
,
S.
,
Chen
,
J.
,
Jain
,
M. V.
,
Galeev
,
R.
,
Gaetani
,
M.
et al. 
(
2020
).
Lysine-specific demethylase 1A restricts ex vivo propagation of human HSCs and is a target of UM171
.
Blood
136
,
2151
-
2161
.
Swartz
,
A. M.
and
Nair
,
S. K.
(
2022
).
The in vitro differentiation of human CD141+CLEC9A+ dendritic cells from mobilized peripheral blood CD34+ hematopoietic stem cells
.
Curr. Protoc.
2
,
e410
.
Taliun
,
D.
,
Harris
,
D. N.
,
Kessler
,
M. D.
,
Carlson
,
J.
,
Szpiech
,
Z. A.
,
Torres
,
R.
,
Taliun
,
S. A. G.
,
Corvelo
,
A.
,
Gogarten
,
S. M.
,
Kang
,
H. M.
et al. 
(
2021
).
Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program
.
Nature
590
,
290
-
299
.
Tewhey
,
R.
,
Kotliar
,
D.
,
Park
,
D. S.
,
Liu
,
B.
,
Winnicki
,
S.
,
Reilly
,
S. K.
,
Andersen
,
K. G.
,
Mikkelsen
,
T. S.
,
Lander
,
E. S.
,
Schaffner
,
S. F.
et al. 
(
2016
).
Direct identification of hundreds of expression-modulating variants using a multiplexed reporter assay
.
Cell
165
,
1519
-
1529
.
Thomas
,
K. R.
and
Capecchi
,
M. R.
(
1987
).
Site-directed mutagenesis by gene targeting in mouse embryo-derived stem cells
.
Cell
51
,
503
-
512
.
Ulirsch
,
J. C.
,
Nandakumar
,
S. K.
,
Wang
,
L.
,
Giani
,
F. C.
,
Zhang
,
X.
,
Rogov
,
P.
,
Melnikov
,
A.
,
McDonel
,
P.
,
Do
,
R.
,
Mikkelsen
,
T. S.
et al. 
(
2016
).
Systematic functional dissection of common genetic variation affecting red blood cell traits
.
Cell
165
,
1530
-
1545
.
Ulirsch
,
J. C.
,
Lareau
,
C. A.
,
Bao
,
E. L.
,
Ludwig
,
L. S.
,
Guo
,
M. H.
,
Benner
,
C.
,
Satpathy
,
A. T.
,
Kartha
,
V. K.
,
Salem
,
R. M.
,
Hirschhorn
,
J. N.
et al. 
(
2019
).
Interrogation of human hematopoiesis at single-cell and single-variant resolution
.
Nat. Genet.
51
,
683
-
693
.
van der Wijst
,
M.
,
de Vries
,
D. H.
,
Groot
,
H. E.
,
Trynka
,
G.
,
Hon
,
C. C.
,
Bonder
,
M. J.
,
Stegle
,
O.
,
Nawijn
,
M. C.
,
Idaghdour
,
Y.
,
van der Harst
,
P.
et al. 
(
2020
).
The single-cell eQTLGen consortium
.
Elife
9
,
e52155
.
Vaser
,
R.
,
Adusumalli
,
S.
,
Leng
,
S. N.
,
Sikic
,
M.
and
Ng
,
P. C.
(
2016
).
SIFT missense predictions for genomes
.
Nat. Protoc.
11
,
1
-
9
.
Vijayakrishnan
,
J.
,
Qian
,
M.
,
Studd
,
J. B.
,
Yang
,
W.
,
Kinnersley
,
B.
,
Law
,
P. J.
,
Broderick
,
P.
,
Raetz
,
E. A.
,
Allan
,
J.
,
Pui
,
C.-H.
et al. 
(
2019
).
Identification of four novel associations for B-cell acute lymphoblastic leukaemia risk
.
Nat. Commun.
10
,
5348
.
Vuckovic
,
D.
,
Bao
,
E. L.
,
Akbari
,
P.
,
Lareau
,
C. A.
,
Mousas
,
A.
,
Jiang
,
T.
,
Chen
,
M.-H.
,
Raffield
,
L. M.
,
Tardaguila
,
M.
,
Huffman
,
J. E.
et al. 
(
2020
).
The polygenic and monogenic basis of blood traits and diseases
.
Cell
182
,
1214
-
1231.e11
.
Way
,
K. J.
,
Dinh
,
H.
,
Keene
,
M. R.
,
White
,
K. E.
,
Clanchy
,
F. I. L.
,
Lusby
,
P.
,
Roiniotis
,
J.
,
Cook
,
A. D.
,
Cassady
,
A. I.
,
Curtis
,
D. J.
et al. 
(
2009
).
The generation and properties of human macrophage populations from hemopoietic stem cells
.
J. Leukoc. Biol.
85
,
766
-
778
.
Wienert
,
B.
,
Martyn
,
G. E.
,
Kurita
,
R.
,
Nakamura
,
Y.
,
Quinlan
,
K. G. R.
and
Crossley
,
M.
(
2017
).
KLF1 drives the expression of fetal hemoglobin in British HPFH
.
Blood
130
,
803
-
807
.
Wolfe
,
L. C.
,
John
,
K. M.
,
Falcone
,
J. C.
,
Byrne
,
A. M.
and
Lux
,
S. E.
(
1982
).
A genetic defect in the binding of protein 4.1 to spectrin in a kindred with hereditary spherocytosis
.
N. Engl. J. Med.
307
,
1367
-
1374
.
Wu
,
Y.
,
Zeng
,
J.
,
Roscoe
,
B. P.
,
Liu
,
P.
,
Yao
,
Q.
,
Lazzarotto
,
C. R.
,
Clement
,
K.
,
Cole
,
M. A.
,
Luk
,
K.
,
Baricordi
,
C.
et al. 
(
2019
).
Highly efficient therapeutic gene editing of human hematopoietic stem cells
.
Nat. Med.
25
,
776
-
783
.
Yu
,
F.
,
Cato
,
L. D.
,
Weng
,
C.
,
Liggett
,
L. A.
,
Jeon
,
S.
,
Xu
,
K.
,
Chiang
,
C. W. K.
,
Wiemels
,
J. L.
,
Weissman
,
J. S.
,
de Smith
,
A. J.
et al. 
(
2022
).
Variant to function mapping at single-cell resolution through network propagation
.
Nat. Biotechnol.
40
,
1644
-
1653
.
Zhang
,
K.
,
Hocker
,
J. D.
,
Miller
,
M.
,
Hou
,
X.
,
Chiou
,
J.
,
Poirion
,
O. B.
,
Qiu
,
Y.
,
Li
,
Y. E.
,
Gaulton
,
K. J.
,
Wang
,
A.
et al. 
(
2021
).
A single-cell atlas of chromatin accessibility in the human genome
.
Cell
184
,
5985
-
6001.e19
.

Competing interests

V.G.S. serves as an advisor to and/or has equity in Branch Biosciences, Ensoma, Novartis, Forma and Cellarity, all unrelated to the present work. The authors have no other competing interests to declare.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.