The Science Behind Genome Editing: Laying the Foundation for Ethical Analysis

BY SAM WU, BS and KEVIN T. FITZGERALD, SJ, PhD

Genome editing, or the ability to manipulate the DNA of an organism, has been facilitated by gene function studies made possible by the progress and affordability of genome sequencing (Gaj T, et al 2013; Ding Y, et al 2016).  To make ethical decisions regarding evaluation and regulation of new genome editing technologies, it is important to gain an understanding of their mechanisms of action and potential applications.

The scientific community has developed several commonly used genome-editing techniques: zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and clustered regularly interspaced short palindromic repeat (CRISPR). ZFNs, TALENs, and CRISPR all use nucleases (proteins or enzymes that can cleave DNA), which can be customized to recognize a particular sequence of base pairs in DNA. More recently, Gao et al (2016) demonstrated the utility of yet another nuclease, Argonaute, for genome editing. In general, once recognition of the target DNA sequence occurs, the system’s nuclease binds to the DNA and creates breaks in both strands of the DNA, also known as a double-strand break (DSB).

The DSB can then be repaired within the cell through either of two DNA repair mechanisms: error-prone nonhomologous end joining (NHEJ) or homology-directed repair (HDR) (Wyman and Kanaar 2006). NHEJ is usually the default repair mechanism for DSBs in cells and is useful in research, but is prone to producing errors in the repaired DNA. When DSBs need to be repaired with greater precision (e.g. for therapeutic purposes), use of the more accurate repair mechanism, HDR, is recommended (Ding Y, et al 2016; Cortez 2015). HDR requires the addition of a fragment of DNA that is identical to the original, unbroken DNA sequence. The proteins involved in HDR use the DNA fragment as a template to ultimately repair the DSB, restoring the original DNA sequence. In genome editing applications, the DNA template can be manipulated such that HDR will introduce a new mutation, or change in the base pair sequence, into a particular gene (Cortez 2015). Depending upon the purpose of the experiment, the outcome of such gene editing could be to inactivate or activate a gene, or to induce or repress expression of a gene. This technology has potential applications in areas such as disease modeling, disease treatment and prevention, and agriculture.

Zinc Finger Nucleases

Zinc finger nucleases (ZFNs) were the first genome editing technology to be popularized in the scientific community (Pabo CO, et al 2001). Zinc finger proteins, major components of ZFNs, were first identified in a species of African aquatic frog, X. laevis (Miller J, McLachlan AD, and Klug A 1985). They are approximately 30 amino acids long and are characterized by the spacing of two particular types of amino acids: cysteine and histidine. In a ZFN, each zinc finger binds to three base pairs on the DNA strand (Pabo CO, et al 2001).

After determining the structure of the zinc finger-binding domain, researchers recognized the utility of the zinc finger framework for the design and selection of new DNA-binding proteins.   Kim, Cha, and Chandrasegaran (1996) were the first to fuse zinc finger proteins with the cleavage domain of another protein, Fok I. This fusion created a hybrid protein that could be customized to cleave DNA at any particular site. Still, zinc finger interactions with DNA were complex and unpredictable. To improve target specificity, more zinc fingers were added to the hybrid. Each zinc finger protein is capable of recognizing three DNA base pairs on the target DNA site; adding more zinc fingers increases the length of the recognition sequence, and thus, target specificity (Bartsevich VV and Juliano RL 2000; Laity JH, Dyson HJ, and Wright PE 2000).

Zinc finger nucleases (ZFNs) are composed of two parts: zinc finger proteins and a DNA cleavage domain (derived from Fok I). An active ZFN (see Figure 1 below) requires two different ZFN complexes—one for each strand of the double-stranded target DNA. This requirement expands the length of recognition sites, further increasing target specificity.   Researchers can alter the DNA binding domain of zinc finger proteins to customize them to recognize a genomic target of choice (Desjarlais and Berg 1992).  Additionally, ZFNs are relatively small in size, allowing for easy delivery into cells compared to other genome editing techniques (Lee J et al 2015). Despite such advantages and improvements to the performance of ZFNs, challenges with efficiency, target availability, off-target effects, and specificity remain (Bae K, et al 2003; Kim et al 2009; Ramirez et al 2008; Cornu et al 2008).

Schematic of ZFN
Figure 1. Schematic representation of a ZFN dimer bound to DNA. Each ZFN is composed of a zinc finger protein and the Fok I nuclease. Two ZFN complexes ( “Left Zinc Fingers” and “Right Zinc Fingers”) are required for an active ZFN, one for each strand of the double-stranded DNA target. (Source: Lee et al 2015).

Transcription Activator-Like Effector Nucleases (TALENs)

Transcription activator-like effector nucleases (TALENs) are similar to ZFNs, in that they consist of a DNA-binding domain and a DNA cleavage domain (also derived from Fok I) (Miller JC, et al 2011). The DNA-binding domains of TALENs are proteins called transcriptor-like effectors (TALEs), derived from plant bacteria. TALEs are comprised of 33-35 amino acid repeats, each of which recognizes a single base pair on the DNA (Deng D, et al 2012).   Usually, TALENs are used in pairs. See Figure 2 below for the structure of TALENs.

Schematic of TALEN
Figure 2. Schematic of a TALEN pair. Each TALEN is comprised of the DNA-binding domain, transcription activator-like effector (TALE) and the Fok I nuclease domain. TALEN pairs are typically used to target genomic sequences approximately 30-40 base pairs long (Source: Lee, et al 2015)

TALENs are relatively easy to design and construct. Large-scale, systematic studies indicate that TALE repeats can be combined to recognize any user-defined sequence (Reyon et al 2012). The assembly of these relatively large protein complexes has been facilitated by the development of systems such as FLASH (fast ligation-based automatable solid-phase high-throughput). FLASH is both a rapid and cost-effective way to facilitate large-scale assembly of TALENs for use in genome editing. Cermak et al (2015) used another system, the Golden Gate method, demonstrating that it could be used to construct TALENs within five days. On the other hand, delivery and optimization of these constructs can be complicated by their large size and repetitive nature (Holkers, et al 2013). Targeting multiple sites on DNA with TALENs may also be hindered by the large size of the nuclease, although this issue could be ameliorated through diversification of the TALEs (Yang, et al 2013).

Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR)

Ishino et al (1987) were the first to observe CRISPR clustered repeats in E. coli. More research followed, with Francisco Mojica characterizing CRISPR as a microbial immune system designed to adapt to and eliminate foreign genetic material (Mojica FJ, et al. 2005; Reiss A, et al 2014). The novel Cas9 (CRISPR-associated protein-9) gene, found to code for a nuclease, was discovered next by Bolotin, et al (2005) in the bacteria S. pyogenes. The team also found that a particular sequence—PAM (protospacer adjacent motif)—of approximately two to five nucleotides is required for target recognition. The PAM sequence itself may differ depending upon the type of Cas9 being used, but it must always be present in proximity to the DNA target site.

Over the next decade, researchers sought to understand the role of CRISPR and how it interferes with invading genetic material (Makarova et al 2006), focusing mostly on the CRISPR type II system that requires only one Cas protein to introduce DSBs (Barrangou et al 2007). Brouns et al (2008) found that small RNAs (CRISPR RNAs or crRNAs) guide Cas proteins to target DNA sites. Further research (Marraffini and Sontheimer 2008; Garneau et al 2010) informed Emmanuelle Charpentier’s work that identified yet another guide RNA: trans-activating CRISPR RNA (tracrRNA) (Deltcheva et al 2011), which forms a duplex with crRNA to guide the Cas9 nuclease to its DNA targets. In June 2012, Charpentier and Doudna published findings confirming the role of Cas9 as an endonuclease (enzyme that cleaves DNA) and demonstrating the fusion of crRNA and tracrRNA to generate a single guide RNA (sgRNA) (Jinek et al 2012), significantly increasing the simplicity and specificity of the gene-editing system. In 2013, a team led by Feng Zhang successfully demonstrated targeted genome cleavage using Cas9 in both human and mouse cells, in addition to showing the system’s utility in targeting multiple DNA sites and driving HDR DNA repair (Cong et al 2013). See Figure 3 below for the Cas9 mechanism of action.

CRISPR Mechanism
Figure 3. Cas9 mechanism of action.   A. The three necessary components for the CRISPR/Cas9 system: cas9, crRNA, and tracrRNA; sgRNA is the fusion of crRNA and tracrRNA. B. The sgRNA guides the Cas9 to the target site, and when Cas9 recognizes the PAM sequence, cleavage is initiated. C. Resultant cleaved fragments of DNA. Source: Riordan et al (2015)

The targeting efficiency of CRISPR/Cas9 relative to other techniques such as ZFNs or TALENs is high, in part due to the guidance of the sgRNA and the PAM requirement. Still, there has been much work to improve the system’s performance, including increasing the efficiency of HDR DNA repair for use with CRISPR—specifically by suppressing factors associated with the NHEJ pathway (Chu et al 2015). Furthermore, other variants of Cas9 (i.e. Cas9s from other species or synthetically-developed mutant Cas9s) have been identified and can be used in different contexts to better serve the needs of a particular experiment (Jiang et al 2013). The process of targeting multiple DNA sites with CRISPR has been facilitated by the development of a single plasmid (fragment of double-stranded DNA separate from a cell’s chromosomal DNA; plasmids are replicable and are frequently used to deliver genes into a cell) that can contain multiple sgRNAs that can guide Cas9 to different DNA target sites (Sakuma et al 2014; Ma et al 2014; Guo et al 2015). Polstein et al (2015) have also developed a method to activate the CRISPR-Cas9 system at particular points in time via light stimulation, giving researchers more control over when genes are expressed.

While the CRISPR system has its benefits, it also has limitations. There is concern over off-target effects, although efforts have been made to reduce such effects through improved design of guide RNAs and the development of Cas9 variants (Cong et al 2013). Additionally, mismatches of guide RNAs to the wrong target sites remain as obstacles to the effectiveness of CRISPR/Cas9-based targeted genome editing. Lastly, delivery of the necessary components of the system may be difficult due to relative size.

Despite such setbacks, CRISPR remains the most affordable and versatile genome editing technique available. Thus, the CRISPR/Cas9 system has seen huge investment and wide publicity, owing to the system’s relative ease of application and versatility for targeted genome editing in multiple species and for many research, therapeutic, diagnostic, public health, and agricultural applications. Some of these applications have proved more controversial than others. Ousterout, et al. (2015) used CRISPR/Cas9 to correct multiple mutations associated with Duchenne muscular dystrophy, a genetic disorder that leads to muscle degeneration over time (Muscular Dystrophy Association 2016). CRISPR has also been used to facilitate the development of a cheaper Zika virus diagnostic test (Gregory 2016).

On the more controversial side, a total of four proof-of-concept studies –in yeast, fruit flies, and two species of mosquitoes—have been published, demonstrating successful development of gene drives in the lab in at least three organisms (Committee on Gene Drive Research 2016). Gene drives are a method in which the odds of passing on a particular gene to the next generation of offspring are significantly increased beyond the means of natural reproduction (Ledford and Callaway 2015). Developments in agriculture include CRISPR-Cas9-edited mushrooms that can be cultivated and sold, without further oversight from the US Department of Agriculture (USDA) (Waltz 2016).   Adding to the list of controversial developments with CRISPR, UK studies involving editing human embryos with CRISPR technology were approved by the Human Fertilisation and Embryology Authority in early 2016 (Callaway 2016). More recently, proposals have been made for human trials that would use CRISPR/Cas9 genome editing in cancer treatment for patients (Begley 2016).

ARGONAUTE

Argonautes are another family of endonucleases that are also important in defending the cell against foreign genetic material (Gao et al 2016). These endonucleases use single-stranded nucleic acids as guides, and exist in nearly all organisms. Some Argonautes have demonstrated that they bind to single-stranded DNAs and can cleave target DNAs (Swarts et al 2014). Gao, et al (2016) demonstrate that NgAgo (Natronobacterium gregoryi Argonaute) is programmable with single-strand DNA guides to be a “precise and efficient tool for genome editing in mammalian cells.”

In response to the Gao et al (2016) study, some have speculated that NgAgo may pose a challenge to proponents of the CRISPR/Cas9 genome editing system. In the CRISPR/Cas9 system, Cas9 requires that the guide RNA exhibit a specific structure for correct binding to occur. In the Argonaute system, however, the guides need not have any specific structure for binding. The CRISPR system is also limited by the PAM sequence—such that the DNA can only be cleaved if it is in proximity to the PAM sequence (Garneau et al 2010). Argonautes, on the other hand, do not require the presence of any specific sequence on targets, and researchers suggest that these endonucleases have a broad targeting range and low tolerance for mismatches (Gao et al 2016).

Looking Forward

Our previous posts discussed recent breakthroughs in genomics, and provided some recommendations on how to address the difficult questions that these scientific advancements raise. Here, we have attempted to provide our readers with an overview of the technologies in question to emphasize the importance of context as the scientific and political communities grapple with regulating these gene-editing technologies.

ABOUT THE AUTHORS

Profile for blog

Sam Wu, BS is a research associate at the Pellegrino Center for Clinical Bioethics at Georgetown University Medical Center.

 

Kevin T. FitzGerald, MDiv, PhD, PhD Associate Professor David Lauler Chair for Catholic Health Care Ethics

Kevin T. FitzGerald, SJ, PhD is a research associate professor at the Pellegrino Center for Clinical Bioethics, GUMC.

Leave a Reply

Your email address will not be published. Required fields are marked *