The ability to precisely edit and change any part of an organism’s genome has long been sought by scientists, and today, we are closer to that goal than ever before. With the discovery of the CRISPR Cas9 system (Clustered Regularly Interspaced Short Palindromic Repeats CRISPR-Associated Proteins 9) scientists are now able to effortlessly and efficiently knock-out or knock-in any gene of interest. Given its ease of use, the RNA-directed CRISPR Cas9 genome-editing system offers an amazingly versatile platform and has potential to supplant the strenuous zinc finger and TALEN approaches.
The CRISPR Cas9 system was originally observed in bacteria and archaea as an adaptive microbial immune system that provides acquired immunity against foreign viruses and plasmids (1). The CRISPR loci are observed in 40% of sequenced bacteria and 90% of sequenced archaea, with the possibility of more than one locus being present in a given bacteria (2). Invading foreign DNA are cleaved by the Cas nucleases, then captured and integrated into the CRISPR locus in the form of spacer sequences interspaced by conserved repeated sequences (Figure 1-a). The acquired spacers serve as templates to create short CRISPR RNAs (crRNAs) which form a complex with the trans-activating crRNA (tracrRNA); together they function as guiding strands to direct the Cas9 nuclease to the complementary invading DNA (3). Once bound, the Cas9 protein cleaves the “crRNA complementary” and opposite strand through its HNH and RuvC1-like nuclease domains, respectively (Figure 1-b) (3).
Since its discovery, 45 different Cas protein families have been described in literature, each with different roles in the synthesis of crRNA, incorporation of new spacer sequences, and the cleavage of invading DNA (4). The CRISPR Cas system is generally divided into three categories, depending on the Cas protein sequence and structure: type I, II, and III (5). The CRISPR Cas9 system that is commonly used today for genome editing is a type II CRISPR Cas system adapted from Streptococcus pyogenes.
Our CRISPR Cas9 system was adapted from the Type II CRISPR System from S. pyogenes. |
In the modern system, targeted genome editing using CRISPR Cas9 technology has two components: 1) an endonuclease; and 2) a short guide RNA (Figure 2). The endonuclease is the bacterial Cas9 nuclease protein from, for example, Streptococcus pyogenes (see Table 1 for a list of other CRISPR nucleases). The Cas9 nuclease possesses two DNA cleavage domains (the RuvC1 and HNH-like nuclease domains) that cleave double-stranded DNA, making double strand breaks (DSB) (4). The gRNA is an engineered single-stranded chimeric RNA, combining the scaffolding function of the bacterial tracrRNA with the specificity of the bacterial crRNA (6). The last 20bp at the 5’ end of the gRNA acts as a homing device, which recruits the Cas9/gRNA complex to a specific DNA target site, directly upstream of a protospacer adjacent motif (PAM), through RNA-DNA base pairing. The PAM sequence differs between different strains and types of CRISPR Cas proteins, and the sequence for the S. pyogenes Cas9 is 5’-NGG (see Protospacer Adjacent Motif). The adapted CRISPR Cas9 system available today can, therefore, be directed towards any 5’-N20-NGG DNA sequence and create a precise double strand break (7). The DSB is then repaired by one of two universal repair mechanisms found in nearly all cell types and organisms: the non-homologous end-joining (NHEJ) or the homology-directed repair (HDR).
Looking for a sgRNA construct for your gene? Browse our collection of ready-to-use sgRNA. |
The non-homologous End Joining (NHEJ) Repair Pathway
The NHEJ repair pathway is an error prone repair mechanism utilized to repair double stranded breaks in the absence of a suitable repair template. The NHEJ pathway attempts to ligate the cleaved ends of a DSB together. However, this process often results in insertion or deletion (InDels) mutations at the DSB site, causing frame shifts or introducing pre-mature stop codons that permanently disrupt the open reading frame of the targeted gene (Figure 3). Although the InDel outcome by NHEJ remains largely random, scientists can ensure maximum gene disruption by targeting the gRNA towards the N-terminal of the gene of interest; this will ensure that frame shift mutations do not lead to partially functional gene product. It is good practice to design the gRNA in the first or second exon and avoid intronic regions of the targeted gene when exploiting the NHEJ repair pathway (7).
Browse our ready-to-use Cas9 nuclease collection to find the best options for your experiment. |
The homology Directed Repair (HDR) with Cas9 Nuclease
In addition to NHEJ, cells are able to utilize a more precise repair mechanism known as the Homology Directed Repair (HDR) pathway. This repair mechanism can be exploited to introduce specific nucleotide modifications to genomic DNA (8). In this method, a DNA repair template that has a high degree of homology to the sequence immediately upstream and downstream of the intended editing site is introduced into the cell along with the appropriate gRNA and Cas9 nuclease. In the presence of this suitable template, the less error-prone HDR mechanism can faithfully make the desired changes to the Cas9 induced DSB site through recombination (Figure 4). When designing the repair template, ensure that either the target sequence is not immediately followed by the PAM sequence or that the PAM sequence is either excluded or mutated. This is to avoid the degradation of the repair template by the same CRISPR Cas9 system.
Introduce Cas9 nuclease to your experiment with any of our Cas9 expression vectors and viruses. |
The CRISPR Cas9 system can be targeted towards any genomic region through the design of a gRNA, however, the specificity of the system depends on the protospacer adjacent motif (PAM) located immediately downstream of the target sequence (Figure 5) (3). As a bacterial and archaeal immune system, the PAM recognition sequence activates the nuclease domains of Cas9 and thereby serves as a means of distinguishing self from non-self (i.e. prevents the CRISPR loci from being targeted) (7).
The PAM recognition sequence differs depending on the species and the type of bacteria from which the Cas9 nuclease is derived. The most commenly used Type II CRISPR system uses the Cas9 nuclease from S. pyogenes. This particular nuclease recognizes 5'-NGG on the immediate 3’ end of the gRNA sequence. Other commercially available Cas9 may recognize other PAM sequences (Table 1).
Table 1: PAM Sequences from Different Species and Subtypes of Cas Nucleases
Species | Subtype | PAM Sequence |
S. pyogenes | II | 5'-NGG |
S. aureus | II-A | 5'-NNGRRT |
S. solfataricus | I-A1 | 5'-CCN |
S. solfataricus | I-A2 | 5'-TCN |
H. walsbyi | I-B | 5'-TTC |
E. coli | I-E | 5'-AWG |
E. coli | I-F | 5'-CC |
P. aeruginosa | I-F | 5'-CC |
S. thermophilus | II-A | 5'-NNAGAA |
S. agalactiae | II-A | 5'-NGG |
F. novicida | V-A | TTTN-'3 |
Acidaminococcus sp. | V-A | TTTN-'3 |
Source: adapted from Protospacer Recognition Motifs - Mixed Identities and Functional Diversity. 5, s.l. : RNA Biology, May 2013, Vol. 10, pp. 891-899.
Our CRISPR Cas9 system utilizes the Type II CRISPR System from S. pyogenes, which recognizes the 5'-NGG PAM sequence. |
An important consideration when using CRISPR Cas9 as a genome editing tool is the extent to which off-target cleavage occurs. Off-target events can lead to InDel mutations in sites not originally intended and therefore compromise the phenotypic results obtained. Several studies have assessed the specificity of the CRISPR Cas9 system and have shown that generally, mismatches towards the 5’ end of the 20 base pair targeting region of the gRNA are tolerated. (6)(9)(10)(11)(12) However, it is hard to predict how these mismatches affect off-target effects of the CRISPR Cas9 system. There have been some instances reported where a mismatch at the 5’ end of the targeting region of the gRNA were not tolerated and other instances where mismatches at the 3’ end of the targeting region were allowed (13). Overall, off-targets effect introduced by the CRISPR Cas9 system are variable in frequency and challenging to predict (7).
There are several methods available that could be utilized in order to minimize these off-target effect. It is possible to minimize the off-target effects by carefully designing gRNA sequences to the intended gene (see guidelines for designing gRNA sequences). Another method that is commonly applied is the use of a variant form of the Cas9 endonuclease. This variant harbours a mutation which deactivates one of the cleavage domains of wild type Cas9 (Cas9 Nickase) and will be discussed in detail shortly. A final method is to reduce the 20 base pair targeting sequence of the gRNA. These truncated gRNA (tru-gRNAs), have been shown to function as effectively as the full length gRNAs in directing the Cas9 nuclease to the intended target site. Tru-gRNAs exhibit decreased mutagenic effects at off-target sequences and are more sensitive towards single or double mismatches. (14)
Our Genome-wide sgRNA Libraries were carefully designed to minimize their off-target effects. |
There are two variations to the system introduced above that are also commonly available today: the Cas9 Nicakse and the Cas9 Double Mutant. Each of these variants has their own benefits and applications.
Cas9 Nickase
One concern with the current CRISPR Cas9 technology is the potential off-target effects that could arise from Cas9 nuclease activity. To improve the off-target mutagenic effects of this system, the Cas9 Nickase was developed. The Cas9 Nickase is a mutant form of Cas9 with either the D10A or H840A mutation in its RuvC1 or HNH-like nuclease domains, respectively (4). This mutant form results in the generation of a single stranded nick instead of a double stranded break at the target site (Figure 6). Since a single stranded break (or nick) is normally quickly repaired through the HDR pathway using the intact complementary DNA strand as the repair template, off-target effects of the Cas9 Nickase is minimized.
To utilize Cas9 Nickase for genome editing, two gRNAs instead of one are required. The two gRNAs will be designed on opposite DNA strands but with close proximity to ensure that a DSB is induced once the two strands are nicked by the Cas9 Nickase (7). This paired Cas9 Nickase modification reduces off-target effects because the two gRNAs need to work together to produce a DSB (Figure 7). Once the DSB is created, either the NHEJ or HDR pathway will be activated to complete the genome editing process.
The Cas9 Nickase can also be used to create nucleotide modifications by homologous recombination if a repair template DNA containing the desired modification is introduced along with the gRNA and Cas9 nickase. Table 2 summarizes the recommended applications of Cas9 nuclease and nickase:
Table 2: Cas9 Nickase and Nuclease Usage in Gene Disruption and Specific Gene Modification
gRNA Requirement | Gene Disruption (NHEJ) | Specific Gene Modification (HDR) | |
Cas9 Nuclease | 1 gRNA per Target | ••••• | •••• |
Cas9 Nickase | 1 gRNA per Target | •• | •• |
Cas9 Nickase | 2 gRNA per Target | ••••• | •••• |
Source: Adapted from CRISPR-Cas system for editing, regulation and targeting genomes. Sander, Jeffry D and Joung, J Keith. 4, s.l. : Nature Biotechnology, March 2, 2014, Vol. 32.
Cas9 Double Mutant
The Cas9 Double Mutant, or Null Mutant, is created by mutating both cleavage domains of the wild type Cas9. Such a Cas9 protein retains its ability to bind to genomic DNA through gRNA:genomic DNA base pairing. Unlike the permanent gene disruption that can be achieved by the Cas9 Nuclease and Cas9 Nickase, the Cas9 Null mutant does not introduce any genome modifications. By fusing the Cas9 double mutant with other effector proteins, the CRISPR Cas9 system can expand its role to gene regulation, genome imaging, chromatin or DNA modifications, and chromatin immunoprecipitation (Figure 8) (15)(16)(17)(18)(19).
To address the limitations of spCas9, namely its large size and G-rich PAM sequence, several other CRISPR enzymes have been proposed and studied as potential alternatives, most notably Cpf1 and saCas9. These nucleases have unique advantages which may open up new possibilities for genome editing. Table 3 summarizes some key differences and similarities between spCas9, saCas9, and Cpf1.
Table 3: Comparison of spCas9, saCas9, and Cpf1 nucleases
spCas9 | saCas9 | Cpf1 | |
Gene Length | ~4.1 kb | ~3.3 kb | ~3.8 kb |
Cleavage Type | Blunt ends | Blunt ends | 5' overhangs |
Repeated Cleavage | No | No | Yes |
Cleavage Site | Within recognition sequence | Within recognition sequence | Downstream of recognition sequence |
PAM Sequence | 5’-NGG-3’ | 5’-NNGRRT-3’ | 5’-TTN-3’ or 5’-TTTN-3’ |
RNA Required | tracrRNA + crRNA | tracrRNA + crRNA | 5crRNA |
Cpf1 Nuclease
A new CRISPR nuclease, Cpf1, was discovered in late 2015 by Feng Zhang’s group at MIT. Of 16 candidate Cpf1 proteins, two were found to have the most potential as highly specific genome editing tools in mammalian cells: asCpf1 and lb2Cpf1 (20). These nucleases were found to have on-target cleavage efficiencies in human cells comparable with that of the commonly-used spCas9 (21).
Cpf1 allows for new targeting possibilities, as it recognizes T-rich PAM sites. This significantly expands the breadth of possible genomic targets compared to the G-rich PAM requirements of Cas9. While Cas9 may be the preferred nuclease to target G-rich areas, Cpf1 can be used to target T-rich areas. In mammalian cells, Cpf1 can target such difficult stretches of DNA as scaffold/matrix-attachment regions and centromere DNA (22). Using Cpf1 would also enable the exploration of bacterial genomes that are dominated by AT-rich domains, such as the malaria-causing Plasmodium falciparum. As well, Cpf1 may be a useful alternative to Cas9 in cases where expression of Cas9 is toxic, as has been seen in Corynebacterium glutamicum and several species of Cyanobacteria (23)(24).
Cpf1 requires a shorter guide RNA to operate. While Cas9 requires the presence of a tracrRNA to process crRNA, Cpf1 can process the pre-crRNA by itself (25). This is of particular interest to biotech, since in comparison to the ~100 nt tracrRNA/crRNA hybrids used with Cas9, Cpf1 can be targeted with only a ~42 nt crRNA. This reduces the size of the engineered sgRNA by more than half, while also simplifying the methods and costs associated with synthesis and (if desired) chemical modification (26). This self-editing capability also makes Cpf1 an excellent choice for multiplexed genome editing, as more sgRNAs may fit in one vector (27).
Another striking difference is that Cas9 generates blunt ends after cleavage while Cpf1 leaves sticky 5’ overhangs that may be used for directional cloning. Other methods which produce sticky ends, such as restriction enzyme digestion, have shorter recognition sequences than Cpf1 and thus cut less specifically. This property of Cpf1 has been exploited to perform highly specific DNA assembly in vitro (28). Using this method in vivo, scientists could perform DNA knock-ins into non-dividing cells such as neurons, for which genome editing via HDR is particularly challenging.
Cpf1 cleaves DNA 18-23 bp downstream from the PAM site, resulting in no disruption to the recognition sequence after NHEJ repair of the double-strand DNA break. As a result, Cpf1 enables multiple rounds of DNA cleavage, and an increased opportunity for the desired genomic editing to occur. By contrast, since Cas9 cuts only 3 bp upstream of the PAM site (29), the NHEJ pathway results in indel mutations which destroy the recognition sequence, thereby preventing further rounds of cutting. In theory, repeated rounds of DNA cleavage should cause Cpf1 to result in increased silencing of the targeted gene, and, in the case of knock-in experiments, a greater chance for homology directed repair to occur.
Recent trials with Cpf1-mediated genome editing are promising. A study by Zhang et al. showed that Cpf1-mediated genome editing could be used to correct muscular dystrophy gene mutations in induced pluripotent stem cells and in mice. This resulted in restored gene expression, and the correction of symptoms in mice (30). Cpf1 has also been used successfully in a protein-sgRNA complex to mutate genes in soybean and tobacco plants (31).
Watch our summary video introducing Cpf1:
saCas9 Nuclease
While S. pyogenes Cas9 (spCas9) is the most commonly used CRISPR nuclease, recent attention has turned to a miniature Cas9 nuclease isolated from S. aureus (saCas9). This small nuclease has significant potential to change the biomedical research industry by making CRISPR-based gene editing in living organisms more feasible. SaCas9 and spCas9 are able to cleave eukaryotic DNA in vivo with comparable efficiency (32). However, saCas9 has several characteristics which make it more useful for certain applications.
saCas9 is approximately 1 kb smaller than spCas9, allowing it to more effectively package into the smaller capacity adeno-associated viruses (AAVs) leaving extra space for regulatory elements, crRNAs, and tracrRNAs. By including the Cas9 and sgRNA in one construct (called an All-In-One vector), Cas9 expression and sgRNA targeting can be accomplished in one transfection/transduction step. AAV is a preferred method of gene delivery for in vivo studies due to its low immunogenicity and ability to selectively infect certain tissue types. For more information on the advantages and disadvantages of using AAV for gene expression, see our Introduction to Adeno-Associated Virus.
As well, saCas9 opens up new possibilities for genome editing due to its different targeting capabilities compared to spCas9. spCas9 recognizes a PAM sequence of 5’-NGG-3’, while saCas9 recognizes 5’-NNGRRT-3’. A greater variety in PAM sequences available for use means an increased number of loci are available for genome editing. This is of particular benefit when precise editing of a gene using homology directed repair is required, as HDR is most effective when recognition sequences are in very close proximity to the region to be edited.
One disadvantage of saCas9’s longer PAM is that this sequence occurs less frequently in the genome than spCas9’s PAM sequence, limiting its potential utility. However, saCas9’s longer PAM sequence should help prevent off-target cleavage due to its greater specificity.
A study done in 2015 by MIT’s Zhang lab investigated saCas9 performance in vivo. They injected AAVs carrying saCas9 into mice to disrupt the PCSK9 gene, which is linked to familial hypercholesterolemia. This resulted in >40% gene modification in liver tissue after 1 week, with no signs of toxicity 4 weeks after injection (33). Other recent work has been done to develop a variant saCas9 with relaxed PAM-recognition specificity using molecular evolution. This variant allows for a larger range of potential editing targets than wild type saCas9, while retaining the advantages its small size conveys (34).
Watch our summary video introducing saCas9:
Why do some researchers prefer saCas9 to spCas9? Our infographic summarizes 4 reasons why (click to view the full sized infographic):
There are two common protein-based genome editing tools: 1) Zinc-finger Nucleases (ZFNs), and 2) Transcription Activator-Like Effector Nucleases (TALENs). Table 4 compares and contrast the two protein-based methods with the RNA-based CRISPR Cas9 System (35). You can also read more about the advantages and disadvantages of these three gene editing systems in our Gene Silencing Methods: CRISPR vs. RNAi vs. TALENs knowledge base article.
Table 4: Comparison of current genome editing technologies
ZFNs | TALENs | CRISPR Cas9 | |
DNA Binding Domain | Cys2-His2 DNA Binding Protein | Conserved Amino Acid Repeated Motif | Single Stranded gRNA |
DNA Cleavage Domain | FokI Restriction Endonuclease | FokI Restriction Endonuclease | Cas9 Endonuclease |
Guiding Mechanism | Protein Guided | Protein Guided | RNA Guided |
Ease of Design | • | ••• | ••••• |
Minimized off-target Effects | ••••• | ••••• | ••• |
Multiplexibility | •• | •• | ••••• |
Source: Adapted from CRISPR/Cas9 for genome editing: progress, implications and challenges. Zhang, Feng, Wen, Yan and Guo, Xiong. 1, s.l. : Human Molecular Genetics, March 17, 2014, Vol. 23.; CRISPR-Cas system for editing, regulation and targeting genomes. Sander, Jeffry D and Joung, J Keith. 4, s.l. : Nature Biotechnology, March 2, 2014, Vol. 32.; and ZFN, TALEN, and CRISPR/Cas-based methods for genome engineering. Gaj, Thomas, Gersbach, Charles A and Barbas III, Carlos F. 7, s.l. : Cell, July 2013, Vol. 31, pp. 397-406.
CRISPR Cas9 was first shown to work as a genome editing tool in human cell culture in 2012 (36). It has since then been used in a wide range of organisms including baker's yeast (37), plants (38), zebra fish (39), fruit flies (40), nematodes (41), mice (42), and several other organisms.
CRISPR Cas9 has the potential to become a therapeutic agent in curing genetic disorders and viral infection. One research group used a gRNA to target the long-terminal repeat promoter of the HIV-1 genome to significantly repress its expression in infected human cells (43). With rapid growth in research being conducted to use and improve the CRISPR Cas9 system, it won’t be long until a viable gene therapeutic agent is made available. In fact, companies such as Editas Medicine and CRISPR Therapeutics have already been established with the goal of developing gene therapies using the CRISPR Cas9 system.
For more information on our experiences with the CRISPR system, read our CRISPR Cas9 Knockout and Knock-in Case Studies.