A breakdown of CRISPR-mediated genome editing

This article will go over complex ideas from repair pathways to Cas 9 variants. These are the things you need to understand if you want to dive deep into this area.

Aditya M
students x students

--

In this article, I will go in-depth about many technical concepts in CRISPR-mediated genome editing.

Table of Contents

DNA, RNA, Amino Acids, Peptides, Enzymes;
Gene regulation, transcription, plasmids, translation;
Double Stranded Breaks, Single Stranded Breaks;
Deletions, Insertions, Substitutions;
Non-homologous end joining; homology directed repair;
The CRISPR mediation genome editing;
Prime editing;
Base editing;
CRISPRa/i;
Cas 9 Variants: SpCas9, SaCas9, eSpCas9, HypaCas9, xCas9, Cas3, Cpf1, Cas13;
Transfection/Transformation Protocols: Electroporation, Micro-injection, Gene guns, Lipofection, Viral Vectors

DNA

DNA or Deoxyribose nucleic acid is a molecule containing genetic code that makes everyone unique. Genetic code influences your phenotype or the physical traits that make you. This includes hair colour, chances of getting a disease, and limits in physical capability. The power of gene editing grants us the ability to prevent/cure diseases or increase heat resistance for plants in areas where climate change is taking a toll.

There are four bases that makeup genetics: adenine, cytosine, guanine, and thymine. Adenine (A) pairs with Thymine (T) and Cytosine (C) pairs with Guanine (G).

DNA is a double-stranded molecule built with nucleotides. Nucleotides have a phosphate group, a sugar group, and a base. Nucleotides are held together by hydrogen bonds. This complex forms a double helix structure.

The direction of nucleic acids (molecules made up of repeating nucleotides) is described using prime ends (5' and 3').

Link

Individual nucleotides are labelled 1' through 5' as shown in the diagram to the right. When extended on both sides, the end of the strand in the direction where 3' points (in the nucleotide) is referred to as 3'. And the end in the direction where 5' (in the nucleotide) is referred to as 5'.

RNA

RNA stands for ribonucleic acid and is similar to DNA.

Rather than having thymine bases, RNA has uracil (U) bases. Additionally, RNA nucleotides have a ribose sugar group while DNA has a deoxyribose sugar group. Ribose sugar has one hydrogen bond while deoxyribose has two.

Thymine contains a methyl (H3C) group at the 5 prime carbon, whereas uracil contains hydrogen (H) molecule at the 5 prime carbon

The major difference is that DNA has a double helix structure while RNA does not.

Amino acids and Proteins

Amino acids are molecules that are the building blocks of proteins. Proteins do all the actions needed to be done within and outside cells for an organism to live. Genetic diseases are caused by malfunctioning or unregulated expression of proteins. So by gene engineering, many diseases can be cured.

There are 20 amino acids classified into two groups: (1) essential amino acids and (2) nonessential amino acids.

Essential amino acids cannot be made by the body. So they must come from food. The 9 essential amino acids are histidine, isoleucine, leucine, lysine, methionine, phenylalanine, threonine, tryptophan, and valine.

Nonessential amino acids can be made by the body. The nonessential amino acids are alanine, arginine, asparagine, aspartic acid, cysteine, glutamic acid, glutamine, glycine, proline, serine, and tyrosine.

There are more that are not essential but are required at a time of illness.

The diagram below shows the structure of an amino acid. R represents a side chain which makes each amino acid unique. Every amino acid has a carboxyl group and an amino group. The connecting carbon is called alpha-carbon.

https://www.astrochem.org/sci_img/Amino_Acid_Structure.jpg

Peptides

A chain of amino acids having any length. They are usually categorized in 3 ways… (based on the number of amino acids)

link

Oligopeptides

Peptides that have fewer than 20 amino acids.

Polypeptides

Peptides have typically between 20 to 50 amino acids.

Dipeptides, tripeptides, and tetrapeptides

Respectively… have two, three, and four amino acids.

Enzymes

Proteins that speed up reactions or chemical changes.

Endonuclease

An enzyme that cut the phosphodiester bond (bond between adjacent bases) within a nucleic acid.

Genes

Genes make up less than 1% of DNA code. They contain all the genetic information to make proteins that do the functions of life.

Making a protein takes two steps transcription and translation.

Transcription

The process of synthesizing messenger RNA (mRNA) based on genes. An enzyme called RNA polymerase does the synthesis. mRNA travels outside the nucleus to give information to ribosomes (a cell organelle that makes proteins.

Gene regulation

Genes are regulated by keeping the process of transcription controlled. This is done through regulatory elements. What are the parts of the gene? Genes have a promoter, multiple introns and exons (protein-coding sequences), regulatory elements, and a terminator.

Link

For transcription to take place, RNA polymerase an enzyme responsible for synthesizing mRNA binds to the promoter region. The promoter region is upstream of protein-coding sequences and is responsible for initiating or “telling” the RNA polymerase to transcribe “here”.

Promoters can be identified by the TATA box (defines the direction of transcription and also indicates that the following gene is the strand to be read) and the transcription start site (the TSS makes up the majority of the promoter and is a catalyst for DNA polymerase to start transcription). Promoters are unique to tissue types and RNA polymerases.

Once initiation, the RNA polymerase synthesizes the mRNA. It knows that it is finished at the terminator region following protein-coding sequences.

link

How is the gene regulated? How can the making of the protein be controlled?

Genes expression is regulated through the regulatory region which is upstream or downstream of the promoter region. Three major elements in the region include the Enhancer, the Silencer, and the Insulator. In order to increase or decrease gene expression, the locus bends so that regulatory elements can bind to the promoter region.

Enhancer: An activator protein binds to the enhancer. Then the complex binds to the promoter which informs RNA polymerases to increase gene transcription.

Silencer: After binding, the RNA polymerase is inhibited. So the gene is shut off.

Insulator: This element acts as a buffer between the promoter and the other regulatory elements. After binding, other elements are blocked from binding to the promoter.

To avoid leaky transcription (transcription when it is not supposed to occur) synthetic promoters can be made in labs and then used.

Plasmid

Plasmids are genetic structures in a cell that can replicate independently of the chromosomes (usually found in prokaryotes). They are not a part of the chromosomal DNA and are usually circular. Compared to chromosomal DNA, it is smaller and not essential (it gives bacteria perks like antibiotic resistance). Compared to the millions or billions of base pairs that make up an entire genome. Plasmids contain up to a thousand.

Plasmids can be easily engineered as they are small, stable, and easy to manipulate. Plasmids created in a lab are called vectors.

Plasmid diagram

All plasmids contain an origin of replication (ORI). It tells the plasmid where to begin replication. They often contain genes that are advantageous for survival. One of the most common naturally occurring genes is the antibiotic resistance gene.

Scientists can engineer plasmids to insert genes (at blue) that they want to be expressed in a cell. To engineer plasmid, scientists use restriction enzymes which cut (at restriction sites) and replace the gene with the gene they want to express. Restriction sites are specific to restriction enzymes. Typically plasmids have Multiple Cloning Sites (MCS) which contain multiple restriction sites allowing different restriction enzymes to cut.

Translation

Translation is the process of translating mRNA into peptides or proteins. Ribosomes are organelles responsible for this process. Ribosomes have two parts a small subunit and a large subunit.

mRNA diagram link

mRNA contains a methylated Cap found on the 5' ends, a start codon, code for specific amino acids, a stop codon, and an end called the Poly-A tail. Codons translate to specific amino acids.

  1. The ribosome’s small subunit attaches to the cap and moves to the translation initiation site (first codon)
  2. tRNAs (transfer RNA) are structures that hold amino acids. They are found freely in the cytoplasm and contain anti-codons which are complementary to the codons on the mRNA. If anti-codons and codons match, the tRNA releases its amino acid to the growing polypeptide chain.
  3. The first mRNA codon is typically AUG. Attached to the end of the tRNA is the corresponding amino acid (Methionine which corresponds to the AUG codon)
  4. Ribosome’s large subunit now binds to create the A-site and P-site. The A (amino acid) site is the location where the anticodon pairs up with the mRNA codon, ensuring that the correct amino acid is added to the growing polypeptide chain. The P (polypeptide) site is the location at which the amino acid is transferred from its tRNA to the growing polypeptide chain.
  5. The ribosome moves along the mRNA building a peptide chain. This step is known as elongation.
link
  1. When the stop codon is encountered in A site. A release factor enters the site and translation is over. This process is called termination.

Breaks

Double Strand Breaks (DSB)

When both strands of DNA are broken. Double-stranded breaks are natural in a cell. They can occur in DNA due to exposure to external factors (i.e. radiation and chemicals) and internal processes, like DNA replication.

However, unrepaired breaks are fatal to a cell as it shatters the integrity of DNA. It potentially causes cellular senescence, the activation of the p53 gene (which tells cells to commit suicide), cancer, translocations in DNA, and more. Double-stranded breaks are dangerous. So, it is not surprising to hear that cells have their own repair mechanisms for double-stranded breaks in DNA.

Single Strand Breaks (SSB)

When only one strand of DNA is broken.

Cells have their own repair mechanisms for DNA breaks. Repairing SSBs are easier as there is a template for enzymes to repair (the opposite strand).

Mutations

Mutations are changes in DNA sequences, and they happen all the time. Processes dealing with DNA like DNA replication (for mitosis), transcription, and DNA repair happen perfectly most of the time (but not all the time). Gene editing techniques like CRISPR mutate gene sequences and change protein expression (solving the problem at hand).

Deletions

A change in the DNA sequence where there is a removal of nucleotides. During gene therapy, large deletions are called knockout (which makes the gene unusable).

Insertions

A change in the DNA sequence where there is an insertion of nucleotides. During gene therapy, insertions are called knock-ins.

Substitutions

A change in the DNA sequence where a base pair is swapped out for another.

www.yourgenome.org

Repair pathways

Most gene editing systems make intentional double-stranded breaks (using endonucleases) in DNA and rely on cellular repair pathways within a cell to make a mutation.

The key repair pathways that are used include non-homologous end joining and Homology directed repair.

Non-Homologous End Joining

NHEJ is the most common cellular repair pathway. It is a quick-fix technique that occurs in all of the phases of the cell cycle. However, it is pretty inaccurate as it causes random insertions and deletions (indels) at the double-stranded break site. Around 1 to 10 nucleotides are indeled (a word I made).

NHEJ can lead to extra or lost pieces. So, it does not restore the DNA to its pre-break sequence. The resulting gene is often unusable and turned off. Typically this repair pathway is taken advantage of in gene therapy to stop a gene from working.

This is how it works:

  1. After a double-stranded break, there is usually degradation of nucleotides from the ends of the strands, called a resection. Enzymes try to fix the breaks. However, they cause random indels at the site.
  2. Ku protein binds around the broken ends, leaving the ends of the strand exposed. It also engages the DNA-PK catalytic subunit (DNA-PKcs)
  3. DNA-PKcs recruit Artemis proteins that degrade nucleic acids. Artemis trims any single-stranded tails at the break.
  4. Ligase IV joins the ends of the broken strands together, and the double-stranded break is fixed (with a lot of new base pairs and a lot of lost ones).

Homology Directed Repair

Homology-directed repair uses a template (a sister chromatid during mitosis) to repair DSBs. That being said, HDR is naturally only active in the S and G2 phases when sister chromatids are active in the cell. Although very accurate with less indel formation, this repair pathway is inefficient and time-consuming. For gene therapy, a template is provided along with an endonuclease to make edits.

This is how it works:

  1. DNA on each side of the DSB is unwound and the 5’ sections are cut away, creating 3’ overhangs on each strand ends. The tails can be hundreds of base pairs long.
  2. The DNA invades the template (usually a sister chromatid). The invasion binds the overhang to one strand of the template, creating a D-Loop. Through reverse transcription, the DNA is repaired with edits based on the template provided.

Genetic Engineering Techniques

The original CRISPR-Cas 9 technique

In bacteria…
The CRISPR-Cas 9 method for editing genes was discovered by Jennifer Doudna and Emmanuelle Charpentier in the S. Pyogenes bacteria genome. They discovered that bacteria used a sequence in their genomes called a CRISPR array as a method to arm themselves against viruses.

Viruses are very harmful to bacteria. If a virus were to inject its DNA into a bacteria, replicate inside, and push out, the bacteria will go 💥. So, through millions of years, bacteria evolved to use a mechanism called CRISPR as an immune system to keep themselves safe.

CRISPR stands for clustered regularly interspaced short palindromic repeats. In simple words, it is an array of nucleotides with alternating repeated sequences and target-specific spacers (non-coding DNA).

The spacers contain the DNA of viruses from past infections. When a new virus invades, a new spacer sequence (containing viral genes) is added to the array (an immune memory).

These spacer fragments are placed between repeated palindromic sequences, which gives CRISPR its name. The CRISPR array acts like a vaccination. If the virus invades again the bacteria is ready to defend itself.

When a virus invades the bacterium, a protein complex called Cas1-Cas2 identifies invading DNA and cuts out a specific segment, called the protospacer, which is inserted in the front of the CRISPR array. Cas stands for CRISPR-associated protein and is responsible for making double-strand DNA breaks. A specific protospacer is chosen that is followed by a PAM or a protospacer adjacent motif.

If a virus reinfects, bacteria use enzymes to transcribe long RNA sequences called pre-crRNA from the CRISPR array.

tracrRNA (a scaffold RNA; transactivating RNA), then links up with the pre-crRNA through base pairing. tracrRNA is crucial for processing the pre-crRNA as well as activating Cas 9.

The dual RNA complex is known as the gRNA, which complements the target gene. Cas 9 binds to the gRNA creating a complex called the ribonucleoprotein or RNP. The complex is then trimmed to a manageable size.

RNP patrols the cell. If a virus re-infects, the complex recognizes the viral DNA and makes a double-stranded break. Double-stranded breaks destroy the virus as they lack DNA repair mechanisms.

Cas 9 acts like scissors, and gRNA act like the hand that guides it to where it should be cut.

The editing system…

CRISPR/Cas-9 genome editing is a gene editing method inspired by this design. An RNP complex is designed and sent into the target cell (gRNA is designed and Cas 9 is taken from a bacteria), the guide RNA leads Cas 9 to the target gene, and Cas 9 makes a double-stranded break. Once the gene is cut, repair pathways are taken advantage of. For gene knockout usually, NHEJ is used. For accurate edits, HDR is used (so a template is inserted along with the RNP).

There are three main steps for CRISPR-Cas 9: recognition, cleavage, and repair.

Recognition
Cas 9 patrols the viral DNA to find the PAM or the protospacer adjacent motif (which tells Cas 9 that it is OK to cut DNA). If found, Cas 9 checks the region that precedes the PAM, or the protospacer. If the protospacer matched the gRNA, Cas 9 knows it is at the target gene and will make a double-stranded break.

What is the Protospacer Adjacent Motif?

Let’s assume a virus revisits a bacteria. How would the bacteria distinguish between the protospacer in the CRISPR array (allowing the bacterial genome to survive) and the protospacer in the invading viral genome? PAM lets Cas 9 differentiate between what it should cut and what it should not cut.

The PAM is a short DNA sequence of around 2–6 base pairs in length that follows the protospacer in the viral genome. For instance, in the bacteria S. Pyogenes, Cas 9 recognizes the PAM “GG” in the viral genome with an additional nucleotide between (NGG). The PAM sequence tells the Cas 9 protein that it is “ok” for it to make a double-stranded break at that location. As a result, spacer sequences in the bacterial genome are protected as they precede a constant repeated sequence GTT (that is not the PAM).

How can you guarantee that there is always a PAM in the viral DNA?

Cas 9 leads the Cas1-Cas2 complex to a location in the viral genome to cut that precedes the PAM.

In the lab, scientists look for a PAM sequence that proceeds to the target location they want to edit. They use the target sequence to engineer a gRNA. If there is no PAM next to the target location, different Cas 9 proteins (that have different PAMs) are used.

Cleavage
How does Cas 9 cut a gene? It has many parts (domains) that allow it to do so.

  • Recognition (REC 1–3) domains. REC1 is the largest and is responsible for binding with gRNA. Rec 3 senses mismatches in target strand and gRNA binding. Cas 9 is activated by REC domains.
  • PAM interacting (PI) domain. Cas 9 searches for the targeted sequence using its PAM interacting domain. The PAM speeds up the recognition process.
  • HNH and RuvC domains. If the gRNA matches the viral DNA, then the HNH and RuvC domains cleave the target DNA like scissors. HNH cuts the target strand (the strand that matches with gRNA) and RuvC cuts the non-target strand.

After the double-stranded break. Depending on the situation HDR or NHEJ takes place by mutating the genome.

This is how the original gene-editing method with CRISPR works.

Prime editing

The precise gene editing method using Homology Directed Repair is good but it is not the best. The technique is prone to off-target changes and uses double-stranded breaks which are dangerous, inefficient, and can cause indels.

The ideal approach for precise gene editing is to have a technique that does not use DSB but still enables insertions, deletions, or substitutions at an efficient rate. Prime editing does just this.

It uses the single-stranded breaks method with slight modifications in the guide RNA

Prime editing has three main parts:

  1. Cas9 Nickase
  2. pegRNA
  3. Reverse Transcriptase

Cas9 Nickase (Cas 9n)
For prime editing, Cas 9 nicks (or makes a single strand break) the target strand. In the wild type Cas 9, the HNH domain cuts the target strand, and RuvC cuts the non-target strand.

By mutating one or both of these domains, we can make a single-strand break or not break at all. The RuvC mutant D10A produces a nick on the target strand, and the HNH mutant H840A generates a nick on the non-target strand. therefore, mutating the RuvC domain (H840A). These variations are called Cas 9 nickase.

The use of Cas9n solves the two biggest problems that canonical Cas 9 faces: (1) off-target editing and (2) accidental indels from double-stranded breaks. With the Cas9 nickase, off-target edits are less likely to occur, as SSB repair pathways in the cell are very accurate.

pegRNA
pegRNA is Prime Edited gRNA. It is a modified version of the traditional gRNA and is much more complicated.

Having the ability to stretch, pegRNA can link to both strands of the target DNA. This gives the Cas 9 complex much more stability and accuracy than the canonical CRISPR system. Rather than relying on Homology Directed Repair for somewhat accurate edits, a template is provided in the pegRNA.

Reverse Transcriptase
Viruses come in DNA and RNA forms. DNA viruses have an easier time taking over cells (like bacteria). But RNA viruses… doesn’t work like that.

RNA needs to be converted to DNA. To do that, viruses use enzymes called reverse transcriptase, which does the opposite of transcription. RNA viruses use this enzyme to convert their RNA to DNA to then take over a cell.

Prime editing uses the same enzyme.

How it works
Scientists engineer a pegRNA with the edits they need. Then, they attach it to a Cas9 nickase.

The pegRNA Cas 9 complex identifies the target DNA using its guide sequence.

The pegRNA latches onto both strands of the target DNA. One side, the traditional gRNA, latches onto the non-target strand.

Next, Cas 9 nickase makes a single-stranded break on the non-target strand at the PAM. The single-stranded break makes two flaps.

The other side of the pegRNA latches on the target strand (where the edit needs to be made). This site has a primer DNA (brown — which will initiate the reverse transcriptase).

Reverse transcriptase, after initiation, begins constructing DNA. It uses the reverse transcription template (pink) to build DNA. The edits that need to be made are found in the reverse transcription template.

After nicking the non-target strand, two “flaps” of DNA were formed. One flap is the edited portion (green), and the other flap is regular DNA (blue).

These two flaps will be competing for the space in the target strand. To fix this issue, another Cas 9 nickase is sent to cut the unedited flap. As a result, single-stranded break repair pathways can fit the edited portion into the target strand.

But now, there is a mismatch between the target strand (with our edits) and the non-target strand.

Cellular repair pathways try to fix this. They can change based on either strand so that base pairs match. To make sure no changes are made on the target strand (with our edits), another Cas 9 nickase is sent to damage the non-target strand.

The damage tricks repair pathways into thinking the non-target strand is unhealthy, forcing enzymes not to change based on the target strand.

Base editing

Base editing is another CRISPR-associated method that uses Cas9n. It fits the ideal approach.

The ideal approach for precise gene editing is to have a technique that does not use DSB but still enables insertions, deletions, or substitutions at an efficient rate.

Base editing is different as it is an approach that uses Cas9n, through which scientists can make minute changes, often substitutions of just individual bases in a genome.

The Cas9n is fused with a Deaminase domain that chemically modifies the base. What do we need a nickase? The use of a Cas 9 nickase has been shown to increase the efficiency of the approach decreasing off-target effects. Although base editing is safe, deaminase enzymes have some limitations.

This is how it works.

  1. Cas 9 nickase searches for its associated PAM sequence and gRNA.
  2. Once the target is found, Cas 9 unravels the DNA.
  3. The deaminase domain makes the change to an individual base
  4. Cas 9n (D10A variant) nicks the target strand
  5. Repair pathways repair the target strand to match the induced change.

There are two types of DNA base editors the (1) cytosine base editor and the (2) adenine base editor.

Adenine base editor (ABE)
ABD has a deaminase domain that converts adenosine (the nucleotide containing the adenine base) to inosine. Inosine is recognized as a unique replacement for guanosine because they are chemically identical with the exception that guanosine has an amino group. After DNA replication of the DNA strand, this results in an Adenine to a Guanine base change.

To the left is guanosine and to the right is inosine.

Cytosine base editor (CBE)
CBE has a deaminase domain that converts cytidine (the nucleotide containing the cytosine base) to uridine (the nucleotide containing the uridine base). After replication of the DNA strand, this results in a Cytosine to a Thymine base change.

CRISPRa and CRISPRi

Every cell in our body has the same DNA. What makes cells unique are their expression levels of genes. Sometimes, diseases are caused by the expression levels of a gene, not just the arrangement of bases.

A slight tweak in the CRISPR Cas 9 complex can be used to increase gene expression with CRISPR activation (CRISPRa) and decrease gene expression (or gene knockdown) using CRISPR interference (CRISPRi) is used to reduce gene expression.

Both of these techniques use dCas 9 or dead Cas9. Dead Cas 9 has both of it’s endonuclease domains mutated so that it cannot cut DNA strands. However, it can still recognize and guide things to the target DNA.

The dCas9 gRNA complex is fused to inhibitor or activator proteins. These are like regulatory elements upstream or downstream of a promoter region in a gene. The dCas9 complex leads take itself to the target gene’s promoter region. The effectors bind to the promoter region either activating or inhibiting the downstream gene.

For example, dCas 9 fused to histone-acetyl-transferase P300 is able to specifically acetylate histones (DNA is spooled around histone proteins) near the target DNA. This activated gene expression.

When dCas9 is fused to the demethylase LSD1, it can demethylate the target DNA sequence.

Methylation is the process of adding methyl groups. When added to the gene, expression levels go up. Demethylation does the opposite. Acetylation works the same way for histone proteins. The more acetyl groups there are, the less tightly DNA is spooled around proteins, and the gene is expressed more.

The two best-known dCas 9 fusions for methylation are Tet1, which demethylates DNA and DNMT3A which methylates DNA.

While the traditional CRISPR method makes permanent changes in DNA, CRISPRa and CRISPRi change gene expression. What makes this system incredible is that changes in gene expression made are reversible. In treatment, this creates a sense of safety. And for drug research, this technology lets scientists understand more about specific gene functions in cells.

If you want to learn more about the applications of altering gene expression, check out this article: https://medium.com/@adityamahes/endometriosis-is-painful-but-the-cure-isnt-out-of-reach-our-idea-to-solve-the-issue-c013cc7788bc, where Arjun Mahes, Mehr Bains, Joshua Dendy, Pedro Carrión, and I outline a solution to Endometriosis using CRISPR.

Cas 9 Variants

SpCas 9

SpCas 9 is currently the most commonly used nuclease for genome editing. This is the original Cas 9 (described as regular Cas 9) that Jennifer Doudna and her team discovered. It is a naturally occurring protein in the bacterium streptococcus pyogenes.

This variant has its limitation:

  • Like all Cas 9 nucleases, PAM is essential for target DNA recognition. Cas 9 takes validation from the PAM sequence to cleave DNA. SpCas 9 enzymes have a specific PAM sequence that they look for (NGG). Often this limits scientists in finding target DNA regions as they must precede a specific PAM sequence. Only 1 in 16 possible target sites on average are able to be used.
  • Can cause off-target edits as its PAM and target stand recognition mechanism is stringent. The enzyme can accidentally recognize other PAM sequences (NAG or NGA rather than NGG). As PAM is crucial for the nuclease to identify the target DNA, non-specific PAMs can cause off-target changes. As you can expect off-target changes are dangerous to cells. Thus dangerous to use in treatment for humans.
  • spCas 9 is relatively large in size, and current transport technologies are so far limited. Hence, it is a challenge to transport the protein into cells.

To solve the PAM challenge, different Cas 9 proteins are being isolated from different bacteria species, that have different PAM sequences. This allows for targeting a wider audience of target DNA. Other naturally occurring nucleases include

  • Streptococcus thermophilus (StCas9)
  • Neisseria meningitidis (NmCas9)
  • Francisella novicida (FnCas9)
  • Campylobacter jejuni (CjCas9)
  • Streptococcus canis (ScCas9)
  • Staphylococcus auricularis (SauriCas9).

SaCas 9

SaCas 9 is a naturally derived Cas 9 from Staphylococcus aureus. This variant is significantly smaller than SpCas 9 (a whole kb smaller!), and as a result, is easier to transport using current technologies. It recognizes the PAM sequence NNGRRT (R is A or G) and generates DSBs.

Researchers have also engineered variants for certain experiments rather than relying on naturally found nucleases. A great example is Cas9n used in prime editing/base editing and dCas9 used in CRISPRa/CRISPRi.

eSpCas9

eSpCas9 is an enhanced S. Pyogenes endonuclease developed in Feng Zhang’s lab. It has higher accuracy and specificity (the narrowness of the range of genes in which the RNP complex can act) than regular Cas 9.

Regular Cas 9 creates double-strand breaks in DNA that matches the gRNA. Cas9 however, can cut at incorrect sites that do not fully match the gRNA. This is one of the major concerns with traditional Cas9.

Feng Zhang’s lab created a solution for this.

Once the guide RNA reaches the target, DNA strands separate as the RNA binds to the target strand. Feng Zhang and his team hypothesized that there is a groove in the Cas 9, in between the HNH, RuvC, and PAM-interacting domains, that stabilizes the nontarget strand of the target DNA. They thought so because the amino acids in the groove are positively charged and the non-target DNA strand is negatively charged.

look at the nt-groove

eSpCas 9 is an engineered version of SpCas9 that has a neutrally charged groove. This weakens non-target strand binding with the endonuclease protein.

If the gRNA strand does not match the target stranded perfectly, the weakened stabilizing force between Cas 9 and the nontarget strand binding lets it strand reconnect with the target strand. This happens because there is a better connection available. This leaves eSpCas 9 to look for another target sequence.

Here is an analogy. Let’s say you like chocolate chip cookies the best. If you have two edible options, a raisin cookie and a chocolate chip cookie. What would you take? Of course the chocolate chip cookie.

On the other hand, if the gRNA matches perfectly, there is no need for the DNA strands to reconnect. This technique greatly minimizes off-target edits, increasing accuracy.

If you have the best cookie in the world already, would you want to replace it with an equally good cookie? Probably not.

HypaCas 9

HypaCas 9 or Hyper accurate Cas 9 is an engineered Cas 9 that was developed in Jennifer Doudna’s lab.

Doudna and her team found that the HNH domain in Cas 9 is controlled by the Rec3 domain, which senses mismatches. They hypothesized that off-target effects were in part caused by a lenient Rec3 domain.

Her team mutated the Rec3 domain to make it more strict, to avoid cutting mismatched DNA sequences. This engineered variant better checked if the gRNA matched perfectly with the target strand before making a DSB. As a result, off-target mutations were minimized. This method has successfully edited mouse zygotes with hyper-accuracy.

xCas9

As discussed before, PAM is a huge limitation. So the availability of potential sites is limited (about 1 in 16 sites in the genome) for regular Cas9.

xCas9 is an engineered variant developed by researchers at Harvard and MIT which overcomes the PAM limitation. The xCas9 nuclease recognizes a broad range of PAM sequences. This increases potential target sites to 1 in 4. xCas9 also showed lower off-target rates.

Two important xCas 9s were developed: (1) SpG which has an expanded range of PAM sequences, and (2) SpRY which can work with almost all known PAM sequences.

Cas 3

Cas3 shreds DNA, rather than making a clean double-stranded break. It moves along a DNA strand degrading it. This variant doesn’t have an associated PAM.

How is it activated then?

It is activated by CASCADE which stands for CRISPR-associated complex for antiviral defence. Once CASCADE recognizes the target. Cas3 can shred extremely large regions, about 10kb in a target genome. This technique works well with bacteria for removing antibiotic-resistant genes.

Cpf1 (Cas 12a)

Cpf1 is gaining a lot of attention lately. It is derived from Prevotella and Francisella 1 and is used to target genes that regular Cas 9 have trouble dealing with. Here are its properties:

  • Regular Cas 9 requires PAM sequences that are rich in Guanine (NGG). That being said, is not the best for AT-rich genes. Cpf1 uses works with PAM sequences that are AT-rich. For example, the FnCas12a variant recognizes (TTN) as its PAM.
  • Cpf1 also creates staggered cuts (sticky ends) after making a double-stranded break. This makes it useful for experiments using Homology Directed Repair.
  • It is smaller than a regular Cas 9. So it can be packaged using small transportation mechanisms (like AAV vectors in the next section). It doesn't require a scaffold RNA (tracrRNA), as it has its own processes that make gRNA.
  • Cpf1 after it makes a DSB at the target DNA becomes activated to cut any single-stranded DNA. This can be taken advantage of, by inserting a reporter single-stranded DNA into the cell. So after the DSB, the endonuclease will cleave the reporter DNA.

Cas 13

All the variants so far explained target DNA. This causes permanent changes in the genome, which in some cases can be unsafe. Editing mRNA (not limited to mRNA) on the other hand does not affect the genome of the cell, as it is expressed only for a short time. This is a great alternative to permanently modifying genomes which can be dangerous or unethical.

Cas 13 edits RNA rather than DNA. Here are some interesting properties.

  • The endonuclease showed cleavage of additional surrounding RNA after binding to the target site. Like Cpf1, a reporter gene can be used for diagnostics research.
  • A dead version of this nuclease was also used combined with base editing to convert Adenine to Inosine Feng Zhang’s Lab.

Types of gene therapies

in vivo

Gene therapy is given directly to a patient through an IV or an injection

in vitro

Cells are removed from the patient and grown in a lab. Treatment is given to these cells to make permanent changes. Then cells are infused back into the patient.

Transfection and Transformation Protocols

In surprisingly hard to get things into cells. Transfection and transformation protocols are methods used to insert foreign nucleic acids into typically eukaryotic cells.

The following are popular and emerging methods to get gene editing mechanisms into cells.

Electroporation

Electroporation is a process where a voltage is applied through a cell. This leads to the formation of holes in the cell membrane, through which nucleic acids and proteins can enter the cell.

Gene editing mechanisms are inserted typically in one of two ways.

  • the Cas 9 guide RNA complex is electroporated directly into the cell, which immediately carries out the gene editing process. If using Homology-directed repair, a template is also inserted into the cell.
  • multiple plasmids are electroporated into the cell. One contains a Cas 9 coding gene. Another contains a gRNA coding gene. And, if applicable a third one contains a Homology-directed repair template. Once inside the cell, the genes are transcribed into RNA (and then protein for Cas 9) which do the gene editing processes..

This is how it works “technically”. An electroporator consists of a pulse generator linked to (using electrodes) a cuvette. A cuvette is a transparent rectangular apparatus used for holding liquid samples. The cuvette holds the target cells suspended in a buffer solution mimicking the cytoplasm. The electric current creates pores in the cell through which nucleic acids, proteins, plasmids, or vectors can enter.

Nucleofection is a kind of electroporation that is improved and more accurate. It enables highly efficient transfection that has been difficult with traditional electroporation.

Microinjection

Microinjection, just like it sounds, involves the use of a micropipette to inject plasmids, proteins, etc. directly into the cell nucleus. It is very precise, but time-consuming and expensive. As you can expect it also has very low throughput.

Gene guns

Gene guns bombard target cells with nucleic acid-coated gold particles at extreme speeds. They are like machine guns, but with no aim. The high speeds and pressure is achieved through pressurized gas or high-voltage electricity.

The use of a gene gun is fairly simple. First, you make insane amounts of copies of the gene (plasmid) you want to transfect along with a reporter gene. Reporter genes produce signal like colour change if it is properly transfected and expressed, which can be used to determine if the process was successful. The copying can be done through a process called PCR.

Then, gold particles are coated with copies of the gene. Finally, the particles are loaded into the gun and fired into a mass of target cells.

Nonetheless, there are many limitations.

  • Particles sent from the gene gun obliterated cells. Hopefully, there is at least one successful transfection.
  • Gene guns are like machine guns with no aim. So, you are relying on a chance of whether there is a successful transfection.
  • You also never know if the gene of interest entered the nucleus.

Lipofection

link

Lipofection is plasmid transfection through liposomes. Liposomes are tiny vesicle (transporting structure in a cell) structures that have the same composition as the cell membrane.

Most of the membrane is made of molecules called phospholipids. Phospholipids have a head that is phosphate and takes that are fatty acids. The phosphate group is negatively charged making the heads hydrophilic or attracted to water. The lipids (fatty acids) on the other hand are uncharged, making the tails hydrophobic or water fearing. The phospholipids form two layers known as phospholipid by-layer. The tails face one another and the head faces the aqueous external environment and the cytoplasm. This makes the membrane semi-permeable. The smaller and less charged something is, the easier it is for it to pass through the phospholipid bilayer.

Nucleic acids are hydrophilic molecules (negatively charged) and cannot pass through the hydrophobic cell membrane. During lipofection, the positively charged heads of the membrane associate with the negatively charged phosphate groups of nucleic acids, creating a lipid nucleic acid complex that can pass through the membrane. This can be done either through endocytosis (the cell membrane moves around the plasmid and eats it like an amoeba) or by fusion with the cell membrane.

link

Micelles can also be used. They are similar to liposomes. The difference is that there isn’t a phospholipid bilayer. There is only one layer. The micelle core is hydrophobic, and the micelle shell is hydrophilic, which protects the nucleic acids it can carry. Micelles work especially well with delivering hydrophobic drugs.

Viral Vectors

Another way to get genes into cells is through the use of viruses. Through millions of years, viruses have adapted to bypass cellular defence mechanisms and insert their DNA, rapidly replicate, and then take over. Taking advantage of this, viruses can be used as delivery vehicles to get genetic material for gene therapy into a cell.

All viruses have RNA or DNA genomes that are surrounded by a protein shell called a capsid. Some viruses have a membrane that protects the capsid. Like a plasmid, viruses can be modified by scientists. First, disease-causing and replication viral genes are removed. Once the viral genes are removed, space for therapeutic genes is created.

There are four main types of viral vectors: lentivirus, adenovirus, retrovirus, and adeno-associated virus. Each is different and is used for different needs.

Adeno-Associated Viruses or AAVs are typically used to deliver smaller DNA packages, which is a factor to determine what diseases it can target.

  • They are non-integrating, meaning that their DNA doesn’t put itself into the cell’s genome. This means during mitosis (cell division), the inserted gene will not be copied. So, the effect of the therapy can be lost over time. That is why they are typically used to target cells that have slower mitosis rates like neurons, liver cells, or skeletal muscle cells. For such cells, the effect of gene therapy can last a lifetime.
  • Most people have become immune to AAV types because it is likely that they have encountered it before. So, the immune system can attack the virus, destroying the therapeutic gene.
  • Very small. Can carry anything smaller than 4.4kb. Typically smaller Cas 9 proteins like SaCas9 are used.
Internal Terminal Repeat identifies the AAV plasmid. The sequence in between the ITR gets packaged into the AAV molecule. The transgene is the gene that will be delivered by the AAV. This website has a full explanation of the parts of an AAV plasmid.

Adenoviral Vectors are similar to AAVs, but they can deliver vectors up to 8 times the size of AAVs.

Lentiviral and Retroviral Vectors carry RNA packages unlike Adenoviral or AAV vectors. They can carry large packages which are converted into DNA within the cell (through reverse transcription). This allows the engineered gene in the vectors to integrate into the genome of the target cell. As changes in DNA are hereditary, lentiviral and retroviral vectors are best suited for dividing cells.

And… that’s a pretty condensed introduction to how CRISPR-mediated genome editing works!

CRISPR Cas 9 is going to change the world. You’ve got a pretty good understanding in an easy-to-understand way. We talked about nucleic acids, gene regulation, breaks, mutations, repair pathways, CRISPR-mediated genome editing, Cas 9 variants, transfection protocols, and more.

My name is Aditya, I am a grade 10 student who’s going to change the world. I love playing ping pong, running, and learning about in-depth scientific topics. Hopefully, you have learned a lot, and are inspired to learn more! If this article provided you with some insight, clap it, share it, and connect with me on Linkedin! If you have any suggestions, questions or just want to talk, you can email me at: 4aditya.m@gmail.com. Also make sure to sign up for my monthly newsletter here! 🔥

We’re providing opportunities for the next generation of student thinkers, inventors, and learners, to publish their thoughts, ideas, and innovation through writing.

Our writers span from all areas of topics — from Growth to Tech, all the way to Future and World.

So if you feel like you’re about to jump into a rabbit hole of reading these incredible articles, don’t worry, we feel the same way. ;)

That’s why students x students is the place for getting your voice heard!

Sounds interesting? Why not join us on this epic journey?

--

--

15 y/o student with a vision of making a difference in the world. Looking to learn at labs!