How gene editing works under the hood

CRISPR/Cas 9 is changing the biological world. But how does it really work?

Aditya M
students x students

--

Photo by Sangharsh Lohakare on Unsplash

The original CRISPR-Cas 9 method for editing genes was discovered by Jennifer Doudna and Emmanuelle Charpentier in 2012.

In recognition of their discovery in genetic engineering, the 2020 Nobel Prize in Chemistry was awarded to them. 👏

CRISPR was discovered in the S. Pyogenes (a bacteria) genome. Bacteria used Crispr as a method to arm themselves against viruses.

Why are viruses bad for bacteria?

If a bacteriophage (a virus) can inject its DNA into the bacterial genome, replicate inside, and push out, the bacteria will go 💥

Bacteria needed a solution… And CRISPR was their saviour!!!

CRISPR stands for clustered regularly interspaced short palindromic repeats. 📝In simple words, it is an array of nucleotides with alternating repeated sequences and target-specific spacers (non-coding DNA).

The spacers contain the DNA of viruses from past infections. When a new virus invades, a new spacer sequence (containing viral genes) is added to the array (an immune memory). 🦟

These spacer fragments are placed between repeated palindromic sequences, which gives CRISPR its name.

The CRISPR array acts like a vaccination. If the virus invades again the bacteria is ready to defend itself. 🦸‍♂️

When a virus invades the bacterium, a protein complex called Cas1-Cas2 identifies invading DNA and cuts out a specific segment. This is known as the protospacer, which is inserted in the front of the CRISPR array.

The Cas1 domain is an endonuclease, which means it produces breaks in DNA. The Cas2 domain allows for the addition of a spacer in the CRISPR array.

A Cas 9 protein (an endonuclease protein responsible for cutting DNA) also goes with the Cas1-Cas2 protein complex to deal with PAM (talk about it later).

To 🦸‍♂️ themselves against viruses, bacteria use enzymes to transcribe long RNA sequences called pre-crRNA (pre-CRISPR RNA) from the CRISPR array.

Transactivating RNA (tracrRNA)-functions as a scaffold, then links up with the pre-crRNA through base pairing. tracrRNA is crucial for processing the pre-crRNA as well as activating Cas 9.

The dual RNA complex is known as the gRNA.

Cas 9 binds to the gRNA. The complex is then trimmed to a manageable size.

The Cas 9 complex patrols the cell. If a virus re-infects, the complex recognizes the viral DNA and makes a double-stranded break.

Double-stranded breaks destroy the virus as they lack DNA repair mechanisms.

Here is an analogy.

Cas 9 acts like scissors, and gRNA act like the hand that guides it to where it should cut.

CRISPR/Cas-9 genome editing is a gene editing method inspired by this design. It contains three steps, recognition, cleavage, and repair.

To edit a gene, scientists design a guide RNA that matches the gene they want to edit. They attach the gRNA to Cas 9.

The guide RNA leads Cas 9 to the target gene, and Cas 9 makes a double-stranded break. Once the gene is cut, the cell will try to repair it. By exploiting these repair pathways scientists can make accurate edits.

🍻 to you, bacteria

Recognition and Cleavage

Cas 9 patrols the viral DNA to find the PAM or the protospacer adjacent motif (which tells Cas 9 that it is OK to cut DNA). If found, Cas 9 checks the region that precedes the PAM if it matches the gRNA. If matched, Cas 9 knows it is at the target gene and will make a double-stranded break.

What is the Protospacer Adjacent Motif?

Let’s assume a virus revisits a bacteria. How would the bacteria distinguish between the protospacer in the CRISPR array (allowing the bacterial genome to survive) and the protospacer in the invading viral genome?

The PAM.

The PAM is a short DNA sequence of around 2–6 base pairs in length that follows the protospacer in the viral genome. For instance, in the bacteria S. Pyogenes, Cas 9 recognizes the PAM “GG” in the viral genome with an additional nucleotide between (NGG). The PAM sequence tells the Cas 9 protein that it is “ok” for it to make a double-stranded break at that location.

As a result, spacer sequences in the bacterial genome are protected as they precede a constant repeated sequence GTT (that is not the PAM).

How can you guarantee that there is always a PAM in the viral DNA?

Cas 9 leads the Cas1-Cas2 complex to a location in the viral genome to cut that precedes the PAM.

In the lab, scientists look for a PAM sequence that proceeds to the target location they want to edit. They use that target sequence to engineer a gRNA. If there is no PAM next to the target location, different Cas 9 proteins (that have different PAMs) are used.

What are the domains or parts of the Cas 9 protein?

(1) Recognition (REC)

Cas 9 is activated by the REC domains

(2) PAM interacting (PI)

Cas 9 then searches for the targeted sequence using its PAM interacting domain.

(3) HNH and RuvC

If the gRNA matches the viral DNA, then the HNH and RuvC domains cleave the target DNA like scissors. HNH cuts the target strand (the strand that matches with gRNA) and RuvC cuts the non-target strand.

Wikipedia

Repair pathways

ok, now we’ve made a double-stranded break in the DNA… 👏

To make the edits CRISPR/Cas 9 takes advantage of repair pathways in a cell (key ones being Non-homologous end joining and Homology directed repair).

Double-stranded breaks are natural in a cell, so they must be able to repair the damage. Double-stranded breaks can occur in DNA due to exposure to external factors (i.e. radiation and chemicals) and internal processes, like DNA replication.

Unrepaired breaks are fatal to a cell as it shatters the integrity of DNA. It potentially causes cellular senescence, the activation of the p53 gene (commits apoptosis), cancer, translocations in DNA, and … 🤔

You get the point double-stranded breaks are dangerous. So, it is not surprising to hear that cells have their own repair mechanisms for double-stranded breaks in DNA.

Nevertheless, they are not very accurate.

P.S. That is why we have prime editing! Coming soon

Non-homologous end joining (NHEJ)

NHEJ is the most common cellular repair pathway. It is a quick-fix technique that occurs in all of the phases of the cell cycle. However, NHEJ is significantly inaccurate as it causes random insertions and deletions (indels) at the double-stranded break site. The insertions or deletions are random ranging from 1 to 10 nucleotides.

So NHEJ has a good chance of affecting a gene’s associated protein. This is how it works:

After a double-stranded break, there is usually degradation of nucleotides from the ends of the strands, called a resection. Nuclease try to fix double-stranded breaks. However, they cause random insertions and/or deletions at the site.

  1. Ku protein binds around the broken ends, leaving the ends of the strand exposed. It also engages the DNA-PK catalytic subunit (DNA-PKcs)
  2. DNA-PKcs recruit Artemis proteins that degrade nucleic acids. Artemis trims any single-stranded tails at the break.
  3. Ligase IV joins the ends of the broken strands together

🎉🎈🎊

Non-Homologous End Joining can lead to extra or lost pieces. So, it does not restore the DNA to its pre-break sequence. The resulting gene is often unusable and turned off.

Homology-directed repair (HDR)

Homology-directed repair uses a template (usually a sister chromatid) to repair the broken DNA accurately. That being said, HDR is naturally only active in the S and G2 phases when sister chromatids are active in the cell. This technique is inefficient and time-consuming. Nonetheless, it is accurate with less indel formation. In the lab, a template is provided for edits.

This is how it works:

  1. DNA on each side of the DSB is unwound and the 5’ sections are cut away, creating 3’ overhangs on each strand ends. The tails can be hundreds of base pairs long.
  2. The DNA invades the template (usually a sister chromatid). The invasion binds the overhang to one strand of the template, creating a D-Loop. Through reverse transcription, the DNA is repaired with edits based on the template provided.

🎉🎈🎊

CRISPR Methods and techniques

Crispr Knock-out (KO) and Knock-in (KI)

After a double-stranded break, the gene will most likely be repaired by Non-homologous end joining. Since NHEJ is error-prone, it usually results in random indels. The resulting gene is often unusable and turned off. (knock out).

After a double-stranded break, cells can also repair themselves through Homology directed repair, which allows researchers to insert a new piece of DNA (knock-in).

CRISPRa and CRISPRi

Every cell in our body has the same DNA. What makes cells unique are their expression levels of genes. CRISPR activation (CRISPRa) is used to increase gene expression, and CRISPR interference (CRISPRi) is used to reduce gene expression.

Similar method that regulates gene expression is CRISPRon and CRISPRoff. It uses the addition or the removal of methyl groups on DNA that can change the level of gene transcription (as a result affecting gene expression).

We now know that double-stranded breaks are dangerous to cells.

Is it possible that we can edit genomes without DSBs?

Kind of… 🤷‍♀️

Making gene editing safer with fewer undesirable effects sacrifices efficiency.

The HNH domain cuts the target strand, and RuvC cuts the non-target strand. By mutating one or both of these domains, we can make a single-strand break or not break at all. The RuvC mutant D10A produces a nick on the target strand, and the HNH mutant H840A generates a nick on the non-target strand.

CRISPRa and CRISPRi use a variant of Cas 9, dead Cas 9 (dCas9), where no breaks in the DNA are used. Dead Cas 9 cannot make double-stranded breaks, but they can patrol the cell looking for the gene of interest. The Cas 9 variant leads transcriptional effectors to the target DNA increasing or decreasing gene expression.

On the other hand, methods involving nicks — single strand breaks (i.e. base editing and prime editing), have one mutated domain. The single-strand break editing approach uses a Cas 9 variant called the Cas 9 nickase (Cas9n).

For increased efficiency, the Cas 9 nickase is fused to other domains. For instance, this paper describes a method where hRad51 mutants fused to the Cas9 nickase (D10A), minimize accidental insertions or deletions in homology-directed repair.

Base editing is an approach that uses Cas9n, through which scientists can make minute changes, often just individual bases.

The Cas9n is fused with a Deaminase domain that modifies the base. The use of a Cas 9 nickase increases the efficiency of the approach. Although base editing is safe, deaminase enzymes have some limitations. P.S. I will write an article soon just focussing on base editing.

The precise gene editing methods we’ve looked at including

(1) Canonical CRISPR/Cas 9 using Homology Directed Repair and

(2) Base editing

are amazing. But they are still not the best.

They are prone to off-target changes.

Canonical CRISPR/Cas 9 uses double-stranded breaks (dangerous), is inefficient, and can cause accidental indels.

Base editing is so far limited in what it can do.

The ideal approach for precise gene editing is to have a technique that does not use DSB but still enables insertions, deletions, or substitutions at an efficient rate.

Introducing…..🥁🥁🥁🥁

Prime editing

Prime editing doesn’t use double-stranded breaks

It doesn’t need donor DNA.

It is the ideal approach

It uses single-stranded breaks method with slight modifications in the guide RNA

Prime editing has three main parts:

  1. Cas9 Nickase
  2. pegRNA
  3. Reverse Transcriptase

Cas9 Nickase

For prime editing, Cas 9 nicks the target strand, therefore, mutating the RuvC domain (H840A).

The use of Cas9n solves the two biggest problems that canonical Cas 9 faces: (1) off-target editing and (2) accidental indels from double-stranded breaks. With the Cas9 nickase, off-target edits are less likely to occur, as only one strand is cut.

pegRNA

pegRNA is Prime Edited gRNA. It is a modified version of the traditional gRNA and is much more complicated.

Having the ability to stretch, pegRNA can link to both strands of the target DNA. This gives the Cas 9 complex much more stability and accuracy than the canonical CRISPR system. Rather than relying on Homology Directed Repair for somewhat accurate edits, a template is provided in the pegRNA.

Reverse Transcriptase

Viruses come in DNA and RNA forms. DNA viruses have an easier time taking over cells (like bacteria). But RNA viruses… doesn’t work like that.

RNA needs to be converted to DNA. To do that, viruses use enzymes called reverse transcriptase, which does the opposite of transcription. RNA viruses use this enzyme to convert their RNA to DNA to then take over a cell.

Prime editing uses the same enzyme.

How it works

Scientists engineer a pegRNA with the edits they need. Then, they attach it to a Cas9 nickase.

The pegRNA Cas 9 complex identifies the target DNA using its guide sequence.

The pegRNA latches onto both strands of the target DNA. One side, the traditional gRNA, latches onto the non-target strand.

Next, Cas 9 nickase makes a single-stranded break on the non-target strand at the PAM. The single-stranded break makes two flaps.

The other side of the pegRNA latches on the target strand (where the edit needs to be made). This site has a primer DNA (brown — which will initiate the reverse transcriptase).

Reverse transcriptase, after initiation, begins constructing DNA. It uses the reverse transcription template (pink) to build DNA. The edits that need to be made are found in the reverse transcription template.

After nicking the non-target strand, two “flaps” of DNA were formed. One flap is the edited portion (green), and the other flap is regular DNA (blue).

These two flaps will be competing for the space in the target strand. To fix this issue, another Cas 9 nickase is sent to cut the unedited flap. As a result, single-stranded break repair pathways can fit the edited portion into the target strand.

Oh no! 🙀

There is now a mismatch between the target strand (with our edits) and the non-target strand.

Cellular repair pathways try to fix this. They can change based on either strand so that base pairs match. To make sure no changes are made on the target strand (with our edits), another Cas 9 nickase is sent to damage the non-target strand.

The damage tricks repair pathways into thinking the non-target strand is unhealthy, forcing enzymes to change bases only on the non-target strand.

Done! 💃🎉🎈🎊 you just learned… PRIME EDITING!

Gene editing is now safer and simpler than ever before thanks to prime editing! Prime editing enables researchers to precisely cut DNA than ever.

Researchers are looking into how to make these Crispr and prime editing as efficient and safe as possible.

The future of the 🌎 is Crispr and prime editing 🧬!

Before you go…

You’ve learned the fundamentals of how gene editing with CRISPR works!

🎉🎊

We explored: (1) how CRISPR/Cas9 is used in bacteria (2) the Cas 9 protein (3) the two main double-strand break repair pathways within a cell (4) the CRISPR/Cas 9 gene editing system (5) Crispr methods like Crispr KO and CRISPRi (6) base editing (7) and finally PRIME EDITING.

if you enjoyed reading through this article feel free to connect with me on

LinkedIn

--

--

15 y/o student with a vision of making a difference in the world. Looking to learn at labs!