What the heck is Comirnaty? Or Tozinameran?

Well, for sure it is a rough start into 2021. But the Covid-19 vaccinations, however slow they are in many places, give some hope. I was asked to write something on this subject. After all, the vaccines developed so rapidly by Moderna, BionTech and Pfizer are RNA vaccines. So are they giving us a gene? And what does that mean?

Tozinameran is the generic name of a modRNA vaccine against Covid-19

The aim of this article is to focus on the vaccine that BioNTech developed in collaboration with Pfizer, literally in a “lightspeed” operation. This is the name of the program that was launched on January 10, 2020, with the publication of the genome of the new SARS-CoV2 by the Chinese Center for Disease Control and Prevention. The first approvals were granted in December 2020. In the course of this, the child also gets a real name. Because BNT162b2, as the vaccine was called during development, is really not a reasonable name, except maybe for a second child of Elon Musk and Grimes. BioNtech and Pfizer, however, will from now on sell their product as a Comirnaty. In addition to the brand name, an international non-proprietary name (INN) is also assigned to each newly approved medical product. This generic name for the Covid-19 vaccine is Tozinameran.

So what is Tozinameran? The active ingredient is known to be RNA, more precisely it is messenger RNA (mRNA). mRNA is the messenger; It is a copy of a gene in the DNA that migrates from the cell nucleus into the cytoplasm, where it serves as a template for the construction of a protein. While DNA is made up of the bases A, C, G and T, in RNA A, C, G and U are used. But if you look at the sequence of Tozinam eran, it looks like this: GAGAAΨAAAC ΨAGΨAΨΨCΨΨ CΨGGΨCCCCA CAGACΨCAGA GAGAACCCGC CACCAΨGΨΨC GΨGΨΨ… (the first 65 of a total of 4284 nucleotides are shown; the entire code can be found here at WHO). Instead of the expected U one finds this strange phi, Ψ. What’s going on here?

Conventional uridine (left) as a component of RNA and pseudouridine (right). Pseudouridine (Ψ) synthase can convert uridine into pseudouridine after transcription.

The Phi in the RNA code stands for pseudouridine. Pseudouridine is a modified uridine in which the uracil is not bound to the ribose with the usual C-N bond, but instead with a C-C bond. Thus, Tozinameran is a so-called nucleoside modified mRNA, or modRNA for short. This is because our body actually has a number of immune defense mechanisms against foreign RNA. However, these mechanisms react only to a very limited extent to the modified RNA, which is why most vaccinated persons do not have the strong immune response that would be triggered by normal RNA.

How tozinameran ends up in our cells and what happens there

Nevertheless, it must be ensured that our cells take up the RNA vaccine at all, in order for the encoded protein to be built inside the cell. For this purpose the Tozinameran RNA is packed in liposomes, i.e. small fat droplets. These nanoparticles can fuse with cell membranes after the vaccine is given intramuscularly. Above all, dendritic cells and macrophages of the immune system take up these liposomes together with the foreign RNA. It is mainly these cell types that ultimately produce the encoded protein. But what protein is that actually?

Rendering of coronaviruses with the spike protein in green.

Encoded on the Tozinameran mRNA is the spike protein of the new coronavirus SARS-CoV2. The spike protein sits in the fatty envelope of the coronavirus and is responsible for binding to the ACE2 receptor and thereby for merging with the cell membrane of the host cells. However, if you look at the sequence in the vaccine, you can see that two of the 1273 amino acids in spike protein have been changed. At positions 986 and 987, the amino acids lysine and valine encoded in the virus sequence have been exchanged for two prolines. Why?

Our cells should only build the spike protein and not the rest of the coronavirus envelope. Now it turned out, however, that the spike protein, when it does not sit in the virus envelope, folds itself over, i.e. takes on a different spatial structure. Since this structure does not correspond to that of the spike protein on the virus surface, it would be stupid if we developed immunity against this refolded form. Fortunately, a US research group was able to show already in 2017 that the exchange of the lysine at position 986 and the valine at position 987 by the strongly structure-breaking proline prevents this folding and thus renders the shape of the spike protein to remain similar as if it were sitting in the virus envelope.

Sequence optimization by BioNTech and reverse engineering by citizen science

If one compares the sequences of Tozinameran and the portion of the viral genome on which the spike protein is encoded, one will find that the sequences are highly different. How can that be if I have just explained that only two amino acids have been exchanged? The answer lies in a property of the genetic code that biologists call degeneration. As I have already explained in more detail elsewhere, there are only 20 amino acids in our proteins, whereas on DNA and RNA there are 64 possibilities each to code an amino acid. Because three of the four different bases A, C, G and T / U always determine an amino acid. As a result, the last digit in such a triplet often no longer plays a role, since, for example, CCA, CCC, CCG and CCT / U all code for the amino acid proline. Thus, for instance, CCA in an RNA can simply be rewritten as CCG and it does not change anything in the encoded protein; one speaks of a “silent mutation“. But why should you do that?

The code sun is read from the inside out: in the innermost ring you can read the first digit of an RNA triplet, in the middle ring the second and in the third ring the third. Note that the identity of the third base often no longer plays a role as far as the amino acid encoded by the triplet is concerned (on the outside).

Amazingly, it seems that GC-rich mRNA is simply a lot more durable and translated into protein more efficiently than AT-rich mRNA (see e.g. here). Most of the changes in the sequence of Tozinameran compared to the sequence in the SARS-CoV2 genome can therefore be traced back to the fact that the developers tried to increase the GC content of the RNA through silent mutations. Another approach to make the sequence more efficient through silent mutations is to orientate itself on what is known as human Codon Usage. Because every organism uses different synonymous triplets with different frequencies in the coding genome. Optimizing DNA and RNA sequences according to these criteria is quite a sport in itself, which not only laboratory scientists seem to play:

Bert Hubert has already described a lot of what I am writing here in his blog post, which is extremely well worth reading. His contribution encouraged readers to optimize the RNA sequence of the spike protein themselves with the help of suitable software and to compare the result for identity with the sequence of Tozinameran. In the second part of his blog article you can see the leaderboard, which is currently (as of January 2, 2021, 10 p.m.) with 91.08% Erik Brauer, with the help of the DNAChisel algorithm.

A Conclusion and the Question of the “Gene of the Week”

In terms of context, The Gene of the Week, this article is quite special. As the Gene of the Week, one could simply consider the section of the RNA genome of SARS-Cov2 that encodes the spike protein. Or the synthetic gene, the core of Tozinameran. We have seen that the two sequences are quite different, not only in the pure sequence of the nucleotides but – and this is very unusual – even in their chemical nature. And yet both nucleic acids encode almost the same protein.

The development of such a vaccine in such an incredibly short time was only possible thanks to our deep understanding of the way DNA and RNA store information. We can hope that this vaccine represents a sustainable and effective measure against the Covid-19 pandemic. And what’s more, this first approval of an RNA vaccine should mean that we will be able to react even faster to new, but also well-known infectious diseases in the future. Because the RNA in the liposomes can be exchanged relatively easily.

Of course, future RNA vaccines would also have to be tested for their safety, as was extensively done with Tozinameran, and is currently still being done (phase III studies will run until August 2021). Anyone who has doubts as to whether it is possible to estimate “long-term effects” in the shortened time until the recent emergency approval of Tozinameran, should be referred to watch Martin Moder’s video on the subject (in German).

You may also like...

1 Response

  1. Avatar Ben says:

    Thanks for this. It is really nice that you make an effort to explain the fine details of the mRNA vaccine. However, to my knowledge the modified base is m1Ψ = 1-methyl-3′-pseudouridine and not just pseudouridine. Interestingly, the WHO structure of the Biontech vaccine, while being correct in naming this based, also contains an error. The artificial base structure there is wrong.

Leave a Reply

Your email address will not be published. Required fields are marked *

This website is using cookies to improve the user-friendliness. You agree by using the website further. Privacy policy