SARS-CoV-2 Structure and Infection Cycle

30 min Read

The severe acute respiratory syndrome (SARS) coronavirus, SARS-CoV-2, is responsible for COVID-19, the worst global pandemic in the 21st century to date, crippling health and economies worldwide. This has triggered a global effort in the scientific community to focus research efforts on SARS-CoV-2 in the hopes that better detection tools, vaccines, and antivirals can be developed to combat the virus. In this article, we will review what is currently known about the SARS-CoV-2 structure and mechanism of infection.

  • Diagnostic tests are used to detect active coronavirus infections and determine whether quarantine or isolation is needed.
  • Antibody tests determine the long-term immune status of a previously-infected individual.

1. Structure of SARS-CoV-2

The coronavirus SARS-CoV-2 is a large enveloped virus with a single-stranded, non-segmented, positive-sense RNA genome (1). The defining feature of coronaviruses is the crown-shaped spike projection (2) on the surface of the virion (Figure 1).

The SARS-CoV-2 structure is composed of:

  • 4 main structural proteins: spike (S), membrane (M), envelope (E), and nucleocapsid (N) proteins
  • 16 non-structural proteins
  • 5 to 8 accessory proteins (3,4) (Figure 2).

These proteins work collaboratively to infect, replicate, and produce structurally complete viral particles.

Figure 1 – Transmission electron micrograph of the Avian coronavirus capturing the characteristic crown of “spikes” on the virion surface. Photo credit: CDC/Dr. Fred Murphy.

Figure 2 – Basic structure of the SARS-CoV-2 virus responsible for the COVID-19 pandemic.

A. SARS-CoV-2 Structural Proteins

i. The Spike (S) protein is a trimeric protein found on the surface of the viral envelope that mediates host receptor recognition, cell attachment, and membrane fusion (2). The S protein has an extracellular N-terminus transmembrane domain anchored in the viral membrane, and an intracellular C-terminus (5). Each monomer has an S1 and S2 domain that forms a crown-like halo as observed by cryo-electron microscopy (2).

The N-terminal S1 subunit contains the receptor binding domain (RBD) that is responsible for virus-host cell binding (3). RBD is also the immunogenic core that is targeted by neutralizing antibodies (6). The C-terminal S2 subunit contains the fusion peptide (FP), HR1, HR2, Transmembrane (TM), and C-terminus domain, and facilitates membrane fusion and viral genome release (7).

Figure 3 – The basic structure of the Spike (S) protein trimer.

ii. The M glycoprotein is the most abundant structural protein in the coronavirus, which defines the shape of the viral envelope (1). It has three transmembrane domains that contain a T cell epitope cluster. Additionally, it has been shown to stabilize nucleocapsid proteins and promote viral assembly by binding to the N protein-RNA complex inside the internal virion particle (8). It is also regarded as a key factor to stimulate a virus-specific humoral response and neutralize the antibodies developed in the host.

iii. The E protein is the smallest SARS-CoV-2 structural protein, and it is involved in virus assembly, release and maturation. It is abundantly expressed inside the infected host cell, with the majority localized at the endoplasmic reticulum (ER), Golgi, and endoplasmic reticulum-Golgi intermediate compartment (ERGIC). The glycosylated viral E proteins will be inserted into the host cell membrane, acting as an ion channel to facilitate virus maturation (8).

iv. The N protein is the only structural protein that primarily interacts with the coronavirus RNA genome, binding and packing the helical nucleocapsid (8). This heavily phosphorylated N protein binds to the viral RNA genome and localizes in the ER-Golgi region of host cells, where it assists in the translation, transcription, and encapsulation of genomic material into viral particles (1).

Browse abm’s collection of SARS-CoV-2 receptors, available as recombinant proteins or in a variety of expression systems (lenti-, adeno-, aav, and more) or as stable cell lines. Browse abm’s collection of SARS-CoV-2 receptors, ACE2 and TMPRSS2, available as recombinant proteins or in a variety of expression systems (lenti-, adeno-, aav, and more) or as stable cell lines.
2. Life Cycle of SARS-CoV-2
A. Host Immune System Evasion, Cell Attachment and Entry

The pathogenicity of a virus is determined by its ability to evade the host immune system and enter host cells. Studies show that SARS-CoV-2 is more infectious, induces weaker host immune response (6), and possesses higher fusogenic activity than SARS-CoV and MERS-CoV (9). Functional differences in the SARS-CoV-2 S protein may explain its enhanced pathogenicity.

In a nutshell, the SARS-CoV spike protein is responsible for cell attachment, host cell entry, and is also targeted by the host’s immune system for removal. This is what is currently known about the mechanism of SARS-CoV-2 attachment and entry:

  • 1. The receptor binding domain (RBD) of the S1 subunit binds to angiotensin-converting enzyme (ACE2) receptors in the lower respiratory tract such as surfaces of alveolar and lung epithelial cells (10).
  • 2. The S protein is cleaved by the transmembrane serine protease 2 (TMPRSS2) (7)
  • 3. Cleavage induces a conformational change that exposes the hydrophobic fusion peptide (FP) of S2.
  • 4. S2 then inserts into the host membrane to initiate membrane fusion (5). The H1 and H2 helical fragments of the S2 subunit also form helical bundles that are essential for viral fusion (2).

Figure 4 – The RBD domain of the S protein binds to the host cell’s ACE2 enzyme. This triggers the host cell’s TMPRSS2 protease to cleave the S protein and initiate membrane fusion with the virion. (Step 1-4)

Browse abm’s collection of SARS-CoV-2 proteins, available as RNA, DNA, recombinant proteins or in a variety of expression systems (lenti-, adeno-, aav, and more) or as stable cell lines.

There are several unique features of the SARS-CoV-2 S protein that may explain its enhanced infectivity and specificity to the host cell’s ACE2 receptor. Firstly, SARS-CoV-2 has a unique multibasic motif “RRAR '' at the S1 and S2 boundary (Figure 3) that is absent in SARS-CoV and other bat SARS-like coronaviruses (11). The RRAR motif can be cleaved by proprotein convertases (PPC), such as furin (a protein abundant in the respiratory tract), that proteolytically pre-activates SARS-CoV-2 (12). The cleavage of this motif gives the host cell’s TMPRSS2 enzyme easier access to cleave the SARS-CoV-2’s S2 domain and initiate membrane fusion (6). As a result, furin processing of SARS-CoV-2 is shown to enhance infection and cell-cell fusion (13, 11).

Secondly, the RBD of SARS-CoV-2 is structurally very different from other CoVs (14). Unlike SARS-CoV, the immunogenic SARS-CoV-2 RBD hides in a “lying down” conformation, packed closely into the protein’s central cavity (15). It is believed that this helps the virus escape host immune system recognition and small molecule inhibitors, although it also reduces RBD’s binding efficiency to ACE2. PPC pre-activation cleaves and exposes the RBD, significantly enhancing binding affinity to ACE2 and therefore, cell fusion (6). In fact, SARS-CoV-2 S protein’s binding affinity has been shown to be as much as 10 to 20 times higher than that of SARS-CoV (16).

Aside from the furin and TMPRSS2 pathway for host cell entry, researchers have also explored the role of the Cathepsin L protease in virion endocytosis (see step 2b in Figure 6). This lysosomal enzyme can also cleave the S protein to induce a conformational change that triggers endosomal membrane fusion(20).

Figure 5 – SARS-CoV-2 can also be taken into the cell via endocytosis where it will be cleaved by the lysosomal enzyme, Cathepsin L, which activates endosomal membrane fusion.

Since the spike protein is targeted by the immune system and is responsible for cell attachment and membrane fusion, scientists are studying the protein extensively for COVID-19 research.

To learn more about the sequences and recombinant expression of S protein subunits, RBD, and the multibasic motif, browse abm’s collection of S protein plasmids, and S protein cell lines.
B. Replicase Protein Expression

  • 5. After cell attachment and membrane fusion, the viral ssRNA genome is released into the host’s cellular cytosol.
  • 6. Host ribosomes bind to the ssRNA and begin translation at the ORF1a and ORF1b region to produce 2 key polyproteins, pp1a and pp1ab (15), which are precursors for non-structural proteins (NSP). pp1ab is the longer form of pp1a, resulting from a frameshift during translation near the NSP10 region of the viral genome.
  • 7. Chymotrypsin-like protease (3CLpro/NSP5) and papain-like protease (PLpro/NSP3) within pp1a and pp1ab self-cleave and then proteolytically release other NSP proteins from the polyprotein (15).

Figure 6 – SARS-CoV-2 viral ssRNA is translated to produce pp1a and pp1ab, precursors for non-structural proteins (NSP). (Step 5-7)

Non-structural proteins (NSP) have a wide range of functions, including:

  • interaction with host cell factors to control the replication cycle,
  • arresting host protein translation
  • building the replicase-transcriptase complexes (RTC)
  • RNA synthesis and genome replication (4, 7,18).
The RNA-dependent RNA polymerase (NSP12/RdRp) is one of the key NSP enzymes of the RTC because it is required for replication and transcription of the viral RNA genome. NSP3, NSP5, and NSP12 are all popular targets for drug development and therapeutic research.

Figure 7 – The SARS-CoV-2 genome encodes for 4 main structural proteins, 16 non-structural proteins (NSPs), and 5 to 8 accessory proteins.

C. Replication, Transcription and Translation

  • 8. The replicase-transcriptase complex (RTC) and RdRp makes full length genomic RNAs and short subgenomic mRNAs.
  • 9a. RTC transcribes the sense ssRNA genomic strand into a minus-sense RNA. The minus-sense RNA can be replicated into positive-sense RNA that are then packaged into new viral progenies as genomic RNA (18).
  • 9b. Alternatively, the negative-sense RNA is used as a template for subgenomic mRNA production by a process called discontinuous transcription (19). RdRp can initiate transcription at various locations along the negative-sense RNA and create mRNA that code for various viral proteins needed for packaging new virus particles.
  • 10. Structural proteins, such as M, S, and E, are translated at the rough endoplasmic reticulum (RER), and subsequently move to the endoplasmic reticulum-Golgi intermediate compartment (ERGIC) for post-translational modification (18). The N protein is translated in the cytoplasm, which will directly join the replicated viral genome and then move to the ERGIC (8).

Figure 8 – The replicase-transcription complex (RTC) transcribes sense ssRNA into minus-sense RNA which will be either replicated into positive-sense RNA that will be packaged into new viruses or it will be transcribed into mRNA and used to produce viral proteins needed for packaging. (Step 8-10)

D. Assembly, Release and Maturation

The assembly of the integrated viral genome and structural proteins happens at the ERGIC compartment. The virus is released from the host cell primarily via budding, exocytosis or cell death.

  • 11. In order to release the virion, the N protein interacts with the M proteins in the ERGIC compartment to help orient the virion properly (8). The glycosylated E protein is also inserted into the host cell membrane, which then acts as an ion channel for virus maturation.
  • 12. The virus matures at the ERGIC and finally gets released via the constitutive exocytic pathway out of the host cell where it can infect and replicate itself in other host cells

Figure 9 – Viral RNA and viral proteins are assembled and packed and the virion is matured at the ERGIC before it is exocytosed. (Step 11-12)

Check out our S Protein Pseudotype Custom Lentivirus Service for generating lentivirus pseudotyped with the S Protein for your research needs.

Figure 10 – The SARS-CoV-2 infection and replication cycle.

  • Astuti, I., & Ysrafil. (2020). Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2): An overview of viral structure and host response. Diabetes & Metabolic Syndrome Clinical Research & Reviews, 14(4), 407-412. doi:10.1016/j.dsx.2020.04.020
  • Huang, Y., Yang, C., Xu, X., Xu, W., & Liu, S. (2020). Structural and functional properties of SARS-CoV-2 spike protein: Potential antivirus drug development for COVID-19. Acta Pharmacologica Sinica, 41(9), 1141-1149. doi:10.1038/s41401-020-0485-4
  • Jiang S., Hillyer C., Du L. Neutralizing antibodies against SARS-CoV-2 and other human Coronaviruses. Trends Immunol. 2020
  • Chen, Y., Liu, Q., & Guo, D. (2020). Emerging coronaviruses: Genome structure, replication, and pathogenesis. Journal of Medical Virology, 92(4), 418-423. doi:10.1002/jmv.25681
  • Bosch BJ, van der Zee R, de Haan CA, Rottier PJ. The coronavirus spike protein is a class I virus fusion protein: structural and functional characterization of the fusion core complex. J Virol. 2003;77:8801–11.
  • Shang, J., Wan, Y., Luo, C., Ye, G., Geng, Q., Auerbach, A., & Li, F. (2020). Cell entry mechanisms of sars-cov-2. Proceedings of the National Academy of Sciences, 117(21), 11727-11734. doi:10.1073/pnas.2003138117
  • Poduri, R., Joshi, G., & Jagadeesh, G. (2020). Drugs targeting various stages of the SARS-CoV-2 life cycle: Exploring promising drugs for the treatment of covid-19. Cellular Signalling, 74, 109721-109721. doi:10.1016/j.cellsig.2020.109721
  • Malik, Y. A. (2020). Properties of coronavirus and SARS-CoV-2. Malaysian Journal of Pathology, 42(1), 3-11.
  • Papa, G., Mallery, D. L., Albecka, A., Welch, L. G., Cattin-Ortolá, J., Luptak, J., Paul, D., McMahon, H. T., Goodfellow, I. G., Carter, A., Munro, S., & James, L. C. (2021). Furin cleavage of SARS-CoV-2 Spike promotes but is not essential for infection and cell-cell fusion. PLOS Pathogens, 17(1), e1009246.
  • Xu H., Zhong L., Deng J., Peng J., Dan H., Zeng X. High expression of ACE2 receptor of 2019-nCoV on the epithelial cells of oral mucosa. Int J Oral Sci. 2020;12:1–5.
  • Papa, G., Mallery, D. L., Albecka, A., Welch, L. G., Cattin-Ortolá, J., Luptak, J., Paul, D., McMahon, H. T., Goodfellow, I. G., Carter, A., Munro, S., & James, L. C. (2021). Furin cleavage of SARS-CoV-2 Spike promotes but is not essential for infection and cell-cell fusion. PLOS Pathogens, 17(1), e1009246.
  • Laporte, M., Raeymaekers, V., Van Berwaer, R., Vandeput, J., Marchand-Casas, I., Thibaut, H. J., Van Looveren, D., Martens, K., Hoffmann, M., Maes, P., Pöhlmann, S., Naesens, L., & Stevaert, A. (2021). The SARS-CoV-2 and other human coronavirus spike proteins are fine-tuned towards temperature and proteases of the human airways. PLOS Pathogens, 17(4), e1009500.
  • Bestle, D., Heindl, M. R., Limburg, H., Van Lam van, T., Pilgram, O., Moulton, H., Böttcher-Friebertshäuser, E. (2020). Tmprss2 and furin are both essential for proteolytic activation of sars-cov-2 in human airway cells. Life Science Alliance, 3(9). doi:10.26508/lsa.202000786
  • e1009500.
  • Hoffmann, M., Hofmann-Winkler, H., & Pöhlmann, S. (2018). Priming Time: How Cellular Proteases Arm Coronavirus Spike Proteins. Activation of Viruses by Host Proteases, 71–98.
  • Naqvi, A. A. T., Fatima, K., Mohammad, T., Fatima, U., Singh, I. K., Singh, A., Atif, S. M., Hariprasad, G., Hasan, G. M., & Hassan, M. I. (2020). Insights into SARS-CoV-2 genome, structure, evolution, pathogenesis and therapies: Structural genomics approach. Biochimica et Biophysica Acta (BBA) - Molecular Basis of Disease, 1866(10), 165878.
  • Wrapp, D. et al. Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation. Science 367, 1260–1263. (2020).
  • Poduri, R., Joshi, G., & Jagadeesh, G. (2020). Drugs targeting various stages of the SARS-CoV-2 life cycle: Exploring promising drugs for the treatment of covid-19. Cellular Signalling, 74, 109721-109721. doi:10.1016/j.cellsig.2020.109721
  • V’kovski, P., Kratzel, A., Steiner, S., Stalder, H., & Thiel, V. (2020). Coronavirus biology and replication: implications for SARS-CoV-2. Nature Reviews Microbiology, 19(3), 155–170.
  • Sawicki, S. G., & Sawicki, D. L. (1995). Coronaviruses use Discontinuous Extension for Synthesis of Subgenome-Length Negative Strands. Advances in Experimental Medicine and Biology, 499–506.
  • Gomes, C. P. et al. (2020). Cathepsin L in COVID-19: From Pharmacological Evidences to Genetics. Front. Cell. Infect. Microbiol.