Coronavirus Host Range Expansion and Middle East Respiratory Syndrome Coronavirus Emergence: Biochemical Mechanisms and Evolutionary Perspectives.

Coronavirus Host Range Expansion and Middle East Respiratory Syndrome Coronavirus Emergence: Biochemical Mechanisms and Evolutionary Perspectives.

Products Related to West NileDengueMalariaT.BChikungunyaSarsZika , COVID-19

Product# 17011 rSars Spike(S) Protein (EUK)

Product# 63001 Recombinant West Nile Envelope E Protein (E.coli)

Product# 17101 Murine Anti- SARS Monoclonal Antibody


Coronaviruses have frequently expanded their host range in recent history, with two events resulting in severe disease outbreaks in human populations. Severe acute respiratory syndrome coronavirus (SARS-CoV) emerged in 2003 in Southeast Asia and rapidly spread around the world before it was controlled by public health intervention strategies. The 2012 Middle East respiratory syndrome coronavirus (MERS-CoV) outbreak represents another prime example of virus emergence from a zoonotic reservoir. Here, we review the current knowledge of coronavirus cross-species transmission, with particular focus on MERS-CoV. MERS-CoV is still circulating in the human population, and the mechanisms governing its cross-species transmission have been only partially elucidated, highlighting a need for further investigation. We discuss biochemical determinants mediating MERS-CoV host cell permissivity, including virus spike interactions with the MERS-CoV cell surface receptor dipeptidyl peptidase 4 (DPP4), and evolutionary mechanisms that may facilitate host range expansion, including recombination, mutator alleles, and mutational robustness. Understanding these mechanisms can help us better recognize the threat of emergence for currently circulating zoonotic strains.


Coronaviruses are a diverse family of viruses that infect a wide range of avian and mammalian hosts. Although bats, rodents, and birds act as the natural reservoir species for many coronaviruses (13), host range expansion into other species has been prevalent over the course of their evolutionary history. Known human coronaviruses likely originated as zoonotic pathogens that underwent host range expansion. These include coronaviruses associated with mild respiratory disease, such as HCoV-229E, HCoV-HKU1, HCoV-NL63, and HCoV-OC43, as well as strains that cause severe disease, including severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV). These latter two viruses combined have resulted in over 1,100 deaths, and MERS-CoV is still circulating in the human population, causing heightened concern due to the lack of vaccines and therapeutics. The threat to public health caused by the emergence of these highly pathogenic strains into humans draws attention to the importance of understanding both the biochemical and evolutionary mechanisms of coronavirus host range expansion.

Coronaviruses are enveloped, single-stranded, positive-sense RNA viruses. Genomes contain a 5′ cap and 3′ poly(A) tail and are divided into nonstructural protein genes and structural and accessory protein genes. The core structural proteins include spike (S), envelope (E), matrix (M), and nucleocapsid (N) proteins. Although accessory proteins vary among coronaviruses and may also include some strain-specific structural glycoproteins, the order of the structural protein genes is highly conserved as S, E, M, and N. The ∼180-kDa spike glycoprotein mediates entry into the host cell and surrounds the virus particle, yielding a crown-like appearance. Coronaviruses utilize a variety of cellular proteins as receptors (4, 5); cleavage of the spike protein is crucial for mediating virus-host membrane fusion and subsequent entry into the cell.

High mutation rates seem likely to play a role in host range expansion. RNA viruses in general have inherently higher mutation rates than DNA viruses due to the decreased fidelity of the RNA-dependent RNA polymerase (RdRp) (6). This allows them to evolve at a faster rate, particularly when coupled with their large population sizes and short generation times. In fact, RNA viruses are responsible for the majority of high-profile viral emergence events into the human population within the past few decades (7). Higher mutation rates allow virus populations to produce higher levels of genetic variation, which not only allows for a more diverse pool of phenotypes that natural selection can act upon but also increases the probability of novel phenotypes. One of these novel phenotypes can be the ability to infect a new host. Changes can occur either within the virus spike protein to facilitate compatibility with a new host cell receptor or elsewhere in the viral genome to allow it to overcome alternate species-specific barriers. Some emerging coronaviruses have a broad host range, demonstrating an enhanced capacity to overcome host range expansion barriers and adapt to new host species. This increased capacity may be related to key characteristics that allow viral populations to produce novel variants, such as through recombination, mutator alleles, or mutational robustness, as discussed below.

Whereas high mutation rates likely facilitate host range expansion, it is less clear whether features unique to coronaviruses also play a role in the high frequency of host range expansion seen in this family. Two such unique features include their uncharacteristically large genome sizes (28–32 kb) and the presence of a proofreading mechanism. Coronaviruses are the only RNA viruses that have evolved a mechanism for proofreading their genomes. The nsp14 protein, known as ExoN, complexes with nsp10 to mediate a robust 3′-to-5′ exoribonuclease activity (8). This activity is similar to the proofreading activity of DNA polymerases (9), as highlighted by the conservation of the DEDD superfamily motif, a hallmark of exonuclease activity among DNA organisms, in nsp14. When this motif in nsp14 is mutated, the virus has a 15- to 20-fold increase in mutation rate (1012). The increased fidelity provided by the nsp10-nsp14 complex, and the consequently lower relative mutation rate, appears to have allowed wild-type coronaviruses to escape error catastrophe and expand their genomes to almost double the size of the next-largest RNA virus genomes (13). Gene acquisition has been found to occur through recombination (14, 15), gene duplication and paralogous gene evolution (16, 17), and de novo generation by utilizing overlapping reading frames (18). Understanding the mechanisms of genome expansion and the functions of accessory genes will help elucidate whether these genes facilitate coronavirus host range expansion events.

Here we present a review of coronavirus host range expansions, particularly for MERS-CoV emergence into humans. We summarize current work that has revealed interactions between MERS-CoV and its host cell receptor are major determinants of mammalian cell permissivity for MERS-CoV. Additionally, we describe three evolutionary mechanisms that may promote host range expansion: recombination, mutator alleles, and mutational robustness. We discuss the relevance of these mechanisms for RNA viruses generally, and for coronaviruses such as SARS-CoV and MERS-CoV in particular. By improving our understanding of these mechanisms, we can increase the potential to predict which virus strains will be most likely to emerge into humans next.


Coronavirus Host Range Expansion

Coronaviruses infect a wide range of species, ranging from birds to mice to pigs to humans (Figure 1). Most human coronaviruses are hypothesized to have originated from bats, although HCoV-OC43 and perhaps HCoV-HKU1 deviate from this pattern. HCoV-OC43 likely emerged from a bovine reservoir species (19), although the original host of this lineage and HCoV-HKU1 may have been a murine species (20). Some coronaviruses appear to be generalists, capable of infecting many different orders of mammals. For example, Betacoronavirus 1 has been detected in dogs, humans, and numerous ungulate species (2123). Other coronaviruses have been detected in only a single mammalian order, such as the many SARS-like coronaviruses that have been found only in bats (24, 25). With a focus centered on bats as reservoirs, metagenomics analyses have found varying levels of coronavirus diversity in bat populations in North America (26) and China (27, 28), as well as detecting individual strains in bat populations worldwide (reviewed in 29). Novel coronaviruses continue to be discovered in bat populations globally; recent examples include samples from Mexico (30), Brazil (31), and South Africa (32). A recent estimate of viral diversity in the bat species Pteropus giganteus from Bangladesh identified 55 viruses, four of which were coronaviruses (33). The prevalence of many emerging viruses in bats has been attributed to bat diversity (species richness and ecology), immunology, physiology, ability to traverse wide geographic regions along with seasonal migrations, and high-density roosting behavior (34, 35).

Figure 1 


A number of established human coronaviruses have been circulating in the human population for hundreds of years (36, 37). At present, most strains cause only mild respiratory symptoms. However, both SARS-CoV and MERS-CoV recently emerged into the human population to cause severe disease. Before infection and transmission were controlled, SARS-CoV infected over 8,000 people, with a 9.6% mortality rate (38). Whereas SARS-CoV is very closely related to bat coronaviruses, with up to 92% overall nucleotide sequence identity (1), no virus identical to SARS-CoV has yet been isolated from bats. However, analysis of the SARS-CoV receptor, angiotensin-converting enzyme 2 (ACE2), showed recurrent positive selection (dN/dS > 1) on the bat ACE2 gene (Figure 2a), which suggests that a SARS-like coronavirus was circulating in the bat population for a long period of time before jumping into humans (39). This led to the hypothesis that, although bats served as the original reservoir species, SARS-CoV emergence into humans may have been facilitated by an intermediate host, such as palm civets, which were prevalent in the marketplaces where SARS-CoV originated (reviewed in 40). This hypothesis was supported by the fact that, initially, no bat coronaviruses were found to utilize ACE2 or any ACE2 ortholog (41). However, a recent study isolated a SARS-like coronavirus that is able to utilize human, civet, and Chinese horseshoe bat ACE2 for cell entry (27). This virus (bat SL-CoV-WIV1) offers strong evidence that SARS-CoV originated from a bat reservoir and suggests that an intermediate host may not have been required to facilitate adaptation to human ACE2. Instead of providing the necessary selective pressure for SARS-CoV spike adaptation, the civet may have played a crucial role in an epidemiological and ecological context by amplifying the virus and placing it in close proximity to humans (42). Further research will be needed to determine the precise mutational and cross-species path of SARS-CoV, and these data should provide crucial insight into how viral adaptation facilitates the emergence of new human pathogens.

Figure 2 


Emergence of MERS-CoV

In 2012, MERS-CoV emerged into the human population. As of June 2015, there were 1,227 laboratory-confirmed cases of MERS-CoV infection, with at least 449 related deaths, resulting in a 37% mortality rate (43). MERS-CoV is grouped phylogenetically into the C betacoronavirus clade along with the bat coronaviruses BtCoV-HKU4 and BtCoV-HKU5 (Figure 1) (29, 44). Unlike SARS-CoV, MERS-CoV utilizes dipeptidyl peptidase 4 (DPP4) as an entry receptor (45). Among the coronaviruses with characterized entry receptors, only MERS-CoV and BtCoV-HKU4 have been found to utilize DPP4 (46, 47). Whereas MERS-CoV binds to human DPP4 (hDPP4) more efficiently than to bat DPP4 (bDPP4), BtCoV-HKU4 can use both hDPP4 and bDPP4 efficiently (47), suggesting that MERS-CoV has adapted specifically to hDPP4. BtCoV-HKU4 replication in human cells may be restricted by the lack of appropriate entry proteases as well (47). Despite BtCoV-HKU5's sequence similarity to MERS-CoV, it is unable to utilize DPP4 for cell entry (46, 47), and its actual receptor for entry remains unknown.

The similarity between MERS-CoV, bat coronaviruses BtCoV-HKU4 and BtCoV-HKU5, and other group 2c bat coronaviruses (Figure 1) suggests that MERS-CoV originated from bats. As with SARS-CoV, however, the detection of a full-length MERS-CoV sequence in bat populations has been elusive, although one study analyzing bats for the presence of MERS-CoV did identify a 190-nt fragment from the RdRp gene with 100% sequence identity to MERS-CoV (48). Like ACE2 (39), bat DPP4 genes have been found to be under strong positive selection (Figure 2b), suggesting that DPP4-utilizing coronaviruses have been circulating in bats for a long time (49). It is unknown whether this signal of positive selection results from the MERS-CoV progenitor, from another DPP4-utilizing bat coronavirus, such as BtCoV-HKU4, or from a noncoronavirus that also utilizes DPP4. Bat coronaviruses with high sequence similarity to MERS-CoV have been detected in geographic regions very distant from its original emergence location in Saudi Arabia. RdRp gene sequences with 96.5% and 99.6% amino acid identity to MERS-CoV were detected from bat populations in Mexico (30) and South Africa (32), respectively. Although full-length sequences will be needed to define the exact phylogenetic similarities across the group 2c strains, these discoveries emphasize the continued importance of metagenomics analyses of bat viromes in widespread geographic locations.

The field's inability to identify MERS-CoV sequences with high frequency in bat populations suggests that an intermediate host may have facilitated the host range expansion event. However, it took nearly a decade to identify bat coronaviruses with high homology to SARS-CoV (27), revealing that sampling is the greatest limiting factor. Still, the origin of MERS-CoV in Saudi Arabia led to the detection of serum against MERS-CoV (or MERS-like coronaviruses) in local camel populations (50). Since the 1960s, major changes in commercial camel practices have led to large numbers of camels living in close proximity to human populations (51). Camels have been targeted as the most probable intermediate host, though this possibility remains controversial (52). A significantly greater seroprevalence of MERS-CoV antibodies has been found in individuals exposed to camels than in the general population (53). Additionally, sequences nearly identical to MERS-CoV have been isolated from a number of camels in Qatar (54) and Saudi Arabia (55, 56). Sequences obtained from dromedary camels in Saudi Arabia showed variants within single samples; one variant was found within the receptor-binding domain (RBD) of the spike protein (A520S); the remaining variants were outside of the RBD (55). The functional significance of these variants is unknown, and more data are needed to reveal which mutations were important for allowing MERS-CoV to be successful in humans.



As detailed below, in vitro and in vivo studies show that a major limitation on the permissivity of mammalian cells for MERS-CoV is the functional interaction of the virus with its cell surface receptor, DPP4 (45). DPP4 is a ubiquitously expressed cell surface protease that has a catalytic role in selectively removing N-terminal dipeptides from certain proteins. It has been well studied due to the role it plays in glucose metabolism, immune responses, adhesion, and apoptosis (57). Whereas MERS-CoV can enter cells using DPP4 from humans, nonhuman primates, bats (58), camels, horses, and to a lesser extent goats (59, 60), it is unable to enter cells using DPP4 from traditional small animal model species such as mice (61, 62), ferrets (63), and hamsters (58, 64). This species restriction prevents the adequate study of MERS-CoV pathogenesis and limits the development of vaccine strategies or alternate therapeutics. Below we explore present evidence on whether this species restriction is based on host cell receptor interactions with the MERS-CoV RBD or on other species-specific restriction factors.

The crystal structure of human DPP4 (hDPP4) complexed with the MERS-CoV RBD (65, 66) shows that blades 4 and 5 of the DPP4 N-terminal β-propeller domain primarily interact with the MERS-CoV RBD (Figure 3a). Cockrell et al. (61) found that introducing part of this region (residues 279 to 346) from hDPP4 into mouse DPP4 (mDPP4) resulted in successful infection in vitro. This result is supported by a similar experiment conducted using ferret DPP4 (fDPP4), in which swapping residues 247 to 504 of hDPP4 into the fDPP4 backbone supported MERS-CoV infection (63). In addition, transfecting hDPP4 into mouse or hamster cell lines resulted in successful infection (61), suggesting that it is host cell receptor interactions that restrict MERS-CoV host range, rather than the presence of other host-specific restriction factors. This conclusion is further supported by in vivo studies, with the first potential MERS-CoV mouse model showing that transient adenovirus-mediated hDPP4 expression in mice results in susceptibility to MERS-CoV (67). Since then, a transgenic mouse with global expression of hDPP4 has been produced that also yielded successful MERS-CoV infection (68). However, this model results in high viral titers in most organs, including the brain, suggesting that additional improvements are needed to more faithfully phenocopy the human disease model. In addition, the enzymatic activity of DPP4 can have detrimental effects, particularly when the protein is overexpressed (69), and the impact of these effects in the transgenic mouse model should be explored.

Figure 3 


Identifying which DPP4 residues mediate MERS-CoV permissivity has become a priority for informing the development of additional small animal models and potential therapeutics. Although the aforementioned swaps of large regions of DPP4 indicate that DPP4 orthologs can act as general scaffolds to support MERS-CoV infection, they do not reveal which residues are the most important. To address this, a mutagenesis study identified several key residues mediating the binding between the MERS-CoV RBD and hDPP4 (70). These include hDPP4 residues 267 and 336, which contribute to a positively charged patch on blade 4 (Figure 3b), and residues 294 and 295, which form an important hydrophobic region on blade 5 (Figure 3c). This study suggests that these two regions play an important role in mediating MERS-CoV entry. Experiments with DPP4 orthologs support this finding. For mDPP4, simultaneous substitutions at two residues were capable of mediating permissivity to MERS-CoV: residues 288 and 330 (aligning to 294 and 336 in hDPP4) (61), which lie within the aforementioned hydrophobic and positively charged regions, respectively. Similar experiments performed with hamster DPP4 found that changes at five residues can mediate permissivity (64); insertion of the human amino acids at five sites within blades 4 and 5 (Figure 3d) allowed MERS-CoV to successfully enter and replicate in cells expressing the humanized hamster DPP4. These studies again support the importance of residues on both blades 4 and 5 of DPP4 for mediating permissivity.

Residue 330 in mDPP4 is part of a putative glycosylation site, which is absent in hDPP4. Removal of this glycosylation site was particularly influential in conferring MERS-CoV permissivity (71). Glycosylation may act as a broader mechanism of resistance to MERS-CoV infection. The nonpermissive hamster DPP4 has the same putative glycosylation site as mDPP4, and fDPP4 has a putative glycosylation site just upstream of the site in mDPP4 (see 71, figure 1a). There is precedent for glycosylation changing the permissivity of host cells to other coronaviruses. For example, modifying a glycosylation site in rat ACE2, combined with a point mutation, allows SARS-CoV to utilize it as an efficient receptor (72). Additionally, HCoV-229E utilization of its receptor human aminopeptidase N (hAPN) can be abolished by the insertion of a glycosylation site into hAPN (73). The influence of receptor glycosylation on the host range of a number of coronaviruses raises the possibility that these modifications may be linked to viral selective pressures.

Because BtCoV-HKU4 also utilizes DPP4 as an entry receptor, we can compare its RBD to that of MERS-CoV. The spike proteins of MERS-CoV and BtCoV-HKU4 show 67% amino acid identity (accession numbers AHX00731.1 and YP_001039953.1, respectively, aligned using Vector NTI). MERS-CoV RBD residues important for facilitating the interaction between the virus and DPP4 include residues L506, W553, and V555 [which form a hydrophobic core that interacts with hDPP4 residue L294 (Figure 4a)], as well as residue Y499 (which engages the hDPP4 residue R336) (66). The amino acid identities at these locations in the RBD are partially conserved in BtCoV-HKU4 (Y503, L510, L558, and I560) (Figure 4b,c), suggesting that utilization of hDPP4 and bDPP4 is robust to variation for some of these key interactions. Further studies can determine whether or not this characteristic is sufficient to allow BtCoV-HKU4 to utilize other host species receptors with the same efficiency as MERS-CoV. It seems likely that additional metagenomics studies will reveal clusters of MERS-like group 2c strains that are more closely related to either BtCoV-HKU4 or MERS-CoV (Figure 1) and that will also utilize DPP4 receptors for entry. Thus, the structural mechanisms regulating DPP4 species-specificity and usage will become more clear over the next few years.

Figure 4 


Although the interactions between the MERS-CoV RBD and DPP4 are important, other host factors also likely influence productive MERS-CoV infection. Binding to DPP4 is the first step in entry; however, host cell proteases also play a crucial role by proteolytically activating the spike and facilitating spike fusion. Coronaviruses have evolved the ability to use a variety of host proteases to process the spike protein. The SARS-CoV spike can be activated by type II transmembrane serine protease (TMPRSS2), cathepsin L, trypsin, elastase, and human airway trypsin-like protease (HAT) (reviewed in 74). Similarly, the MERS-CoV spike can be processed by TMPRSS2 at the cell surface and by cathepsin L in the endosome (75, 76). Furin can also cleave the MERS-CoV spike protein, with increased furin expression resulting in enhanced susceptibility to MERS-CoV (77, 78). This finding has important host range implications based on the presence or absence of these proteases in putative host species and the potential differences in cleavage site recognition that might occur between them. In fact, expressing human TMPRSS2 in vitro was found to enhance MERS-CoV spike–mediated pseudovirus entry, but did not enhance entry by BtCoV-HKU4 spike pseudoviruses (47). Additionally, exogenous proteases are not essential for MERS-CoV entry into human cells (59) but are required for BtCoV-HKU4 spike–mediated entry into cells (47). This suggests that MERS-CoV has adapted to human proteases and also has evolved the ability to enter cells using atypical proteases or in a protease-independent manner. The mechanisms of MERS-CoV entry and the role of host proteases are ongoing areas of investigation, and further knowledge can help us understand the cross-species transmission of MERS-CoV.

Evolutionary Implications of Virus–Host Cell Receptor Interactions

Among viruses for which the host receptor has been identified, there is an association between host range and phylogenetic conservation of that receptor (79). This result is consistent with previous species-level studies that have shown that the more phylogenetically related two species are, the more likely it is that a virus will be able to jump between them (80). These observations confirm that the host receptor is a primary determinant of host range expansion and also that receptor conservation can potentially act as a screen to identify viruses that are likely to jump into humans.

The link between DPP4 sequence conservation across species and permissivity to MERS-CoV infection, however, is not obvious. Analysis of the DPP4 gene tree shows no clear phylogenetic clustering of permissive and nonpermissive hosts (Figure 5; see also 49). In Figure 5, blue indicates permissive hosts and red indicates nonpermissive hosts (5864, 8184). The lack of a clear pattern of permissivity among closely related DPP4 genes suggests that other aspects of DPP4 may be more important than the linear amino acid sequence, such as structural similarity or conservation of posttranslational modifications (e.g., glycosylation). Alternately, or in addition, receptor-independent host restriction mechanisms may operate in some species.

Figure 5 


Experiments in other coronaviruses have shown that receptor differences can be easily overcome. For example, studies in murine hepatitis virus (MHV) found the emergence of host range variants following persistent infection (85), with four amino acid substitutions in the MHV spike protein responsible for the shift in tropism (86). Three mutations in a strain of SARS-CoV isolated from a civet were required for successful infection and replication in human cells (87), although this was accompanied by a trade-off in binding affinity to either civet or human ACE2 (88). Additionally, a single mutation in the spike protein was associated with increased mouse ACE2 receptor usage (89). These data suggest that coronaviruses can easily adapt to utilize receptor orthologs and engage in cross-species transmission. As we improve our understanding of the DPP4 biochemical interface, we will have a better grasp on how to predict which orthologs MERS-CoV may be most likely to adapt to, and which species may be the most likely candidates for MERS-CoV host range expansion.

In addition to considering the adaptation of MERS-CoV to nonpermissive DPP4 orthologs, we can also consider the possibility of MERS-CoV gaining the ability to utilize a new receptor. Experimental evidence has shown that coronaviruses can adapt to an alternate receptor; MHV can shift from using its natural receptor CEACAM1a to using heparan sulfate to enter the host cell in persistently infected cell cultures (90). MERS-CoV could potentially evolve to utilize another dipeptidyl peptidase family member, such as DPP8, DPP9, or fibroblast activation protein (FAP). All three of these proteins share high structural homology with DPP4 (Figure 6), despite low amino acid sequence identity (22%, 20%, and 52%, respectively). FAP appears to be the most likely candidate, with the highest sequence and structural homology (Figure 6c); FAP can also be expressed on the surface of cells, particularly at sites of tissue remodeling (91). Although only some of the key residues identified in DPP4 studies match the amino acid identities/properties of FAP (Figure 6d), it could still act as a candidate for MERS-CoV adaptation. This shift in receptor usage could, for example, result from increased selective pressure following a reduction in expression of DPP4. Reduction in expression has been shown to occur in vitro, where persistent MERS-CoV infection induces downregulation of DPP4 expression in bat cells (92). Additionally, selection for an alternate receptor could come from therapeutics that block DPP4 from being bound by the virus. Although the use of antibodies against DPP4 is impractical due to the importance of DPP4 in other roles (57), the administration of soluble DPP4 to prevent cell entry has been proposed (93); results show that soluble DPP4 can reduce and even block infection of cells by a pseudotyped MERS-CoV (45, 66). Further experiments should be considered to determine whether selection imposed by reduced DPP4 expression or DPP4-based therapeutics would push MERS-CoV to utilize an alternate receptor.

Figure 6 


The biochemical determinants of MERS-CoV permissivity can inform the potential evolution of host range by the virus. The knowledge that other DPP4 orthologs can act as backbones to support infection (61, 63, 64) suggests that, in theory, it is possible for MERS-CoV to expand into currently nonpermissive hosts. However, experimental results to date indicate that it may not be easy for the virus to adapt to these alternate receptors. First, the presence of glycans as barriers can dramatically disrupt the interaction between the MERS-CoV RBD and DPP4. Second, changing the interactions on both blades 4 and 5 of nonpermissive DPP4 orthologs, as seen with mouse and hamster studies (61, 64), is crucial. Both of these observations suggest that multiple mutations in the spike protein would be required for a virus to efficiently utilize these orthologous receptors. The probability of these mutations occurring simultaneously in the same genome may be unlikely, depending on the actual number of changes needed. If MERS-CoV does readily adapt to orthologous receptors, a bigger question may be what changes are needed to promote increased transmission between species or between individuals within a species. Currently, no transmission model exists for coronaviruses, creating a gap in our ability to experimentally evaluate the evolution of enhanced transmissibility. Still, as discussed in the following section, we can posit the various evolutionary mechanisms that may uniquely equip coronaviruses with the ability to expand their host range and to be successful in new hosts.


Understanding the evolutionary mechanisms facilitating viral host range expansion is crucial for prevention of, and preparation for, new emerging pathogens. Unfortunately, the forces that drive host range expansion events are still relatively unknown. Even for well-studied viruses such as influenza viruses, SARS-CoV, and MERS-CoV, the path of emergence and the selective pressures preceding emergence are often not clear. Here we discuss three potential evolutionary mechanisms that may influence coronavirus host range expansion: recombination, mutator alleles, and mutational robustness (i.e., the ability to remain phenotypically constant or functional in the face of genetic perturbations). These topics can be explored in future studies to determine their potential impact on the emergence of MERS-CoV.


Recombination can act to create new viral variants and is a common event in the adaptive evolution of RNA viruses (14). The ability to generate new genetic variants allows viral populations to explore the adaptive landscape at a faster rate than permitted by mutation alone (94). Recombination has been implicated as a major evolutionary force for HIV-1 (95) and has been suggested to contribute to host range expansion for many viruses, such as influenza A virus (96), nuclear polyhedrosis virus (97), and cauliflower mosaic virus (98). Coronaviruses can have in vitro recombination rates approaching 25% in progeny after a single round of coinfection at high multiplicity of infection (99), likely due to the ability of the RdRp to switch templates during replication (100). In fact, recombination has been implicated in the sharing of homologous genes by distantly related members of the Coronaviridae family (14). Phylogenetic analyses have revealed that HCoV-NL63 may have undergone many recombination events, including two sites of recombination in the S gene and potential recombination between HCoV-NL63 and porcine enteric disease virus (PEDV) in the M gene (101). Additionally, evidence suggests that natural avian infectious bronchitis virus (AIBV) isolates recombined with a vaccine strain (Holland 52), specifically in the N gene (102). Not only has recombination been detected between coronaviruses, but traces of gene acquisition from highly distinct viruses have been detected, such as the transfer of the influenza C–like hemagglutinin esterase gene to coronaviruses in the 2a subgroup (15). The potential for inter- and intraspecies recombination among coronaviruses gives them the ability to create a greater panel of novel and potentially pathogenic genomes.

The role of recombination in SARS-CoV emergence is controversial. Initially, it was hypothesized that SARS-CoV was a product of recombination between two circulating bat coronaviruses, producing a hybrid virus capable of jumping into other species (103, 104). Studies suggest that the SARS-CoV M and N genes share a common ancestor with the lineage that eventually became the AIBVs (105, 106). By contrast, the PP1ab polyprotein is most similar to murine-bovine coronaviruses, whereas the S gene is most similar to both the avian and group 1 (feline, canine, and porcine) coronaviruses (106). The apparent homology of SARS-CoV with very different clades suggests that extensive recombination has occurred in its evolutionary history. However, others argue that SARS-CoV is distinct and not a product of a recombination at all (39, 100), with a false signal for recombination detected due to a diversity in evolution rates between the coronavirus lineages (107). Still, recombination occurs frequently in natural populations; recent sequence analyses of SARS-like coronaviruses in horseshoe bats in China detected high levels of recombination between coronavirus strains and even between strains from varying geographical locations (108). Although phylogenetic analyses have not yet revealed whether recombination played a role in the emergence of MERS-CoV, this possibility should be considered. Further sampling of bat coronaviruses will help to determine the contribution of this evolutionary mechanism.

Mutator Alleles

The high mutation rates of RNA viruses give them the ability to evolve rapidly, particularly when combined with their large population sizes and short generation times. However, most mutations are lethal or deleterious (109, 110). The deleteriousness of mutations can cause high mutation rates to be disadvantageous by pushing the population closer to the extinction threshold—the mutation rate becomes so high that the deleterious mutation load causes the mean fitness of the population to decrease to such an extent that the population goes extinct (111). RNA viruses exist close to this threshold, as shown through lethal mutagenesis experiments in many different RNA viruses [e.g., foot and mouth disease virus (112), poliovirus (113)]. In fact, it has been suggested that most RNA viruses replicate at an optimum mutation rate that maximizes fitness, virulence (114), and evolvability (115) without crossing the extinction threshold.

Despite the proximity of RNA viruses to the extinction threshold, mutator alleles have the potential to play a role in promoting host range expansion. A mutator allele increases the inherent mutation rate of a virus and thus enhances the genetic variation of the population; this variation may harbor mutations that are beneficial or required for expansion into a new host. If the benefit of a mutation that allows a virus to jump into a new host outweighs the cost of an increased deleterious mutation load compared with the wild type, the mutator allele has the potential to be successful. This success comes in the form of hitchhiking—the mutator allele hitchhikes along with the beneficial mutation that it produced. Note that hitchhiking is dependent on asexual reproduction (i.e., no recombination) so that the mutator remains linked to the beneficial mutation it produced. The hitchhiking of mutator alleles has been well documented in other organisms, such as bacteria (116, 117), but has yet to be experimentally demonstrated in a viral system. Still, it is well understood that mutators will be favored by natural selection in the face of novel environments (118, 119), such as that presented by a new host species.

Mutator phenotypes have been isolated from natural populations of HIV-1 (120) and influenza A virus (121). Additionally, targeted mutations have resulted in mutator strains for poliovirus (114, 122), coxsackievirus B3 (123), chikungunya virus (124), foot and mouth disease virus (125), and coronaviruses (1012). However, coronaviruses are the only RNA viruses capable of proofreading their genomes, in a process mediated by the nsp10-nsp14 complex (8); nsp14 contains a conserved DEDD superfamily motif that facilitates exoribonuclease activity (9). This proofreading capability has several implications. First, it likely allowed coronaviruses to expand their genome size past that of other viruses. Second, it allows the production of mutator alleles that are independent of the polymerase. In coronaviruses, mutator alleles have been identified for both SARS-CoV and MHV by mutating nsp14, elevating the mutation rate by 15- to 20-fold (10, 11). A MERS-CoV mutator, however, has not yet been produced. Experiments with the SARS-CoV mutator show that in direct competition with the wild type, the mutators quickly die out when no selection is present (12). This suggests a cost to the mutator allele, whether directly through the mutation itself or indirectly through the accumulation of deleterious mutations. However, the fact that viable mutators are likely to easily arise through mutations in nsp14 suggests that they may be influential in shaping the evolution of coronaviruses when a selective pressure is present, such as that of a new host or immune system. One caveat, however, is that if the in vivo multiplicity of infection is high, coinfection of cells can result in recombination of the mutator with wild-type genomes, thus uncoupling the mutator allele from the beneficial mutations that it produced. Better in vivo recombination and coinfection rate estimates will help inform the potential for mutators to succeed in wild-type populations.

Mutators have been implicated as an influential feature of host range expansions. For example, phylogenetic data suggest that an avian influenza virus (H1N1) jumped into pigs and then into humans about 100 years ago (126), with a mutator allele as the primary hypothesis for how the virus could rapidly cross two species barriers (127). When the H1N1 avian influenza virus jumped into European swine again in 1979, a mutator allele was thought to be responsible. However, even though the evolution rate was higher in the new influenza strain compared with the ancestral strain, the mutation rate was not (128). Although the role of a mutator in this host range expansion event is highly speculative and controversial, mutators may still play a role in influenza evolution given that clinical isolates have been readily identified with mutation rates 3–4 times greater than in the ancestral strain (121), suggesting that natural populations could contain high frequencies of mutators at any given time. Additionally, mutator strains of norovirus have been suggested to play an important role in the recent norovirus pandemics. Using in vitro RdRp assays, the mutation rates of various genogroup II genotype 4 (GII.4) strains in addition to less common GII.b/GII.3, GII.3, and GII.7 strains (129) were measured. The mutation rates of the predominant GII.4 strains were 5- to 36-fold higher than those of the less frequently detected lineages. This evidence suggests that an increased inherent mutation rate can enhance the epidemiological fitness of circulating virus strains.

Mutators have not yet been implicated for host range expansion in coronaviruses, but this may simply be due to a lack of discovery. The mutation rate of the original SARS-CoV isolate has been measured [9 × 10−7 substitutions per nucleotide per replication cycle (10)], but the mutation rates of closely related bat coronaviruses have not. Furthermore, the mutation rate of MERS-CoV has not yet been experimentally determined, although the evolution rate has been estimated at 1.12× 10−3 substitutions per site per year (130), which is comparable to the SARS-CoV estimate of 2.82×10−3 (108). It would be interesting to compare the MERS-CoV mutation rate with those of BtCoV-HKU4 and BtCoV-HKU5. If a mutator allele was responsible for producing the variation necessary for MERS-CoV to emerge in humans, we would expect it to have a higher inherent mutation rate than those of closely related lineages circulating in bats. Furthermore, mutators may play an important role in coronavirus vaccine development. The SARS-CoV mutator has been shown to be attenuated in vivo with no reversion to virulence following passaging or persistent infection (12), providing a promising starting candidate for further development. Additionally, whereas SARS-CoV is resistant to ribavirin (131, 132), the mutator strain is not (133). This suggests that a possible therapeutic strategy could be to combine mutagens with inhibitors of nsp14 activity. Further studies can help advance the use of mutators in the development of vaccines and therapeutics.

Mutational Robustness

Robustness is the ability of a phenotype to remain constant or functional in the face of perturbation, whether environmental or genetic (i.e., mutations). Mutational robustness enables a population to explore more sequence space (i.e., sample more mutations) without losing viability, thus potentially yielding novel functions or life-cycle strategies. In terms of host range expansion, these novel functions could include changes within the spike protein that allow the virus to utilize new receptor molecules or evade a new host's immune defenses. Although experimentally measuring robustness in virus populations can be complex, competition experiments in vesicular stomatitis virus have shown that a more mutationally robust virus lineage can have an advantage over a faster-replicating lineage when the mutation rate is increased via a mutagen (134). This work, combined with theoretical studies, suggests that natural selection can favor mutational robustness in viral populations (135) and can even facilitate adaptation to a novel host under certain conditions (110).

Mutational robustness has been suggested to play a role in influenza A virus evolution. H3N2 isolates were found to accumulate genetic variation by moving through neutral networks (i.e., mutationally connected genotypes that produce the same phenotype). These periods of phenotypic constancy were then disrupted by a sudden phenotypic change in the virus antigenic structure. The exploration of sequence space facilitated by high mutational robustness allowed these changes to accumulate and result in epochal shifts that are well matched by epidemiological models (136, 137). Whereas the H3N2 study found a role for robustness in immune evasion specifically, the same process can apply to host range expansion. A virus population that could explore a neutral network within the context of the spike protein could potentially accumulate changes that increase the likelihood of producing a host range mutant. This process could be particularly applicable to MERS-CoV, which may require multiple mutations to expand its host range to currently nonpermissive species, as discussed above.

Mutational robustness among RNA viruses varies both among distantly related families (138) and among closely related strains (134). Mutational robustness has not been measured in coronaviruses; however, the higher resistance of SARS-CoV to the mutagen ribavirin (131, 132) compared with that of MERS-CoV (139) suggests a difference in mutational robustness between these viruses (137, 140). Even if coronaviruses are not robust to mutations during the infection of a host cell by a single virus, coinfection of host cells by two or more viruses can buffer individuals from deleterious mutations. During peak infection (i.e., high viral load), coinfection can occur frequently, enabling complementation. When multiple genomes contribute to the same protein pool, mutant genomes can persist because their fitter coinfection partners compensate for their deleterious mutations at the protein level. Experimental evidence has shown that complementation results in increased mutational robustness in the RNA bacteriophage Φ6 (141) and the maintenance of lethal mutations in several different types of viruses (142, 143), including coronaviruses (144, 145). By conferring mutational robustness, coinfection could enable the virus population to sample a broader sequence space, increasing the probability of producing a variant with the ability to infect a new host species.

At the moment, the role of mutational robustness in host range expansion is purely speculative. Although robustness may increase the likelihood of producing a variant with an expanded host range, the success of this variant relies on how well it can adapt to the new host. Because robustness can either hinder or facilitate adaptation, depending on various conditions, the role of robustness in host range expansion is complex (110, 146). Further experimental studies are needed both to confirm the mutational robustness of coronaviruses and to determine its contribution to host range expansion.


The increasing prevalence of emerging pathogens heightens the need to understand the biochemical and evolutionary dynamics of host range expansion events. Although coronaviruses have been circulating in the human population for hundreds of years, the recent emergence of highly pathogenic strains and the lack of effective vaccines or therapeutics emphasize the need to prioritize these studies. For SARS-CoV and MERS-CoV, the specific factors contributing to the host range jump into humans are still unclear. Elucidating the selective pressures imposed on the virus populations will help reveal the specific path of emergence for SARS-CoV and MERS-CoV. Additionally, the isolation of MERS-CoV and BtCoV-HKU4 reveals an increased variation in the receptors that coronaviruses can utilize, adding DPP4 to the already identified human receptors, APN (HCoV-229E) and ACE2 (SARS-CoV, HCoV-NL63). This discovery shows the enhanced variation and functionality of coronavirus spike proteins, but it reduces the expected power of screening potential for emerging coronaviruses based on receptor utilization.

According to our current knowledge, MERS-CoV permissivity is primarily mediated by the interactions between the host cell receptor and the MERS-CoV RBD. Other areas of research that require additional investigation include the role of host cell proteases and innate immune antiviral defense pathways, which for the most part are understudied in coronavirus host range expansions. Permissive and nonpermissive DPP4 orthologs have been identified and tested experimentally (58, 6164, 70, 83), generating a useful toolkit for identifying the specific residues or biochemical characteristics that allow or prevent MERS-CoV infection. These studies can inform the development of small animal models, vaccines, and therapeutics, in addition to revealing the mutational path that allowed MERS-CoV to expand its host range into humans.

Although ecological factors play an important role in host range expansion, several inherent biological properties may increase the emergence probability of a viral strain. These evolutionary mechanisms can include recombination, mutator phenotypes, and mutational robustness. Recombination occurs at high frequencies in vitro (99), suggesting that it has the potential to shape the evolutionary trajectories of coronavirus populations. Recombination has been implicated in the SARS-CoV jump (103, 104; but also see 39, 100), and its role in the emergence of MERS-CoV should be measurable as the number of bat coronavirus sequence samples increases. Similarly, mutator phenotypes have been readily produced in MHV and SARS-CoV (10, 11). In the quasispecies pool, mutations in nsp14 that disrupt its exoribonuclease activity may be quite common in natural populations. These mutators have the potential to allow coronavirus populations to access unique mutation combinations and subsequently jump and adapt to new host species. Measuring the mutation rate of MERS-CoV and closely related viruses can help reveal whether an increased production of genetic variation played a role in its emergence into humans. Finally, high mutational robustness has the potential to allow coronaviruses to tolerate the mutations that would allow them to expand their host range. Evaluating the mutational robustness of SARS-CoV and MERS-CoV can help detail whether robustness plays a role in coronavirus evolution.

Whereas the genetics and biochemistry of coronaviruses have been well studied, we know very little about the underlying evolutionary mechanisms that result in cross-species transmission and emergence into humans. Coronaviruses can act as an important model system for addressing the fundamental principles of host range evolution. The SARS-CoV and MERS-CoV infectious clone systems (147, 148) allow robust reverse genetics experiments, and the presence of a proofreading mechanism allows for the production and study of mutator phenotypes. These mutator phenotypes also provide a unique tool for studying robustness, because strains with higher mutation rates are expected to evolve robustness greater than that of a wild-type strain (135). Additionally, SARS-CoV and MERS-CoV have well-characterized virus–host cell receptor interactions as well as known human monoclonal antibodies that target the RBD of each virus. Experimental evolution has been scarce in the coronavirus world, but the tool set is present for strong collaborations with evolutionary biologists to help better understand the evolutionary mechanisms that govern highly relevant pathogens. Ideally, understanding host range expansion in emerging pathogens would enable us to screen circulating virus populations and identify strains that are most likely to emerge into humans. Although the limited number of outbreak events in coronaviruses reduces our ability to create a comprehensive picture of key host range expansion determinants, understanding the potential contribution of different evolutionary mechanisms is a useful place to start.

disclosure statement

The authors are not aware of any affiliations, memberships, funding, or financial holdings that might be perceived as affecting the objectivity of this review.


K.M.P. is supported by a National Science Foundation Graduate Research Fellowship. This work was supported by National Institutes of Health awards R01 AI108197, U19 109761, and U19 109680 and National Science Foundation award DEB-0922111.

literature cited

Jawahar Raina

Leave a comment

Please note: comments must be approved before they are published.