The V1V2 Region of HIV-1 gp120 Forms a Five-Stranded Beta Barrel
Product# 101 gp120 Capture Elisa Kit
Product# 1081 HIV-1 gp120 (ADA)
Product# 1011 HIV-1 gp120 (subtype C)
The HIV-1 envelope (Env) glycoprotein complex, consisting of glycoproteins gp120 and gp41, mediates the virus entry into the host cell. This heterotrimer is the major target for neutralizing antibodies (Abs) induced in HIV-1-infected patients and for HIV/AIDS vaccine development (1). Glycoprotein gp120 has five variable and five conserved regions (2); the region consisting of the first and second variable regions (V1V2) is the most diverse in both sequence and length. V1V2, including residues 126 to 196 in the HXB2 numbering scheme (3), usually has two nested disulfide bonds; the V1 disulfide bond (between cysteine residues 131 and 157) is located within the V2 disulfide bond (between residues 126 and 196) (4). The average length of V1V2 is about 80 amino acids (aa), with a possibility of large length variations mostly derived from two regions, one in the middle of V1 and the other near the C-terminal end of V2. Recent data have shown that V1V2 is located at the distal apex of the Env trimer and that the V1V2 regions from the three gp120s, at least in the stabilized BG505 SOSIP.664 construct, join at the center to form a top layer of the trimer (5,–7). This layer can shield the third variable region (V3) and the coreceptor binding site in the prefusion state, and it can also make large movements upon CD4 binding to expose the coreceptor binding site.
Due to the nested disulfide bond configuration, it was difficult to predict the structural conformation of V1V2 from sequence analysis and epitope mapping (8). V1V2 was excluded in the early attempts to crystalize monomeric forms of gp120 (9, 10). The first breakthrough in visualizing the V1V2 structure was achieved by Kwong and coworkers, who crystallized complexes of broadly neutralizing monoclonal antibodies (MAbs) PG9 and PG16 with the V1V2 inserted in a scaffold molecule (Protein Data Bank identifier [PDB ID] 1FD6); two clade C V1V2s, CAP45 (referred to here as V1V2CAP45-1FD6) and ZM109 (V1V2ZM109-1FD6), were used (11, 12). The 1FD6 scaffold molecule was among many tried and is a small protein with a region that can accommodate the disulfide bond and glycans of V1V2 (11). These structures showed that V1V2 contains 4 beta strands (named strands A, B, C, and D) forming a Greek key motif, an elegant solution to allow a conserved motif with two sub-variable regions. However, regions of V1V2 were still missing from these structures. These include part of V1 in V1V2ZM109-1FD6 and the sub-variable C-terminal V2 region in both structures. The missing V2 region contains a highly conserved integrin-binding motif, 179Leu-Asp-Ile/Val181, which can bind the α4β7 integrin, the gut mucosal homing receptor for peripheral T cells (13). Although the precise function of this motif is still being questioned (14), this region may play a potentially important role in facilitating the virion getting onto cells in order for it to get into cells, and targeting α4β7 reduces mucosal transmission in an animal model (15).
The V1V2 region is immunogenic in humans (8, 16), and many V1V2-specific MAbs have been isolated from HIV-1-infected patients. We have recently classified V1V2 epitopes into three major types: V2q, V2p, and V2i (17, 18). The V2q type is defined by quaternary neutralizing epitope (QNE) MAbs, including human MAbs 2909, PG9, and PG16, and also a panel of rhesus MAbs, including 2.5B (19,–21). Crystal structures of PG9 and PG16 in complex with the V1V2 scaffolds have shown that these MAbs recognize an N-linked glycan at residue 160 and another at either 156 or 173 as well as the C-terminal region of strand B of V1V2 through a beta sheet interaction using long complementarity-determining-region (CDR) H3s harbored by these QNE MAbs (11, 12). The V2p type is defined by human MAbs CH58 and CH59 isolated from a vaccinee of the phase III RV144 human vaccine trial (22). Their epitopes are also located in strand B, overlapping the V2q epitope region, but have a helical or helical-coil structure, different from the beta strand structure of the V2q type (22). The V2i type is defined by a panel of seven human MAbs, including 830A, 697-30D, and 2158 (17, 23). Extensive immunological, mutagenesis, and computational modeling data have shown that the V2i epitopes overlap the integrin-binding site and likely recognize discontinuous regions in V1V2 (17, 18, 23). However, the precise nature of the V2i epitopes is still unknown.
Data from the correlation analysis of the RV144 human clinical vaccine trial have suggested that the IgGs targeting V1V2 induced by the vaccine inversely correlate with the risk of infection (24), and immunologic data delineated the cross-reactivity of these Abs (25, 26). A subsequent sequence sieve analysis that compared the V1V2 sequences from viruses infecting placebo and vaccine recipients identified two positions in V2, 169 and 181, that distinguished viruses from vaccine recipients. The RV144 vaccine efficacy against viruses matching the Lys vaccine residue at position 169 was 48%, whereas the vaccine efficacy against viruses mismatching the Ile vaccine residue at 181 was 78% (27). We recently modeled the full-length V1V2, including the V2i epitope region, and provided a mechanistic understanding of mismatching sieve residue 181 (18). To further understand the structure and function of V1V2 and to precisely map the V2i epitopes, we determined the structure of the antigen-binding fragment (Fab) of V2i MAb 830A (an IgG3 MAb with VH4-34 heavy chain and VL kappa 3-15 light chain genes) in complex with a V1V2 scaffold (V1V2ZM109-1FD6). Our structure reveals the atomic details of the 830A epitope and the integrin-binding site. In addition, our data showed that the V1V2 region forms a five-stranded beta barrel, a structure that is uniquely suitable for the functions of V1V2.
MATERIALS AND METHODS
Fab 830A production and complex formation.
Fully glycosylated V1V2ZM109-1FD6 was produced by transient transfection using a mammalian expression system as previously reported (11). Briefly, the V1V2 scaffold protein with all the glycan sites was produced in 293S GnTI−/− cells that produce only high-mannose glycans, which are more homogeneous than complex glycans and better suited to crystallization (11, 28). It was then purified by the use of nickel-nitrilotriacetic acid (Ni-NTA) beads. The DNAs of the heavy and light chains of Fab 830A were synthesized by Genewiz (South Plainfield, NJ) and subcloned into the pVRC-8400 expression vector with the secretion signal (MRPTWAWWLFLVLLLALWAPARG) at their N termini and a 6×His tag at the C terminus of the heavy chain. Fab 830A were expressed and also purified with the His tag affinity columns. Purified V1V2ZM109-1FD6 and Fab 830A were mixed at a 1:1 molar ratio and further purified by size exclusion chromatography.
Crystallization, data collection, and structure determination and refinement.
Data representing the initial crystallization conditions of Fab 830A/V1V2ZM109-1FD6 were obtained by robotic screening using the vapor diffusion hanging drop method. Diffracting crystals of Fab/epitope complex were obtained with a well solution of 16.5% polyethylene glycol 8000–0.1 M Tris (pH 8.5). Many crystals were produced, but they rarely diffracted well. Final data were collected at the beamline General Medical Sciences and Cancer Institutes Structural Biology Facility Collaborative Access Team (GM/CA CAT) at the Advanced Photon Source (APS), Argonne National Laboratory, and processed using the HKL 2000 package (29) and XDS (30). The structure was determined by molecular replacement using the structures of PG9-bound V1V2ZM109-1FD6 (PDB ID 3U2S) and a homologous Fab fragment (PDB ID 3KDM) as the starting models. Cycles of refinement for the structure were carried out in Coot and Phenix (31, 32). Final structural analysis was carried out using ICM (33), and figures were generated using ICM and PyMOL (Schrödinger, LLC).
Structure determination of Fab 830A/V1V2zm109-1FD6 complex.
To crystallize human MAb 830A in complex with V1V2, we cloned its Fab fragment and expressed it in 293 cells, as it was difficult to obtain a large amount of IgG from the hybridoma cells. Fully glycosylated (with all 6 glycosylation sites) V1V2zm109-1FD6 was also produced in 293 cells, and the complex of Fab 830A and V1V2zm109-1FD6 was obtained by mixing them at a 1:1 molar ratio and further purifying by size exclusion chromatography. The crystals of the complex diffracted to better than 3-Å resolution and belonged to the orthorhombic space group P212121 with two Fab/epitope complexes in the asymmetric unit. We determined the structure by molecular replacement and refined the model to 3.0-Å resolution with a final Rwork value of 22% (Rfree = 29.5%) (Table 1 and Fig. 1). Since the two noncrystallographic complexes in the crystal are highly similar (Fig. 1A and andC),C), we chose only one for description. We numbered the residues of the light and heavy chains following the Kabat and Wu convention, preceded by “L” and “H,” respectively, and the residues of V1V2ZM109-1FD6 by a “G” (for “gp120”).
|a, b, c (Å)||64.73, 159.17, 170.36|
|α, β, γ (°)||90, 90, 90|
|Resolution (Å)||3.00 (3.18–3.00)|
|CC (1/2)||99.9 (79.4)|
|Completeness (%)||99.6 (99.5)|
|No. of reflections||35,968|
|Bond length (Å)||0.010|
|Bond angle (°)||1.422|
Structure of V1V2ZM109.
The V1V2 structure we observed in the 830A complex still has the four previously described beta strands in the PG9 complex, i.e., strand A (from ThrG128 to AsnG130), strand B (from ArgG153 to AsnG160), strand C (from LysG171 to TyrG177), and strand D (from LeuG190 to LeuG193) (Fig. 1C). Each strand in the 830A complex structure is slightly shorter than those observed in the PG9 complex (Fig. 1D). However, we have observed an additional beta strand from IleG181 to LeuG184 located between strand C and D and we have named it strand C′ (Fig. 1C and and2A).2A). Together with strand C′, these five beta strands now form a complete beta barrel structure (Fig. 2). The shorter lengths of the beta strands allow them to curve around the barrel so that it has a cylindrical shape. The V1V2 beta barrel has several characteristics (Fig. 2). V1V2 of ZM109 has six N-linked glycosylation sites, and we observed (for only single N-acetylglucosamine [NAG] moieties) three of them on AsnG130, AsnG160, and AsnG173, all extending out from the barrel toward one side (Fig. 2C). The remaining glycosylation sites, AsnG138, AsnG188a, and AsnG188d, are in the flexible strand-connecting loops (SCLs) (Fig. 2), which are partly missing in the structure. Combining the five-stranded beta barrel structure with the sequence variation of V1V2, we can now redefine the highly variable regions in V1V2 more precisely: the V1 variable region is the strand-connecting loop between strand A and B (SCLA-B), while the V2 variable region is the strand-connecting loop between strand C′ and D (SCLC′-D) (Fig. 2D). These two strand-connecting loops have high sequence diversity and a high percentage of N-linked glycosylation sites among the different HIV-1 strains (Fig. 2D).
The V1V2 beta barrel has a hydrophobic core formed by a number of hydrophobic residues (Fig. 2B), including ValG127 and LeuG129 from strand A, the disulfide bond of V1 between CysG131 and CysG157, PheG159 and IleG161 near the end of strand B, ValG172, AlaG174, and PheG176 of strand C, IleG181 of strand C′, and TyrG191 and LeuG193 of strand D. The side chain of RV144 sieve residue IleG181 on strand C′ is buried in the core, with a buried surface area of ∼80 Å2. A mutation of 181 to Val, a mismatching amino acid in the RV144 sieve analysis (27), would have a reduced buried surface area of 67 Å2. On the other hand, a Met or a Leu, also mismatching amino acids in the RV144 sieve analysis, at this position would be slightly too bulky to pack into the hydrophobic core. Thus, the Ile amino acid seems ideal for position 181 of the hydrophobic core of the beta barrel.
The integrin-binding site.
The sequence of the ZM109 α4β7 integrin-binding motif is LeuG179-AspG180-IleG181. The first two residues, LeuG179 and AspG180, are located in a short 310 helical turn between strands C and C′ (named “kink”), while IleG181 is located on strand C′ (Fig. 2B and andD).D). In contrast to the buried IleG181, the highly conserved LeuG179 and the 100% conserved AspG180 are exposed on the surface of the kink region (Fig. 3A). But attempts to computationally dock the ZM109 integrin-binding motif onto the α4β7 head structure were not successful; the integrin-binding site in our structure is too recessed to allow reaching the deep binding site shown in the α4β7 head structure (34).
Our structure revealed the details of the epitope of 830A (Fig. 3). It comprised three components: (i) residues ArgG153 and ValG154 from V1, (ii) residues ThrG175 and TyrG177 from strand C and LeuG179 and AspG180 from the kink region, and (iii) IleG194 from the V2 stem just after the C-terminal strand D (Fig. 3A). The side chain of ArgG153 in the first component forms a hydrogen bond with the hydroxyl group of TyrH32 from CDR H1 of 830A, and ValG154 forms minor contacts against two residues, TryH99 and ValH100B, at the tip of CDR H3 (Fig. 3D). The most prominent interaction between 830A and V1V2 is in the second component, including TyrG177, the last residue of strand C, and LeuG179 and AspG180, the first two residues of the α4β7 integrin-binding motif in the kink region. The side chains of TyrG177 and LeuG179 pack against a wall formed by the backbone of GlyH98 and the side chains of TyrH100H and TyrL49 from the antibody side. These three residues have a total contact surface area of 147 Å2 with 830A, which is more than 70% of the total buried surface area on the V1V2 side (Fig. 3C). In addition, the side chain of AspG180 can form hydrogen bonds with the backbone nitrogen of IleH100D and the hydroxyl group of TyrH100H from the CDR H3 of 830A, and the carbonyl oxygen of LeuG179 forms a hydrogen bond with the hydroxyl group of TyrL49 from CDR L1. Similarly to residues in the first component of the epitope, IleG194 in the third component, located in the stem of V2, forms van der Waals contacts with hydrophobic residue IleH100D of CDR H3. We did not observe any contacts between 830A and glycans of V1V2ZM109, consistent with enzyme-linked immunosorbent assay (ELISA) data showing that deglycosylation of V1V2ZM109-1FD6 did not affect 830A binding (data not shown).
The CDR H3 of 830A is 18 amino acids long (Kabat definition), and it stands tall in the center of the antigen-binding site, protruding about 10 Å from its base (Fig. 3B). This is different from two other V2i MAbs, 697 and 2158, whose CDR H3s are collapsed, making their antigen-binding sites flat (18, 23). The CDR H3 of 830A contributes the majority of its antigen binding, and all residues of the 830A paratope are from the heavy chain except TyrL49 (Fig. 3B).
Here we have defined the epitope of one of the human V2i MAbs, 830A, at an atomic resolution, and the epitope of 830A overlaps the α4β7 integrin-binding motif. It consists of multiple structural components, including regions from V1, the kink, and the C terminus of V2 (Fig. 3A, ,C,C, and andDD and and4A).4A). Data from one of our previous mutagenesis studies suggested that mutations of residues in strand B of V1V2, including residues 168 and 169, can also influence 830A binding (17), but we did not observe direct contacts between these residues and 830A in the crystal structure (Fig. 3). It is therefore possible that mutations of those residues can alter the global conformation of V1V2, thus affecting the conformation of the 830A epitope. The multicomponent epitope of 830A is likely conformationally flexible, as the kink and strand C′ region were not observed in several previously published crystal structures (5, 11, 12). This conformational flexibility of the V2i epitopes provides a possible explanation for why the neutralizing activity of 830A and other V2i MAbs is substantially improved with prolonged incubation (unpublished data and reference 35).
Although the seven human V2i MAbs (830A, 2297, 697, 1361, 1393A, 1357, and 2158) have been described as a family (17, 18), their gene usages differ. Monoclonal antibodies 830A and 2297 are encoded by VH4, while the other five are encoded by VH1 (23). In particular, four of these MAbs (697, 1361, 1393A, and 2158) are encoded by VH1-69, and we have determined the Fab structures for two of them, 697 and 2158, and have shown that they have a hydrophobic surface patch in their antigen-binding sites (18, 23). This is not observed in the case of 830A. Functionally, VH4 MAbs 830A and 2297 have a narrower neutralization breadth even though 830A has at least 5% more somatic mutations than the other V2i MAbs (23). It is therefore conceivable that the seven V2i MAbs can be divided into subgroups, each with a distinct V1V2 binding mode, which can be elucidated only by future structural studies.
Stabilization of the V2 kink region by Fab 830A binding likely helped the visualization of the strand C′ region and the beta barrel structure of V1V2, although the full glycosylation of the construct we used may also have helped to stabilize the beta barrel structure. In the previously determined PG9- and PG16-bound V1V2ZM109-1FD6 structures (11, 12), the Fab molecules bound the N-terminal region of strand C, which is at the other end of the beta barrel from the kink region. Thus, the kink/strand C′ was flexible in those complexes, preventing their visualization in the crystal structures (11, 12). Overall, regions of V1V2, such as the connecting loops, are likely quite flexible and can adopt various conformations. However, the five-stranded beta barrel core structure of V1V2 represents a more complete and comprehensive solution for the structure and function of V1V2 than the four-stranded Greek key motif (11). The cylindrical shape consists of a hydrophobic core with side chains of charged/polar residues projecting out from the barrel surface, and glycosylation sites decorating the outer surface. The two extreme sequence- and length-variable regions of V1 and V2 (SCLA-B and SCLC′-D) are precisely defined, and both point to one end of the barrel (Fig. 2). This configuration allows length variations for these SCLs (Fig. 2D) without affecting the main barrel structure. Interestingly, data have suggested that immune pressure can drive the length of these regions, which may serve as a tool for the virus to evade the immune response (36,–38). However, these SCLs are located at the periphery of the trimer and it is not yet known if they can protect any key sites of Env vulnerability. In view of the high sequence conservation of the kink/strand C′ (residues 175 to 184; Fig. 2D), one may speculate that at least one of the functions of these SCLs is to protect this conserved region of V2. Thus, increasing the lengths of SCLC′-D would increase the flexibility of strand C′, leading to conformationally masking the integrin-binding site and strand C′ (39).
A beta barrel structure also allows V1V2 to function as an individual module. V1V2 is located on the distal apex of the trimer, packed on top of V3, but it has to make a large movement upon CD4 binding to expose V3 and the coreceptor binding site. As individual functional modules, V1V2 and V3 can pack against each other loosely but can also move away independently from the gp120 core when CD4 binds. Such functional modules are common in signal transduction protein molecules (40). During preparation of the manuscript, a 3.5-Å resolution structure of the trimeric form of envelope BG505 SOSIP.664 was published (7), and its V1V2 region has essentially the same overall structure as we observed here; thus, the V1V2 structure that we resolved and describe in the context of an antibody-bound V1V2 scaffold likely represents V1V2 in its native conformation. We noticed that the structures of the strand C region observed in the CH58 and CH59 complexes are very different from that of the scaffolded V1V2s or the SOSIP. The CH58/CH59 conformations likely present in free gp120 molecules, which were used for the RV144 vaccine.
The antigenic sites on V1V2 have become crucial targets for vaccine design since the correlate study and sieve analysis of the RV144 clinical vaccine trial highlighted the role of antibodies to this region in reducing the risk of HIV infection (24,–27). The structural data presented here provide additional support for the important role of these antibodies. The central component of the 830A epitope, the kink region overlapping with the integrin-binding site, is highly conserved in V1V2 (41, 42). This immunogenic region is likely another vulnerable site in V1V2, in addition to the strand C targeted by PG9 and other QNE MAbs. Precise structural mapping of this epitope region provides an opportunity to rationally design scaffold immunogens to focus antibody responses on this region (43, 44). One challenge of targeting this epitope region is the flexibility of strand C′, which may render this region less immunogenic than the strand C region. However, strand C′ can be stabilized by an additional disulfide bond as in the case of HIV-2 and some simian immunodeficiency virus (SIV) envelopes (45). In fact, mutations to introduce such a disulfide bond into certain virus strains improved the neutralization potency of 830A (C. Hioe, personal communications). Thus, stabilizing this region may improve the antigenicity and immunogenicity of this conserved region in V1V2. Taking the results together, precise epitope mapping of V1V2-specific MAbs such as 830A will improve our ability to design immunogens targeting key antigenic epitopes in the V1V2 region of HIV-1 gp120.
Protein structure accession number.
Atomic coordinates and structure factors for the Fab 830A/V1V2ZM109-1FD6 complex have been deposited in the RCSB Protein Data Bank under ID code 4YWG.
We thank Peter Kwong for the construct of 1FD6-V1V2ZM109 and Brian Foley for help with downloading the Env sequence alignment from the LANL database.
This work was supported in part by the National Institutes of Health under awards AI100151;, AI082274;, AI084119;, and HL059725 and by funds for the Department of Veterans Affairs. GM/CA-CAT at APS has been funded in whole or in part with federal funds from the National Cancer Institute (Y1-CO-1020) and the National Institute of General Medical Sciences (Y1-GM-1104). Use of the Advanced Photon Source was supported by the U.S. Department of Energy, Basic Energy Sciences, Office of Science, under contract DE-AC02-06CH11357.
The content is solely our responsibility and does not necessarily represent the official views of the National Institutes of Health or the Department of Veterans Affairs.
Articles from Journal of Virology are provided here courtesy of American Society for Microbiology (ASM)