Construction and Characterization of Single-Chain Variable Fragment Antibody Library Derived from Germline Rearranged Immunoglobulin Variable Genes
Antibody repertoires for library construction are conventionally harvested from mRNAs of immune cells. To examine whether germline rearranged immunoglobulin (Ig) variable region genes could be used as source of antibody repertoire, an immunized phage-displayed scFv library was prepared using splenocytic genomic DNA as template. In addition, a novel frame-shifting PCR (fsPCR) step was introduced to rescue stop codon and to enhance diversity of the complementarity-determining region 3 (CDR3). The germline scFv library was initially characterized against the hapten antigen phenyloxazolone (phOx). Sequence analysis of the phOx-selective scFvs indicated that the CDRs consisted of novel as well as conserved motifs. In order to illustrate that the diversity of CDR3 was increased by the fsPCR step, a second scFv library was constructed using a single scFv clone L3G7C as a template. Despite showing similar binding characteristics towards phOx, the scFv clones that were obtained from the L3G7C-derived antibody library gave a lower non-specific binding than that of the parental L3G7C clone. To determine whether germline library represented the endogenous immune status, specific scFv clones for nucleocapsid (N) protein of SARS-associated coronavirus (SCoV) were obtained both from naïve and immunized germline scFv libraries. Both libraries yielded specific anti-N scFvs that exhibited similar binding characteristics towards recombinant N protein, except the immunized library gave a larger number of specific anti-N scFv, and clones with identical nucleotide sequences were found. In conclusion, highly diversified antibody library can be efficiently constructed using germline rearranged immunoglobulin variable genes as source of antibody repertoires and fsPCR to diversify the CDR3.
Phage-displayed antibody library has been widely used to derive high-affinity target-specific antibodies, such as antibodies that were specific for angiogenesis marker fibronectin , melanoma-specific B3 and B4 antigens , epidermal growth factor receptor , HIV Vpr protein , and spike protein of SCoV , . Antibody repertoires of phage-displayed library are conventionally created by harvesting mRNAs from peripheral blood lymphocytes, spleen, bone marrow, tonsil or similar sources using RT-PCR and family-based oligonucleotides , , , . The heavy and light chains are then randomly combined and cloned to construct a combinatorial scFv library, from which specific antibodies against not-yet-encountered antigen are selected , , , .
Although the use of mRNAs ensures functional antibody genes retrieval, there are some limitations. Potential Ig genes may not be recovered from antibody cDNA library due to reading frame shifted or the presence of stop codon(s) which are generated by imprecise somatic recombination and P- and N- additions , ; or the immunoglobulins are self-reactive and thus eliminated by the host immune system , . Besides, like other somatic cells, B cells are diploid and therefore rearranged Ig genes can only be expressed from one of the sister chromosomes while the other is concealed , . In addition, chance of getting Ig genes against poor immunogenic targets is hampered by poor humoral response of immunized host. On the other hand, activated B cells undergo clonal expansion and therefore the antibody repertoires of a cDNA-derived scFv library would be dominated by antigen-stimulated humoral response. Hence recombinant antibody repertoires of a cDNA-derived antibody library are limited.
Germline Ig variable region (V) genes, which are selected over millions of years for their compatibility with many different antigens, are poly-functional and capable of orchestrating an effective immune-response , . Indeed, antibodies that encoded by a very limited VH and VL genes of inbred mice were found to react with different haptens, polysaccharides, and even protein antigens , , , , . However, the potential use of rearranged germline Ig genes as source of antibody repertoires for construction of antibody library has never been explored.
Structural analysis of antibody binding site suggests that only a few canonical conformations exist within the five CDRs except VH CDR3 loop which shows a wide range of variations in both length and space , . Previous studies indicate that CDR3 diversity is the principle determinant of antigen-specificity and binding-affinity , , and the diversity of CDR3-FR4 junction determines how antibody undergoes affinity maturation . Hence, modifications in CDR3 seem to be an efficient way of expanding antibody diversity beyond what are encoded by the germline Ig genes.
In the present study, we described the construction and characterization of germline scFv antibody fragment libraries using variable region of rearranged Ig genes as source of antibody repertoires. The germline Ig variable regions were further diversified using a novel frame-shifting PCR step to rescue stop codon(s) as well as to enhance diversity in the CDR3 loop. Indeed, in this proof-of-concept study, we showed that the germline scFv library was highly diverse, and specific antibodies against hapten and protein were obtained.
Materials and Methods
Oligos and primers
All oligos and primers were custom synthesized by Invitrogen. The sequences and the relative position with respect to an scFv of various oligos and primers were detailed in Table 1 and Figure S1, respectively.
|A. Mouse VH Forward Primers (FR1 Region)|
|B. Mouse VH Reverse Primers (FR4 Region)|
|RH1||5′- gAcDgTgASHGWRGTYC -3′|
|C. Mouse VLκ Forward Primers (FR1 Region)|
|D. Mouse VLκ Reverse Primers (FR4 Region)|
|E. Frame-shifting primers|
|F. Linker primers for VH|
|G. Linker primers for VLκ|
Isolation of splenocytes
The protocol for animal work was approved by the Animal Experimentation Ethics Committee of the Chinese University of Hong Kong (Permit Number: 03/015/MIS). BALB/c mice (∼25 g) were sacrificed by cervical dislocation and the spleens were immediately removed. After punching holes with a G21 needle, splenocytes were squeezed out and resuspended in an ice-cold phosphate buffered saline (PBS) containing 4% BSA (Sigma). For isolation of CD19+ cells, splenocytes (2×107 cells/ml, 5 ml) were incubated with a fluorescein isothiocyanate (FITC)-conjugated rat anti-mouse CD19+ monoclonal antibody (0.5 mg/ml, 100 µl, BD Bioscience) on ice for 30 min to label B-lymphocytes carrying rearranged immunoglobulin genes. After washing to remove unbound antibodies, labeled cells were resuspended in 5 ml of RPMI-1640 medium (Gibco) supplemented with 10% fetal bovine serum (FBS, Gibco). The CD19+ splenocytes were sorted and captured using a FACS VANTAGE SE cell sorter (Becton Dickinson).
Retrieval of variable regions from germline rearranged Ig genes
Splenocytic genome DNA was extracted using DNAzol® reagent (Invitrogen) as described by the manufacture. The variable regions of germline rearranged Ig genes were then retrieved using semi-nested PCR (snPCR). The 1st round snPCR aimed at amplifying the rearranged Ig variable regions using genome DNA as template, and reaction was carried out in a final volume of 50 µl containing 1X PCR buffer (Perkin-Elmer), 1.5 mM MgCl2 (Perkin-Elmer), 0.2 mM dNTP (Invitrogen), 5% DMSO (MB grade from Sigma), 2.5 U non-proof reading Taq polymerase, 200 ng of extracted splenocytic genome DNA and VH degenerate primer pair of 0.3 µM FH1 and 1.5 µM RH1, or VLκ degenerate primer pair of 0.45 µM FK11 and 0.3 µM RK11. Following pre-denatured at 94°C for 60 s, the reaction mixture was subjected to PCR amplification that consisted of two stages. The first stage was composed of 10 amplification cycles each consisting of 4 amplification steps, including denaturation at 94°C for 30 s, annealing at 40°C for 60 s, a ramping up extension step from 40°C to 72°C at a rate of 0.2°C/s, and then followed by an extended incubation step at 72°C for 1 min. Then the samples were subjected to another stage of 20 amplification cycles each consisting of denaturation at 94°C for 30 s, a touch up annealing step from 40°C to 50°C at a rate of 0.5°C/cycle for 1 min, and an extension step at 72°C for 60 s. Following the amplification cycles, the reaction mixture was further subjected to a post-extension step at 72°C for 2 min. The PCR reactions were carried out in a GeneAmp® 9700 PCR thermocycler with 96-well aluminum plate (Applied Biosystems). The PCR products were stored at 4°C until use.
To enrich Ig variable region fragments, PCR products of 1st round snPCR were used as templates for 2nd round of snPCR with primer pair carrying additional oligonucleotides at the 3′-end that bracketed only Ig-like templates being further amplified (Figure S1). The 2nd round snPCR was carried out in a final volume of 50 µl containing 1X PCR buffer; 1.5 mM MgCl2, 0.2 mM dNTP, 5% DMSO, 2.5 U non-proof reading Taq polymerase, and 4 µl of the 1st round PCR products. For retrieving VH segments, 0.3 µM FH2 and 1.5 µM RH2 were used. For VLκ semi-nested PCR, 0.5 µM FK12 and 1.125 µM RK12 were used. Reaction mixture was pre-denatured at 94°C for 2 min, and then subjected to PCR amplification that consisted of three stages. The first stage is composed of 5 amplification cycles each consisting of denaturation at 94°C for 30 s, annealing at 65°C for 30 s and extension at 72°C for 30s. The second stage was composed of 25 amplification cycles each consisting of a denaturation step at 94°C for 30s, a touch-down annealing step from 65°C to 55°C at a rate of 0.4°C/cycle for 30 s and an extension step at 72°C for 30 s. The third stage was composed of 5 amplification cycles each consisting of a denaturation step at 94°C for 30s, an annealing step at 55°C for 30 s and an extension step at 72°C for 30 s. Following the amplification cycles, the reaction mixture was further subjected to a post-extension step at 72°C for 2 min. The PCR products were stored at 4°C until use.
To purify the Ig variable region gene fragments, PCR products were pooled and subjected to agarose gel electrophoresis. Agarose gel stripes containing the Ig gene fragments were placed into a dialysis tubing (MWCO 3,500 from Spectrum) together with 1 ml of TAE buffer, DNA fragments were electro-eluted, purified by phenol chloroform extraction and then precipitated by ethanol. The purified PCR products were either cloned into TOPO TA (Invitrogen) or pGEM-T Easy (Promega) cloning vectors for nucleotide sequence determination, or used for subsequent CDR3 diversity enhancement.
Frame-shifting PCR for CDR3 diversity enhancement
The frame-shifting PCR (fsPCR) was performed as described in our US patent . Briefly, the frame-shifting reaction was carried out in a final volume of 50 µl containing 1X PCR buffer, 1.5 mM MgCl2, 0.2 mM dNTP, 2.5 U non-proof reading Taq polymerase, 60-100 ng gel-extracted VH or VLκ DNA fragments, and VH primer pair of 0.5 µM FH1 and 4 µM VH frame-shifting reverse primer RFSH; or VLκ primer pair of 0.5 µM FK12 and 4 µM VLκ frame-shifting reverse primer RFSK. For both VH and VLκ fsPCR, the reaction mixture was pre-denatured at 94°C for 2 min. The 25 fsPCR cycles were carried out in a GeneAmp® 9700 PCR thermocycler with 96-well aluminum plate (Applied Biosystems). Each fsPCR cycle consisted of a denaturation step at 94°C for 30 s, an annealing step at 20°C for 2 min and an extension step with temperature ramping up from 20 to 94°C at a rate of +0.1°C per second (or 5% ramping-speed of GeneAmp® 9700 PCR thermocycler with 96-wells aluminum plate). The fsPCR products were stored at 4°C until use. After separation on a 1.5% agarose gel, DNA fragments with a size range of 300–400 bp were electro-eluted, phenol-chloroform purified, and ethanol precipitated. The purified PCR products were either cloned into TOPO TA (Invitrogen) or pGEM-T Easy (Promega) cloning vectors for nucleotide sequence determination, or used for subsequent scFv construction.
The CDR3-diversified germline Ig VH and VLκ DNA fragments were randomly linked together using a two-stage overlap extension PCR (oePCR). The 1st-stage oePCR was carried out in a reaction volume of 50 µl containing 1X PCR buffer, 1.5 mM MgCl2, 0.2 mM dNTP, 2.5 U non-proof reading Taq polymerase, 120 ng of frame-shifted Ig VH or VLκ DNA fragments, and 0.2 µM each SBS1 and L1JP (VH primer pair); or L2JP and KN1 (VLκ primer pair). The 1st-stage oePCR was used to add an adaptor and a linker to the retrieved Ig variable region of heavy or light chains. After a pre-denaturing step at 94°C for 2 min, 15 amplification cycles were carried out. Each cycle consisted of a denaturation, an annealing, and an extension steps at 94°C, 54°C, and 72°C for 30 s, respectively. After an extended incubation at 72°C for 5 min, the PCR products were stored at 4°C until use. The PCR products (5 µl) were used as templates for the subsequent 2nd-stage oePCR for scFv construction.
The 2nd-stage of oePCR was used to join the heavy and light chains randomly into an scFv. The reaction was carried out in a reaction volume of 50 µl containing 1X PCR buffer, 1.5 mM MgCl2, 0.2 mM dNTP, 2.5 U non-proof reading Taq polymerase, 5 µl of PCR products of the 1st-stage oePCR (both VH and VLκ) and 0.2 µM each SBS1 and KN1 primers. The mixture was pre-denatured at 94°C for 2 min, then subjected to 20 amplification cycles. Each cycle consisted of denaturation at 94°C for 30 s, annealing at 62°C for 30s, and extension at 72°C for 90s. After an extended incubation at 72°C for 5 min, the PCR products were stored at 4°C until use.
ScFv Library construction and specificity assessment
The Not 1- and Sfi 1- restricted scFv repertoires were cloned into pCANTAB 5E phagemid vector (GE Healthcare). Library was constructed by electroporation of the ligated products into competent TG1 E. coli as described in our previous publication . Subsequently, log-phase TG1 transformants were super-infected with M13KO7 helper phage (GE Healthcare) in a multiplicity of infection (moi) ratio of 3∶1, and phages were rescued at 30°C overnight with gentle shaking. The overnight culture of scFv-phages was purified by polyethylene glycol precipitation (20% PEG8000 and 2.5 M NaCl). Purified phages were resuspended in 4 ml of a pre-blocking buffer (1X PBS, 0.2% Triton X-100, 0.01% NaN3, 0.1% BSA and 10% non-fat milk) and incubated at room temperature for 30 min prior panning. Phages (0.5 ml/well) were panned against immobilized antigen as described previously , . Briefly, antigen (5 – 200 µg) was immobilized in a 24-well plate and incubated with the scFv library. After incubation at room temperature for 2 hr with gentle shaking, bound scFv-phages were eluted with 100 µl of 0.1 M glycine-HCl, pH 2.2. After 10 min acid-incubation at room temperature, the eluant was neutralized with 10 µl of 1 M Tris-HCl, pH 8.0. Specificity of eluted phages was confirmed by phageELISA and competitive phageELISA with free ligand in antigen-competition manner, bound phages were detected with an anti-M13 peroxidase-conjugated antibody (GE Healthcare).
Nucleotide sequence analysis
Nucleotide sequence determinations were performed by dye-terminator cycle sequencing using Beckman CEQ DTCS Kit as recommended by manufacturer, or by custom sequencing service. Sequences obtained were compared with NCBI IgBLAST® and analyzed against VBASE2 Ig database . Multiple sequence alignment was performed by ClustalW® from EMBL-EBI server with following default conditions: matrix, BLOSUM; gap opening penalty, 10.0; gap extension penalty, 0.05; gap separation penalty, 8; maxdiv, default; no end gap separation penalty. Alignment in the CDR3 was further adjusted manually in accordance with physical property of amino acid residues.
PhageELISA was performed as described previously . Briefly, the phageELISA assay was carried out in a 96-well ELISA plate which was pre-coated with an antigen (0.3–50 µg). After incubation with 100 µl of scFv-phages at 37°C for 1 hr, in the absence or presence of free ligand, bound phages were detected by incubation with 100 µl of a horseradish peroxidase-conjugated anti-M13 mouse antibody (GE Healthcare) at 37°C for 1 hr. Activity of horseradish peroxidase was measured by a colorimetric method with o-phenylenediamine/H2O2 as substrates. Color was allowed to develop for 1 hr at room temperature, and absorbance at 450 nm was measured with a µQuant™ micro-plate reader (Bio-Tek).
His-tagged scFv expression
For expression analysis, scFv clones were subcloned into pCantab His vector, transformed into E Coli HB2151, and periplasmic His-tagged scFvs were purified using Ni-NTA agarose beads as described in our previous publication .
Retrieval of variable regions of germline rearranged Ig genes
To retrieve the variable regions of rearranged Ig genes from genomic DNA, degenerate primer pair that is complementary to the antibody framework segment 1 (FR1) and FR4 of immunoglobulin chain was used for semi-nested PCR. As shown in Figure 1A, semi-nested PCR generated DNA fragments that displayed a size corresponding to VH (lanes 2 and 5) and VLκ (lanes 4 and 6) variable segments. To confirm the DNA fragments being Ig genes, the PCR products deriving from splenocytic genomic DNA of naïve non-immunized mice were gel-purified and then cloned into vectors. The VH or VLκ transformants (∼100 each) were randomly picked for sequencing. Based on successfully sequenced clones, 92.0% (96/104) and 100% (84/84) of clones were found to be murine Ig VH, and murine Ig VLκ, respectively (Figure 1B).
To analyze the diversity of retrieved Ig genes and to confirm their germline origin, VH and VLκ clones were searched for homology with germline Ig variable region genes against the VBASE2 database (Table S1). Sequence analysis indicated that the retrieved VH and VLκ genes were mainly derived from 6 out of the 15 VH (Figure 1C) and 4 out of the 19 VLκ (Figure 1D) subfamilies, respectively. Intriguingly, in some case the number of Ig clone with unique sequence exceeded the number of germline variable region genes. For instance, a total of 59 unique VLκ clones were identified as members of the VLκ 21 subfamily while the estimated germline Ig gene number of the VLκ 21 subfamily is only 6-13, suggesting retrieved Ig genes might have derived from both rearranged germline and hypermutated Ig genes.
Diversification of CDR3 by frame-shifting PCR
To rescue the non-functional rearranged Ig genes and to enhance the diversity of CDR3, retrieved variable regions of Ig genes were subjected to frame-shifting PCR (fsPCR) that aimed at introducing sequences variation in CDR3. Essentially, the frame-shifting reverse primers (RFSH for VH and RFSK for VLκ) are composed of two portions. In the 5′-portion of the frame-shifting reverse primer is a 5′-degenerate sequence of J gene-segments while the 3′-portion is a random hexamer. We reasoned that the 5′-degenerate J gene sequence would bring the frame-shifting reverse primer close to the CDR3-J region of the Ig templates while the 3′-random hexamer would imperfectly anneal to the CDR3 (Figure 2A). As for the low sequence similarity and the sequence heterogeneity of 3′-random hexamer, annealing between templates and the frame-shifting reverses primer would not be precise and resulted in generating a library of immunoglobulin chains with different CDR3 nucleotide sequences. Indeed, as illustrated in Figure 2B, fsPCR generated DNA fragments that spread around 220-396 bp for VH and 250-344 bp for VLκ Ig genes.
To analyze the nucleotide diversity in CDR3, PCR products of fsPCR were gel-purified, subcloned and sequenced. Multiple sequence alignment of the derived VH and VLκ sequences indicated a significant sequence variation within the CDR3 among various clones. It is of interest to note that sequence variation occurred mainly at the 3′-end of CDR3 which the random hexamer portion of the degenerate reverse primer was predicted to anneal to during fsPCR (Figure 3).
Library construction and selection of phOx-selective scFvs
To evaluate the feasibility of using germline rearranged Ig genes for making a target-specific antibody, an immunized germline antibody library was constructed and then characterized against the hapten antigen 4-ethoxymethylene-2-phenyl-2-oxazolin-5-one (phOx). Three days after the third boost injection of mixture consisting of phOx-chicken serum albumin (CSA) conjugate-and incomplete Freund's adjuvant, splenocytic CD19+ B cells (∼0.8×106) were isolated and used for retrieving variable regions of germline rearranged Ig genes by semi-nested PCR. Following CDR3 diversification by fsPCR, scFvs were assembled by overlap extension PCR and later cloned into pCANTAB 5E phagemid to establish a small scFv phage display library (5.16×105 recombinants).
The phage-displayed scFv library was panned against immobilized phOx-BSA conjugate of which amount was reduced ten-fold in each successive round of panning. After 5 rounds of panning, 288 clones were randomly picked. Specificity of candidate scFv-phages was assessed using phageELISA against phOx-BSA conjugate, and 15 phOx-BSA binders were identified from which 5 high binding clones were selected for further characterization including L3B8C, L3B10B, L3E4C, L3G7C and L3H11A (Table S2). The phOx-selective phage clones bound phOx-BSA in a concentration-dependent manner (Figure 4A). The binding of these phage clones to phOx heptan was further confirmed by competitive phageELISA in which free phOx competed with immobilized phOx-BSA for phage binding. Phage binding towards immobilized phOx-BSA was dose-dependently suppressed by free phOx at concentrations that spread more than two log10 scales, suggesting each phage clone exhibited distinct binding characteristics towards phOx (Figure 4B). Furthermore, the phage clones bound significantly higher to phOx-BSA than to other irrelevant immobilized antigens including BSA, insulin, thyroglobulin, angiotensin II (Ang II), LPS and ssDNA (Figure 4C), suggesting the isolated phage clones were selective towards phOx. To compare the phOx binding pocket, putative amino acid sequences of the phOx-selective scFvs were translated, and their CDRs were identified and aligned. It is noted that some of the identified scFvs shared conserved CDR motifs with previously characterized phOx-binders in the Ig database , , while others showed conservative amino acid substitutions (Figure 4, panels D and E, Table S2).
ScFv library derived from a single scFv template
It was reasoned that if fsPCR could truly diversify CDR3, an scFv library could be established from a single scFv template. To test this hypothesis, a new scFv library was constructed using a single scFv clone L3G7C that exhibited a moderate phOx-specific binding and a noticeable non-specific binding towards LPS (Figure 4). After VH and VLκ amplification and CDR3 diversity enhancement by fsPCR, the newly constructed scFv library was panned against phOx-BSA conjugate again. Two rounds of panning were carried out and 48 candidate clones were randomly picked (Figure 5A). The nucleotide sequence of selected phage clones were determined, and sequence alignment indicated that sequence variations were found only within the CDR3 of both VH (Figure 5B) and VLκ (Figure 5C). Furthermore, it was noted that some of the selected phOx-binders exhibited frame-shift in the Ig CDR3 coding sequences, and resulting in lacking of consensus CDR3 boundary sequence and without recognizable framework 4 (FR4) of Ig (Figure 5 and Table S3).
Reactivity of the selected clones was further evaluated by phageELISA, and 7 distinct clones that exhibited high phageELISA signal were identified (Table S4). PhageELISA indicated the selected clones bound phOx avidly with minimum binding to other control antigens, except LPS (Figure 6A). There was no apparent changes in specific binding (Figure 6B), but clones L10D5A and L10D10A gave a significant lower binding than the parental L3G7C to the unrelated antigens insulin, angiotensin and ssDNA (Figure 6A). Despite clones L10D5A and L10D10A still gave noticeable non-specific binding towards LPS, the binding was substantially lower than the parental L3G7C. These results suggested that fsPCR diversified the CDR3 sequences, which resulted in decreasing the non-specific binding and therefore enhancing the signal-to-noise ratio of antigen binding. Hence, fsPCR could be used for optimizing the binding selectivity of antibody.
Consistent with previous reports that unexpected frame-shifts and stop codons are commonly found in gene expressed in phage-display system , , sequence analysis indicated that some of the selected phage clones contained stop codon or displayed a frame-shift in the coding region (Table S4). To confirm the expression of scFv proteins, those selected phage clones were subcloned into a pCantab His vector which also fused a His6 tag to the C-terminus of scFv protein. The clones were bacterially expressed and the periplasmic proteins were collected. Following Western protein analysis, His6-tagged recombinant scFv proteins with a predicted molecular size of ∼30 kDa were detected (Figure 6C), suggesting the L3G7C-derived phOx-selective phage clones were expressed properly in spite of the coding region containing stop codon and/or frame-shift.
Antigen-specific antibody fragment derived from naïve verse immunized splenocytic germline scFv library
To determine whether germline antibody library represented the immune status, a naïve phage-displayed scFv library (L4; 3.5×105 recombinants) was constructed using splenocytic DNA collected from non-immunized mice, and the library was screened against recombinant N protein of SARS-associated coronavirus (SCoV). For comparison, an immunized germline scFv library (L9; 3×106 recombinants) was also prepared using splenocytic genomic DNA from mice immunized with heat-inactivated SARS associated coronavirus. After two rounds of biopanning, 2100 and 96 specific anti-N scFvs were isolated from both the immunized and naïve libraries, respectively. Following confirmation with phageELISA, six strong binders for SCoV-N protein were identified from naive (L4A3A; L4A3B, L4A8B) and immunized (L9A6B, L9A11B, and L9N01) libraries. It is of interest to note that not only a larger number of phage clones was isolated, 9 clones with identical sequence were found from immunized library L9 and those clones were subsequently referred as L9N01.
In addition to recognizing recombinant SCoV-N protein in its native form, the selected anti-N scFv-phages also bound to the recombinant N protein in Western blot (Figure 7A). Binding of the selected scFv-phages was specific as it only exhibited high binding towards recombinant N protein but not to any of the non-specific controls (Figure 7B). However, it was noted that the phage clone L9N01 cross-reacted with interleukin 11 . Sequence analysis indicated that those selected anti-N scFvs were derived from different rearranged Ig variable region genes, and higher sequence homology was found amongst the light chains (Figure 8, Table S5). However, there was on distinct difference between clones derived from naïve or immunized libraries, except more positive clones were isolated from the immunized library.
In this proof-of-concept study, we have demonstrated the possibility of using rearranged germline immunoglobulin variable region genes as the source of antibody repertoires for construction of germline scFv library. In order to retrieve the rearranged variable region genes from genomic DNA, degenerate primers were used. Though changes in the nucleotide sequences of FR1 and FR4 is a major disadvantage of using degenerate primers, sequence-alteration by degenerate primers was considered as a minor influence . On the other hand, the use of family-specific primers assures less change in nucleotide sequences, but a large number of family specific primer pairs will be required for retrieving the Ig variable region genes from genomic DNA. Moreover cross-family amplification cannot be avoided if all the primers are used at the same time.
The incorporation of a random hexamer in the fsPCR reverse primer may cause a bias towards GC-rich sequences. Nucleotide sequence analysis revealed that the AT and GC contents in the VH CDR3 were 53 and 47% (data not shown), respectively, suggesting that there was no significant bias towards GC-rich sequences. Furthermore, the bias of PCR condition that favored certain nucleotide sequences was minimized by adding 5% DMSO. Although an addition of both DMSO (5%) and betaine (1 M) was suggested to achieve a uniform amplification of heterogeneous templates , we used 5% DMSO alone in the present study because the yield of PCR amplification from genomic templates was not satisfactory in the presence of both chemicals.
Sequence analysis indicated that the variable regions of germline Ig genes were successfully retrieved from genomic DNA. Furthermore, retrieved Ig variable regions could be grouped into 4 VLκ and 7 VH subgroups with reference to germline V gene-segment (Table S6). It is of interest to note that more VH (7 out 15) subfamilies were retrieved than that of VLk (4 out of 19) using present strategy. However, when comparing the distributions of Ig clones in each V genes subfamily, distributions of Ig clones of the naive germline library were different from that of the target-specific Ig clones (Table S6). These results are consistent with the notion that Ig repertoires of naive germline scFv library are highly diverse while library derived from immunized mice was an image of humoral response to antigens.
In silico translation of the coding sequences indicated that some selected scFvs contained unexpected frame-shifts and stop codons. However, protein expression study showed that those clones with defective ORF produced normal scFv recombinant protein in E. Coli. Previously, it has been reported that unexpected frame-shifts and stop codons were found in genes expressed in the phage-displayed system , . A eukaryotic yeast-displayed system has been used to identify antigen-specific scFv clones from a human non-immune antibody library, overcoming the various shortcomings of phage-displayed system . Hence, to facilitate the cloning and production of antigen-specific scFvs, the germline antibody library could alternatively be expressed on the yeast-displayed system.
In agreement with previous report that the length of mice heavy chain CDR3 ranges from 2-19 residues with an average size of 10 residues , the heavy chain CDR3 length of those germline-derived anti-phOx and anti-N scFvs ranged from 2 to 23 residues with a predominant distributions at 8 - 11 residues (Table S7). Consistent with light chain lacking the D segment of heavy chain, the light chain CDR3 length was smaller than that of heavy chain and ranged from 8 to 18 residues with a peak value at 9 residues (Table S7). The similarity in CDR3 length distribution between native and recombinant antibodies suggested the hypervariable regions that derived from germline rearranged Ig genes were folded into the canonical structure of immunoglobulins .
Multiple sequence alignment of those isolated anti-phOx scFv clones indicated that the nucleotide sequences in CDR1 and CDR2 of both heavy and light chains were highly conservative, and indeed they shared significant homology with previously characterized anti-phOx antibodies. This result is consistent with previous reports that mouse anti-phOx IgG response is highly restricted and most antibodies are preferentially generated via the recombination of VH-Ox1 and Vκ-Ox1 variable gene segments , . In comparison with the CDR3 of heavy chain, the light chain CDR3 was more conservative. This result suggests that light chain CDR3 of anti-phOx antibody may play a critical role in phOx binding. The less variability in light chain CDR3 is also in line with previous observation that in the course of affinity maturation of anti-phOx antibodies, hypermutations of amino acid residues in light chain are much concentrated in the CDR1 rather than the CDR3 . It has been reported that the CDR3 of anti-phOx heavy chain consists of a sequence motif Asp-X-Gly-X-X (where X is any amino acid) ,  and the length of CDR3 ranges from 4-11 amino acid residues . In the present study, it is noted that majority of the heavy chain CDR3 of anti-phOx scFvs were 8-11 amino acid residues in length (Tables S2 and S6). The heavy chain CDR3 of L3G7C consisted of the sequence motif Asp-X-Gly-X-X, but other clones were not. These results suggest that the anti-phOx scFvs prepared from genomic DNA with fsPCR displayed similar characteristics of anti-phOx antibody derived from hybridoma or mRNA. However, by comparing the CDR3 composition, anti-phOx scFvs derived from genomic DNA with fsPCR were more heterogeneous in nature.
Moreover, with fsPCR mimicking somatic recombination, we successfully generated CDR3 heterogeneity in length and nucleotide sequence among those anti-phOx scFvs that were obtained from the L3G7C-derived scFv library (Table S4). In comparison with the parental L3G7C, those single template-derived anti-phOx scFvs gave a lower non-specific binding, suggesting fsPCR can be used in antibody engineering for optimizing antibody selectivity.
Sequence variation was found mainly localized at the 3′-end of CDR3 among the anti-phOx scFvs which exhibited different non-specific binding, suggesting the amino acids in the C-terminal end of CDR3 played a critical role in determining the reactivity of antibody. Indeed, the result is consistent with the observation that the last three amino acids of CDR3 in llama antibody determining the reactivity towards different subtype of HIV strain . However, the molecular nature of how these terminal amino acids modulate the binding between antibody and antigen remains to be examined.
Consistent with the results of anti-phOx scFvs, light chain of the anti-N phage clones shared higher homology. On the other hand, higher diversity of amino acids in the CDR3 region of heavy chain was noted. In spite of lacking noticeable difference among the anti-N phage clones that obtained from the naïve and immunized libraries, clones with identical sequence were enriched in the immunized library. These results suggest that antigen-specific monoclonal antibody could be easily obtained from the immunized germline library bypassing the labor-intensive hybridomas preparation.
In principle, the germline scFv library is also an effective means for profiling an antibody response by retrieving rearranged immunoglobulin genes from the CD19+ splenocytes that are stimulated by antigen and undergo clonal expansion, which provides a much-in-demand tool for analyzing the humoral responses to antigens. Indeed, using a germline scFv library derived from splenocytic DNA of mice immunized with heat-inactivated SARS-coronavirus, we have identified an anti-N scFv that cross-reacts with interleukin 11 .
Comparing with the conventional mRNA-derived antibody libraries, germline rearranged Ig gene-derived library may consist of antibodies that have not undergone hypermutations, which results in generating polyreactive antibodies . Indeed, despite phOx-selective binders were successfully isolated from the germline antibody libraries, some selected anti-phOx scFvs exhibited noticeable binding to unrelated antigens. On the other hand, the use of genomic DNA will benefit from its higher stability in handling and larger feasibility in sampling. In addition, the V gene repertoires in the germline antibody library are predicted to be wider than that of mRNA-derived library. For instance, heavy and light chain clones deriving from V pseudogenes (Table S1) were found in the present study. Furthermore, among those isolated anti-phOx scFvs, we found CDR sequences that had been reported previously, and we also identified new CDR motifs. However, in order to evaluate the repertoire diversity of the germline scFv library and to unveil the molecular nature that determines antibody binding specificity, the germline antibody library should be panned against a panel of antigens, and selected antibodies could then be compared for their binding specificity and binding characteristics.
Insertions and deletions of CDR3 nucleotides, which expand the repertoires of antibody hypervariable loop both in length and in structure, have recently been shown to be an additional mechanism how immunoglobulin V region genes are diversified , , . Moreover, lines of evidence suggest that diversity of CDR3 plays a critical role in antigen recognition , . These results support the notion that diversity enhancement of CDR3 by fsPCR can expand antibody diversity beyond what is encoded by germline Ig genes . Current strategy can achieve the effects of gene-shuffling, site-directed mutagenesis, and nucleotide(s) deletion/insertion in one fsPCR step, which not only rescues stop-codon containing Ig genes from the genome, but also modify the repertoires of Ig genes, simultaneously. On the other hand, previous studies have indicated that the CDR3 loop of heavy chain is potentially immunogenic , , diversity enhancement of CDR3 by fsPCR may further increase the immunogenicity of antibody.
Taken together, a simple PCR-based method has been developed to use rearranged germline Ig genes as source of antibody repertoires for library construction. This germline antibody library allows us effectively to prepare target-specific antibody and to profile antibody response.
Schematic diagram showing primer locations. In reference to an scFv molecule, locations of primers used in present study are indicated together with their nucleotide sequences. Restriction sites for cloning in the primers are underlined and restriction enzymes are indicated.
Analysis of immunoglobulin clones derived from mouse genomic DNA.
Analysis of anti-phOx scFv clones from immunized mice-derived scFv Lib03.
Putative reading framework of Library 10 clones that derived from a single scFv template L3G7C.
Analysis of anti-phOx scFv from L3G7C-derived scFv Lib10.
Analysis of anti-SCoVN scFv derived from scFv Lib04 (naive) and Lib09 (immunized).
Germline V-minigene Usage.
Number of amino acid residue in CDR3.
Special thanks to Dr. Shawn S.O. Leung, and Dr. Yiu-Loon Chiu for their suggestions and helpful discussion. Thanks also go to Thomas L.H. Mak and Yick-Keung Suen for their excellent technical assistance.
Competing Interests: The authors have read the journal's policy and have the following conflicts: A United States patent (US7241598 B2) on frame-shifting PCR had been granted in 2007, which is indicated in the manuscript (Materials and Methods) and the patent is listed in the References (Ref. 32). The authors declare that at the present moment, no product is in development. Neither direct or indirect interests exist, nor funding, consultancy, nor employment provided by companies for commercial-related purposes and activities. This does not alter the authors' adherence to all the PLoS ONE policies on sharing data and materials.
Funding: This work was partly supported by a CUHK Strategic Research Program Grant (44M4028), an Earmarked research grant (CUHK4536/03M) from Research Grants Council of Hong Kong Special Administrative Region (HKSAR) and a research grant from HKSAR Innovation and Technology Commission (GHP/030/06). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Articles from PLoS ONE are provided here courtesy of Public Library of Science