In the current work, we investigate the impact of two library design elements: tailored diversity at putative paratope positions and partial wild-type conservation at structurally important positions within the paratope region. We hypothesize that the amino acid distribution observed in expressed immunoglobulin complementarity-determining regions is superior to both an equal distribution and focused tyrosine/serine degeneracy. We have identified a set of skewed nucleotide mixtures that yields codons that approximately match this antibody-inspired distribution while enabling library synthesis at significantly less expense than triphosphoramidite-based construction. A second design hypothesis is that wild-type bias at structurally critical positions within the paratope region can improve library design by increasing the functional diversity of the library, reducing the entropic cost of binding, and focusing diversity on positions with higher likelihood for antigen contact.
We investigated these hypotheses in the context of the tenth type III domain of human fibronectin (Fn3). Fn3 is a small, stable protein with a beta-sandwich fold and three solvent exposed loops, which is structurally similar to an antibody variable domain. The two new design elements were incorporated into a new Fn3 library. Stability, structural, and genetic analyses were conducted to determine the extent of wild-type bias, if any, desired at each position. Diversity was achieved using skewed nucleotides to yield the antibody-inspired distribution. Direct competition of the new library versus two previous library designs yielded preferential selection of binders from the new library. Sequence analysis reveals the relative importance of the multiple design elements. The impact of wild-type bias was further characterized by stability analysis of library clones providing insight into the structural tolerance of the domain as well as potential refinement of library design. The library design principles and methodology should be directly applicable to other protein engineering efforts both in molecular recognition and elsewhere.