These tools will no longer be maintained as of December 31, 2024. Archived website can be found here. PubMed4Hh GitHub repository can be found here. Contact NLM Customer Service if you have questions.


PUBMED FOR HANDHELDS

Search MEDLINE/PubMed


  • Title: A bioinformatics pipeline to build a knowledge database for in silico antibody engineering.
    Author: Zhao S, Lu J.
    Journal: Mol Immunol; 2011 Apr; 48(8):1019-26. PubMed ID: 21310488.
    Abstract:
    A challenge to antibody engineering is the large number of positions and nature of variation and opposing concerns of introducing unfavorable biochemical properties. While large libraries are quite successful in identifying antibodies with improved binding or activity, still only a fraction of possibilities can be explored and that would require considerable effort. The vast array of natural antibody sequences provides a potential wealth of information on (1) selecting hotspots for variation, and (2) designing mutants to mimic natural variations seen in hotspots. The human immune system can generate an enormous diversity of immunoglobulins against an almost unlimited range of antigens by gene rearrangement of a limited number of germline variable, diversity and joining genes followed by somatic hypermutation and antigen selection. All the antibody sequences in NCBI database can be assigned to different germline genes. As a result, a position specific scoring matrix for each germline gene can be constructed by aligning all its member sequences and calculating the amino acid frequencies for each position. The position specific scoring matrix for each germline gene characterizes "hotspots" and the nature of variations, and thus reduces the sequence space of exploration in antibody engineering. We have developed a bioinformatics pipeline to conduct analysis of human antibody sequences, and generated a comprehensive knowledge database for in silico antibody engineering. The pipeline is fully automatic and the knowledge database can be refreshed anytime by re-running the pipeline. The refresh process is fast, typically taking 1min on a Lenovo ThinkPad T60 laptop with 3G memory. Our knowledge database consists of (1) the individual germline gene usage in generation of natural antibodies; (2) the CDR length distributions; and (3) the position specific scoring matrix for each germline gene. The knowledge database provides comprehensive support for antibody engineering, including de novo library design in selection of favorable germline V gene scaffolds and CDR lengths. In addition, we have also developed a web application framework to present our knowledge database, and the web interface can help people to easily retrieve a variety of information from the knowledge database.
    [Abstract] [Full Text] [Related] [New Search]