These tools will no longer be maintained as of December 31, 2024. Archived website can be found here. PubMed4Hh GitHub repository can be found here. Contact NLM Customer Service if you have questions.
Pubmed for Handhelds
PUBMED FOR HANDHELDS
Search MEDLINE/PubMed
Title: Fingerprint-based clustering applied to define a QSAR model use radius. Author: Sprous DG. Journal: J Mol Graph Model; 2008 Sep; 27(2):225-32. PubMed ID: 18556228. Abstract: In ongoing research, QSAR has been a tool applied to evaluate compound qualities associated with skin permeability and membership in either a druglike class or specific nondruglike type classes. A need that arose from this pursuit was to know the boundaries of the QSAR models within which molecules could be analyzed. To satisfy this need, a method of QSAR model validation was developed which moves away from the simple declaration of correlation to a description of expected correlation as a function of similarity to the training set. This extension of the "validation" and "predictive" concepts to include a border is referred to henceforth as the QSAR model use radius. By defining this metric, it is possible to select for models which have predictivity exterior to their training sets. The heart of this approach is the common use of division into training sets and test sets to demonstrate an ability to successfully predict outside of the training set. The new rigor introduced is to repetitively cluster and systematically increase the permitted dissimilarity within those clusters. The training sets are assembled by taking one and only one compound from each cluster at a specific level of permitted dissimilarity. The QSAR model is developed over these training sets and applied to predict the remaining compounds. In this manner, it is possible to point where there is adequate similarity to predict a compound and where there is not. This method is especially useful for large, chemically redundant systems of greater than 250 compounds where leave-one-out crossvalidation is of limited use. To illustrate this technique, the results of defining the use radius for (a) a skin permeability model (based on 276 compounds), (b) a drug compound and "safe" compound partition (3000 compounds) and (c) a kinase inhibitor and drug compound partition ( approximately 1300 compounds) are discussed.[Abstract] [Full Text] [Related] [New Search]