These tools will no longer be maintained as of December 31, 2024. Archived website can be found here. PubMed4Hh GitHub repository can be found here. Contact NLM Customer Service if you have questions.
Pubmed for Handhelds
PUBMED FOR HANDHELDS
Search MEDLINE/PubMed
Title: Representative distance: a new similarity measure for class discovery from gene expression data. Author: Yu Z, You J, Li L, Wong HS, Han G. Journal: IEEE Trans Nanobioscience; 2012 Dec; 11(4):341-51. PubMed ID: 22893451. Abstract: Similarity measurement is one of the most important stages in the process of cancer discovery from gene expression data. Traditional distance functions, such as the Euclidean distance, the correlation coefficient measure, the cosine distance, and so on, are selected to quantify the similarity between two cancer samples. However, these measures do not take into account the properties of cancer samples and do not consider the relationships among the genes in gene expression data. In order to explore the properties of cancer samples and the relationships among genes, we design a new similarity measure called representative distance (RD) to identify cancer samples in gene expression data. Specifically, RD does not compute the distance between two cancer samples using all the genes, but only calculates the similarity using representative genes selected by the affinity propagation algorithm. Then, a similarity matrix is constructed based on the representative distance. Finally, the spectral clustering algorithm is adopted to partition the similarity matrix, and discover the biological meaningful samples. To our knowledge, this is the first time in which the representative distance is applied to class discovery for gene expression data. Experiments on real cancer datasets indicate that our similarity measure can i) outperform most of the traditional distance measures, ii) identify cancer samples correctly in most of the datasets.[Abstract] [Full Text] [Related] [New Search]