Evaluation of data integration strategies based on kernel method of clinical and microarray data
Journal Title: Bioinformation - Year 2012, Vol 8, Issue 3
Abstract
The cancer classification problem is one of the most challenging problems in bioinformatics. The data provided by Netherland Cancer Institute consists of 295 breast cancer patient; 101 patients are with distant metastases and 194 patients are without distant metastases. Combination of features sets based on kernel method to classify the patient who are with or without distant metastases will be investigated. The single data set will be compared with three data integration strategies and also weighted data integration strategies based on kernel method. Least Square Support Vector Machine (LS-SVM) is chosen as the classifier because it can handle very high dimensional features, for instance, microarray data. The experiment result shows that the performance of weighted late integration and the using of only microarray data are almost similar. The data integration strategy is not always better than using single data set in this case. The performance of classification absolutely depends on the features that are used to represent the object.
Authors and Affiliations
Ary Noviyanto , Ito Wasito
A prognostic model for the combined analysis of gene expression profiling in hepatocellular carcinoma.
Microarray techniques using cDNA array and comparative genomic hybridization (CGH) have been developed for several discovery applications. They are frequently applied for the prediction and diagnosis of cancer in recent...
EuDBase: An online resource for automated EST analysis pipeline (ESTFrontier) and database for red seaweed Eucheuma denticulatum.
Functional genomics has proven to be an efficient tool in identifying genes involved in various biological functions. However the availability of commercially important seaweed Eucheuma denticulatum functional resources...
EGID: an ensemble algorithm for improved genomic island detection in genomic sequences
Genomicislands (GIs) are genomic regions that are originally transferred from other organisms. The detection of genomic islands in genomes can lead to many applications in industrial, medical and environmental contexts....
Structural prediction and analysis of VIH-related peptides from selected crustacean species
The tentative elucidation of the 3D-structure of vitellogenesis inhibiting hormone (VIH) peptides is conversely underprivileged by difficulties in gaining enough peptide or protein, diffracting crystals, and numerous ext...
RegStatGel: proteomic software for identifying differentially expressed proteins based on 2D gel images.
Image analysis of two-dimensional gel electrophoresis is a key step in proteomic workflow for identifying proteins that change under different experimental conditions. Since there are usually large amount of proteins and...