Date of Award
8-2008
Document Type
Thesis
Degree Name
Master of Science (MS)
College/School
College of Science and Mathematics
Department/Program
Computer Science
Thesis Sponsor/Dissertation Chair/Project Chair
Jing Peng
Committee Member
Stefan Robila
Committee Member
Dajin Wang
Abstract
In this thesis we propose the use of sparse Principal Component Analysis (PCA) for representing high dimensional data for classification. Sparse transformation reduces the data volume/dimensionality without loss of critical information, so that it can be processed efficiently and assimilated by a human. We obtained sparse representation of high dimensional dataset using Sparse Principal Component Analysis (SPCA) and Direct formulation of Sparse Principal Component Analysis (DSPCA). Later we performed classification using k Nearest Neighbor (kNN) Method and compared its result with regular PCA. The experiments were performed on hyperspectral data and various datasets obtained from University of California, Irvine (UCI) machine learning dataset repository. The results suggest that sparse data representation is desirable because sparse representation enhances interpretation. It also improves classification performance with certain number of features and in most of the cases classification performance is similar to regular PCA.
File Format
Recommended Citation
Siddiqui, Salman, "Sparse Representation of High Dimensional Data for Classification" (2008). Theses, Dissertations and Culminating Projects. 1265.
https://digitalcommons.montclair.edu/etd/1265