Date of Award


Document Type


Degree Name

Master of Science (MS)


College of Science and Mathematics


Computer Science

Thesis Sponsor/Dissertation Chair/Project Chair

Jing Peng

Committee Member

Stefan Robila

Committee Member

Dajin Wang


In this thesis we propose the use of sparse Principal Component Analysis (PCA) for representing high dimensional data for classification. Sparse transformation reduces the data volume/dimensionality without loss of critical information, so that it can be processed efficiently and assimilated by a human. We obtained sparse representation of high dimensional dataset using Sparse Principal Component Analysis (SPCA) and Direct formulation of Sparse Principal Component Analysis (DSPCA). Later we performed classification using k Nearest Neighbor (kNN) Method and compared its result with regular PCA. The experiments were performed on hyperspectral data and various datasets obtained from University of California, Irvine (UCI) machine learning dataset repository. The results suggest that sparse data representation is desirable because sparse representation enhances interpretation. It also improves classification performance with certain number of features and in most of the cases classification performance is similar to regular PCA.

File Format