Date of Award


Document Type


Degree Name

Master of Science (MS)


College of Science and Mathematics


Mathematical Sciences

Thesis Sponsor/Dissertation Chair/Project Chair

Haiyan Su

Committee Member

Meiyin Wu

Committee Member

Andrada Ivanescu


Numerous diseases can occur when swimming or fishing in contaminated water. It is important that the quality of water is being monitored prior to recreational activities. To model the quality of the water at any given time with available predictors, data from bodies of water across New Jersey from 1999 to 2013 was collected from the database STORET. The water quality parameters studied were Escherichia coli (E. coli) and enterococcus with the predictors as Dissolved oxygen (DO), pH, Salinity, Temperature, Total Dissolved Solids (TDS) and Total Suspended Solids (TSS). The data was broken down by habitat type. The habitat types in this study are: river, coastal, inland coastal, and ponded. For each habitat, statistical models were built for E. coli and enterococcus as response variables with available predictors given above. The techniques we used to analyze this data were multiple regression, logistic regression, and lasso regression. Likelihood ratio tests and deviance tests were used to do model selection. The results of each method were compared for each habitat. With the best selected model, water quality can be predicted using available predictors.

File Format


Included in

Mathematics Commons