IS 517 Methods of Data Science

A dramatic increase in computing power has enabled new areas of data science to develop in statistical modeling and analysis. These areas cover predictive and descriptive learning bridge ideas and theory in statistics, computer science and artificial intelligence. We will cover many of these new methods including predictive learning such as estimating models from data to predict future outcomes, notably regression and classification models. Regression topics include linear regression with recent advances to deal with large numbers of variables, smoothing techniques, additive models, and local regression. Classification topics include discriminant analysis, logistic regression, support vector machines, generalized additive models, naive Bayes, mixture models and nearest neighbor methods. Lastly we develop neural networks and deep learning techniques, bridging the theory introduced in the earlier parts of the class to purely empirical methods. We situate these methods in the "data science lifecycle" as part of the larger set of practices in the discovery and communication of scientific findings.

Recent syllabus