IS 490RB Foundations of Data Science

This course will build a practical foundation for data science by teaching students basic tools and techniques that can scale to large computational systems and massive data sets. Students will first learn how to work at a Unix command prompt before learning about source code control software like git and the GitHub site. Next, the Python programming language will be covered, with a focus on specific aspects of the language and associated Python modules that are relevant for Data Science. Python will be introduced and used primarily via the IPython (or Jupyter) Notebooks, and will cover the Numpy, Scipy, MatPlotlib, Pandas, Seaborn, and scikit_learn Python modules. These capabilities will be demonstrated through simple data science tasks such as obtaining data, cleaning data, visualizing data, and basic data analysis.

Recent syllabus

Textbooks and Course Materials

