Aleksandar (Alex) Vakanski

 

CS 404/504 Special Topics: Python Programming for Data Science

 

Course Syllabus

Syllabus

Course GitHub page

https://github.com/avakanski/Fall-2022-Python-Programming-for-Data-Science

Course Description

With the increased use of data science projects for improving various functions and operations across organizations, the tools for managing such projects have matured as well. This course introduces students to Python tools and libraries that are commonly used by organizations for management of the different phases in the life cycle of data science projects. The content is divided into four main themes. The first theme reviews the basics of Python programming and extends it with advanced concepts. The second theme focuses on data engineering, and covers Python tools for data exploration and preprocessing. The next theme overviews model engineering, and includes model training, testing, fine-tuning, and selection. The last theme introduces Data Science Operations (DSOps), and covers techniques for model deployment, performance monitoring, and reproducibility of data science projects in production environment. The course will provide hands-on Python programming experience for data science workflow management. Additional work is required for graduate credit.

Learning Outcomes

Upon the completion of the course, the students should demonstrate the ability to:

1.  Understand and describe commonly used Python frameworks for life cycle management of data science projects.

2.  Apply advanced Python tools for data collection, analysis, and visualization.

3.  Design, validate, and justify the selection of data science models using statistical approaches, data mining, and machine learning methods.

4.  Implement algorithms for processing tabular, image, and natural language data using Python-based frameworks.

5.  Understand the main characteristics of existing Python libraries for deployment, continuous delivery, and monitoring of data science projects.

6.  Deploy data science projects on cloud servers and as web applications.

Course Materials

Textbooks:

  1. Joel Grus, "Data Science from Scratch: First Principles with Python," 2nd Edition, O'Reilly Media, 2019, ISBN: 9781492041139.
  2. Chip Huyen, "Designing Machine Learning Systems," O'Reilly Media, 2022, ISBN: 9781098107963.

Topics

Prerequisites

CS 212 Practical Python - OR - CS 477/577 Python for Machine Learning - OR - Instructor Permission

The course requires basic programming skills in Python. Knowledge of data science approaches is recommended, but it is not required.

Evaluation Procedure

Quizzes (3)

30 %

Assignments (4)

60 %

Class participation

10 %