SYST/OR 568. Applied Predictive Analytics - Mason Analytics MS

George Mason University
Fall 2023

Course Material
Video Lectures

TA: Raina Joy Saha (rsaha3 (at) gmu.edu)
Instructor: Vadim Sokolov

Location: Krug Hall 7
Time: Mondays 4:30 pm - 7:10 pm
Office hours: By appointment
Prerequisites: Graduate standing (Undergraduate engineering math: Calculus, probability theory, statistics, and some basic computer programming skills.).
HW Logistics: You will submit your HW and projects to BlackBoard

Content and goals

Introduces predictive analytics with applications in engineering, business, finance,health care, and social economic areas. Topics include time series and cross-sectional data processing, data visualization, correlation, linear and multiple regressions, classification and clustering, time series decomposition, factor models and causal models, predictive modeling performance analysis, and case study. Provides a foundation of basic theory and methodology with applied examples to analyze large engineering, social, and econometric data for predictive decision making. Hands-on experiments with R will be emphasized.

List of Topics

  • Predicting with probability (3 Ch 2,)
  • Data and Statistics (5 Ch 4,)
  • Linear regression (3 Ch)
  • Classification (ISLR Ch 4, APM Ch 12
  • Lasso and Model Selection (ISLR Ch 6)
  • Tree-based methods (ISLR Ch 8)
  • Estimation
  • Bayesian Inference
  • Time series forecasting (html notes, pdf notes)

Schedule

  • Aug 21: First Class
  • Sept 4: Labor Day : University Closed
  • Sep 11: HW 1 Due
  • Sep 25: HW 2 Due
  • Oct 9: Fall Break (Classes Do Not Meet)
  • Oct 10: We have a class
  • Oct 16: HW 3 Due
  • Oct 23: In-class Midterm
  • Oct 30: Final Project Proposal
  • Nov 6: Hw 4 Due
  • Nov 20: Hw 5 Due
  • Nov 27: Last class, project presentations
  • Dec 3: Final Projects Due

Assignments

Students will have a in-class midterm exam and final project. There are 5 homework assignments; students are encouraged to work in small groups. Each homework has 2-3 theoretical questions and 2-3 hands-on problems. Theoretical questions will be based on the material covered in class. Hands-on problems will require using R and routines provided by instructor to perform data analysis tasks. For the final project a student or a group of students can choose their own data set and a hypothesis to verify. Instructor will have 1-2 data sets/analysis problems, in case students have hard time identifying it on their own. Work on the final project can begin as soon as class starts. Each group will submit the final report.

Computing

You can choose which software you use. I recommend investing the time to learn R. Python is good choice as well. R is the dominant software package for real world Predictive Analytics and is used throughout other courses. This open-source software is available for free download at www.r-project.org and you can find documentation there.

A great way to start learning is to buy a book and start working through tutorials. A good guide is Adler’s Nutshell. They have many tutorials to help you get up to speed. You can browse other options by searching ‘R statistics’ on Amazon. If you are new to R (and even if not) you should complete a tutorial to familiarize yourself with the language. A great option is the TryR code school.

Grading

Grade based entirely on participation in class, homework assignments, in-class midterm and final project.

Midterm 35% + Final project + 35% + Homework 30%. Scores of each component are normalized to be out of 100. Grades will be posted on (D. Cut-offs: 97 (A+), 93 (A), 90 (A-), 87 (B+), 82 (B), 79 (B-), 77 (C+), 73 (C), 70 (C-), 67 (D+), 60)

Optional Textbooks

  • Diez, Barr and Cetinkaya-Rundel Statistics, OpenIntro, 2015
  • James, Witten, Hastie and Tibshirani, R, Springer, 2009.
  • Kuhn and Johnson, Modeling, Springer, 2013.
  • Hyndman and Athanasopoulos, Practice, OTexts, 2013.

Mason Honor Code

To promote a stronger sense of mutual responsibility, respect, trust, and fairness among all members of the George Mason University community and with the desire for greater academic and personal achievement, we, the student members of the university community, have set forth this honor code: Student members of the George Mason University community pledge not to cheat, plagiarize, steal, or lie in matters related to academic work. Students are responsible for their own work, and students and faculty must take on the responsibility of dealing with violations. The tenet must be a foundation of our university culture.

All work performed in this course will be subject to Mason’s Honor Code. Students are expected to do their own work in the course. For the group project, students are expected to collaborate with their assigned group members. In papers and project reports, students are expected to write in their own words,

Individuals with Disabilities

The university is committed to providing equal access to employment and educational opportunities for people with disabilities.

Mason recognizes that individuals with disabilities may need reasonable accommodations to have equally effective opportunities to participate in or benefit from the university educational programs, services, and activities, and have equal employment opportunities. The university will adhere to all applicable federal and state laws, regulations, and guidelines with respect to providing reasonable accommodations as necessary to afford equal employment opportunity and equal access to programs for qualified people with disabilities.

Applicants for admission and students requesting reasonable accommodations for a disability should call the Office of Disability Services at 703-993-2474. Employees and applicants for employment should call the Office of Equity and Diversity Services at 703-993-8730. Questions regarding reasonable accommodations and discrimination on the basis of disability should be directed to the Americans with Disabilities Act (ADA) coordinator in the Office of Equity and Diversity Services.