Introduces predictive analytics with applications in engineering, business, finance,health care, and social economic areas. Topics include time series and cross-sectional data processing, data visualization, correlation, linear and multiple regressions, classification and clustering, time series decomposition, factor models and causal models, predictive modeling performance analysis, and case study. Provides a foundation of basic theory and methodology with applied examples to analyze large engineering, social, and econometric data for predictive decision making. Hands-on experiments with R will be emphasized.
Predicting with probability (OpenIntro Statistics Ch 2, 3)
Data and Statistics (OpenIntro Statistics Ch 4, 5)
Linear regression (ISLR Ch 3)
Classification (ISLR Ch 4, APM Ch 12, Tennis)
Lasso and Model Selection (ISLR Ch 6)
Tree-based methods (ISLR Ch 8)
Estimation
Bayesain Inference
Time series forecasting (FPP)
Aug 29: No class
Sept 5: Labor Day : University Closed
Sep 12: HW 1 Due
Sep 26: HW 2 Due
Oct 10: HW 3 Due
Oct 10: Fall Break (Classes Do Not Meet)
Oct 17: In-class Midterm
Oct 24: Final Project Proposal
Oct 31: Hw 4 Due
Nov 14: Hw 5 Due
Nov 28: Last class, project presentations
Dec 5: Final Projects Due
Students will have a in-class midterm exam and final project. There are 5 homework assignments; students are encouraged to work in small groups. Each homework has 2-3 ‘‘theoretical questions’’ and 2-3 ‘‘hands-on’’ problems. Theoretical questions will be based on the material covered in class. Hands-on problems will require using R and routines provided by instructor to perform data analysis tasks. For the final project a student or a group of students can choose their own data set and a hypothesis to verify. Instructor will have 1-2 data sets/analysis problems, in case students have hard time identifying it on their own. Work on the final project can begin as soon as class starts. Each group will submit the final report.
You can choose which software you use. I recommend investing the time to learn R. Python is good choice as well. R is the dominant software package for real world Predictive Analytics and is used throughout other courses. This open-source software is available for free download at www.r-project.org and you can find documentation there. A great way to start learning is to buy a book and start working through tutorials. A good guide is Adler’s R in a Nutshell. They have many tutorials to help you get up to speed. You can browse other options by searching ‘R statistics’ on Amazon. If you are new to R (and even if not) you should complete a tutorial to familiarize yourself with the language. A great option is the TryR code school.
Grade based entirely on participation in class, homework assignments, in-class midterm and final project.
Midterm 35% + Final project + 35% + Homework 30%. Scores of each component are normalized to be out of 100. Grades will be posted on Bb. Cut-offs: 97 (A+), 93 (A), 90 (A-), 87 (B+), 82 (B), 79 (B-), 77 (C+), 73 (C), 70 (C-), 67 (D+), 60 (D)
Diez, Barr and Cetinkaya-Rundel OpenIntro Statistics, OpenIntro, 2015
James, Witten, Hastie and Tibshirani, An Introduction to Statistical Learning with Applications in R, Springer, 2009.
Kuhn and Johnson, Applied Predictive Modeling, Springer, 2013.
Hyndman and Athanasopoulos, Forecasting: Principles and Practice, OTexts, 2013.
To promote a stronger sense of mutual responsibility, respect, trust, and
fairness among all members of the George Mason University community and
with the desire for greater academic and personal achievement, we, the
student members of the university community, have set forth this honor code:
Student members of the George Mason University community pledge not to
cheat, plagiarize, steal, or lie in matters related to academic work.
Students are responsible for their own work, and students and faculty must take
on the responsibility of dealing with violations. The tenet must be a foundation
of our university culture.
All work performed in this course will be subject to Mason's Honor Code.
Students are expected to do their own work in the course. For the group project,
students are expected to collaborate with their assigned group members.
In papers and project reports, students are expected to write in their own words,
The university is committed to providing equal access to employment and
educational opportunities for people with disabilities.
Mason recognizes that individuals with disabilities may need reasonable
accommodations to have equally effective opportunities to participate in or
benefit from the university educational programs, services, and activities, and
have equal employment opportunities. The university will adhere to all
applicable federal and state laws, regulations, and guidelines with respect to
providing reasonable accommodations as necessary to afford equal
employment opportunity and equal access to programs for qualified people
with disabilities.
Applicants for admission and students requesting reasonable accommodations
for a disability should call the Office of Disability Services at 703-993-2474.
Employees and applicants for employment should call the Office of Equity and
Diversity Services at 703-993-8730. Questions regarding reasonable
accommodations and discrimination on the basis of disability should be
directed to the Americans with Disabilities Act (ADA) coordinator in the Office
of Equity and Diversity Services.