Skip to main content

CGS602A: Basic Statistics Data Analysis & Inference

Course Description

In the course we will study experiment design, basic data analysis and inference, hypothesis testing and some modelling. This involves exploring data via visualization to help propose viable hypotheses; testing hypotheses and constructing and validating predictive models for the data. The basic statistics and probability concepts required will also be covered - but quickly.

The language and platform used will be Python. Python will not be covered in the course. Look at the reference material in the links page.

At the end of the course students should be proficient in the platform used and should be able to use it to do basic data analysis and inference. Also, contruct simple statistical models for data. They will also be expected to understand the basic theoretical statistical basis for data analysis. However, the emphasis will be on applying statistical methods for data analysis and not the theory behind the methods themselves.

The topics below will not be covered in sequence but braided together so that the course as whole makes sense.

Course Content

Topics

  1. Types of studies, experiments and their design. Independent and dependent variables. Sources of error and variation.
  2. Basic statistics and probability fundamentals.
  3. Basic visualization of data (scatter, box,` density and other plots).
  4. Probability laws.
  5. Discrete and continuous distributions.
  6. Parameters of a distribution.
  7. Moments, expectation, linearity of expectation.
  8. Inferring the parameters of a distribution from a sample. Likelihood function and maximum likelihood estimates. Biased and unabiased estimates.
  9. Basic overview of the Python libraries needed.
  10. Sampling and the sampling distribution, confidence intervals.
  11. Null and alternate hypothesis.
  12. Significance level, P-values. Basic statistical inference: various statistics and tests based on them. Single and two population tests for mean, proportion, difference of means and proportions.
  13. Simple, multiple regression models; testing and validating models; model selection.
  14. One way and two way analysis of variance. Analysis of longitudinal studies and repeated measures.
  15. Basic non-parametric tests.
  16. Bayesian methods and modelling.
  17. (Optional - if time permits) Some basic computational models e.g. accumulator models.

Reference material

Links to internet resources are at: useful links.

  1. Scott E Maxwell, Harold D Delaney, Ken Kelley, Designing Experiments and Analyzing Data: A Model Comparison Perspective, 3rd Ed, Routledge, 2018.
  2. Gary W Oehlert, A First Course in Design and Analsyis of Experiments, Creative Commons License, 2010.
  3. Marc C Paolella, Fundamental Probability: A Computational Approach, Wiley, 2006.
  4. Marc C Paolella, Fundamental Statistical Inference: A Computational Approach, Wiley, 2018.
  5. R Barker Bausell, The Design and Analysis of Meaningful Experiments Involving Human Participants, Oxford, 2015.
  6. D S Moore, G P McCabe, B A Craig, Introduction to the Practice of Statistics, 9th Ed., WH Freeman and Co., 2017.