A first course in probability and statistics emphasizing statistical reasoning and basic concepts. Topics include visual and numerical summaries of data, representative and non-representative samples, elementary discrete probability theory, the normal distribution, sampling variability, elementary statistical inference, measures of association. Examples and applications from the popular press and the life, social and physical sciences. No prerequisites.
Spring 2014 (180), Spring 2015 (144), Spring 2016 (184)
This is a course on the formal mathematical and the informal intuitive foundations of modern applications of statistics and related fields. The emphasis is on foundational concepts rather than practical applications.
The main topics are:
(1) the maximum entropy principle for large systems and large deviations
(2) the bias-variance dilemma for nonparametric inference
(3) computation and inference for graphical models
Topic (1) will touch on ideas from statistical physics, large deviations, and information theory. Topic (2) will introduce some of the key concepts from classical statistics and then focus on more modern techniques like kernel methods and support vector machines. Topic (3) will introduce graphical models and highlight some important tools like dynamic programming, MCMC, and EM.
1. Gibbs ensembles, maximum entropy principle, large deviations, relative entropy, Sanov's theorem, conditional limit theorem, exponential families, lossless source coding, entropy, asymptotic equipartition property, lossless source coding theorem, exchangeability, asymptotic independence, Maxwell's distribution
2. Statistical estimation, consistency, bias, variance, mean squared error, mean integrated squared error, kernel density estimation, bandwidth, cross-validation, Stone's theorem, curse of dimensionality, maximum likelihood estimation, classification, error probabilities, Bayesian classification rule, Neyman-Pearson lemma, ROC curves, statistical classification, generative models, linear discriminant analysis, quadratic discriminant analysis, naive Bayes, discriminative models, logistic regression, k nearest neighbors, maximum margin classifiers, support vector machines, kernel methods
3. Undirected graphical models, Gibbs random fields, Markov random fields, Hammersley-Clifford theorem, derived dependency graphs, dynamic programming, hidden Markov models, Gibbs sampling, Markov chain Monte Carlo, exponential families, latent variable models, minorization-maximization, expectation-maximization, Gaussian mixture models
Fall 2009 (21), Spring 2011 (36), Spring 2012 (43), Spring 2016 (57)
Probability and statistics are increasingly computational fields. This course exposes students to several topics that use computers to solve challenging problems in probability and statistics. Students will also use computers to develop intuitions about classical analytic results in probability and statistics.
0. Review (probability, random variables, expectation, hypothesis testing, confidence intervals, linear algebra, eigenvectors)
1. Simulating randomness (pseudo-random number generation, transformation of random variables, rejection sampling)
2. Stochastic approximation (law of large numbers, central limit theorem, convolution, Monte Carlo integration, importance sampling)
3. Random walks (recurrence properties, exit probabilities)
4. Graphical models (Gibbs random fields, Markov random fields, Bayes nets, hidden Markov models, dynamic programming, Gibbs sampling, 2D Ising model)
5. Dimensionality reduction (principle components analysis, projection pursuit, independent components analysis, blind source separation)
Fall 2010 (64), Fall 2011 (102)
Information theory is the study of the fundamental limits of information transmission and storage. The concepts of information theory extend far beyond communication theory, however, and have influenced diverse fields from physics to computer science to biology. This course, intended primarily for advanced undergraduates and beginning graduate students, offers a broad introduction to information theory and its applications: Entropy and information; lossless data compression; communication in the presence of noise, channel capacity, and channel coding; lossy compression and rate-distortion theory; Kolmogorov complexity.
Fall 2013 (25), Fall 2014 (35)
Many modern data sets involve observations about a network of interacting components. Probabilistic and statistical models for graphs and networks play a central role in understanding these data sets. This is an area of active research across many disciplines. Students will read and discuss primary research papers and complete a final project. I will occasionally lecture on background material for the readings. The required readings will focus on breadth over depth, and the selection of readings may depend somewhat on student interest. Students will acquire depth through their individual projects.
Spring 2010 (16)
Roughly 1% of the US adult population is incarcerated. Many will return to prison after their release. Educational programs for inmates, particularly GED and college-level courses, can reduce recidivism rates. There are only a few existing programs for Brown-affiliated individuals to get involved in correctional education. I, along with several graduate students, have been teaching degree-granting math courses (accredited by the Community College of Rhode Island) within Rhode Island's state prisons. Talk with me if you would like more information.
Spring 2016 Office Hours
182 George St Rm 327
APMA 0650 Office Hrs
182 George St Rm 327
APMA 1740/2610 Office Hrs
182 George St Rm 327
Last updated February 8, 2016
© Matthew Harrison
Any opinions, findings, conclusions or recommendations expressed in this material or material obtained from this website are solely those of the author and do not necessarily reflect the views of Brown University, NSF, DARPA, NIH or any other sponsor.