Department of Statistics & Actuarial Science Simon Fraser University |
Author: Carl James Schwarz Professional Statistician P.Stat. (Statistical Society of Canada), PStat® (American Statistical Association) Phone: Retired. Office: Retired. Email: Email |

The course notes below illustrate methods of analysis using JMP, R, or SAS. Instructions for installing R, JMP, SAS (and other softeware)..

Sample code used in the notes is available at:
**Sample Program Library**.
Note that even though a chapter may not have a version for a package, the program code
for the example is often available -- I just haven't had time yet to update the notes
to include the code and output directly in the notes.

The suggested citation for a chapter of notes is:

Schwarz, C. J. (2014). Chapter Name.

In *Course Notes for Beginning and Intermediate Statistics*.

Available at http://www.stat.sfu.ca/~cschwarz/CourseNotes. Retrieved yyyy-mm-dd.

Package | Chapter and sections | ||||
---|---|---|---|---|---|

JMP | R | SAS | 1 In the beginning... |
||

1.1 My favorite papers | |||||

1.2 Introduction | |||||

1.3 Effective note taking strategies | |||||

1.4 It's all $\Gamma \rho \epsilon \epsilon \kappa $ to me | |||||

1.5 Which computer package? | |||||

1.6 FAQ - Frequently Asked Question | |||||

JMP | R | SAS | 2 Introduction to Statistics |
||

2.1 TRRGET - An overview of statistical inference | |||||

2.2 Parameters, Statistics, Standard Deviations, and Standard Errors | |||||

2.3 Confidence Intervals | |||||

2.4 Hypothesis testing | |||||

2.5 Meta-data | |||||

2.6 Bias, Precision, Accuracy | |||||

2.7 Types of missing data | |||||

2.8 Transformations | |||||

2.9 Standard deviations and standard errors revisited | |||||

2.10 Other tidbits | |||||

JMP | R | SAS | 3 Sampling |
||

3.1 Introduction | |||||

3.2 Overview of Sampling Methods | |||||

3.3 Notation | |||||

3.4 Simple Random Sampling Without Replacement (SRSWOR) | |||||

3.5 Sample size determination for a simple random sample | |||||

3.6 Systematic sampling | |||||

3.7 Stratified simple random sampling | |||||

3.8 Ratio estimation in SRS - improving precision with auxiliary information | |||||

3.9 Additional ways to improve precision | |||||

3.10 Cluster sampling | |||||

3.11 Multi-stage sampling - a generalization of cluster sampling | |||||

3.12 Analytical surveys - almost experimental design | |||||

3.13 References | |||||

3.14 Frequently Asked Questions (FAQ) | |||||

JMP | R | SAS | 4 Designed Experiments - Terminology and Introduction |
||

4.1 Terminology and Introduction | |||||

4.2 Applying some General Principles of Experimental Design | |||||

4.3 Some Case Studies | |||||

4.4 Key Points in Design of Experiments | |||||

4.5 A Road Map to What is Ahead | |||||

JMP | R | SAS | 5 Single Factor - Completely Randomized Designs (a.k.a. One-way design) |
||

5.1 Introduction | |||||

5.2 Randomization | |||||

5.3 Assumptions - the overlooked aspect of experimental design | |||||

5.4 Two-sample $t$-test- Introduction | |||||

5.5 Example - comparing mean heights of children - two-sample $t$-test | |||||

5.6 Example - Fat content and mean tumor weights - two-sample $t$-test | |||||

5.7 Example - Growth hormone and mean final weight of cattle - two-sample $t$-test | |||||

5.8 Power and sample size | |||||

5.9 ANOVA approach - Introduction | |||||

5.10 Example - Comparing phosphorus content - single-factor CRD ANOVA | |||||

5.11 Example - Comparing battery lifetimes - single-factor CRD ANOVA | |||||

5.12 Example - Cuckoo eggs - single-factor CRD ANOVA | |||||

5.13 Multiple comparisons following ANOVA | |||||

5.14 Prospective Power and sample sizen - single-factor CRD ANOVA | |||||

5.15 Pseudo-replication and sub-sampling | |||||

5.16 Frequently Asked Questions (FAQ) | |||||

5.17 Table: Sample size determination for a two sample $t$-test | |||||

5.18 Table: Sample size determination for a single factor, fixed effects, CRD | |||||

5.19 Scientific papers illustrating the methods of this chapter | |||||

JMP | R | SAS | 6 Single factor - pairing and blocking |
||

6.1 Introduction | |||||

6.2 Randomization protocol | |||||

6.3 Assumptions | |||||

6.4 Comparing two means in a paired design - the Paired $t$-test | |||||

6.5 Example - effect of stream slope upon fish abundance | |||||

6.6 Example - Quality check on two laboratories | |||||

6.7 Example - Comparing two varieties of barley | |||||

6.8 Example - Comparing prep of mosaic virus | |||||

6.9 Example - Comparing turbidity at two sites | |||||

6.10 Power and sample size determination | |||||

6.11 Single Factor - Randomized Complete Block (RCB) Design | |||||

6.12 Example - Comparing effects of salinity in soil | |||||

6.13 Example - Comparing different herbicides | |||||

6.14 Example - Comparing turbidity at several sites | |||||

6.15 Power and Sample Size in RCBs | |||||

6.16 Example - BPK: Blood pressure at presyncope | |||||

6.17 Final notes | |||||

6.18 Frequently Asked Questions (FAQ) | |||||

JMP | R | SAS | 7 Incomplete block designs |
||

7.1 Introduction | |||||

7.2 Example: Investigate differences in water quality | |||||

JMP | R | SAS | 8 Estimating an overall mean with subsampling |
||

8.1 Average flagellum length | |||||

JMP | R | SAS | 9 Single Factor - Sub-sampling and pseudo-replication |
||

9.1 Introduction | |||||

9.2 Example - Fat levels in fish - balanced data in a CRD | |||||

9.3 Example - fat levels in fish - unbalanced data in a CRD | |||||

9.4 Example - Effect of UV radiation - balanced data in RCB | |||||

9.5 Example - Monitoring Fry Levels - unbalanced data with sampling over time | |||||

9.6 Example - comparing mean flagella lengths | |||||

9.7 Final Notes | |||||

JMP | R | SAS | 10 Two Factor Designs - Single-sized Experimental units - CR and RCB designs |
||

10.1 Introduction | |||||

10.2 Example - Effect of photo-period and temperature on gonadosomatic index - CRD | |||||

10.3 Example - Effect of sex and species upon chemical uptake - CRD | |||||

10.4 Power and sample size for two-factor CRD | |||||

10.5 Unbalanced data - Introduction | |||||

10.6 Example - Stream residence time - Unbalanced data in a CRD | |||||

10.7 Example - Energy consumption in pocket mice - Unbalanced data in a CRD | |||||

10.8 Example: Use-Dependent Inactivation in Sodium Channel Beta Subunit Mutation - BPK | |||||

10.9 Blocking in two-factor CRD designs | |||||

10.10 FAQ | |||||

JMP | R | SAS | 11 Two-factor split-plot designs |
||

11.1 Introduction | |||||

11.2 The three basic structures | |||||

11.3 Data and labeling experimental units. | |||||

11.4 Assumptions | |||||

11.5 Example - Tensile strength of paper - main plots in CRD | |||||

11.6 Example - Biomass of trees - main plots in an RCB | |||||

11.7 Example - Tenderness of meat - main plots in an RCB | |||||

11.8 Example - Fungi degrading organic solvents - a split-plot in time | |||||

11.9 Example - Home range - an unbalanced split-site plot in time | |||||

11.10 Example - Floral scents and learning - pseudo-replication | |||||

11.11 Example - Pheromone effects upon wild type and anarchist colonies of bee | |||||

11.12 Repeated Measure Designs analyzed as a Split-Plot Analysis | |||||

11.13 Example - Holding your breath at different water temperatures - BPK | |||||

11.14 Example - Systolic blood pressure before presyncope - BPK | |||||

11.15 Final notes | |||||

11.16 Frequently Asked Questions (FAQ) | |||||

JMP | R | SAS | 12 Analysis of BACI experiments |
||

12.1 Introduction | |||||

12.2 Before-After Experiments - prelude to BACI designs | |||||

12.3 Simple BACI - One year before/after; one site impact; one site control | |||||

12.4 Example: Change in density in crabs near a power plant - one year before/after; one site impact; one | |||||

12.5 Simple BACI design - limitations | |||||

12.6 BACI with Multiple sites; One year before/after | |||||

12.7 Example: Density of crabs - BACI with Multiple sites; One year before/after | |||||

12.8 BACI with Multiple sites; Multiple years before/after | |||||

12.9 Example: Counting fish - Multiple years before/after; One site impact; one site control | |||||

12.10 Example: Counting chironomids - Paired BACI - Multiple-years B/A; One Site I/C | |||||

12.11 Example: Fry monitoring - BACI with Multiple sites; Multiple years before/after | |||||

12.12 A statistical diversion | |||||

12.13 Closing remarks about the analysis of BACI designs | |||||

12.14 BACI designs power analysis and sample size determination | |||||

JMP | R | SAS | 13 Comparing proportions - Chi-square ($\chi ^2$) tests |
||

13.1 Introduction | |||||

13.2 Response variables vs. Frequency Variables | |||||

13.3 Overview | |||||

13.4 Single sample surveys - comparing to a known standard | |||||

13.5 Comparing sets of proportions - single factor CRD designs | |||||

13.6 Pseudo-replication - Combining tables | |||||

13.7 Simpson's Paradox - Combining tables | |||||

13.8 More complex designs | |||||

13.9 Final notes | |||||

13.10 Appendix - how the test statistic is computed | |||||

13.11 Fisher's Exact Test | |||||

JMP | R | SAS | 14 Correlation and simple linear regression |
||

14.1 Introduction | |||||

14.2 Graphical displays | |||||

14.3 Correlation | |||||

14.4 Single-variable regression | |||||

14.5 A no-intercept model: Fulton's Condition Factor $K$ | |||||

14.6 Frequent Asked Questions - FAQ | |||||

JMP | R | SAS | 15 Detecting trends over time |
||

15.1 Introduction | |||||

15.2 Simple Linear Regression | |||||

15.3 Transformations | |||||

15.4 Pseudo-replication | |||||

15.5 Introduction | |||||

15.6 Power/Sample Size | |||||

15.7 Power/sample size examples | |||||

15.8 Testing for common trend - ANCOVA | |||||

15.9 Example: Degradation of dioxin - multiple locations | |||||

15.10 Example: Change in yearly average temperature with regime shifts | |||||

15.11 Dealing with Autocorrelation | |||||

15.12 Dealing with seasonality | |||||

15.13 Seasonality and Autocorrelation | |||||

15.14 Non-parametric detection of trend | |||||

15.15 Summary | |||||

15.16 ? | |||||

JMP | R | SAS | 16 Regression with pseudo-replication |
||

16.1 Introduction | |||||

16.2 Example: Selenium concentration in fish tissue | |||||

16.3 Pseudo-replication when regression is over time | |||||

16.4 Comparing slopes after environmental impact | |||||

JMP | R | SAS | 17 Regression - hockey sticks, broken sticks, piecewise, change points |
||

17.1 Hockey-stick, piecewise, or broken-stick regression | |||||

17.2 Searching for the change point | |||||

17.3 What is the first time that a treatment mean differ from a control mean | |||||

JMP | R | SAS | 18 Analysis of Covariance - ANCOVA |
||

18.1 Introduction | |||||

18.2 Assumptions | |||||

18.3 Comparing individual regression lines | |||||

18.4 Comparing means after covariate adjustments | |||||

18.5 Power and sample size | |||||

18.6 Example: Degradation of dioxin - multiple locations | |||||

18.7 Example: Change in yearly average temperature with regime shifts | |||||

18.8 Example - More refined analysis of stream-slope example | |||||

18.9 Example: Comparing Fulton's Condition Factor $K$ among groups | |||||

18.10 Final Notes | |||||

JMP | NA | NA | 19 Multiple linear regression |
||

19.1 Introduction | |||||

19.2 Example: Blood pressure vs.\ age, weight, and stress | |||||

19.3 Regression problems and diagnostics | |||||

19.4 Polynomial, product, and interaction terms | |||||

19.5 The general linear test | |||||

19.6 Indicator variables | |||||

19.7 Example: Predicting PM10 levels | |||||

19.8 Variable selection methods | |||||

JMP | R | SAS | 20 Regression - hockey sticks, broken sticks, piecewise, change points |
||

20.1 Hockey-stick, piecewise, or broken-stick regression | |||||

20.2 Searching for the change point | |||||

20.3 What is the first time that a treatment mean differ from a control mean | |||||

JMP | R | SAS | 21 Logistic Regression |
||

21.1 Introduction | |||||

21.2 Data Structures | |||||

21.3 Assumptions made in logistic regression | |||||

21.4 Example: Space Shuttle - Single continuous predictor | |||||

21.5 Example: Predicting Sex from physical measurements - Multiple continuous predictors | |||||

21.6 Retrospect and Prospective odds-ratio | |||||

21.7 Example: Parental and student usage of recreational drugs - $2 \times 2$ table. | |||||

21.8 Example: Effect of selenium on tadpoles deformities - $2 \times k$ table. | |||||

21.9 Example: Pet fish survival - Multiple categorical predictors | |||||

21.10 Example: Horseshoe crabs - Continuous and categorical predictors. | |||||

21.11 Assessing goodness of fit | |||||

21.12 Variable selection methods | |||||

21.13 Complete Separation in Logistic Regression | |||||

21.14 Final Words | |||||

JMP | R | SAS | 22 Logistic Regression - Advanced Topics |
||

22.1 Introduction | |||||

22.2 Sacrificial pseudo-replication | |||||

22.3 Example: Fox-proofing mice colonies - dealing with sacrificial pseudo replication | |||||

22.4 Example: Over-dispersed Seed Germination Data | |||||

22.5 Example: Are mosquitos choosy? A preference experiment. | |||||

22.6 Example: Reprise: Are mosquitos choosy? A preference experiment with complete blocks. | |||||

22.7 Example: Reprise: Are mosquitos choosy? A preference experiment with INCOMPLETE blocks. | |||||

JMP | NA | NA | 23 Poisson Regression |
||

23.1 Introduction | |||||

23.2 Experimental design | |||||

23.3 Data structure | |||||

23.4 Single continuous $X$ variable | |||||

23.5 Single continuous $X$ variable - dealing with overdispersion | |||||

23.6 Single Continuous $X$ variable with an OFFSET | |||||

23.7 ANCOVA models | |||||

23.8 Categorical $X$ variables - a designed experiment | |||||

23.9 Log-linear models for multi-dimensional contingency tables | |||||

23.10 Variable selection methods | |||||

23.11 Summary | |||||

JMP | R | SAS | 24 A short primer on residual plots |
||

24.1 Linear Regression | |||||

24.2 ANOVA residual plots | |||||

24.3 Logistic Regression residual plots - Part I | |||||

24.4 Logistic Regression residual plots - Part II | |||||

24.5 Poisson Regression residual plots - Part I | |||||

24.6 Poisson Regression residual plots - Part II | |||||

JMP | NA | NA | 25 Time Series - a VERY brief introduction |
||

25.1 Introduction | |||||

25.2 Fundamental material | |||||

25.3 White noise and autocorrelation in regression | |||||

25.4 Detrending, Differencing and Integration | |||||

25.5 Autoregressive Models on stationary series | |||||

25.6 Moving Average Models on stationary series | |||||

25.7 Combining Moving Average and Autoregressive Models - \ensuremath | |||||

25.8 Model Selection - I | |||||

25.9 Estimation | |||||

25.10 Model Selection - II - AIC | |||||

25.11 Model checking | |||||

25.12 Forecasting | |||||

25.13 Summary | |||||

JMP | R | SAS | 26 Tables |
||

26.1 A table of uniform random digits | |||||

26.2 Selected | |||||

26.3 Selected | |||||

26.4 Cumulative probability for the | |||||

26.5 Selected percentiles from the | |||||

26.6 Selected percentiles from the | |||||

26.7 Sample size determination for a two sample $t$-test | |||||

26.8 Power determination for a two sample $t$-test | |||||

26.9 Sample size determination for a single factor, fixed effects, CRD | |||||

26.10 Power determination for a single factor, fixed effects, CRD | |||||

JMP | R | SAS | 27 THE END! |
||

27.1 Statisfaction - with apologies to Jagger/Richards | |||||

27.2 ANOVA Man with apologies to Lennon/McCartney | |||||

JMP | R | SAS | 28 An overview of environmental field studies |
||

28.1 Introduction | |||||

28.2 Analytical surveys | |||||

28.3 Impact Studies | |||||

28.4 Conclusion | |||||

28.5 References | |||||

28.6 Selected journal articles | |||||

28.7 Examples of studies for discussion - good exam questions! |

- Medians, Percentiles, Prediction and Tolerance Intervals
- Random Effects - Random effects in two factor models
- Introduction to Maximum Likelihood Estimation and AIC model selection
- A short course on the analysis of Air Quality data The examples and R code are available here
- A short course on the analysis of Water Quality data The examples and code are available here
- A presentation on the Design and analysis of BACI experiments given at UBC on 2012-10-15.
- A presentation on the Mixed Effect Models - Theory and Applications given at UBC on 2012-10-15.

- Statistics for Biologists - A Refresher Course. Available using JMP, SAS, or R.
- Ordinary and Logistic Regression. Available using JMP, SAS, or R.
- Advanced experimenal design and analysis. (See relevant chapters above). Available using JMP, SAS, or R.
- Trend Analysis and Environmental Impact Assessment. Available using JMP, SAS, or R.
- Design and analysis of Mark-Recapture Studies. Uses MARK and other specialized software.
- Occupancy models. Uses Presence, GenPres, MARK, and other specialized software.
- Distance Sampling. Uses DISTANCE.
- Introduction to Bayesian Methods for Ecologists. Uses WinBugs/OpenBugs/JAGS and R. The course material is available here.
- Resource selection functions and utilization distributions. Uses R.
- Introduction to
*R*Software and Becoming a R-expert

Both versions have been offered through the CMIAE as linked above. Here is a copy of the COMBINED slides (and other material)

*SAS*Tips and Tricks. (A two day course on SAS tips and tricks for intermediate users of SAS. An introductory course is also available - refer to my Stat-340 course notes for more details..

- Stat 300 Statistical Communication
- Stat 342 Introduction to Statistical Computing (SAS)
- Stat 430 Design and Analysis of Experiments
- Stat 805 Analysis of Discrete Data (Logistic Regression) (skeleton notes only)
- Stat Linear Models Linear Models

[ SFU Statistics and Actuarial Science Home | SFU Home]

Email comments or suggestions to Carl Schwarz (cschwarz@stat.sfu.ca)

© 2012 Carl James Schwarz Last updated 2012-01-04