What is the difference between LS Means and Observed Means?

This article highlights the difference between Least Squares Means computed from linear models such as ANOVA and traditional observed means. It also develops an illustration using Excel and XLSTAT.

Some definitions: Observed Means and Least Squares Means

In this article, we will frequently refer to two types of means defined as follows:

Observed Means: Regular arithmetic means that can be computed by hand directly on your data without reference to any statistical model.
Least Squares Means (LS Means): Means that are computed based on a linear model such as ANOVA.

Dataset to illustrate the difference between Observed Means & LS Means

The data correspond to several ratings given by two judges for two products A & B. The data are unbalanced as the number of ratings for each product differs according to the judge.

One-way ANOVA: Observed Means & LS means are always the same

Imagine a situation where two judges are rating the same product. Each judge rates the product several times. We want to compare the mean grade per judge. In this case, the mean grade of each judge computed by hand will be exactly the same as LS Means arising from a 1-way ANOVA.

Judge	Grade
1	4
1	10
1	4
1	5
1	6
1	8
2	9
2	5
2	7
2	9
2	5
2	6
2	10

Judge 1 has a mean grade of 6.2 and judge 2 has a mean grade of 7.3.
Means & LS means differ when dealing with a bit more complex models such as unbalanced multi-way ANOVAs that include interactions.

Unbalanced multi-way designs: Observed Means & LS Means differ

Consider now the original dataset where each judge rates two products several times such as:

Judge 1 x Product A: 6 ratings
Judge 1 x Product B: 10 ratings
Judge 2 x Product A: 7 ratings
Judge 2 x Product B: 4 ratings

A typical way to analyze such a design is to use a 2-way ANOVA with an interaction term between the two factors (Judge x Product). This is an unbalanced design, as the number of replicates is not the same across the Judge & Product category combinations.
Let’s get back to the comparison mean rating per judge, considering means first and LS means second.

Using the regular observed means:

Mean of Judge 1 is the mean of the 16 ratings performed by judge 1 (6 for Product A and 10 for Product B). Mean of Judge 2 is the mean of the 11 ratings performed by judge 2 (7 for Product A and 4 for Product B).

Using the LS mean based on a Two-way ANOVA with an interaction:

Mean of Judge 1 is the mean of two numbers:
1. The mean of the 6 replicates of Product A tested by Judge 1
2. The mean of the 10 replicates of Product B tested by Judge 1
Mean of Judge 2 is the mean of two numbers:
1. The mean of the 7 replicates of Product A tested by Judge 2
2. The mean of the 4 replicates of Product B tested by Judge 2

Summary

Here are the values for the two types of means:

Ovserved Means vs LS Means

Why should you prefer LS Means compared to Observed Means?

In unbalanced, multi-way designs, the LS means estimation is often assumed to be closer to reality. LS Means somehow correct the design’s imbalance. In our case, LS Means estimation gives the same weight to both products when estimating mean ratings for judges. Conversely, for judge 1, the observed mean estimation incorporates a weight of 6 for product A and a weight of 10 for product B, which gives a judge rating estimation biased in favor of product B.
In balanced designs, or in unbalanced 1-way ANOVA designs, observed means and least squares means are the same.

How to obtain LS Means in Excel using XLSTAT?

When running an ANOVA in XLSTAT, the software computes LS means by default.
After opening XLSTAT, go to Modeling Data / ANOVA.

XLSTAT Modeling Data Menu ANOVA In the General tab, select Grade as a Quantitative dependent variable. Select Judge and Product in the Qualitative Explanatory variables.