# Groundhog Day 2016: Predicting the Future

**Groundhog Day is next Tuesday, February 2.**

Will we see an early Spring this year, or 6 more weeks of Winter?

**According to tradition, a groundhog emerging from its burrow on that morning can show us what to expect. **

The groundhog will have one of two reactions:

- Sees its own shadow and retreats back into its burrow. Expect 6 more weeks of Winter.
- Doesn’t see its own shadow. Expect an early Spring.

**What does the evidence show?**

To find out, we combined 167 results from 46 different Groundhog Day events with weather data reported by the National Climactic Data Center between 2008 and 2015.

We also looked separately at the longer history (1988-2015) of the most popular Groundhog Day celebration in Punxsutawney, PA.

**Are Groundhog Day predictions more than just random chance? **

**Using Fisher’s Exact Test, we find:**

- Probability that Groundhog Day predictions are just random chance:
**12%** - Probability that Punxsutawney Phil’s predictions are just random chance:
**37%**

In general, a pattern would only be considered “statistically significant” if the probability that it’s due to random chance is **less than 5%**. We should remain skeptical.

(We kept this part of the analysis brief, but you can also check out last year’s article for a more in-depth discussion.)

**Was that what we really wanted to know?**

**Remember—the actual goal was to make a prediction.**

What is the probability of an early Spring in 2016?

If a groundhog predicts an Early Spring in 2016, how does this information change our prediction? What is the probability of an early Spring now that we know this?

** **

**We can use Bayes’ theorem to figure this out with three relevant pieces of information.**

**p(Early Spring)** The overall probability of an early Spring between 2008-2015

This was **50%**

** **

**p(No Shadow | Early Spring)** The probability that groundhogs saw no shadow in previous years that also had an early Spring

This was **64%**

**p(No Shadow)** The overall probability that groundhogs see no shadow

This was **58%**

**If a groundhog predicts an early Spring next Tuesday, we can update the probability of an early Spring:**

p(Early Spring | No Shadow) = p(No Shadow | Early Spring) × p(Early Spring) ÷ p(No Shadow)

64% × 50% ÷ 58% = **55%**

**And if instead a groundhog sees its shadow and predicts 6 more weeks of Winter next Tuesday, we can update the probability of more Winter:**

p(Regular Winter | Shadow) = p(Shadow | Regular Winter) × p(Regular Winter) ÷ p(Shadow)

49% × 50% ÷ 42% = **58%**

**What about Punxsutawney Phil?**

**Maybe he’s famous for a reason…either way, there is a longer record, from 1988-2015.**

Note that we should expect the pattern to look different than what we saw between 2008-2015. The trade-off is between more groundhog predictions and more years of recent weather data. The overall probability of an early Spring in 1988-2015 was much higher: **71.4%**.

**If Punxsutawney Phil predicts an early Spring, we can update the probability of an early Spring:**

p(Early Spring | No Shadow) = p(No Shadow | Early Spring) × p(Early Spring) ÷ p(No Shadow)

35% × 71.4% ÷ 28.6% = **87.5%**

**If Punxsutawney Phil predicts 6 more weeks of Winter, we can update the probability of more Winter:**

p(Regular Winter | Shadow) = p(Shadow | Regular Winter) × p(Regular Winter) ÷ p(Shadow)

87.5% × 28.6% ÷ 71.4% = **35%**

**The groundhogs DO seem to improve predictions.**

The changes aren’t drastic, but in every case tested we see that the prediction made by a groundhog is *slightly more likely to happen* than the overall probability of that event happening.

Maybe there is something to it.

And if not, we can continue to refine and update our predictions as new data come in.

**Bayesian statistics** allow us to take probabilities observed in the past, update them as new information comes in, and then predict future events more accurately.

You may not realize it, but there is a good chance that you already use Bayesian statistics every day.

They are behind the rating systems on streaming services, the Spam filter on your email, Google searches, and of course, weather forecasts.

If there is a smart way to interpret Groundhog day, this is it.

Now, let’s see what happens…