“Scorecasting”: Are the Nationals hot or cold?

 

Scorecasting-02-200

“There are three kinds of lies: lies, damned lies, and statistics.”  –Mark Twain

This quote sets the tone for one of my favorite chapters of Scorecasting: The Hidden Influences Behind How Sports are Played and Games are Won by Tobias J. Moskowitz and L. Jon Wertheim.

Scorecasting is sold as a “Freakanomics” of sports, using statistics and common sense to dubunk some of sports biggest myths and misconceptions.  The above noted chapter is titled “Damned Statistics” and subtitled “Why ‘four out his last five’ almost surely means four of six.”  In it, the authors make the simple argument that people selectively choose numbers to support their point.  To quote:

We are bombarded by stats when we watch games, but the data are chosen selectively and often focus on small samples and short-term numbers.  When we’re told that a player has reached base in “four of his last five at-bats,” we should assume right away that it’s four of his last six.  Otherwise, rest assured, we’d have been told that streak was five out of six.  Clearly, a team that “has lost three in a row” has dropped only three of its last four–and possibly three of five or three of six or…otherwise it would have been reported as a four-game losing streak.

To apply this principle to the current Nationals, realize that going into Friday night’s game against the Pirates:

  • The Nationals have won 2 games in a row
  • The Nationals have lost 3 of their last 5
  • The Nationals have won 5 of their last 8
  • The Nationals have lost 7 of their last 12

All of the above statistics are correct, but people can pick and choose whatever numbers they want to support their position.  Are the Nats hot?  Sure.  Are the Nats cold?  Yeah, look at the “numbers”.

Beware small sample sizes.  To get an accurate picture, you need to take a step back.

In any event, the statistic you’re looking at is probably a lie.  Or a damned lie.

 

Advertisements

4 thoughts on ““Scorecasting”: Are the Nationals hot or cold?”

  1. That’s a good point. Small sample sizes abound when folks try to explain trends in the sports world.

    But, another way to look at it – and my preferred way – is that the most important sample is the most recent one. I’m less interested in your BABIP this year compared to last year as I am in your last week’s BABIP. I always feel that 7-14 days is the best rolling sample size, since it shows the most recent stats backed up by a good 10 or so games (usually, in baseball) of data. That’s enough time to have a streak or slump.

    1. Danny- I agree that rolling small sample sizes are not useless. You can definitely use 10 games or so to figure out if someone is streaking or slumping. My problem is when people try to manipulate the data. People say things like “they’ve lost four of five” because a five game sample size supports their predetermined narrative. In any event, a small sample size is not likely to predict future performance–especially in the long term.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s