ValleyCats Playoff Odds

Note: I have learned that some of my assumptions regarding tie-breakers and makeup games were inaccurate.  I’ll update later today with those revised.

Update as of Sunday afternoon: Each team split its last two games, and the playoff results predictably changed little.  I have the ValleyCats at 36.13%, Vermont at 51.40% and Connecticut at 12.46%.

===
We know the ValleyCats are in the playoff hunt. Tri-City is 1.5 games behind Vermont and a half-game back of Connecticut in the Stedler Division, playing its best baseball as we head towards the home stretch. But what kind of chance do the ‘Cats really have of reaching the postseason?

I put together a quick-and-dirty simulation for the rest of the season in an attempt to answer that question. I’ll try not to go into too many details about how I made the simulation, because I don’t expect that many of you care; leave a comment or email me if you want to know more. But a quick and fairly technical summary: I first figured each team’s pythagorean record, which estimates a team’s performance going forward from its current run differential. Then I plugged those records into Bill James’s log5 formula to figure the odds that each team wins each game. I then used these odds to simulate the Tri-City, Vermont and Connecticut games for the rest of the season*, and played out the season 1,000,000 times. (This task is made a lot easier by the fact that the wild card will almost certainly not come out of the Stedler Division, so I only had to worry about three teams.)

*I included makeups for games that have been lost to rain this year – Tri-City vs Jamestown, Vermont vs Batavia and Staten Island – because they will be played if they affect the pennant race at the season’s end. (My mistake – these games will not be made up.)

Here were the results:

TRI wins:  30.3244
VER wins:  45.3815
CT wins:  8.3277
TRI + VER tie:  9.5976
TRI + CT tie:  2.2061
CT + VER tie:  2.9302
3-way tie:  1.2325

That comes out to a 16% chance that we’ll end up in some sort of tie. The same log5 process I used above can create odds that each team wins a head-to-head play-in game (there is no play-in game; the tiebreaker is divisional record), allowing us to estimate the full odds that each team makes the playoffs (for simplicity’s sake, I assumed that each team would win the three-way tie one-third of the time):

Tri-City: 37.38%
Vermont: 51.83%
Connecticut: 10.80%

I was surprised that Connecticut’s odds are so low. But if you look at run differential, the Tigers just haven’t been very good this season. They rank dead last in runs scored and have a worse run differential than all but two teams; their Pythagorean record pegs Connecticut as a .426 team, rather than a .500 one. The Tigers are 10-5 in one-run games, and will probably not be as lucky going forward.

The ValleyCats have a better run differential and expected record than Vermont, but the 1.5-game edge in the standings is enough for Vermont to remain the favorite. Still, I can assure you that their playoff odds are as high as they’ve been all season.

Two major caveats come with these results. The first is that my simulation does not currently discriminate between home and road games, treating them all equally. I will probably build in an adjustment for this in the next edition of my playoff odds. The second is less clear-cut. Right now, all of my predictions are based on full-season data, so games in June count just as much as games in August. I am not sure if this is optimal or not, particularly in a league where players get promoted relatively frequently; when I do this again, I’ll consider weighting recent results more heavily. It clearly makes a difference in this race – Vermont is playing terribly of late, while the ValleyCats are hot. If you think recent results are more predictive than early-season games, you should consider Tri-City somewhat more likely to make the playoffs than these numbers, and the opposite for Vermont. 

Kevin Whitaker

1 Comment

I’m still debating whether or not to add an adjustment for recent play. On the one hand, I do think recent play is more predictive than play earlier in the year, particularly in a low-level league where rosters, player usage and player performance all tend to fluctuate relatively steeply. But on the other hand, a team that has just faced an easy stretch of schedule would be overrated by a system that rewards recent performance – the recent schedule would vary much more between teams than the year-to-date schedule. I’m toying with maybe adding a recent performance factor with some sort of back-of-the-envelope strength of schedule adjustment, but not sure if that will work.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: