Monday, June 15, 2009

MLB: Wins and Payroll, a fast fast fast midseason look

Ok, so this is a really bad study (It took all of three minutes to do), but I found it interesting.

Data was sketchy at best and definitley would not hold up even in class let alone a paper, but like I said it took three minutes.


Y is number of wins right as of 1Am eastern today (June 15) (and yes I know some teams have played more games), X1 is payroll as reported by USA Today, X2 is number of players CURRENTLY on the disabled list.

Wins= 26.85 + (4.75E-08)Payroll
Payroll is significant at the 4% level (t=2.23)

Which is expected, pay more for talent get more wins.

What is interesting is to see how well this simple simple simple model predicts wins. (spreadsheet below). The best? The Red Sox. The Worst? The Nationals. The Mets? about 2 fewer ganes than would be expected (given they have one extra game to play than many teams AND won at least 8/9ths of the Friday game, I found this a good sign). LOL.

Then using a very noisy variable for injuries (Hey I am a Met fan, I get to use injuries as an excuse):

Y is number of wins right as of 1Am eastern today (June 15) (and yes I know some teams have played more games), X1 is payroll as reported by USA Today, X2 is number of players CURRENTLY on the disabled list.

Wins= 29.93 + (4.11E-08)(Payroll) + (-0.51627)(number of players CURRENTLY on DL)

t-stat for payroll 1.93, t stat for players on DL= -1.45

So point estimates are in the expected direction but not significant using normal levels. Tjis might be because of the failure to differentiate injury severity (15 vs 60 day DL), importance of player (a mop up player has the same weight as a Jose Reyes or Carlos Delgado), or the total number of games lost (literally this is just the number of players reported by Yahoo that are CURRENTLY) on the DL.

Team payroll Injuries wins expected wins based off of payroll Wins - expected wins
Yankees 201449189 5 36 36.414706 -0.41471
Mets 148373987 8 32 33.896147 -1.89615
cubs 134809000 3 30 33.252453 -3.25245
red sox 121745999 3 38 32.632579 5.367421
tigers 115085145 3 34 32.316504 1.683496
angels 113709000 3 32 32.251202 -0.2512
phillies 113004046 3 36 32.21775 3.78225
astros 102996414 4 29 31.742861 -2.74286
mariers 98904166 6 30 31.548673 -1.54867
braves 96726166 7 30 31.445322 -1.44532
white sox 96068500 2 30 31.414114 -1.41411
Giants 82616450 2 34 30.775778 3.224222
indians 81579166 7 29 30.726556 -1.72656
blue jays 80538300 5 34 30.677165 3.322835
brewers 80182502 2 34 30.660281 3.339719
cardianals 77605109 4 34 30.537977 3.462023
rockies 75201000 4 31 30.423895 0.576105
reds 73558500 3 31 30.345955 0.654045
diamondbacks 73516666 6 27 30.343969 -3.34397
royals 70519333 7 28 30.201738 -2.20174
rangers 68178798 8 35 30.090673 4.909327
orioles 67101666 4 27 30.039561 -3.03956
twins 65299266 3 32 29.954032 2.045968
rays 63313034 9 34 29.85978 4.14022
As 62310000 7 27 29.812183 -2.81218
nationals 60328000 7 16 29.718132 -13.7181
pirates 48693000 5 30 29.166021 0.833979
padres 43734200 7 28 28.930713 -0.93071
marlins 36834000 5 32 28.60328 3.39672

1 comment:

John said...

You should try to re-run it with the payroll of the players on the injured reserve (if the data is available). I would expect that a player with a higher salary who is on the injured list has a greater negative effect on the record than a no-name player.