The Illusion of Control


Image courtesy Nationals101

I had scarcely posted my latest projection on Twitter when I got hit by a barrage of tweets asking me to adjust the relief pitching innings.

That led to another discussion as to whether I had too many innings in total. The revised model has a total of 1475.2 IP for the whole Nats staff. That’s a lot of innings.

But I’ve decided to freeze my projections at 98 wins, and 1475.2 innings, because, at this point, I feel like I’m tinkering at the edges.

The hard thing about doing this, I find, is that as I keep running the model, something else comes up that I think I can control for. Pretty soon, I’m drowning in complications.

The whole exercise reminds me of a flyer I saw once in the Old Building of the London School of Economics. The flyer was promoting a special lecture that was to be given on “The Illusion of Control in the Social Sciences,” and it had an illustration of a man, hunched over, buried in charts, graphs, data tables, volumes of historical statistics, adding-machine-tape, etc. The point was that we can model and predict, and in doing so, we may fall prey to the notion that we can actually control the universe we’re trying to observe. Of course, that’s not entirely true.

Projecting the 2013 Nationals: Extra Innings

When I projected that the 2013 Nats were going to win 94 games, I did so with a bit of trepidation. Not only did this mean that I was projecting a performance so good that it would have been literally unbelievable only a few years before, but because I have certain doubts about the construction of my model.

As you might have gathered from the title of this post, I think my model has been systematically under-counting playing time for pitchers and hitters. In the spirit of Top of the Inning/Bottom of the Inning nature of the Natstradamus projections, I’ll deal with the pitching issues first, and then the batting problems in the bottom of the inning.

EDIT: Astute readers noted that I should have reduced relief pitcher innings by as much as I increased starting pitching innings. I have amended the relevant analysis. This results in a 98-win total. 

Executive Summary for the TL;DR Crowd: Our earlier projection wasn’t as accurate as it should have been in counting playing time: A slight adjustment in innings pitched for starters–with a corresponding reduction in relief pitching innings– yielded a decrease in runs scored by 2—but a better/more nuanced look at plate appearances by the starting line-up yielded an astonishing increase in runs scored, from 692 to 725. This revises our win projection for the 2013 Nats to 98 wins.

Innings, Limits, and Other Stuff to Tear Your Hair Out With

First, pitching. If you look back at the projected innings pitched column in my pitching runs allowed projections, you will notice that I assume that pitchers in the starting rotation will pitch about 190 innings each, with Strasburg pitching only 180. How does that stack up with reality?

  • Gio Gonzalez (199.1 IP);
  • Jordan Zimmermann (195.2 IP)
  • Edwin Jackson (189.2 IP).

Looking at things like this, it’s starting to look like our 180-inning starting rotation baseline is off by a little bit. Is it really, though? None of the top three for the Braves (Minor, Hudson, Hanson) pitched over 180 innings last year. The Phillies had Hamels (215.1) and Lee (211.0), then a sharp drop-off (injuries). The Mets had Dickey (232.2) and Niese (190.1), and then a precipitous dropoff to Santana (117.0).

Things get a bit better when we look at the Reds, whose top five were remarkably consistent as far as innings, with Cueto (217), Latos (209.1), Bailey (208), Arroyo (202) , and Leake (179).  Likewise, the Giants got a lot of innings out of their starters, with Cain (219.1), Bumgarner (208.1), Vogelsong (189.2), Lincecum (186), and Zito (184.1).

In fact, it’s the rare National League team that gets more than 180 innings from all of its top five starters–only the Giants managed this in 2012, and we all know how that worked out for them, right?

Anyway, returning to our projections: is there a better way we can match the innings expectations for Nationals starting pitchers? Maybe we can. During the height of the Strasburg Shutdown hysteria last year, I wrote that the organization has a general innings-limiting principle:

The Nats have a policy–and a remarkably enlightened one, at that–of limiting starting-pitcher workloads to 120% of the innings a pitcher had pitched the previous year, wherever those innings happened (whether as an amateur, the minor leagues, or the majors). For pitchers returning from major injuries, the innings limit seems to be about 120% of the pitcher’s previous single-season career high total innings pitched.

The conventional wisdom is that this limit may not apply to pitchers like Gio Gonzalez (age 27) and Dan Haren (age 32). Jordan Zimmermann (age 26) might have arguably “aged out” of this system, too, since he pitched 195.2 innings last year. Detwiler (age 26) might have aged out, as well, but last year’s 164.1 IP represented his professional maximum, so let’s assume we’re stretching him out more carefully and put him on the limit. Strasburg (age 24), it should go without saying, is probably under this silent limit as well.

Applying those limits, and looking at last year’s performances, we get the following:

  • Stephen Strasburg. 120% of last year’s innings for Strasburg works out to 190.2 innings for Strasburg. Plugging that into our model, that works out to 54.23 runs allowed, an increase of 3.03 runs.
  • Jordan Zimmermann. JZ pitched 195.2 innings. It would be foolish to assume he would pitch any more. Let’s assume he pitches 195 innings, then. That works out to 80.38 runs allowed, an increase of 2.06 runs.
  • Gio Gonzalez. 199 innings is a lot, but he pitched over 200 innings in the two preceding years, so I don’t think it’s too much of a stretch to give Gio 200 innings in 2013. Ten more innings of Gio than in our initial model yields 84.67 runs, an increase of 4.24 runs.
  • Ross Detwiler. Detwiler’s 151 innings in 2012 was a career high for him. Increasing that by 120% yields 181 innings. Fortunately, the old model pegged him at 180 innings to begin with. We’ll leave well enough alone, then.
  • Dan Haren. Haren’s a little harder to judge. He only pitched 176.2 innings in 2012, but before his back got balky, he pitched well in excess of 200 innings for seven consecutive seasons. Various projections have him pitching as many as 218 innings and as few as 170. Let’s say he recovers form and pitches 190 innings–which is what we had in the original model. Great.

After adjusting for an increase in innings pitched, we see that the Nats give up a few more runs– 9.33 runs. That’s enough to cost them one full game in the Natstradamus projection–so that leaves them with 93 wins, instead.

Not so fast. You will notice that we’ve increased Gio’s innings by 10, Strasburg’s innings by 10, and Zimmermann’s innings by 5. That means we need to reduce relief pitcher innings accordingly. If we reduce Craig Stammen’s 110 innings to 95 innings (-6.6 runs allowed) and Zach Duke’s innings from 90 to 80 (-4.8 runs allowed), we actually end up saving about 2 runs. That keeps us steady at 94 wins for now. But how about the hitting?

Batters: Up.

The crude assumption built into the model was that every one of the starting position players got 600 plate appearances each. This is, of course, false. The ever-astute David Huzzard reminded me that the number of plate appearances varies with position in the batting order. Fortunately, Baseball Reference lets us look at exactly how many plate appearances, on average, each batting order position got in the National League in 2012. As you can see, the lead-off batter gets, on average 750 plate appearances–125% more than our model assumed! What does it look like?

Split Pa
Batting 1st 750
Batting 2nd 732
Batting 3rd 716
Batting 4th 699
Batting 5th 684
Batting 6th 666
Batting 7th 647
Batting 8th 625
Batting 9th 606
Provided by View Original Table
Generated 2/18/2013.

In fact, we see that in the NL, the only batting average position that gets even close to 600 plate appearances is the number 9 batter–which is usually the pitcher’s spot! Safe to say, then, that the model is broken as far as runs scored. To fix it, we need to figure out what the batting order is going to be and award plate appearances in proportion to that player’s spot in the batting order. To keep things consistent with our defensive statistics, we’ll assume that each “every day” position player appears in 150 games. With that in mind, let’s assign some plate appearances to a hypothetical order:

Player PA
Denard Span 695
Jayson Werth 678
Bryce Harper 663
Adam LaRoche 647
Ryan Zimmerman 633
Ian Desmond 617
Danny Espinosa 599
Wilson Ramos/Kurt Suzuki 579
Pitchers 561

That leaves us with some 453 plate appearances to distribute among the other bench players. Let’s assume, crudely, that we distribute them evenly among Tracy, Moore, Lombardozzi, and Bernadina, giving them 113 plate appearances each. Let’s also further assume that the “Pitchers” spots are evenly distributed among all the starting pitchers, giving each of the starting five 112 plate appearances each.

The results are shocking:

Player Name 4-year total PA 4-year total wRC 4-yr moving avg wRC/PA Projected PA Projected wRC Team Total wRC
Jayson Werth 2803 425 0.151623260792009 678 102.80
Ryan Zimmerman 2844 426 0.149789029535865 633 94.82
Tyler Moore 171 26 0.152046783625731 113 17.18
Bryce Harper 597 86 0.144053601340034 663 95.51
Adam LaRoche 2622 361 0.13768115942029 647 89.08
Denard Span 2671 334 0.125046798951703 695 86.91
Wilson Ramos 613 76 0.123980424143556 290 35.95
Ian Desmond 1849 214 0.115738236884803 617 71.41
Danny Espinosa 1428 164 0.11484593837535 599 68.79
Roger Bernadina 1150 121 0.105217391304348 113 11.89
Chad Tracy 845 85 0.100591715976331 113 11.37
Kurt Suzuki 2703 274 0.101368849426563 290 29.40
Steve Lombardozzi 448 42 0.09375 113 10.59
Stephen Strasburg 83 3 0.036144578313253 112 4.05
Drew Storen 2 0 0 0 0.00
Dan Haren 240 19 0.079166666666667 112 8.87
Craig Stammen 90 3 0.033333333333333 30 1.00
Jordan Zimmermann 166 4 0.024096385542169 112 2.70
Zach Duke 226 1 0.004424778761062 0.00
Tyler Clippard 14 0 0 0 0.00
Gio Gonzalez 84 -5 -0.05952380952381 112 -6.67
Ross Detwiler 97 -9 -0.092783505154639 112 -10.39
Ryan Mattheus 1 0 0 0 0.00
Rafael Soriano 0 0 0 0 0.00
Bill Bray 0 0 0 0 0.00

That’s a huge jump in runs scored, from 692 up to 725!

Putting it Together

Having adjusted our playing-time expectations somewhat, our revised projection has the 2013 Nats allowing 600 runs, while scoring 725 runs. Running that through the Pythagorean Win Expectation Formula gives us a revised win projection for the 2013 season of 98 wins, or four more than we had initially projected. The vast undercount of offensive plate appearances made a huge difference in terms of runs scored, and added two whole wins. The increase in starting pitching at the expense of middle relief yields two more wins.

There are a few caveats, of course. Naturally, this all assumes that every player involved will stay healthy all year, and that they all perform according to their four-year trailing average performances. A realignment of the batting order will affect runs scored in very real ways: this is particularly true in the case of Bryce Harper. The current line up puts two left-handed power hitters, Harper and LaRoche, back-to-back, which may be suboptimal in matchup situations. But moving Harper down in the order will deprive him of plate appearances and run-creating chances.

I have goosebumps just thinking about this.

Projecting the 2013 Nationals, Part 4: 94 Wins or Bust?

Having determined that the 2013 Nationals are projected to allow 591 runs and score 692 runs, how many games does that mean they will win?

This is a job for the Pythagorean win expectation formula:

Wins/Losses= 1/1+(runs allowed/runs scored)^2)

Which yields us the shocking total: The 2013 Nationals are projected to win 94 games. That’s right. They’re projected to have a record of 94-68, playing .579 baseball.

Just let that wash over you for a second. I just projected this team to win over 90 games. This is exhilerating. This is terrifying.

And the thing that gets me about this is that these are all fairly conservative estimates. I’ll be playing with these numbers from time to time over Spring Training. But I’m going to go with this as my baseline estimate for 2013.

To be honest, I sat on these results for about a month, looking at them over and over and over again, utterly terrified of posting them. I am not used to being this optimistic about the Washington Nationals, ever. And now, suddenly, I am in the position of rooting not for the underdog, or the lovable loser–no, this year, I am rooting for the favorite. This is bizarre and wonderful and terrifying at once.

Projecting the 2013 Nationals, Part 3: Offense

Now we come to the fun part of the inning: how many runs does the home team score? The model projects that the 2013 Nationals will score 693 runs.

Assuming that an everyday position player will get about 600 plate appearances, and assuming that the plate appearances of the two catchers, Suzuki and Ramos, are divided evenly, we end up with a table that looks something like this:

  Player Name 4-year total PA 4-year total wRC 4-yr moving avg wRC/PA Projected PA Projected wRC Team Total wRC
Jayson Werth 2803 425 0.151623260792009 600 90.97
Ryan Zimmerman 2844 426 0.149789029535865 600 89.87
Tyler Moore 171 26 0.152046783625731 150 22.81
Bryce Harper 597 86 0.144053601340034 600 86.43
Adam LaRoche 2622 361 0.13768115942029 600 82.61
Denard Span 2671 334 0.125046798951703 600 75.03
Wilson Ramos 613 76 0.123980424143556 300 37.19
Ian Desmond 1849 214 0.115738236884803 600 69.44
Danny Espinosa 1428 164 0.11484593837535 600 68.91
Roger Bernadina 1150 121 0.105217391304348 150 15.78
Chad Tracy 845 85 0.100591715976331 100 10.06
Kurt Suzuki 2703 274 0.101368849426563 300 30.41
Steve Lombardozzi 448 42 0.09375 150 14.06
Stephen Strasburg 83 3 0.036144578313253 150 5.42
Drew Storen 2 0 0 0 0.00
Dan Haren 240 19 0.079166666666667 150 11.88
Craig Stammen 90 3 0.033333333333333 30 1.00
Jordan Zimmermann 166 4 0.024096385542169 150 3.61
Zach Duke 226 1 0.004424778761062 30 0.13
Tyler Clippard 14 0 0 0 0.00
Gio Gonzalez 84 -5 -0.05952380952381 150 -8.93
Ross Detwiler 97 -9 -0.092783505154639 150 -13.92
Ryan Mattheus 1 0 0 0 0.00
Rafael Soriano 0 0 0 0 0.00
Bill Bray 0 0 0 0 0.00

As excited as we’ll all be to follow Bryce Harper in his quest to beat Mike Trout’s insane age-20 season, it’s instructive to look at this table. Jayson Werth and Ryan Zimmerman are projected to get 91 and 90 wRC respectively. Harper is expected to do great things–86 wRC–but it’s worth noting just how much a healthy Werth and Zimmerman mean to the Nationals line-up.

Notice also that the line-up is remarkably deep. Let’s look at it from the point of view of a possible batting order:

  1. Denard Span, wRC 75.03
  2. Jayson Werth, wRC 90.97
  3. Bryce Harper, wRC 86.43
  4. Adam LaRoche, wRC 82.61
  5. Ryan Zimmerman, wRC 89.87
  6. Ian Desmond, wRC 69.44
  7. Danny Espinosa, wRC 68.91
  8. Wilson Ramos, wRC 37.19; plus Kurt Suzuki, wRC 30.41

Those first five batters, however you order them, are pretty impressive. That should make for a much deeper line-up than we’re used to seeing here in DC.

So, what does this all mean? Tune in next time as we discuss how this all fits together in Part 4.

Projecting the 2013 Nationals, Part 2: Pitching and Defense

In Part 1, we announced the starting line-up. Let’s see how many runs the pitching allows in 2013. My model conservatively estimates that in 2013, Nats pitching will account for 609 runs scored against the Nats, but defense will “save” 18 runs. Thus, the model conservatively predicts that 591 runs will be scored against the 2013 Nationals.

Here’s the table for pitching:

Pitcher Name Projected IP 4-Yr Moving Avg xFIP Projected Runs Allowed TOTAL RUNS ALLOWED
Stephen Strasburg 180 2.56 51.20
Gio Gonzalez 190 3.81 80.43
Jordan Zimmermann 190 3.71 78.32
Ross Detwiler 180 4.44 88.80
Dan Haren 190 3.37 71.14
Rafael Soriano 70 3.6 28.00
Drew Storen 70 3.46 26.91
Tyler Clippard 70 3.54 27.53
Ryan Mattheus 70 4.48 34.84
Craig Stammen 110 3.96 48.40
Zach Duke 90 4.34 43.40
Bill Bray 65 4.19 30.26 609.25

You will notice that my initial guesses for innings pitched for starting pitchers are quite low. We’ll tweak those later, but for now, I’m going to assume that these are good enough to go by.

A similar table of the defensive statistics would be tedious to recount, so let me sum it up with a few general notes:

  • According to these projections, the three biggest defensive assets on the 2013 Nationals are Ryan Zimmerman, Denard Span, and Danny Espinosa.
  • Ryan Zimmerman should save 7.6 runs–best on the team. The high number of defensive runs saved here underscores just how important it is for the Nats to keep him healthy.
  • Danny Espinosa has been the target of a lot of fan frustration lately, especially given his struggles at the plate. His defense, however, is outstanding. The model projects that he will save 5.2 runs.
  • The newest addition to the Nats defense, center fielder (and noted icthyophobe) Denard Span, is projected to save 4.6 runs. Bryce Harper had a UZR of 9.7 as a center fielder last year, so just looking at that, you might think that Span is a lousy center fielder compared to Harper. You’d be wrong. UZR is notoriously unstable–we need at least 3 years of data to get a good sample. Span actually posted a UZR of 9.0 as a center fielder for the Twins in 2011; likewise, as Twins CF in 2012, he posted a UZR of 8.5. As you can see, the projection for Span seems very conservative–but it takes into account some bad defensive years for Span (2008 and 2009). I would expect Span actually to outperform this projection.

Right, that wraps up the top of the inning. Tune in to Part 3, where we’ll discuss how the offense looks.

Projecting the 2013 Nationals, Part 1: Ground Rules & Starting Line-ups

Spring Training is well underway down in Viera. This is the season, then, of portents and omens–the latest and most amusing of which was the story of an osprey dropping a fish onto the Nats outfield. Not having any expertise in the art of augury, I don’t think I can really comment about the auspiciousness or inauspiciousness of such an omen for the upcoming season.

What I can offer you, however, is the results of my own admittedly crude projection system. Long-time readers will know that I like to think of the baseball season as a single inning of a baseball game writ very very large. In the top of the inning, we see the home team take the field, and see how good the pitching and the defense are at getting opposing batters out. In the bottom of the inning, we watch the home team at bat, and see how well they drive in runs. Then we count the runs allowed in the top of the inning and the runs scored in the bottom of the inning–if the home team scored more runs than the other team, they win.

If you want the nuts and bolts of my projection system, please, read the post I’ve linked above. It describes the general outline of the system as clearly as I can.

This year, however, I’m making a few changes to the Natstradamus projection system.

First, in pitching, I have replaced FIP with xFIP. I don’t know enough about home run/fly ball rates to tell, really, which pitchers are “lucky” or “unlucky” with respect to how many home runs they give up on fly balls. xFIP fixes that for me by normalizing runs allowed by a pitcher to a league-average home run/fly ball rate. Some pitchers get better; other pitchers get worse; but I think over all that might be a more fair way of evaluating pitchers for the purposes of this projection system.

Second, I have tweaked the defensive calculations slightly. Instead of using UZR, I have calculated a UZR/game, and then multiply that  by the number of games in which I expect each player to appear. Again, this is crude, and defensive metrics are highly unstable anyway, but hey, it’s all I’ve got.

Remember, my projections are based on four-year trailing averages for each stat. That is, they’re the averages of the past four years.

With those preliminaries out of the way, let’s start this year’s predictions off by going through the 2013 Nationals’ projected 25 man roster:

Starting Rotation

  • Stephen Strasburg, xFIP 2.56
  • Gio Gonzales, xFIP 3.81. I do not believe Gio will be subject to a suspension for his alleged involvement in the Biogenesis scandal. I explained my view on the situation here.
  • Jordan Zimmermann, xFIP 3.71.
  • Ross Detwiler, xFIP 4.44
  • Dan Haren, xFIP 3.37.

Starting Position Players

  • Adam LaRoche, 1B.
  • Danny Espinosa, 2B
  • Ryan Zimmerman, 3B
  • Ian Desmond, SS
  • Bryce Harper, LF
  • Denard Span, CF.
  • Jayson Werth, RF
  • Wilson Ramos, C
  • Kurt Suzuki, C. I have Ramos and Suzuki splitting playing time evenly.


  • Chad Tracy, OF/3B
  • Tyler Moore, OF/1B
  • Steve Lombardozzi, IF/OF
  • Roger Bernadina, OF


  • Rafael Soriano, xFIP 3.6, Primary Closer
  • Drew Storen, xFIP 3.46, Primary Set-up, Back-up Closer
  • Tyler Clippard, xFIP 3.54
  • Ryan Mattheus, xFIP 4.48
  • Zach Duke, xFIP 4.34, Left-handed long reliever/Spot starter
  • Craig Stammen, xFIP 3.96, Right-handed long reliever/Spot Starter
  • Bill Bray, xFIP 4.19, Left-handed one-out guy. This is probably the most controversial pick; others might put Henry Rodriguez or Christian Garcia here instead. But I’m going to assume Bray heads north with the club.

No surprises, then. Stay tuned as we discuss pitching and defense in Part 2 of our projections.