And now the moment you’ve all been waiting for: the Natstradamus projections for the 2015 season!

This year, the projection comes with a major caveat: If Ryan Zimmerman is no worse than a league-average defensive first baseman, the Washington Nationals are projected to win between 95 and 98 games.

Just to refresh your recollection (because Lord knows I need to refresh mine every year), I use a pretty simple projection system to come up with the Nats’ won/loss totals for the year. The whole thing is based off Bill James’s Pythagorean Expectation, and it’s a satisfyingly intuitive way to figure out how good your team is. In baseball, you win if you score more runs than you allow. The Pythagorean Win Expectation model reflects that.

Imagine the whole baseball season compressed into two halves of one inning at Nats park. First, we need to fill up the lineup card of players. Then, we need to know who these players are–I use a four-year trailing average as the basis for these calculations. Then, we need to figure who plays where and when. This is the greatest acknowledged weakness of my system, as I have somewhat arbitrarily assigned playing time based on my impressions of injuries, etc.

At the top of the inning, the visiting teams come to bat. The result of that half-inning will be Runs allowed. Any upper-deck crank will tell you that there are two ways you can allow a run, generally: by pitching badly (giving up tons of walks and homers) or by fielding badly (not getting to balls hit in the gap, dropping fly balls, committing errors). The same upper-deck crank will tell you that you can get out of the inning with good pitching (striking everyone out) and great fielding (robbing home runs, showing ridiculous range, gunning down runners with your arm). In my model, base pitching runs allowed off a pitcher’s FIP (I also use xFIP as an alternative, which normalizes pitcher home runs allowed to a league average home-run/fly-ball rate). Defense is handled by UZR, which handily expresses defense as the number of extra runs allowed or saved.

At the bottom of the inning, the Nats come to bat: time to score some runs. I use Weighted Runs Created for each batter. Since that’s a counting stat, I divide that by the number of plate appearances over the last four years to get the number of runs created per plate appearance. I multiply this by the number of projected plate appearances (an everyday player will get about 600 plate appearances). That’s the number of runs on the board.

When that’s over, I come to some conclusions.

The 2015 Nats pitching staff is projected to allow between 530 (using FIP) and 562 (using xFIP) runs. The 2014 Nats actually allowed 555 runs–and we were already amazed at how good the pitching was last year.

This is a good thing because there is too much uncertainty about the defense to have any real confidence. UZR is notorious in that it needs a pretty big sample size to stabilize–the rule of thumb is that 3 years’ worth of data for an everyday player is what you’d need for the stat to be of any real use. Unfortunately, Ryan Zimmerman, first baseman, is a relatively new creation. His limited time at first base resulted in a comically bad UZR/150 (i.e., what UZR would be if he played 150 games at first base for a year) of -109.1. If true, it would mean that Ryan Zimmerman’s first base defense would be costing the Nats over 20 more runs than he would stand to get them at the plate (~88, by my calculations). That’s hard to stomach. If we follow the model blindly, though, we end up with the defense costing the Nats’ excellent pitching just over 97 runs. If we back off and assume Ryan Zimmerman is at least a league-average first baseman, the defense improves significantly, actually saving just under 12 runs.

So, if you think Ryan Zimmerman is a 100-run liability at first base (and I doubt very much that this is the case), the pitching and defense combined concede between 627-659 runs (totals not seen since 2011, when the Nats allowed 643 runs). If you think Ryan Zimmerman is a league-average first baseman, the pitching and defense combine to allow between 518 and 550 runs (As good or better than the 2014 Nats).

Turning now to the batting, things are more straightforward. The model projects the Nats will score 652 runs. This is lower than last year’s observed total of 686 runs. The projection reflects my pessimism regarding Rendon’s playing time and the speed at which Span and Werth can return to the lineup. I will be very happily proven wrong on this point, though.

Add it all together, and you end up with 95 to 98 wins if Zim is at least a league-average first baseman. If he is the nightmare that the tiny and highly unreliable sample of data UZR has to work with, things are much less rosy, with the Nats winning between 80 and 84 games, and likely missing the playoffs.

# What the hell is the matter with Bryce Harper?

Bursitis.

Oh, you wanted more? Fine:

Bryce Harper doesn’t know how to play outfield very well yet. He makes up for this by being unbelievably fast. But where his bad route to a ball intersects with a wall, that same speed results in painful collisions.

That’s heresy, right? How dare I impugn the defensive skills of the National League’s fifth-best outfielder (by UZR) in 2012? Yes, Bryce Harper posted a ridiculous 9.5 UZR in 2012–that is, his outfield defense prevented 9.5 runs from scoring on the 2012 Nats. That’s pretty good, right?

Sure. But let’s remember that UZR is a highly unstable measure of defensive ability–that is, we need a pretty big sample size to be sure of what we’re looking at:

How many UZR opportunities do you need for UZR to be reliable? There isn’t any magic number. If I asked you how many AB you need before a player’s BA becomes reliable, you would likely answer, “I don’t know. The more the merrier I guess.” That is true with UZR and with all metrics. Of course, for some metrics, you need more or less data than for other metrics for an equivalent reliability. It depends on the sampling error and the spread in underlying talent, and other things that are inherent in that metric. Most of you are familiar with OPS, on base percentage plus slugging average. That is a very reliable metric even after one season of performance, or around 600 PA. In fact, the year-to-year correlation of OPS for full-time players, somewhat of a proxy for reliability, is almost .7. UZR, in contrast, depending on the position, has a year-to-year correlation of around .5. So a year of OPS data is roughly equivalent to a year and half to two years of UZR.

This makes intuitive sense, in a way. To gather information about a player’s defense, we have to put a ball in play somewhere near that player and give him a chance to make a defensive play. In some cases, that happens pretty often–think of a second baseman or a shortstop taking ground balls. Other times, it’s less often–think of an outfielder (like, say, Harper!) standing around as his starting pitcher (like, say, Strasburg!) strikes out batter after batter.

All of this is to say that even though UZR says Harper likely saved 9.5 runs for the 2012 Nats, that might not really be the truest measure of Harper’s defensive prowess in the outfield–although, again, it’s the best we can do for now.

But let’s just take the one year of data and look at it  more closely, OK?

Now, UZR is broken up into components, each of which makes sense if you imagine yourself playing baseball. First, as the ball is put in play, you have to react to the ball, get to where it’s going, and put yourself in a position to make the play. The distance you cover to get to that ball is your range. Thus, the runs that you save because you can get to the ball (instead of letting it go by you) are Range runs, denoted RngR. Bigger is better here–this means you’re actually getting to the ball and getting a glove on it. That’s good news for your ball club.

Next, once you’ve got the ball, you might need to throw it somewhere in a hurry. Maybe you need to turn a double play, or maybe you need to hit a cutoff man, or maybe you’re trying to gun down a runner at the plate. You need a pretty good and accurate arm to do any of those things. The runs you save because of your good and accurate arm are Arm runs, denoted ARM.

Finally, things don’t always go your way. Maybe you get to the ball, then boot it. Or maybe your arm is strong, but not accurate; or accurate, but nowhere near strong enough. To err is human, of course. Runs you cost your team because of your errors are–shocker–error runs, denoted ErrR.

Now, let’s look at each of those components for 2012 Bryce Harper. Harper has 5.4 RngR, which tells us that he’s got pretty good range for an outfielder. His arm is absurd: 6.2 ARM,  best in the National League in 2012. He does goof every so often, though, giving up -2.1 ErrR. That all adds up to his 9.5 UZR.

Now let’s do the hack thing and compare Harper to Mike Trout, another phenomenal young outfielder.

Trout posts a higher UZR of 13.3, fourth-best in the American League in 2012. What’s interesting is that he does this despite a not-so-great ARM (-3.8). So, if Trout costs his team runs with a weak/inaccurate arm, whence comes this outrageous defensive skill? Well, Trout doesn’t make many mistakes. In fact, he makes fewer mistakes than average, so that’s worth 0.4 ErrR. The real story is that Trout has absurd range, with RngR of 16.7–best in the American League!

Now, think about what that means, for a second. What does it mean when we say that an outfielder has good range? It means he gets to balls that other outfielders might not be able to reach. There are three parts to fielding a ball in play in the outfield: first, you have to know the ball is coming to you. Then, you need to figure out where that ball is going to be, and how best to get there. Finally, you have to run to that spot and make the play.

Mike Trout has been playing the outfield for quite some time. He played the outfield as an amateur. He has, in his short life thus far, seen many more balls hit towards him in the outfield than Bryce Harper has. No wonder, too– Harper was an catcher as an amateur, and was only turned into an outfielder after he turned professional.

Now, Trout and Harper are built similarly. I don’t have the data, but let’s assume that they have similar reaction times. They can see equally well. They run more or less the same speed (fast!), jump more or less the same height (high!). I submit that the vast difference between Harper and Trout’s range has nothing to do with the raw physical part of fielding–the running to where the ball is going to be. It has everything to do with the first and second parts of the process–seeing the ball and picking the best route to the ball.

This is a long way of saying that Harper’s propensity to run into, at, or through outfield walls has nothing at all to do with his willingness to play hard, or play “the right way,” or whatever. Harper runs into walls, or pulls up at warning tracks, or sprints towards fly balls in the gap because he just doesn’t know where he is on the outfield. He makes up for his lack of skill by employing that prodigious physical gift of speed. It is a testament to Harper’s raw speed that his range is as good as it is at all.

The trouble is, of course, when those sub-optimal routes, taken at breath-taking speed, intersect with walls. That’s why he has bursitis now.

The good news here is that there is every indication that Harper will learn to be a better outfielder as he gains more experience. This is exciting, because if he can get better jumps on balls and make fewer mistakes, he can bring his absurdly powerful arm into play even more often.

Bryce Harper is good at baseball. He is not yet good at playing outfield. He is probably going to become very good at that soon, though–and that will be fun to watch.

Again, notice: this isn’t about Harper’s mentality, or whether he’s playing “too hard,” or whether he believes he can blast through walls like the goddamn Kool-Aid Man. This is just about a 20 year old kid learning to play baseball better tomorrow than he did today.

# Rendon doesn’t play second base yet

This is, of course, an obvious statement. The Nats third-base prospect, currently with AA Harrisburg, doesn’t play second base yet. He didn’t play second base as a collegiate ballplayer at Rice. And, although the Nats intend to have him take reps at second base and shortstop, that’s not the same as playing second base.

Why am I wasting your time repeating the obvious?

Because the Nats played awful infield defense in this weekend’s series with the Reds. Ian Desmond was charged with a staggering six errors. Chad Tracy had another.

The vagaries of the rulebook meant that Danny Espinosa escaped without an error–but is still largely responsible for the margin of defeat in Sunday’s loss, when, in the sixth inning, he chose to launch a wayward throw that failed to get the runner coming home. A run scored, leaving two runners on with no outs recorded–both of whom subsequently scored, too. That goes as a fielder’s choice in the scorebook, and it’s a terrible choice, but it’s not an error.

All these misadventures, and more, were enough to get the “CALL UP RENDON NOW” brigade active on twitter.

To whom I have this to say: you mean to tell me that, to fix a series where the main problem was lousy infield defense, you want to call up a young player with extremely limited experience playing precisely the infield positions (shortstop, second base) were all the bad defense was happening?

Wait, what?

Anthony Rendon is a talented player, and, if reports are to believed, a fine third baseman. He may yet become a second baseman or a shortstop. He is not yet that–and, until he is, you’ve got to hope that the current middle infield of Desmond and Espinosa shrug off this weekend’s performance and regain their usual defensive form.

# Projecting the 2013 Nationals, Part 2: Pitching and Defense

In Part 1, we announced the starting line-up. Let’s see how many runs the pitching allows in 2013. My model conservatively estimates that in 2013, Nats pitching will account for 609 runs scored against the Nats, but defense will “save” 18 runs. Thus, the model conservatively predicts that 591 runs will be scored against the 2013 Nationals.

Here’s the table for pitching:

 Pitcher Name Projected IP 4-Yr Moving Avg xFIP Projected Runs Allowed TOTAL RUNS ALLOWED Stephen Strasburg 180 2.56 51.20 Gio Gonzalez 190 3.81 80.43 Jordan Zimmermann 190 3.71 78.32 Ross Detwiler 180 4.44 88.80 Dan Haren 190 3.37 71.14 Rafael Soriano 70 3.6 28.00 Drew Storen 70 3.46 26.91 Tyler Clippard 70 3.54 27.53 Ryan Mattheus 70 4.48 34.84 Craig Stammen 110 3.96 48.40 Zach Duke 90 4.34 43.40 Bill Bray 65 4.19 30.26 609.25

You will notice that my initial guesses for innings pitched for starting pitchers are quite low. We’ll tweak those later, but for now, I’m going to assume that these are good enough to go by.

A similar table of the defensive statistics would be tedious to recount, so let me sum it up with a few general notes:

• According to these projections, the three biggest defensive assets on the 2013 Nationals are Ryan Zimmerman, Denard Span, and Danny Espinosa.
• Ryan Zimmerman should save 7.6 runs–best on the team. The high number of defensive runs saved here underscores just how important it is for the Nats to keep him healthy.
• Danny Espinosa has been the target of a lot of fan frustration lately, especially given his struggles at the plate. His defense, however, is outstanding. The model projects that he will save 5.2 runs.
• The newest addition to the Nats defense, center fielder (and noted icthyophobe) Denard Span, is projected to save 4.6 runs. Bryce Harper had a UZR of 9.7 as a center fielder last year, so just looking at that, you might think that Span is a lousy center fielder compared to Harper. You’d be wrong. UZR is notoriously unstable–we need at least 3 years of data to get a good sample. Span actually posted a UZR of 9.0 as a center fielder for the Twins in 2011; likewise, as Twins CF in 2012, he posted a UZR of 8.5. As you can see, the projection for Span seems very conservative–but it takes into account some bad defensive years for Span (2008 and 2009). I would expect Span actually to outperform this projection.

Right, that wraps up the top of the inning. Tune in to Part 3, where we’ll discuss how the offense looks.

# Projecting the 2013 Nationals, Part 1: Ground Rules & Starting Line-ups

Spring Training is well underway down in Viera. This is the season, then, of portents and omens–the latest and most amusing of which was the story of an osprey dropping a fish onto the Nats outfield. Not having any expertise in the art of augury, I don’t think I can really comment about the auspiciousness or inauspiciousness of such an omen for the upcoming season.

What I can offer you, however, is the results of my own admittedly crude projection system. Long-time readers will know that I like to think of the baseball season as a single inning of a baseball game writ very very large. In the top of the inning, we see the home team take the field, and see how good the pitching and the defense are at getting opposing batters out. In the bottom of the inning, we watch the home team at bat, and see how well they drive in runs. Then we count the runs allowed in the top of the inning and the runs scored in the bottom of the inning–if the home team scored more runs than the other team, they win.

If you want the nuts and bolts of my projection system, please, read the post I’ve linked above. It describes the general outline of the system as clearly as I can.

This year, however, I’m making a few changes to the Natstradamus projection system.

First, in pitching, I have replaced FIP with xFIP. I don’t know enough about home run/fly ball rates to tell, really, which pitchers are “lucky” or “unlucky” with respect to how many home runs they give up on fly balls. xFIP fixes that for me by normalizing runs allowed by a pitcher to a league-average home run/fly ball rate. Some pitchers get better; other pitchers get worse; but I think over all that might be a more fair way of evaluating pitchers for the purposes of this projection system.

Second, I have tweaked the defensive calculations slightly. Instead of using UZR, I have calculated a UZR/game, and then multiply that  by the number of games in which I expect each player to appear. Again, this is crude, and defensive metrics are highly unstable anyway, but hey, it’s all I’ve got.

Remember, my projections are based on four-year trailing averages for each stat. That is, they’re the averages of the past four years.

With those preliminaries out of the way, let’s start this year’s predictions off by going through the 2013 Nationals’ projected 25 man roster:

Starting Rotation

• Stephen Strasburg, xFIP 2.56
• Gio Gonzales, xFIP 3.81. I do not believe Gio will be subject to a suspension for his alleged involvement in the Biogenesis scandal. I explained my view on the situation here.
• Jordan Zimmermann, xFIP 3.71.
• Ross Detwiler, xFIP 4.44
• Dan Haren, xFIP 3.37.

Starting Position Players

• Danny Espinosa, 2B
• Ryan Zimmerman, 3B
• Ian Desmond, SS
• Bryce Harper, LF
• Denard Span, CF.
• Jayson Werth, RF
• Wilson Ramos, C
• Kurt Suzuki, C. I have Ramos and Suzuki splitting playing time evenly.

Bench

• Tyler Moore, OF/1B
• Steve Lombardozzi, IF/OF

Bullpen

• Rafael Soriano, xFIP 3.6, Primary Closer
• Drew Storen, xFIP 3.46, Primary Set-up, Back-up Closer
• Tyler Clippard, xFIP 3.54
• Ryan Mattheus, xFIP 4.48
• Zach Duke, xFIP 4.34, Left-handed long reliever/Spot starter
• Craig Stammen, xFIP 3.96, Right-handed long reliever/Spot Starter
• Bill Bray, xFIP 4.19, Left-handed one-out guy. This is probably the most controversial pick; others might put Henry Rodriguez or Christian Garcia here instead. But I’m going to assume Bray heads north with the club.

No surprises, then. Stay tuned as we discuss pitching and defense in Part 2 of our projections.

# Hunting the Dreaded Sun Monster

Nats fans are all too well acquainted with the dreaded Sun Monster that ate up both Bryce Harper and Jayson Werth during a Sunday afternoon horror show against the Brewers at Nats park on September 23. (Of course, this creature already has its own twitter account.)

Well, the Nats will travel to Saint Louis to play the Cardinals in the National League Division series. Because television is run by media elites who write idiotic hit pieces about DC, the Nats will play at 3:00 P.M, Eastern time. Thanks to the New York DamnYankees and their stranglehold on prime-time television scheduling, the Nats will have to play an afternoon game for the benefit of the legions of unemployed television-viewing baseball fans everywhere who would otherwise be numbing their pain with vicodin and bourbon cocktails while watching Dr. Phil.

This also means that Harper, Werth, and possibly Morse might have to contend with a Saint Louis Sun Monster. James Wagner of the Post has already written a fairly good piece on the difficulties of the sun in Saint Louis. I commend that piece to you if you want to read about how players felt about the sun.

But here at Natstradamus, we like verifiable phenomena where we can find them. So the question is: when is the worst sun field time at New Busch Stadium in Saint Louis?

If you don’t want to be blinded with science here’s your short answer: the Sun Monster is going to gobble up whoever is standing in center field at 4:02 P.M. Central Time (5:02 Eastern).

Let’s start with the ballpark orientation. You should all bookmark this diagram by the brilliant FlipFlopFlyBall. That’s a graphical representation of the direction a batter faces in all MLB ballparks, relative to True North.

Let’s assume that a center fielder in straight-away center field lines up so that he could stare at the batter directly in the eyes–that is, they would be on the same line, facing each other. (I know this isn’t how real defensive alignment works, but go with me on this, OK?) That means that the center fielder would have to be facing 180 degrees opposite the batter’s facing.

Refer again to that diagram and look for Busch Stadium. If you plug Busch Stadium into Google Earth and measure the angle from home plate to straight away center field, you will see that the batter faces about 68 degrees from true North. The Center Fielder, then, would have to be facing the other direction (180 degrees opposite!) so the Center Fielder’s facing is about 248 degrees from true North.

Thanks to the hard work of scientists at the National Oceanic and Atmospheric Administration, the general public has access to an excellent Solar position calculator. The value we’re looking for is the Solar Azimuth: the position of the sun in degrees clockwise from north.

At the start of the game (2:00 P.M. Central Time, 3:00 P.M. Eastern), we can already see that the solar azimuth is at 221.94 degrees. Things get progressively worse as the afternoon goes on, however. At about 4:02 local time, things are at their worst: the solar azimuth reaches the dreaded 248.06 degrees: right into the eyes of the center fielder.

Yikes. How about a shadow? That’s going to need a bit of trigonometry.

At the start of the game, the solar elevation (the angle of the sun, measured from the horizon) is 36.06 degrees. I don’t have good measurements of the height of the stands at Busch Stadium, so I’m going to assume the stands are about 100 feet tall at their highest point. We’ll also imagine that the center fielder is playing about medium depth (start of the inning, no basenners, no defensive shifts on) which puts him maybe 375 feet from home plate. I don’t know the measurement of the foul ground between the plate and the backstop. Let’s assume it’s 12 feet.

The length of the shadow at any given time, then, assuming that the sun is shining directly behind home plate, is the long side of a right triangle formed by the base of the backstop (A), the top of the stands (B), and the position of the center fielder (C). If that value is equal to or greater than 387 (375+12), the center fielder is in the shadow; if less, the Sun Monster has him.

So how long will the shadow be at 4:02 PM central time? well, that’ll be

$\tan{\theta} = \frac{\text{height of stands}}{\text{distance from backstop}}$

Which means

$\text{Length of shadow} = \frac{\text{height of stands}}{\tan \theta}$

Where

$\theta = \text{angle of elevation}$

With an angle of elevation of 16.51 degrees at 4:02 PM local time,

$\text{Length of shadow} = \frac{\text{100 feet}}{\tan 16.51^\circ} = 337 \text{feet}$

Our center fielder will get no help from the shadows, then. If he stands 375 feet from the plate, he’s 387 feet from the backstop, and in the full sun. Fifty feet ahead of him (in what would now doubtless be infield-fly territory), the relief of the shadows beckons. But he must live with the full sun.

At the start of the game, by the way, the shadows are much shorter–a mere 137 feet–so even if the sun isn’t directly in the center fielder’s eyes, pretty much the whole outfield is in direct sunlight.

There you have it, Nats fans. We had better hope that there are no fly balls hit to Nats outfielders tomorrow.

# The Limits of Prescience

A thread over at the Washington Nationals Fan Forums pushed back against some of my projections here and raised a few points that I neglected to address in my 2012 projections.

## Margins of Error

Interesting projections but the missing piece would be an estimate of how much of a margin of error there would be for both the offensive and defensive estimates that would provide a range for the expected number of wins as opposed to a hard number.

This was a serious omission on my part. All projections have a certain degree of uncertainty built into them, and I really should have discussed the degree of uncertainty built into mine.

I took my method for calculating the projected runs allowed by pitching and defense from this site. The author tested this method against 7 years of complete season data from 2002 through 2008. As he writes:

I found the R^2 value. Not to oversimplify things too much, but this value basically shows what percentage of the variation can be accounted for by the model. The value ranges from 0 (worthless) to 1 (perfect). For my 210 data points, I had an R^2 value of about 0.78 (i.e. 78% of the variation).

That means that my defense and pitching runs allowed projections should be good for plus or minus 22%. That gives a lower bound of 482.84 runs allowed and an upper bound of 755.20 runs allowed.

If we assume that my offensive predictions are correct (a problem I’ll get to in a second), that means the 2012 Nats will win anywhere between 68 and 103 games

I know that’s an immense difference. I’m not sure how I could close that gap. UZR doesn’t account for pitcher or catcher defense, for instance. But even then, I think the method at least gets us in the ballpark.

The offense numbers are a lot more troublesome. I haven’t been able to do any real regression analysis to determine how good my model is–I simply haven’t had the time.

On the other hand our offense has way too many question marks to estimate the total number of runs scored with enough precision to come up with a meaningful value that can be used in a secondary projection as you did in calculating our win total.

Any type of future projection is likely to involve more than a little handwaving. Here, I’ve drawn an arbitrary line: all players included in this analysis are players on the Nats’ 25-man roster as of January 27, 2012, some 23 days before pitchers and catchers are due to report at Viera.

## Individual Players and the Projections

Will Werth stay Werthless?

2011 Jayson Werth was astonishingly bad. I’m going to believe that his 2011 numbers are aberrations and not indicative of a “new normal.” I’m fairly confident that the 4-year average from 2008-2011 is a fair picture of what kind of player Werth is now–somewhere between his Philly days and the debacle of 2011.

Will Desmond, Ramos, and Espi improve or stagnate?

As far as Desmond and Espinosa, I have no idea. I don’t think I have nearly enough data about them to make any predictions going forward. Ramos, however, gets a nice bump from more playing time and more PAs. His wRC/PA isn’t terrible, so that’s to be expected.

Will Morse fall back to Earth?

I’m going to go ahead and say No. As I said in Part 3, Morse’s modest offensive outputs in 2008-2010 might make you think that he’s going to crash down to Earth in 2012. But, remember, I’ve taken a four year average of his wRC/PA over the same period. Giving Morse 600 plate appearances in 2012 gives a projected wRC of 97.00: exactly the same as his breakout 2011 “beastmode” year. Indeed, even if we throw out Morse’s 2011 season, running the same calculation over data from 2008-2010 yields a projected wRC of 90.00: Seven runs short of our prior projection and of the 2011 total, but still enough to make him almost as good as Ryan Zimmerman (projected for 90.69 wRC). Indeed, all of this taken together seems like pretty persuasive evidence that “beastmode” has been lurking inside him the whole time, and only needed to see enough PAs.

Will Zimmm get hurt again? Will LaRoche bounce back?

My response: Dammit, Jim, I’m a baseball fan, not a doctor!. I have really no good way of figuring out La Roche’s prognosis post-surgery, nor can I really know anything about the state of Zimmerman’s joints and muscles. The only real response I have here is that the four-year interval I picked should be fair to both men in terms of their expected production.

Who plays centerfield?

Again, I had to draw an arbitrary line and go with who was in the organization as of the day I began compiling the statistics. That means that for now, we’re looking at a DeRosa/Bernadina platoon in center field. This might not be ideal, but I didn’t want to mix players who weren’t officially in the organization into these projections. Blown Save, Win, however, has attempted to address the center field question in a recent post, where he suggests that perhaps the short-term answer is Rick Ankiel. I’ll have to go back and study this, obviously.

# Projecting the 2012 Nationals, Part 2:Top of the Inning: Pitching, Defense, and Runs Allowed.

In part 1 of this project, I sketched out how we might arrive at a projected win-loss total for the 2012 Nationals by using the Pythagorean win expectation formula. Again, let’s suppose the whole 2012 season is like a day at Nats Park. The visitors get to bat first. As Nats fans, then, the first thing we have to watch is the effectiveness of the home team’s pitching and defense.

# Total Runs Allowed: 615.72

Let’s get this out of the way quickly: I project that the opponents of the 2012 Nationals will score just under 616 runs against the Nats.

To be pedantic, the “visiting” team in our calculations will score 615.72 runs in 2012. Don’t be bothered too much about the fractional runs–they’ll all come out in the wash.

You might ask yourself: “Well, how did I get here?”

The short answer is this: we need to figure out how many runs the pitching staff allows–that means using FIP. In your mind’s eye, imagine the 5-run 9th-inning debacle against the Marlins on July 26th of last year.

Then we need to figure out if the defense can take any of those runs away. In your mind’s eye, think of a happier moment– Roger “The Shark” Bernadina’s unbelievable catch at Nats park, robbing Mike Stanton of at least a couple of runs.

The gist is: Runs allowed is the sum of each individual pitcher’s runs allowed, minus the sum of all the runs saved by each defender.

## Pitching: 619.02 Runs Allowed

You might have noticed that FIP looks an awful lot like the “traditional” pitching effectiveness statistic, Earned Run Average or ERA. This is not an accident. FIP is meant to remove the troublesome “earned/unearned” distinction and get to the question of whether the pitcher “caused” the opposing team to score.

ERA, of course, is calculated like this:

$\text{Earned Run Average} = 9 \times \frac{\text{Earned Runs Allowed}}{\text{Innings Pitched}}$

$\text{Fielding Independent Pitching} = \frac{13HR + 3BB - 2K}{IP} + \text{scaling constant}$

Where “scaling constant” is some constant figure (around 3.20 or so) to normalize things to a league average and make it look like ERA.

Notice that FIP only cares about things that are in the pitcher’s control: Home Runs, Walks, Strikeouts, and Innings Pitched. The rest is up to the defense (which we’ll get to). Notice also that it looks an awful lot like ERA, so we can use it like ERA. FIP tells us how many runs a pitcher is likely to give up, on average, for every 9 innings he pitches.

The only thing we don’t know for sure is the number of innings each pitcher will pitch–that’s what we have to project. But we already know, more or less, how “good” each pitcher is from the FIP data.

To figure out how many runs each pitcher is likely to give up, we calculate an expected runs allowed this way:

$\text{Pitcher's Projected Runs Allowed} = \frac{\text{FIP} \times \text{Projected Innings Pitched}}{9}$

Adding each of those numbers together for each pitcher will give you a total number of runs likely to be given up.

## Starting Pitchers

 Pitcher Name 2012 IP (Projected) FIP (2008-2011 Average) 2012 Runs Allowed per pitcher (projected) Stephen Strasburg 160.00 1.87 33.24 Jordan Zimmermann 180.00 3.59 71.80 Gio Gonzalez 200.00 4.06 90.22 Chien-Ming Wang 180.00 4.35 87.00 John Lannan 180.00 4.57 91.40

## Bullpen

 Pitcher Name 2012 IP (Projected) FIP (2008-2011 Average) 2012 Runs Allowed per pitcher (projected) Ross Detwiler 63.2 4.30 30.42 Tom Gorzelanny 98.1 4.64 50.69 Craig Stammen 61.0 4.23 28.67 Sean Burnett 62.0 4.20 28.93 Brad Lidge 60.0 3.72 24.80 Henry Rodriguez 72.2 3.22 26.00 Tyler Clippard 72.2 3.61 29.15 Drew Storen 73.0 3.29 26.69

## Defense: 3.30 Runs Saved.

Accounting for defense in these projections is, paradoxically, both easier to do and harder to explain.

It’s easier, because there’s not much to be done. We take our UZR data and add them up.

Yup, it’s really that simple. The end result tells us how many of the runs allowed by the pitchers the defense saves. Thus, a positive value means that the defense took away that number of runs that might have scored otherwise. A negative value, on the other hand, means that the defense bungled enough to allow more runs to score than they otherwise would have done.

We can say this, of course, because built into the pitching statistic (FIP) is the assumption that the pitcher would perform exactly the way FIP would expect him to perform in front of a perfectly average defense. UZR measures how much above or below average defense is.

I won’t reproduce the position player tables here–that would be too tedious, and you can read them here anyway. When you add them all together, the 2012 Nats defense will prevent 3.30 runs from scoring that might otherwise have scored.

There are a couple of quirks to this calculation. If you’ve been paying attention, you’ll notice that UZR is a “counting” statistic, not a rate. So over four years, the totals you’ll see in the tables are aggregates: the number of runs, total, in the last four years that that player is responsible for saving (or letting through). For the purposes of this calculation, I’ve had to divide that figure by four, to get a rough estimate of how many runs the player saves, on average, for each year under consideration.

I should note a few things I learned while looking at the defensive statistics:

• Ryan Zimmerman is every bit the defender that I thought he was, apparently. In each of the four years in my study, the Nats could expect Zimmerman to save 7.55 runs, on average. That’s phenomenal.
• As a right fielder, Jayson Werth’s average UZR in the period under study is a respectable 4.35. As a center fielder, he’s perfectly average, with a 0.00 UZR in the period under study. In left field, Werth is less than ideal, but serviceable, with a -1.60 UZR (allowing, on average 1.6 “extra” runs to score).
• By UZR, Roger Bernadina might be the worst center fielder on this roster (-2.10 UZR). He’s much, much better in left field (1.70 UZR). This surprised me. After all, it’s his spectacular diving catch in center field that I linked to above as an example of saving runs.
• On the bench, Mark DeRosa and Steve Lombardozzi are, overall, perfectly average defenders, but they can play an excellent spread of positions. If I were managing the Nats, I’d appreciate the degree of flexibility they can bring to a lineup.

Well, that does it for the top of the inning. The pitchers would have allowed 619.02 runs, but the defense took 3.30 of those away from the opposition. Going into the bottom of the inning, the 2012 score stands with the visiting team at 615.72, with the Nats coming to bat in the bottom of the inning. We’ll find out just how well they bat in the bottom of the inning in Projecting the 2012 Nationals, Part 3, Bottom of the Inning: Offense.

# Projecting the 2012 Nationals, Part 1: Ground Rules & Starting Line-Ups

In keeping with the prophetic nature of the blog, I promised you all some projections about the 2012 Nationals. As you might imagine, trying to see the future is a fair bit of work, and I wanted to be able to walk you all through my reasoning step by step, so I’m going to break my analysis up into a 4-post series.

And because this is about baseball, after all, I’ll break it down in a baseball-like fashion. Imagine yourselves in Davey Johnson’s shoes, stepping out towards home plate at Nats park, line-up card in hand, ready to meet the umpire and the opposing manager. You’d have to discuss the ground rules first, and then exchange line-up cards. That’s what we’ll be doing in this post: sketching out the outlines of my method and telling you just who’s in the starting line-up.

# Ground Rules: What Are We Doing and How Are We Doing It?

A baseball team’s winning percentage can be estimated fairly accurately using Bill James’s Pythagorean win expectation formula:

This is of course pretty intuitive, particularly in its simplified form on the right. The team that scores more runs than it gives up will win a baseball game. A 162-game season is thus just a day in the [ball]park, but in macrocosm. Our calculations feel pretty much like watching a ballgame, too:

1. Figure out who makes the team.
2. Watch the top of the inning: how many runs do the pitchers give up? To do this, we’ll need a stat called FIP, or Fielding-Independent Pitching.
3. Still in the top of the inning: how well is the team defending? To answer that, we’ll need an esoteric stat: UZR, or Ultimate Zone Rating [Yes, I know it’s a dumb name. The sad thing is that if Sabermetricians were more articulate, they’d be baseball writers–and thereby deprive us of their statistical insights].
4. Finally, at the bottom of the inning, we figure out if the home team can score more runs than the visiting team did in the top of its inning. To find that out, we’ll need wRC, weighted Runs Created.

Projections should be pretty straightforward, right? There are a couple of pitfalls. UZR is notoriously unstable, and needs at least 3 years of data to be any good at all in calculations like this. Because we’re dealing with a pretty mixed bunch of ballplayers here, I can’t just use career UZR figures and take an annual average. Jayson Werth’s figure would have to be divided by 9, while Danny Espinosa’s would only be divided by 2. To even things up, I’ve decided to use a four-year average of each of the stats above. That gives just about enough of a sample size, I think, to be useful. It’s also fair: the four-year moving average sweeps from 2008 through the end of 2011–good news for Jayson Werth, who gets to include his phenomenal run with the Phillies with his near-abysmal 2011 campaign.

# The Starting Lineup: Meet your 2012 Washington Nationals!

With today’s acquisition of veteran relief pitcher Brad Lidge, I think it’s pretty safe to say that the Hot Stove League is at an end. Without further ado, meet your 2012 Washington Nationals! [All of the data here, by the way, is from Fangraphs.]

## Starting Rotation

 Pitcher Name 2012 IP (Projected) FIP (2008-2011 Average) Remarks Stephen Strasburg 160.00 1.87 Strasburg’s coming back after Tommy John surgery, so he’ll be on an innings limit, just like Jordan Zimmermann was in 2011. I’ve set his limit at 160 innings, around about where J.Z. was limited last year. Jordan Zimmermann 180.00 3.59 Now that J.Z. is healthy again, I’ve allocated him what I feel is a fair load for a starting pitcher. Gio Gonzalez 200.00 4.06 Gio’s had a few 200 IP seasons, and he comes billed as an inning-eater, so I’ve given him a heavier IP load. Chien-Ming Wang 180.00 4.35 Wang is also coming off a long injury. I wonder if giving him a regular starting pitcher’s load isn’t a bit ambitious. Also, Wang gets hurt by my somewhat arbitrary 4-year window. His career FIP is really 4.04, but for now I’m going to accept the 4.35 number because… John Lannan 180.00 4.57 Lannan’s FIP is really really high compared with the rest of the rotation. I’ll get a lot of flak for putting him in the rotation at all, especially from Detwiler’s (4.30 FIP) partisans. On a wholly subjective level, though I think Lannan’s pitched well enough for long enough to land a spot in the rotation. Detwiler, to me, anyway, seems to have a much harder time the second and third time through an opposing batting order, but I don’t have any data to confirm that at the moment.

## Bullpen

 Pitcher Name 2012 IP (Projected) FIP (2008-2011 Average) Remarks Ross Detwiler 63.2 4.30 Long relief. Tom Gorzelanny 98.1 4.64 Long relief. Craig Stammen 61.0 4.23 Middle relief Sean Burnett 62.0 4.20 Middle relief Brad Lidge 60.0 3.72 Middle relief. Lidge figures to be a 6th-inning pitcher to get to Clippard & Storen. Also, as far as I can tell, Lidge has never had a plate appearance, so he doesn’t mess with my offensive calculations. Henry Rodriguez 72.2 3.22 Middle relief; alternate closer; last-ditch pitcher in losing efforts. Tyler Clippard 72.2 3.61 Clip’s 2008-2011 FIP is better than his career FIP of 3.91 Drew Storen 73.0 3.29 Closer.

## Starting Position Players

Note on position players: because UZR is calculated per-position, players will appear more than once on each table. In effect, it’s like having lots of players, one at each position, on defense, but having them form like Voltron into a single batter for offense. Also, I’ve omitted the pitchers’ offensive numbers from these tables–they were getting too cluttered, anyway. Don’t worry, I’ve factored the pitchers’ offensive contributions, such as they might be, into my final projections, but it would be tiresome to list them here. Also, UZR ignores defense from pitchers & catchers, so you won’t see any UZR numbers by Ramos or Flores.

 Player Position UZR 2008-2011 wRC 2008-2011 annual average wRC/PA 2008-2011 annual average 2012 PA (projected) 2012 wRC (projected) Adam LaRoche 1B 4.30 65.50 0.132658 600 79.59 Danny Espinosa 2B 3.00 22.50 0.116883 600 70.13 SS -0.20 Ryan Zimmerman 3B 30.20 83.25 0.151158 600 90.69 Ian Desmond SS -13.70 33.25 0.102151 600 61.29 RF -0.70 2B -2.80 Michael Morse LF -6.90 37.75 0.161670 600 97.00 1B -3.50 RF -7.50 3B 0.40 Roger Bernadina CF -8.40 22.25 0.100112 400 40.04 RF -4.10 LF 6.60 Jayson Werth RF 17.40 95.25 0.154941 600 92.96 CF 0.00 LF -1.60 Wilson Ramos C 15.75 0.121857 400 48.74

## Bench Players

 Player Position UZR 2008-2011 wRC 2008-2011 annual average wRC/PA 2008-2011 annual average 2012 PA (projected) 2012 wRC (projected) Mark DeRosa RF 6.10 44.50 0.129927 400 51.97 LF 2.70 SS 0.00 1B -1.20 2B -2.80 3B -4.50 Steve Lombardozzi 3B 1.10 0.25 0.031250 350 10.94 2B 0.10 SS -0.90 Jesus Flores C 13.25 0.101727 300 30.52
Unless something unusual happens in the next couple of days, I don’t see the Nats’ opening-day 25-man roster looking too different from this. How will they do in 2012? Stay tuned for the next part of my 2012 projection series, Top of the Inning: Pitching, Defense, and Runs Allowed.