# Projecting the 2013 Nationals, Part 1: Ground Rules & Starting Line-ups

Spring Training is well underway down in Viera. This is the season, then, of portents and omens–the latest and most amusing of which was the story of an osprey dropping a fish onto the Nats outfield. Not having any expertise in the art of augury, I don’t think I can really comment about the auspiciousness or inauspiciousness of such an omen for the upcoming season.

What I can offer you, however, is the results of my own admittedly crude projection system. Long-time readers will know that I like to think of the baseball season as a single inning of a baseball game writ very very large. In the top of the inning, we see the home team take the field, and see how good the pitching and the defense are at getting opposing batters out. In the bottom of the inning, we watch the home team at bat, and see how well they drive in runs. Then we count the runs allowed in the top of the inning and the runs scored in the bottom of the inning–if the home team scored more runs than the other team, they win.

If you want the nuts and bolts of my projection system, please, read the post I’ve linked above. It describes the general outline of the system as clearly as I can.

This year, however, I’m making a few changes to the Natstradamus projection system.

First, in pitching, I have replaced FIP with xFIP. I don’t know enough about home run/fly ball rates to tell, really, which pitchers are “lucky” or “unlucky” with respect to how many home runs they give up on fly balls. xFIP fixes that for me by normalizing runs allowed by a pitcher to a league-average home run/fly ball rate. Some pitchers get better; other pitchers get worse; but I think over all that might be a more fair way of evaluating pitchers for the purposes of this projection system.

Second, I have tweaked the defensive calculations slightly. Instead of using UZR, I have calculated a UZR/game, and then multiply that  by the number of games in which I expect each player to appear. Again, this is crude, and defensive metrics are highly unstable anyway, but hey, it’s all I’ve got.

Remember, my projections are based on four-year trailing averages for each stat. That is, they’re the averages of the past four years.

With those preliminaries out of the way, let’s start this year’s predictions off by going through the 2013 Nationals’ projected 25 man roster:

Starting Rotation

• Stephen Strasburg, xFIP 2.56
• Gio Gonzales, xFIP 3.81. I do not believe Gio will be subject to a suspension for his alleged involvement in the Biogenesis scandal. I explained my view on the situation here.
• Jordan Zimmermann, xFIP 3.71.
• Ross Detwiler, xFIP 4.44
• Dan Haren, xFIP 3.37.

Starting Position Players

• Danny Espinosa, 2B
• Ryan Zimmerman, 3B
• Ian Desmond, SS
• Bryce Harper, LF
• Denard Span, CF.
• Jayson Werth, RF
• Wilson Ramos, C
• Kurt Suzuki, C. I have Ramos and Suzuki splitting playing time evenly.

Bench

• Tyler Moore, OF/1B
• Steve Lombardozzi, IF/OF

Bullpen

• Rafael Soriano, xFIP 3.6, Primary Closer
• Drew Storen, xFIP 3.46, Primary Set-up, Back-up Closer
• Tyler Clippard, xFIP 3.54
• Ryan Mattheus, xFIP 4.48
• Zach Duke, xFIP 4.34, Left-handed long reliever/Spot starter
• Craig Stammen, xFIP 3.96, Right-handed long reliever/Spot Starter
• Bill Bray, xFIP 4.19, Left-handed one-out guy. This is probably the most controversial pick; others might put Henry Rodriguez or Christian Garcia here instead. But I’m going to assume Bray heads north with the club.

No surprises, then. Stay tuned as we discuss pitching and defense in Part 2 of our projections.

# Baseball Eve!

## The Boys Are Back in Town!

No real insights for you today on the day before Pitchers & Catchers report to Viera. Federal Baseball already has some early photographic evidence of baseball returning to Viera. Highlights include Jordan Zimmermann and Drew Storen rocking the quasi-official Beastmode T-Shirt introduced to the ’11 Nats by Ian Desmond and made famous by Michael Morse. But who’s that shaking hands with Tyler Clippard? The #tigerbeatbaseball girls want to know. (It’s not Ryan Tatusko, though. I checked that already.)

Two statistically-related things that I’ve been thinking about lately, though:

## Lost in Translation

Given the number of major league players and prospects who play in the Latin American winter-ball leagues in Venezuela, the Dominican Republic, and Mexico, it’s remarkable to me how hard it is to get reliable statistical information out of those leagues. The leagues have their own stats pages, to be sure. For instance, the Venezuelan League’s stats pages are pretty comprehensive. But it’s not exactly easy to find the player you’re looking for. Moreover, calculating advanced statistics like wOBA and wRC is pretty much impossible. The worst has got to be wRC, because it depends on calculating a league average wOBA. To do that for the Venezuelan league, I’d have to key in all the data for all players into another spreadsheet and run the calculations from there. The calculating isn’t too bad, but the data entry will take more time than I’m willing to commit (it’s not like sabermetrics is my job, y’know–and if it were, I’d be pretty terrible at it).

As an aside: reading statistical tables and box scores in Spanish reminded me that my Spanish isn’t as good as it ought to be. Baseball stats are cryptic enough in English, but they can be pretty opaque in Spanish. Glossaries do exist, but I’ve had to bring in an outside consultant for help with a few.

If you’re at all interested in Latin American baseball stats, PuraPelota has the most complete database I’ve been able to find, but they can be a bit slow on the update cycle. I haven’t been able to find anything nearly as complete or helpful for any of the Asian leagues (Japan, Korea, Taiwan). I can’t understand why that would be so–surely the Sabermetric revolution has spread all across the baseball world? Nothing makes you appreciate the excellent work that Baseball Reference and Fangraphs do quite like dealing with the sparse data available for foreign baseball leagues.

## Eye in the Sky

I’ve already written about this post at Línea de Fair, but I can’t help but take a closer look at one of the author’s objections to UZR:

…UZR, the measure employed to determine whether a fielder has more range than his teammates, and whether, on the whole, he can prevent opponents from from creating more runs. Joey Cora used to remind me how an infielder could be better depending on which pitcher was on the mound. This was due not only to the pitches, but also to the control the pitcher has over them. “What happens if a catcher calls for a sinker inside,” Cora asked. The shortstop moves a little, almost imperceptibly, towards the hole if the batter is right-handed. But if the pitcher leaves the ball outside, the roller could go up the middle of the infield. Result? A higher probability that the batted ball goes up the middle of the field and finds the shortstop further away from it–thus raising his UZR.

My initial reaction is that complaining that UZR may not describe that particular defensive alignment and situation like this is like complaining that the Ideal Gas Law won’t tell you exactly where to look for one particular carbon dioxide molecule in a tank full of compressed air.

Part of the problem, I think, is that UZR is the one baseball statistic in (quasi-) common use that is flat-out impossible to derive from other published statistics. As far as I can tell, the whole process depends on individual human beings watching game footage, noting where fielders are positioned, and noting where the fielder meets (or doesn’t meet) the ball.

Because I’m lazy, I figure that there must be a better way to do things–or at least one that isn’t so unbearbly tedious. We already have fairly sophisticated software that can track the location of, say, baseballs and baseball gloves as they move across a camera’s field of view. It should be a fairly simple matter to fix a wide-angle camera (or several) across a baseball field, record the whole game, and only have human intervention whenever the ball strikes the bat. An observer might tap one button when he sees the impact of the ball on the bat, and then tap another when the ball comes to rest (either in the glove of the fielder, or out of play). The end result might look something like the FlipFlopFlyball‘s defensive positioning infographic.

The genius of computing, however, would allow us to track each defensive move as a vector, with an origin point at wherever the defender started when the ball was put in play, and an endpoint at wherever he was standing when the play was over. I’m not so great at mathematics, but I imagine the resulting graphical representations (and statistical inferences!) that could be made from those data would be extremely useful in evaluating the range of any individual defender. Heck, maybe it wouldn’t be too hard to explain– if I only had a brain!

# How Good Does Bryce Harper Have to Be?

Keen readers of this blog–both of you–will have noticed one glaring omission among all of my calculations. I have thus far decided not to include a certain 19-year-old catcher-turned-outfielder who last saw limited playing time at AA Harrisburg.

In a recent column, the Washington Post’s Jason Reid suggested that Bryce Harper needs to grow up. Given that this is the same Jason Reid whose journalistic insight into the Redskins’ quarterback situation early in the 2011 season gave Washington sports fans–and journalism as a whole– the biggest “Doh!” moment since the night Dewey beat Truman, I was moved to tweet:

The fact that @JReidPost raised doubts about @BHarper3407 making the team leads me to conclude Harper WILL make the #nats opening day team

Well, if Harper does make the Opening Day roster, how good does he have to be to do no harm to a squad already projected for 86 wins?

Let’s assume Harper is an everyday player. There’s no indication so far that he can play center field. The Nats don’t have anyone available with a positive UZR as a center fielder except Werth. So let’s put Harper in right field. Here’s the most dangerous assumption of them all: assume Harper is a totally average defender.

## Assuming a Healthy LaRoche

Let’s also assume that Adam LaRoche is healthy and ready to be his usual self at first base. That rounds out the outfield as Morse, Werth, and Harper.

Someone needs to get bumped off the bench. Given that the Nats went out and got DeRosa and Ankiel, that leaves Roger “The Shark” Bernadina the odd man out, so we need to assume that The Shark doesn’t break camp with the Nats.

Assuming everybody’s an every-day type player, we’ll need to cut down DeRosa’s plate appearances, to reflect his status as a real bench player and not half of a platoon. Let’s give him 250 plate appearances. Same with Ankiel.

As constructed and run through my model, this Harper-less squad is good for 83 wins. Were he to join the Nats as the opening-day right fielder, Harper would need to have a wRC of 25–that is “create” 25 runs over 162 games.

What does 25 wRC look like? It looks like an outfielder not much worse than Aaron Rowand of the Giants, who posted 27 wRC in 2011. Rowand batted .233/.274/.347 with 30 extra-base hits (including 4 home runs) in 2011. That’s a pretty low bar to clear.

## But What If LaRoche Isn’t on the Team?

The situation becomes more complicated if LaRoche is not healthy. Morse has to move to first base. Werth slides to center, Harper moves into right. Left field sees a Bernadina/DeRosa platoon. Cameron and Ankiel come along for the ride as bench players. What does this look like now? Not too good, I’m afraid: 73 wins.

To do no harm to the team in this situation, Harper would need to be worth 90 wRC. What does a 90 wRC outfielder look like? Consider Matt Holliday of the Cardinals, who posted exactly 90 wRC in 2011. In 2011, Holliday batted .296/.388/.525 with 36 extra-base hits (including 22 home runs). That’s a much taller order.

To put the sheer magnitude of that task into perspective consider this: in 147 plate appearances with AA Harrisburg, Bryce Harper posted a wRC of 18. Normalizing that to the 600 plate appearances one might expect to see out of an every-day player, that would have given Harper an expected wRC of 83.72. Harper would have to hit major-league pitching better than he hit AA pitching to even have a chance of doing no harm to the team in this situation.

Fangraphs’ RotoChamp projection sees Harper with 259 plate appearances in 2012, projecting a wRC of 36 from those plate appearances. Even if we normalize this to 600 plate appearances, that only gets us to 83.39 wRC–not quite good enough for our purposes.

That’s how good Bryce Harper has to be. The real question is: how good is Bryce Harper? Only he can show us if he’s as good as he has to be. For the sake of Nats fans everywhere, I hope he shows us he’s much better than even that.

# Reason, Passion–and Reasonable Expectations

If you read Spanish at all, read this post over at Línea de Fair. It discusses baseball, the philosophy of science, semiotics, Sabermetrics, and the experience of being a fan all in a single post. One paragraph in particular caught my attention (translation is mine):

The baseball fan and the baseball analyst–sometimes the roles are confused, but both are delighted to see a good ballgame–try to explain the logic of the game and to predict what might happen next in the same way that man used to try to find the reason why the sun rose every day, or why the rain fell. The dynamism and insight of the Society for American Baseball Research (SABR, by its English initials) has generated new explanations, very much in vogue these days, which have been the origin of a feverish debate similar to that between the Apollonians and the Dionysiacs….

The author points to a divide in the philosophy of science between those who believe that reality can be described by the application of reason (Apollonians) and those who doubt that human reason can possibly explain the whole world (Dionysians). This is a tension that I as baseball fan feel very strongly.

On the one hand, there is a certain unknowable, aesthetic quality to baseball. When I see Danny Espinosa leap and pluck a line drive out of the air, turning as he lands to double off the runner taking a lead off first base, I am watching something no less beautiful or graceful as a ballet. But even though I might witness that play at Nats park with twenty or thirty or forty thousand of my closest friends, not one of them will feel quite the same way as I do when I see it. We can communicate those memories to each other, and compare them, but those emotions are really ours alone. And no matter how many times Debbi Taylor (or her successor) asked the Nats’ hero of the day to describe what was going through his mind as he made a game-changing play, neither Debbi nor anyone watching will ever really know how it felt to make that play. That’s a wholly subjective, unknowable experience. Our emotional bond with baseball is made of countless such memories–each of them precious, each of them irreplaceable, and each of them utterly incommunicable.

But then, I spend an awful lot of time perusing statistics. The cynic might suggest that this kills the joy of going down to the baseball game at all. After all, stats don’t tell stories as much as they open windows into specific questions: Which is the most effective pitcher? Who bats better, overall? How good is this player’s defense? Indeed, on this blog, I’ve tried to use my rudimentary grasp of statistics to open a window on the 2012 Nationals season yet-to-be.

All of this mucking about with cold rationality has affected me as a fan–but, I think, for the better. I started my 2012 projections project because I was sick and tired of hearing all the emotional overreactions to the Prince Fielder free agency drama on my twitter feed. The Nationals, so it went, were going to be world-beaters with Fielder and terrible without him. That looked like a proposition I could test, so I did, the best way I could.

As FDR might have said, the only thing Nats fans have to fear is “Fear itself: nameless, unreasoning, unjustified terror, which paralyzes needed efforts to convert retreat into advance.” My calculations put the Nationals anywhere between 84 and 86 wins–on track for their best season since arriving in DC. And, even in a “doomsday scenario” without Adam LaRoche, the Nats look to get anywhere between 79 and 81 wins.

Think about what that means. It means that the worst I can expect from the 2012 Nationals is that they’ll have an even chance of winning any given ballgame on any given night. As a fan, that’s all I can reasonably ask for, anyway. If that’s the worst I can expect, I can put my unreasoning, unjustified terror aside and enjoy the visceral joys of watching Espinosa and his teammates doing beautiful things on a baseball field. It might not be a perfect synthesis of reason and passion, but I’ll take it.

# Round and Round it Goes

It’s been a dizzying day in Natstown. First came the news that the Nationals had prevailed in the salary arbitration case against John Lannan, netting Lannan a $5 million sallary instead of the$5.7 million he asked for.

Just when everybody thought it would be time to put the arbitration proceedings behind us and focus on baseball, Natstown was rocked by the news that Bavarian-born St. Louis hurler Edwin Jackson had joined the Nats for a one-year deal valued at about \$10 million.

Wait, WHAT?

I guess that means Rizzo’s going to trade Lannan and acquire that mythical center fielder, right? Well, not really. “We did not acquire Edwin Jackson to trade another starting pitcher,” said Rizzo.

If we’re to take Rizzo at his word–something I myself am loath to do–where does that leave the 2012 Nationals?

The Nats can’t possibly break camp with all of that starting pitching. Someone has got to go on the pitching staff. It can’t be Detwiler, who’s out of minor-league options. I very much doubt that it will be Wang. The only pitcher in the rotation that comes to mind with minor league options left is…the five-million dollar man, John “Long Ball” Lannan.

So Jackson must replace Lannan. What does that mean? Well, between 2008 and 2011, Edwin Jackson has has a FIP of 4.13 (as opposed to Lannan’s 4.57). Assigning him the 180 innings that I gave Lannan in my previous projections, Edwin Jackson’s better pitching is worth one extra win. That’s right Nats town: With Edwin Jackson instead of John Lannan, the 2012 Nationals would be projected to 85 wins.

It could potentially get better. Edwin Jackson is a much better batter than Lannan. I project he will be worth 1.16 wRC in 2012. So what? That nudges the win total up to 86.

That’s a lot of wins for a ballclub that’s only broken even once (the magical 2005 Nationals!). But that hasn’t stopped some observers from envisioning big things for the Nats. Who would have believed that Buster Olney was going to put the Nats in the wildcard?

My best guess here is that the fans in Syracuse will be treated to John Lannan for a good while…until a deal can be struck trading Lannan, Bernadina, Detwiler, and possibly Lombardozzi for a capable center fielder.

Also, what shouldn’t be lost in all this is that John Lannan has been a pretty good pitcher for the Nats, all in all. As a friend of mine remarked right after the arbitration award was announced, “Lannan has done yeoman’s work for the Nationals during some of their darkest years.” Even if he didn’t get all of what he asked for, he deserved at least some of it.

# The Limits of Prescience

A thread over at the Washington Nationals Fan Forums pushed back against some of my projections here and raised a few points that I neglected to address in my 2012 projections.

## Margins of Error

Interesting projections but the missing piece would be an estimate of how much of a margin of error there would be for both the offensive and defensive estimates that would provide a range for the expected number of wins as opposed to a hard number.

This was a serious omission on my part. All projections have a certain degree of uncertainty built into them, and I really should have discussed the degree of uncertainty built into mine.

I took my method for calculating the projected runs allowed by pitching and defense from this site. The author tested this method against 7 years of complete season data from 2002 through 2008. As he writes:

I found the R^2 value. Not to oversimplify things too much, but this value basically shows what percentage of the variation can be accounted for by the model. The value ranges from 0 (worthless) to 1 (perfect). For my 210 data points, I had an R^2 value of about 0.78 (i.e. 78% of the variation).

That means that my defense and pitching runs allowed projections should be good for plus or minus 22%. That gives a lower bound of 482.84 runs allowed and an upper bound of 755.20 runs allowed.

If we assume that my offensive predictions are correct (a problem I’ll get to in a second), that means the 2012 Nats will win anywhere between 68 and 103 games

I know that’s an immense difference. I’m not sure how I could close that gap. UZR doesn’t account for pitcher or catcher defense, for instance. But even then, I think the method at least gets us in the ballpark.

The offense numbers are a lot more troublesome. I haven’t been able to do any real regression analysis to determine how good my model is–I simply haven’t had the time.

On the other hand our offense has way too many question marks to estimate the total number of runs scored with enough precision to come up with a meaningful value that can be used in a secondary projection as you did in calculating our win total.

Any type of future projection is likely to involve more than a little handwaving. Here, I’ve drawn an arbitrary line: all players included in this analysis are players on the Nats’ 25-man roster as of January 27, 2012, some 23 days before pitchers and catchers are due to report at Viera.

## Individual Players and the Projections

Will Werth stay Werthless?

2011 Jayson Werth was astonishingly bad. I’m going to believe that his 2011 numbers are aberrations and not indicative of a “new normal.” I’m fairly confident that the 4-year average from 2008-2011 is a fair picture of what kind of player Werth is now–somewhere between his Philly days and the debacle of 2011.

Will Desmond, Ramos, and Espi improve or stagnate?

As far as Desmond and Espinosa, I have no idea. I don’t think I have nearly enough data about them to make any predictions going forward. Ramos, however, gets a nice bump from more playing time and more PAs. His wRC/PA isn’t terrible, so that’s to be expected.

Will Morse fall back to Earth?

I’m going to go ahead and say No. As I said in Part 3, Morse’s modest offensive outputs in 2008-2010 might make you think that he’s going to crash down to Earth in 2012. But, remember, I’ve taken a four year average of his wRC/PA over the same period. Giving Morse 600 plate appearances in 2012 gives a projected wRC of 97.00: exactly the same as his breakout 2011 “beastmode” year. Indeed, even if we throw out Morse’s 2011 season, running the same calculation over data from 2008-2010 yields a projected wRC of 90.00: Seven runs short of our prior projection and of the 2011 total, but still enough to make him almost as good as Ryan Zimmerman (projected for 90.69 wRC). Indeed, all of this taken together seems like pretty persuasive evidence that “beastmode” has been lurking inside him the whole time, and only needed to see enough PAs.

Will Zimmm get hurt again? Will LaRoche bounce back?

My response: Dammit, Jim, I’m a baseball fan, not a doctor!. I have really no good way of figuring out La Roche’s prognosis post-surgery, nor can I really know anything about the state of Zimmerman’s joints and muscles. The only real response I have here is that the four-year interval I picked should be fair to both men in terms of their expected production.

Who plays centerfield?

Again, I had to draw an arbitrary line and go with who was in the organization as of the day I began compiling the statistics. That means that for now, we’re looking at a DeRosa/Bernadina platoon in center field. This might not be ideal, but I didn’t want to mix players who weren’t officially in the organization into these projections. Blown Save, Win, however, has attempted to address the center field question in a recent post, where he suggests that perhaps the short-term answer is Rick Ankiel. I’ll have to go back and study this, obviously.

# Projecting the 2012 Nationals, Part 4: Setting Expectations.

In this fourth and final installment of my series on projecting the 2012 Nationals season, we’ll put together everything we’ve learned about the 2012 Nationals so far and make a final, bold prediction of the Nats’ won/loss record.

Actually, what the hell, let’s get the prediction out of the way first: The 2012 Nationals will win 84 games and lose 78, for a winning percentage of .520 on the season.

Remember I said back in Part 1 that a baseball team’s winning percentage can be estimated fairly accurately using the Pythagorean win expectation formula:

$\text{Win} = \frac{\text{Runs Scored}^2}{\text{Runs Scored}^2 + \text{Runs Allowed}^2} = \frac{1}{1 +(\text{Runs Allowed}/ \text{Runs Scored} )^2}$

Plugging the data we collected in Part 2 and Part 3, that comes out to a final winning percentage of .520. Multiply that by a 162-game season, and that gives you 84 wins.

That’s not bad! In fact, it’s 4 more wins than the Nats got in 2011, and it would be more wins than the Nationals have ever gotten in a season since coming to D.C. I don’t know about the rest of you, but I’m very excited by this.

Now all we need to do is convince Ted Lerner to bring back fireworks at Nats Park so we can all hear Charlie Slowes make his signature “Bang, Zoom Go the fireworks! A Curly ‘W’ is in the books!” call as God intended.

