Natstradamus

Analyze Facts, Not Entrails

Natstradamus Projections for 2015: REPLY HAZY TRY AGAIN

Posted on April 4, 2015 by ouij

And now the moment you’ve all been waiting for: the Natstradamus projections for the 2015 season!

This year, the projection comes with a major caveat: If Ryan Zimmerman is no worse than a league-average defensive first baseman, the Washington Nationals are projected to win between 95 and 98 games.

Just to refresh your recollection (because Lord knows I need to refresh mine every year), I use a pretty simple projection system to come up with the Nats’ won/loss totals for the year. The whole thing is based off Bill James’s Pythagorean Expectation, and it’s a satisfyingly intuitive way to figure out how good your team is. In baseball, you win if you score more runs than you allow. The Pythagorean Win Expectation model reflects that.

Imagine the whole baseball season compressed into two halves of one inning at Nats park. First, we need to fill up the lineup card of players. Then, we need to know who these players are–I use a four-year trailing average as the basis for these calculations. Then, we need to figure who plays where and when. This is the greatest acknowledged weakness of my system, as I have somewhat arbitrarily assigned playing time based on my impressions of injuries, etc.

At the top of the inning, the visiting teams come to bat. The result of that half-inning will be Runs allowed. Any upper-deck crank will tell you that there are two ways you can allow a run, generally: by pitching badly (giving up tons of walks and homers) or by fielding badly (not getting to balls hit in the gap, dropping fly balls, committing errors). The same upper-deck crank will tell you that you can get out of the inning with good pitching (striking everyone out) and great fielding (robbing home runs, showing ridiculous range, gunning down runners with your arm). In my model, base pitching runs allowed off a pitcher’s FIP (I also use xFIP as an alternative, which normalizes pitcher home runs allowed to a league average home-run/fly-ball rate). Defense is handled by UZR, which handily expresses defense as the number of extra runs allowed or saved.

At the bottom of the inning, the Nats come to bat: time to score some runs. I use Weighted Runs Created for each batter. Since that’s a counting stat, I divide that by the number of plate appearances over the last four years to get the number of runs created per plate appearance. I multiply this by the number of projected plate appearances (an everyday player will get about 600 plate appearances). That’s the number of runs on the board.

When that’s over, I come to some conclusions.

The 2015 Nats pitching staff is projected to allow between 530 (using FIP) and 562 (using xFIP) runs. The 2014 Nats actually allowed 555 runs–and we were already amazed at how good the pitching was last year.

This is a good thing because there is too much uncertainty about the defense to have any real confidence. UZR is notorious in that it needs a pretty big sample size to stabilize–the rule of thumb is that 3 years’ worth of data for an everyday player is what you’d need for the stat to be of any real use. Unfortunately, Ryan Zimmerman, first baseman, is a relatively new creation. His limited time at first base resulted in a comically bad UZR/150 (i.e., what UZR would be if he played 150 games at first base for a year) of -109.1. If true, it would mean that Ryan Zimmerman’s first base defense would be costing the Nats over 20 more runs than he would stand to get them at the plate (~88, by my calculations). That’s hard to stomach. If we follow the model blindly, though, we end up with the defense costing the Nats’ excellent pitching just over 97 runs. If we back off and assume Ryan Zimmerman is at least a league-average first baseman, the defense improves significantly, actually saving just under 12 runs.

So, if you think Ryan Zimmerman is a 100-run liability at first base (and I doubt very much that this is the case), the pitching and defense combined concede between 627-659 runs (totals not seen since 2011, when the Nats allowed 643 runs). If you think Ryan Zimmerman is a league-average first baseman, the pitching and defense combine to allow between 518 and 550 runs (As good or better than the 2014 Nats).

Turning now to the batting, things are more straightforward. The model projects the Nats will score 652 runs. This is lower than last year’s observed total of 686 runs. The projection reflects my pessimism regarding Rendon’s playing time and the speed at which Span and Werth can return to the lineup. I will be very happily proven wrong on this point, though.

Add it all together, and you end up with 95 to 98 wins if Zim is at least a league-average first baseman. If he is the nightmare that the tiny and highly unreliable sample of data UZR has to work with, things are much less rosy, with the Nats winning between 80 and 84 games, and likely missing the playoffs.

I’m sorry, Nats Town.

Posted on July 24, 2013 by ouij

I try very hard not to blog out of emotion. There’s a lot of feeling out there–mostly on sports talk radio–and not enough thinking. If you follow me on twitter, you know that I’m pretty emotional when I watch Nats games. Lately, most of those emotions are bad.

So I want to apologize, Nats town. I’m sorry. As winter thawed to spring, I projected the Nats for an unbelievable 98 wins and the division crown. As I write this, the Nats are 48-52, in third place behind Philadelphia, and eight games back from the division-leading Braves.

I don’t think they’re going to catch up.

When I projected the Nats to win all those games, I assumed two things: the starting lineup would be healthy, and everybody was going to perform in line with their four-year trailing averages.

By this time, my model would have expected the Nats to have scored some 448 runs. They have scored only 367 to date. The disappointments are all across the board.

Let’s look at the differences:

According to my pre-season model, Jayson Werth should have 63 wRC by now. He has accumulated only 44–a difference of 19 runs over 100 games. Given his tremendous performance since his return from the disabled list, we can safely assume that his time away accounts for the difference in runs.

Injury also robbed the Nats of nearly a month of Bryce Harper’s services. By now, according to my model, he should have accounted for 59 wRC. He has accumulated only 42: a difference of 17 runs.

More perplexing is the offensive decline of Denard Span. According to my model, he should have accounted for 54 wRC by now; he only has 40, a difference of 14 runs. Every time I see him, he seems to ground out sharply to second base–a gut feeling reinforced by the fact that his BABIP (batting average on balls in play) stands at .300, down from his career BABIP of .315. Perhaps he was due for a regression in BABIP eventually? I don’t know.

The single biggest offensive failure of the Nats in the first 100 games of the 2013 season was their stubborn insistence on Danny Espinosa. We now know that Espinosa was suffering through a number of injuries that sapped him of power–his ISO (Isolated power) numbers dropped from a career .165 to .114 this year. The power outage, coupled with his high strikeout rate (28.1% this year, slightly up from his career K% of 27.1%), rendered him an offensive black hole and an automatic “out” for opposing pitchers. Had Espinosa been at least as healthy as he was in 2011 and 2012, my model expected him to have accumulated 42 wRC by now. He accumulated 4. That’s a difference of 38 runs.

Put another way: if Danny Espinosa had been as I expected him to be this year, and if the Nats had allowed exactly as many runs as they have to this point (392 runs), the Nats would be five wins better.

Put yet one more way: Danny Espinosa was so bad compared to how I projected him that the shortfall that he created in my projections is greater than the shortfall created by the injuries to Harper and Werth combined.

Espinosa’s offensive failings, we can say, helped put the Nats in a very deep hole–one that they might not manage to climb out of. Every team struggles. Nobody in the NL East seems to be winning as I write this. And yet, the Nats have fallen into third place because of their lousy start.

This wasn’t Rick Eckstein’s fault, or Davey Johnson’s, or, really Danny Espinosa’s fault. This was General Manager Mike Rizzo’s fault. His “scout’s eye” might have told him something was wrong with Espinosa. The power outage was, in hindsight, evident from the beginning and showed no sign of abating. He knew that Espinosa had at least two injuries that were likely causing his offensive struggles. And yet, for months, Rizzo did nothing, despite the fact that Espinosa had a minor-league option left.

Instead, we kept telling ourselves that it was early, that things were going to come around. For some things did come around–notably Jayson Werth and Wilson Ramos (the latter of which, I should add, is 10 wRC better than my model had him at this point of the year, despite having played a fraction of the time due to an extended DL stint). But for Espinosa, it never did.

I wish I could offer some hope. I wish I could tell you that, no, the Nats offense had every chance of breaking out. I can’t. This is what we’ve got to look forward to.

I’m sorry, everybody. I’m really, really sorry.

Projecting the 2013 Nationals, Part 3: Offense

Posted on February 18, 2013 by ouij

Now we come to the fun part of the inning: how many runs does the home team score? The model projects that the 2013 Nationals will score 693 runs.

Assuming that an everyday position player will get about 600 plate appearances, and assuming that the plate appearances of the two catchers, Suzuki and Ramos, are divided evenly, we end up with a table that looks something like this:


Player Name	4-year total PA	4-year total wRC	4-yr moving avg wRC/PA	Projected PA	Projected wRC	Team Total wRC
Jayson Werth	2803	425	0.151623260792009	600	90.97
Ryan Zimmerman	2844	426	0.149789029535865	600	89.87
Tyler Moore	171	26	0.152046783625731	150	22.81
Bryce Harper	597	86	0.144053601340034	600	86.43
Adam LaRoche	2622	361	0.13768115942029	600	82.61
Denard Span	2671	334	0.125046798951703	600	75.03
Wilson Ramos	613	76	0.123980424143556	300	37.19
Ian Desmond	1849	214	0.115738236884803	600	69.44
Danny Espinosa	1428	164	0.11484593837535	600	68.91
Roger Bernadina	1150	121	0.105217391304348	150	15.78
Chad Tracy	845	85	0.100591715976331	100	10.06
Kurt Suzuki	2703	274	0.101368849426563	300	30.41
Steve Lombardozzi	448	42	0.09375	150	14.06
Stephen Strasburg	83	3	0.036144578313253	150	5.42
Drew Storen	2	0	0	0	0.00
Dan Haren	240	19	0.079166666666667	150	11.88
Craig Stammen	90	3	0.033333333333333	30	1.00
Jordan Zimmermann	166	4	0.024096385542169	150	3.61
Zach Duke	226	1	0.004424778761062	30	0.13
Tyler Clippard	14	0	0	0	0.00
Gio Gonzalez	84	-5	-0.05952380952381	150	-8.93
Ross Detwiler	97	-9	-0.092783505154639	150	-13.92
Ryan Mattheus	1	0	0	0	0.00
Rafael Soriano	0	0	0	0	0.00
Bill Bray	0	0	0	0	0.00

						692.7806858275

As excited as we’ll all be to follow Bryce Harper in his quest to beat Mike Trout’s insane age-20 season, it’s instructive to look at this table. Jayson Werth and Ryan Zimmerman are projected to get 91 and 90 wRC respectively. Harper is expected to do great things–86 wRC–but it’s worth noting just how much a healthy Werth and Zimmerman mean to the Nationals line-up.

Notice also that the line-up is remarkably deep. Let’s look at it from the point of view of a possible batting order:

Denard Span, wRC 75.03
Jayson Werth, wRC 90.97
Bryce Harper, wRC 86.43
Adam LaRoche, wRC 82.61
Ryan Zimmerman, wRC 89.87
Ian Desmond, wRC 69.44
Danny Espinosa, wRC 68.91
Wilson Ramos, wRC 37.19; plus Kurt Suzuki, wRC 30.41

Those first five batters, however you order them, are pretty impressive. That should make for a much deeper line-up than we’re used to seeing here in DC.

So, what does this all mean? Tune in next time as we discuss how this all fits together in Part 4.

Projecting the 2013 Nationals, Part 1: Ground Rules & Starting Line-ups

Posted on February 18, 2013 by ouij

Spring Training is well underway down in Viera. This is the season, then, of portents and omens–the latest and most amusing of which was the story of an osprey dropping a fish onto the Nats outfield. Not having any expertise in the art of augury, I don’t think I can really comment about the auspiciousness or inauspiciousness of such an omen for the upcoming season.

What I can offer you, however, is the results of my own admittedly crude projection system. Long-time readers will know that I like to think of the baseball season as a single inning of a baseball game writ very very large. In the top of the inning, we see the home team take the field, and see how good the pitching and the defense are at getting opposing batters out. In the bottom of the inning, we watch the home team at bat, and see how well they drive in runs. Then we count the runs allowed in the top of the inning and the runs scored in the bottom of the inning–if the home team scored more runs than the other team, they win.

If you want the nuts and bolts of my projection system, please, read the post I’ve linked above. It describes the general outline of the system as clearly as I can.

This year, however, I’m making a few changes to the Natstradamus projection system.

First, in pitching, I have replaced FIP with xFIP. I don’t know enough about home run/fly ball rates to tell, really, which pitchers are “lucky” or “unlucky” with respect to how many home runs they give up on fly balls. xFIP fixes that for me by normalizing runs allowed by a pitcher to a league-average home run/fly ball rate. Some pitchers get better; other pitchers get worse; but I think over all that might be a more fair way of evaluating pitchers for the purposes of this projection system.

Second, I have tweaked the defensive calculations slightly. Instead of using UZR, I have calculated a UZR/game, and then multiply that by the number of games in which I expect each player to appear. Again, this is crude, and defensive metrics are highly unstable anyway, but hey, it’s all I’ve got.

Remember, my projections are based on four-year trailing averages for each stat. That is, they’re the averages of the past four years.

With those preliminaries out of the way, let’s start this year’s predictions off by going through the 2013 Nationals’ projected 25 man roster:

Starting Rotation

Stephen Strasburg, xFIP 2.56
Gio Gonzales, xFIP 3.81. I do not believe Gio will be subject to a suspension for his alleged involvement in the Biogenesis scandal. I explained my view on the situation here.
Jordan Zimmermann, xFIP 3.71.
Ross Detwiler, xFIP 4.44
Dan Haren, xFIP 3.37.

Starting Position Players

Adam LaRoche, 1B.
Danny Espinosa, 2B
Ryan Zimmerman, 3B
Ian Desmond, SS
Bryce Harper, LF
Denard Span, CF.
Jayson Werth, RF
Wilson Ramos, C
Kurt Suzuki, C. I have Ramos and Suzuki splitting playing time evenly.

Bench

Chad Tracy, OF/3B
Tyler Moore, OF/1B
Steve Lombardozzi, IF/OF
Roger Bernadina, OF

Bullpen

Rafael Soriano, xFIP 3.6, Primary Closer
Drew Storen, xFIP 3.46, Primary Set-up, Back-up Closer
Tyler Clippard, xFIP 3.54
Ryan Mattheus, xFIP 4.48
Zach Duke, xFIP 4.34, Left-handed long reliever/Spot starter
Craig Stammen, xFIP 3.96, Right-handed long reliever/Spot Starter
Bill Bray, xFIP 4.19, Left-handed one-out guy. This is probably the most controversial pick; others might put Henry Rodriguez or Christian Garcia here instead. But I’m going to assume Bray heads north with the club.

No surprises, then. Stay tuned as we discuss pitching and defense in Part 2 of our projections.

Jim Vance: Telling Truth to Power

Posted on July 28, 2012 by ouij

WRC-TV evening news anchor and Washington DC legend Jim Vance delivered a scathing editorial on-air recently, lambasting the local sports media–including his own WRC-TV!–for overhyping Redskins training camp and ignoring the first-place Nationals.

This is a significant moment in Nats fandom. When a local media legend like Jim Vance says it’s time to get behind the Nats, you know it’s serious.

Here’s his editorial, transcribed in full:

Okay. So. Did you notice the lead story in our sports segment a couple of minutes ago? Did you see the front page of the Post today? Are you wondering, like I am, what the hell is wrong with you people?

RGIII and the Redskins have been dominating local sports coverage for weeks now, way out of proportion, in my view, to the place they deserve–to the place that they’ve earned–on the current DC sports landscape.

Did I just now commit heresy? Did I just even suggest that there might be another professional sports franchise in this ‘Skins-crazed town? Yeah, I did. And I have a feeling that I’m not alone.

Allow me to make something clear before I go any further. I am lovin’ me some RGIII. I think his might be the most refreshing and exciting athletic presence in this town in years. I love the way that he’s been handling himself and the media, and I am especially thrilled that that maturity and articulation are so obviously the result of a mother and a father who would expect nothing else from their boy.

That being said, the kid has yet to play a down in the NFL, for goodness’ sake! While back in the city–where a sports franchise that carries the city’s name ought to be, by the way–the Nationals are on fire. There is not one team in Major League Baseball with a better record. The Nationals–our team–they are 20 games above .500.

You’ve heard, haven’t you, that the last time that happened was in 1945, when Doreen [Gentzler] was looking at her driver’s license? You want some front-page material? There it is. And this is with a team that’s been banged-up and injured all season long! You want a headline? Davey Johnson ought to be Time Magazine’s “Man of the Year” for masterful stewardship of that team. Our team.

Listen, I am not even a baseball fan. And I am jacked up over this team.

Here’s my problem with the ‘Skins training camp overkill hype: That’s what it is–hype! I was at all four of our Super Bowl appearances, and for twenty years since then, that team has set me up in August and cut my heart out in November.

I was also at RFK for the last Senators game back in ’71. Truth is, I didn’t really care if they left. I didn’t know anything about ’em. But now, forty-one years later, I have never been more excited and filled with hope for a baseball team.

The ‘skins promise. The Nats deliver. And, until that changes, that’s my sports headline.

Why the Nats Didn’t Re-Sign Pudge

Posted on February 22, 2012 by ouij

I got into a pretty lively discussion on Twitter today about the Nats catcher situation, sparked off by This tweet:

The #nats are horribly thin at catcher. They don’t know how Ramos will recover from the kidnapping. They need to bring Pudge back.

Let me refute the proposition that [the Nats] need to bring Pudge back by refuting, in turn, each of the statements upon which it was premised.

The Nats are horribly thin at catcher

There are a few assumptions embedded in this statement. Mostly, the objection boils down to this: Jesús Flores is not a good hitter.

This is an opinion. I’ll answer with facts. In his 2011 Venezuelan League regular season, Flores batted .332/.369/.516, with 16 doubles and 8 home runs. He posted a wRC of 27. Yeah, I can hear you saying, that’s Venezuela, a Double-A league at best. He did’t hit so good as a big-leaguer!. OK, that’s true. In 2008, his last long, uninjured season, Flores batted .256/.296./402 with 8 home runs. Not impressive–he was only worth 32 wRC to the ’08 Nats. That’s a wRC+ of 79, which is below average.

I concede that there’s a very big drop-off from Flores’s best wRC+ of 79 to Wilson Ramos’s worst wRC+ or 91 (in 2010). But, as we’ll see later, Flores stacks up very nicely against the competition–especially when that competition is Pudge Rodriguez.

[The Nats] don’t know how Ramos will recover from the kidnapping

This is a true statement in the very strict sense. We’ll never really know, because Ramos himself won’t talk much about it. The only thing we have to judge him on is his Venezuelan league performance. As I said on Sunday, the Venezuelan League numbers aren’t as bad as they might seem. Sure, Ramos batted a comparatively lousy .216/.274/.273 with 2 doubles and only 1 home run, accounting for only 11 wRC. But Ramos only got 98 plate appearances (his regular season having been disrupted by the kidnapping, naturally). When we normalize his offensive numbers to the 200 plate appearances he would have otherwise gotten (and which he did get in 2010), he would have gotten 23 wRC. Yes, exactly the same wRC as he got in 2010, a Venezuelan season in which he hit .322/.390/.567 with 17 doubles and 9 home runs.

And we cannot help but be encouraged by his performance in the Championship Series, in which he helped the Tigres de Aragua to victory batting .450/.550/.478 with 2 home runs over 20 at-bats in 6 games.

For all intents and purposes, the Wilson Ramos that walked out of the jungle a free man seems to have been the same Wilson Ramos that was taken into the jungle at gunpoint. We should expect the same from him.

The Nats Need to Sign Pudge

No they don’t.

OK, you’re saying, but what’s the harm in signing Pudge? He’s a future hall-of-famer, calls a great game, and is generally awesome. Why not have Pudge back up for Ramos instead of Flores? Well, I hate to say it, but Pudge is too old, bats too poorly, and costs too much to put him on the team instead of Flores.

Remember when I said Flores’s wRC+ of 79 in his best year made him a below-average hitter? Have a look at Pudge’s wRC+ since 2009. It’s not pretty: 69, 68, 63. In the 2010 season, the last season Pudge was the every-day catcher, Pudge hit into 25 double plays (leading a friend of mine to dub him, not so fondly, GiDPudge). There’s no denying it–Pudge has entered the autumn of the patriarch. Rest assured that having Pudge as a back-up catcher instead of Flores will mean less offense on a ballclub that desperately needs offense.

Fine, but Pudge is the best defensive catcher in the game! Yes he is. But using the same wRC projection method I use for making my 2012 season projections, Pudge is worth 24 wRC. Flores is worth 31. Is Pudge’s defense good enough to save 7 additional runs? Maybe not.

Even if Pudge’s defense could make up for his declining offense, there’s the question of money. In 2011, Pudge earned a cool $3,000,000 from the Nats. Even if he decided to take a significant discount and play for half that–$1,500,000, Pudge would cost nearly twice as much as the $815,000 the Nats are paying Flores for 2012.

If you think Flores’s future looks more like his 2011 Venezuelan League numbers, why would you pay twice as much for a catcher who will net you less offense? And even if Pudge’s defensive skills equal the difference between his offensive numbers and Flores’s, why would you pay twice as much to achieve the same net result?

It’s not that Pudge has not been an excellent catcher. But the Nats have two catchers who are perfectly adequate for their purposes right now–especially at their salary levels. If I were GM, I would worry less about catchers and more about the outfield.

Pitchers & Catchers Report!

Posted on February 19, 2012 by ouij

Nats pitchers and catchers officially report to Viera today!

Of course, many of their teammates have already been in Viera for quite some time, getting extra work in before the official start to spring training.

Notably, however, a few Nats have been doing a lot more with their winter vacations than that. Henry Rodriguez, along with his fellow Venezuelans Jesús Flores and Wilson Ramos, spent the winter playing in the Venezuelan League. However many off-season workouts you can do, I imagine it’s very different to be able to work on your skills in a situation where real games are on the line, in front of stadiums packed with thousands of adoring fans.

While beat writers will be busy asking other ballplayers what they did on their winter vacation–and while those other ballplayers will reply with endless variations on “I worked really hard; I’m in the best shape of my life now,” the Nationals’ three Venezuelan ballplayers can get on with their business and let their records speak for themselves. Well, what do those records say?

First, a note about the Venezuelan League season. There is a 63-game regular season, followed by a 16-game round-robin “semifinal” that determines the two teams that face each other in the final championship series. I’m only looking at regular-season statistics here. After all, that’s all I look at when I look at a player’s MLB statistics. The Round-robin and championship series phases are “post-season,” and so won’t be counted. Besides,as I said yesterday, I’m lazy. Getting proper offensive statistics would require more data entry than I have time or inclination to do.

Henry Rodriguez: Tan Capaz de Ser Feo como Fenómeno

A few days ago, I tweeted that Henry Rodriguez was going to be someone I’ll be watching carefully over the course of the 2012 season. In his time with the Nats so far, he has shown himself capable of unbelievable feats of relief pitching dominance. But to say he had some issues getting his considerable power under control might be something of an understatement:

According to SB Nation, the 10th-worst Pitch of 2011. I still cringe just thinking about this.

The Hot Rod’s 2011 season with the Nationals split the difference between those two extremes. In 59 appearances and 65.2 innings pitched, the Hot Rod recorded an ERA of 3.56, a FIP of 3.24, and a WHIP of 1.51. On average, in any given nine-inning stretch, you could have expected him to strike out 9.59 batters, and walk 6.17 of them–and give up a measly 0.14 home runs.

How did he do in Venezuela this winter? In 23 appearances and 23.2 innings pitched, he recorded an ERA of 3.80, a FIP of 3.88, and a WHIP of 1.39. On average, in any given nine-inning stretch, you could have expected him to strike out 9.39 batters, walk 6.46, and give up 0.38 home runs.

The one thing that kills Rodriguez is his walks. His walk rate crept up during the 2011 Venezuelan league regular season, and that’s not something Nats fans wanted to see. The 1.39 WHIP is lower than his 2011 MLB WHIP of 1.51, despite an increase in walk rate and decrease in strikeout rate, so it looks like Venezuelan-league batters had a harder time reaching base safely after making contact. I can’t verify this without better information, but I’m betting the sheer speed of his pitches leaves hitters making weak, late contact–they must not have been catching up to the fastball. Of course, when they do time him, they can do serious damage. Witness the increase in home run rates (although I wonder if that’s just bad luck, rather than bad pitching).

In many ways, the 2011 Venezuelan regular season has been a disappointment for Hot Rod, because in the 2010 Venezuelan league regular season, he put up dominant numbers. The numbers speak for themselves. In 21.1 IP over 18 appearances, Hot Rod posted absolutely Strasburg-like stats: 1.69 ERA, 1.84 FIP, 0.94 WHIP. Strikeouts per 9 innings? 14.00. And, most importantly of all: 3.80 walks per 9 innings. Oh, and zero home runs.

When Henry Rodriguez is locked-in, as he was in Venezuela in 2010, he’s one of the most fearsome relievers in the game, capable of totally destroying opposing batting. But when he’s not locked-in, he puts up performances that are, well, not nearly so dominant. We saw that in DC all last summer, and fans in Venezuela saw it this winter. It will be interesting to see whether Nats pitching coach Steve McCatty can work with Henry to get his fearsome power under control. If the 2010 Venezuelan League model of the Hot Rod rolls out of the bullpen for the 2012 Nats, the National League is in for a nasty surprise. But if the 2011 Hot Rod coughs and sputters to life, fans seated behind home plate should, for their safety, carefully inspect the netting, and maybe consider buying a half-smoke while Henry goes to work.

Ramos y Flores

Let’s move on to the Nats’ two botanically-surnamed catchers. In Venezuela this winter, one of them batted .332/.369/.516, with 16 doubles and 8 home runs, posting a wRC of 27. The other batted .216/.274/.273, with 2 doubles and 1 home run, with a wRC of 11. Which is which?

If you guessed that the flourishing catcher was Jesús Flores, you are right. Flores didn’t see much action with the Nats in 2011, and we had pretty much forgotten about him in DC after he was hurt in 2009. The last good look we’d gotten at Flores was in 2008, when he batted .256/.296/.402 with 18 doubles, a triple, and 8 home runs. If his Venezuelan league offensive figures are any indication of his readiness for the 2012 MLB season, I think the Nats can expect very good things from Flores. If Flores bats in 2012 the way that he did in Venezuela, we can project him to have a wRC of 34 in 2012–4 more runs than we would have expected from his recent past.

Ramos’s Venezuelan season got off to the worst possible start–he was kidnapped at gunpoint by masked men, and the freed in what was supposed to have been a fierce gunfight. Only he can know how he was affected, but his offensive production, at first glance, looks to have dropped off considerably. If Ramos bats as well in 2012 for the Nats as he did in Venezuela, I’d project him to post a wRC of 46–3 runs fewer than I have him projected this year.

But look again. During the 2010 Venezuelan season, he batted .322/.390/.567 with 17 doubles and 9 home runs, posting a wRC of 23. But, crucially, Ramos got 200 plate appearances in 2010, as opposed to only 95 in 2011. If we give him 200 plate appearances in 2011, he ends up with a wRC of… yup, 23!

How can that be? My guess: one of the components of wRC is the league average wOBA. In 2010, when Ramos put up the gaudy Venezuelan numbers, The league average wOBA was .283. In 2011, that average dropped to .275. Perhaps Ramos’s numbers (and scaled numbers) are down because the whole league’s numbers are down. Perhaps Venezuelan league pitching improved as a whole. Either way, Nats fans can be comforted by the fact that, even after everything that’s happened to him, Wilson Ramos is the same ballplayer he’s always been.

What Nats fans should look forward to this spring, however, is an emerging Catcher Controversy. Flores did very well with the Navegantes de Magallanes–look at those offensive stats! If Flores can continue to build on his Venezuelan League successes while in the Grapefruit League this spring, we might find that it is Flores, not Ramos, who ends up as the Nats’ opening-day catcher.

Baseball Eve!

Posted on February 18, 2012 by ouij

The Boys Are Back in Town!

No real insights for you today on the day before Pitchers & Catchers report to Viera. Federal Baseball already has some early photographic evidence of baseball returning to Viera. Highlights include Jordan Zimmermann and Drew Storen rocking the quasi-official Beastmode T-Shirt introduced to the ’11 Nats by Ian Desmond and made famous by Michael Morse. But who’s that shaking hands with Tyler Clippard? The #tigerbeatbaseball girls want to know. (It’s not Ryan Tatusko, though. I checked that already.)

Two statistically-related things that I’ve been thinking about lately, though:

Lost in Translation

Given the number of major league players and prospects who play in the Latin American winter-ball leagues in Venezuela, the Dominican Republic, and Mexico, it’s remarkable to me how hard it is to get reliable statistical information out of those leagues. The leagues have their own stats pages, to be sure. For instance, the Venezuelan League’s stats pages are pretty comprehensive. But it’s not exactly easy to find the player you’re looking for. Moreover, calculating advanced statistics like wOBA and wRC is pretty much impossible. The worst has got to be wRC, because it depends on calculating a league average wOBA. To do that for the Venezuelan league, I’d have to key in all the data for all players into another spreadsheet and run the calculations from there. The calculating isn’t too bad, but the data entry will take more time than I’m willing to commit (it’s not like sabermetrics is my job, y’know–and if it were, I’d be pretty terrible at it).

As an aside: reading statistical tables and box scores in Spanish reminded me that my Spanish isn’t as good as it ought to be. Baseball stats are cryptic enough in English, but they can be pretty opaque in Spanish. Glossaries do exist, but I’ve had to bring in an outside consultant for help with a few.

If you’re at all interested in Latin American baseball stats, PuraPelota has the most complete database I’ve been able to find, but they can be a bit slow on the update cycle. I haven’t been able to find anything nearly as complete or helpful for any of the Asian leagues (Japan, Korea, Taiwan). I can’t understand why that would be so–surely the Sabermetric revolution has spread all across the baseball world? Nothing makes you appreciate the excellent work that Baseball Reference and Fangraphs do quite like dealing with the sparse data available for foreign baseball leagues.

Eye in the Sky

I’ve already written about this post at Línea de Fair, but I can’t help but take a closer look at one of the author’s objections to UZR:

…UZR, the measure employed to determine whether a fielder has more range than his teammates, and whether, on the whole, he can prevent opponents from from creating more runs. Joey Cora used to remind me how an infielder could be better depending on which pitcher was on the mound. This was due not only to the pitches, but also to the control the pitcher has over them. “What happens if a catcher calls for a sinker inside,” Cora asked. The shortstop moves a little, almost imperceptibly, towards the hole if the batter is right-handed. But if the pitcher leaves the ball outside, the roller could go up the middle of the infield. Result? A higher probability that the batted ball goes up the middle of the field and finds the shortstop further away from it–thus raising his UZR.

My initial reaction is that complaining that UZR may not describe that particular defensive alignment and situation like this is like complaining that the Ideal Gas Law won’t tell you exactly where to look for one particular carbon dioxide molecule in a tank full of compressed air.

Part of the problem, I think, is that UZR is the one baseball statistic in (quasi-) common use that is flat-out impossible to derive from other published statistics. As far as I can tell, the whole process depends on individual human beings watching game footage, noting where fielders are positioned, and noting where the fielder meets (or doesn’t meet) the ball.

Because I’m lazy, I figure that there must be a better way to do things–or at least one that isn’t so unbearbly tedious. We already have fairly sophisticated software that can track the location of, say, baseballs and baseball gloves as they move across a camera’s field of view. It should be a fairly simple matter to fix a wide-angle camera (or several) across a baseball field, record the whole game, and only have human intervention whenever the ball strikes the bat. An observer might tap one button when he sees the impact of the ball on the bat, and then tap another when the ball comes to rest (either in the glove of the fielder, or out of play). The end result might look something like the FlipFlopFlyball‘s defensive positioning infographic.

The genius of computing, however, would allow us to track each defensive move as a vector, with an origin point at wherever the defender started when the ball was put in play, and an endpoint at wherever he was standing when the play was over. I’m not so great at mathematics, but I imagine the resulting graphical representations (and statistical inferences!) that could be made from those data would be extremely useful in evaluating the range of any individual defender. Heck, maybe it wouldn’t be too hard to explain– if I only had a brain!

[Something I didn’t notice when I first saw that video in high school: MC 900ft Jesus is wearing a 1926 Washington Senators cap!]

How Good Does Bryce Harper Have to Be?

Posted on February 15, 2012 by ouij

Keen readers of this blog–both of you–will have noticed one glaring omission among all of my calculations. I have thus far decided not to include a certain 19-year-old catcher-turned-outfielder who last saw limited playing time at AA Harrisburg.

In a recent column, the Washington Post’s Jason Reid suggested that Bryce Harper needs to grow up. Given that this is the same Jason Reid whose journalistic insight into the Redskins’ quarterback situation early in the 2011 season gave Washington sports fans–and journalism as a whole– the biggest “Doh!” moment since the night Dewey beat Truman, I was moved to tweet:

The fact that @JReidPost raised doubts about @BHarper3407 making the team leads me to conclude Harper WILL make the #nats opening day team

Well, if Harper does make the Opening Day roster, how good does he have to be to do no harm to a squad already projected for 86 wins?

Let’s assume Harper is an everyday player. There’s no indication so far that he can play center field. The Nats don’t have anyone available with a positive UZR as a center fielder except Werth. So let’s put Harper in right field. Here’s the most dangerous assumption of them all: assume Harper is a totally average defender.

Assuming a Healthy LaRoche

Let’s also assume that Adam LaRoche is healthy and ready to be his usual self at first base. That rounds out the outfield as Morse, Werth, and Harper.

Someone needs to get bumped off the bench. Given that the Nats went out and got DeRosa and Ankiel, that leaves Roger “The Shark” Bernadina the odd man out, so we need to assume that The Shark doesn’t break camp with the Nats.

Assuming everybody’s an every-day type player, we’ll need to cut down DeRosa’s plate appearances, to reflect his status as a real bench player and not half of a platoon. Let’s give him 250 plate appearances. Same with Ankiel.

As constructed and run through my model, this Harper-less squad is good for 83 wins. Were he to join the Nats as the opening-day right fielder, Harper would need to have a wRC of 25–that is “create” 25 runs over 162 games.

What does 25 wRC look like? It looks like an outfielder not much worse than Aaron Rowand of the Giants, who posted 27 wRC in 2011. Rowand batted .233/.274/.347 with 30 extra-base hits (including 4 home runs) in 2011. That’s a pretty low bar to clear.

But What If LaRoche Isn’t on the Team?

The situation becomes more complicated if LaRoche is not healthy. Morse has to move to first base. Werth slides to center, Harper moves into right. Left field sees a Bernadina/DeRosa platoon. Cameron and Ankiel come along for the ride as bench players. What does this look like now? Not too good, I’m afraid: 73 wins.

To do no harm to the team in this situation, Harper would need to be worth 90 wRC. What does a 90 wRC outfielder look like? Consider Matt Holliday of the Cardinals, who posted exactly 90 wRC in 2011. In 2011, Holliday batted .296/.388/.525 with 36 extra-base hits (including 22 home runs). That’s a much taller order.

To put the sheer magnitude of that task into perspective consider this: in 147 plate appearances with AA Harrisburg, Bryce Harper posted a wRC of 18. Normalizing that to the 600 plate appearances one might expect to see out of an every-day player, that would have given Harper an expected wRC of 83.72. Harper would have to hit major-league pitching better than he hit AA pitching to even have a chance of doing no harm to the team in this situation.

Fangraphs’ RotoChamp projection sees Harper with 259 plate appearances in 2012, projecting a wRC of 36 from those plate appearances. Even if we normalize this to 600 plate appearances, that only gets us to 83.39 wRC–not quite good enough for our purposes.

That’s how good Bryce Harper has to be. The real question is: how good is Bryce Harper? Only he can show us if he’s as good as he has to be. For the sake of Nats fans everywhere, I hope he shows us he’s much better than even that.

The Limits of Prescience

Posted on January 30, 2012 by ouij

A thread over at the Washington Nationals Fan Forums pushed back against some of my projections here and raised a few points that I neglected to address in my 2012 projections.

Margins of Error

Interesting projections but the missing piece would be an estimate of how much of a margin of error there would be for both the offensive and defensive estimates that would provide a range for the expected number of wins as opposed to a hard number.

This was a serious omission on my part. All projections have a certain degree of uncertainty built into them, and I really should have discussed the degree of uncertainty built into mine.

I took my method for calculating the projected runs allowed by pitching and defense from this site. The author tested this method against 7 years of complete season data from 2002 through 2008. As he writes:

I found the R^2 value. Not to oversimplify things too much, but this value basically shows what percentage of the variation can be accounted for by the model. The value ranges from 0 (worthless) to 1 (perfect). For my 210 data points, I had an R^2 value of about 0.78 (i.e. 78% of the variation).

That means that my defense and pitching runs allowed projections should be good for plus or minus 22%. That gives a lower bound of 482.84 runs allowed and an upper bound of 755.20 runs allowed.

If we assume that my offensive predictions are correct (a problem I’ll get to in a second), that means the 2012 Nats will win anywhere between 68 and 103 games

I know that’s an immense difference. I’m not sure how I could close that gap. UZR doesn’t account for pitcher or catcher defense, for instance. But even then, I think the method at least gets us in the ballpark.

The offense numbers are a lot more troublesome. I haven’t been able to do any real regression analysis to determine how good my model is–I simply haven’t had the time.

On the other hand our offense has way too many question marks to estimate the total number of runs scored with enough precision to come up with a meaningful value that can be used in a secondary projection as you did in calculating our win total.

Any type of future projection is likely to involve more than a little handwaving. Here, I’ve drawn an arbitrary line: all players included in this analysis are players on the Nats’ 25-man roster as of January 27, 2012, some 23 days before pitchers and catchers are due to report at Viera.

Individual Players and the Projections

Will Werth stay Werthless?

2011 Jayson Werth was astonishingly bad. I’m going to believe that his 2011 numbers are aberrations and not indicative of a “new normal.” I’m fairly confident that the 4-year average from 2008-2011 is a fair picture of what kind of player Werth is now–somewhere between his Philly days and the debacle of 2011.

Will Desmond, Ramos, and Espi improve or stagnate?

As far as Desmond and Espinosa, I have no idea. I don’t think I have nearly enough data about them to make any predictions going forward. Ramos, however, gets a nice bump from more playing time and more PAs. His wRC/PA isn’t terrible, so that’s to be expected.

Will Morse fall back to Earth?

I’m going to go ahead and say No. As I said in Part 3, Morse’s modest offensive outputs in 2008-2010 might make you think that he’s going to crash down to Earth in 2012. But, remember, I’ve taken a four year average of his wRC/PA over the same period. Giving Morse 600 plate appearances in 2012 gives a projected wRC of 97.00: exactly the same as his breakout 2011 “beastmode” year. Indeed, even if we throw out Morse’s 2011 season, running the same calculation over data from 2008-2010 yields a projected wRC of 90.00: Seven runs short of our prior projection and of the 2011 total, but still enough to make him almost as good as Ryan Zimmerman (projected for 90.69 wRC). Indeed, all of this taken together seems like pretty persuasive evidence that “beastmode” has been lurking inside him the whole time, and only needed to see enough PAs.

Will Zimmm get hurt again? Will LaRoche bounce back?

My response: Dammit, Jim, I’m a baseball fan, not a doctor!. I have really no good way of figuring out La Roche’s prognosis post-surgery, nor can I really know anything about the state of Zimmerman’s joints and muscles. The only real response I have here is that the four-year interval I picked should be fair to both men in terms of their expected production.

Who plays centerfield?

Again, I had to draw an arbitrary line and go with who was in the organization as of the day I began compiling the statistics. That means that for now, we’re looking at a DeRosa/Bernadina platoon in center field. This might not be ideal, but I didn’t want to mix players who weren’t officially in the organization into these projections. Blown Save, Win, however, has attempted to address the center field question in a recent post, where he suggests that perhaps the short-term answer is Rick Ankiel. I’ll have to go back and study this, obviously.