Baseball Analytics: Another way to love the game

By Eric García McKinley

Think about baseball analytics, and then play a little word association. What comes to mind? Maybe words like “numbers” and “spreadsheet” pop up. Or maybe the term evokes some of the metrics associated with it: “WAR” or Wins Above Replacement; “FIP” or Fielding Independent Pitching, and however you choose to say “wRC+” or Weighted Runs Created Plus.

Perhaps emotional states come to mind. “Rational” might be a common one. For the extra-descriptive types, “coolly detached” may sound about right. But what about “passion?”

Admittedly, that’s an odd fit, and not in tune with the popular canard that “numbers people don’t actually watch games.” Rather, passion and baseball analytics don’t seem to go together because a great deal of the movement is conservative. Instead of saying that a player is “on fire” or “sucks,” extremely good and bad performances are typically identified as “outliers,” and the expectation that those will “regress to the mean” is so commonplace that it hardly needs to be explicitly stated anymore.

But for me, passion and analytics go hand in hand. Sabermetrics reinvigorated my love of baseball because it allowed me to appreciate the game in new ways. It was like finding out for the first time that grandma’s tortillas can be eaten with something other than pinto beans. (Sorry, Grandma!)

New questions and fresh angles are at the core of baseball analytics — they are the flavorful guacamole that makes something already familiar and dear even better. Beside diced red onions, we’re throwing in habanero peppers and cilantro. The world of sabermetrics offers new ways for baseball to surprise. And that opens up channels for more original and thoughtful stories to be told.

Numbers to believe in

My affair with sabermetrics started in 2010 with 6-foot-5 right-hander Ubaldo Jiménez. It’s how I learned about the conservative side of advanced analytics. The Rockies were riding a wave, having gone to the World Series in 2007 and then returned to the playoffs in 2009. I thought that they had become contenders.

And for the first time in the team’s history, they had a bona fide ace in Jiménez. The reliable old-school statistics made him a rising star, especially in the first half of the season. Jiménez, who hails from Nagua, Dominican Republic, put up a 15-1 record with a 2.20 ERA before the All-Star break. He had a 0.93 ERA through his first 12 starts. It was amazing. Were we seeing the second-coming of Pedro Martínez?

But I wanted more ways to show how incredible Jiménez’s start was and how historic his season could be. I sought out proof in advanced statistics. I thought that they would say the same thing about Jiménez’s awesomeness, with más sazón. I found my way to FanGraphs.com — a blog dedicated to all things sabermetrics — and perused unfamiliar stats that became an intimate part of my relationship with baseball from that point forward.

I don’t remember if I noticed at the time, but looking at Jiménez’s first-half stats from 2010 now, one number sticks out: Jiménez’s FIP. While his ERA was 2.20, his FIP was 3.09.

The basis of Fielding Independent Pitching is that pitchers only have control over three outcomes: walks, strikeouts and home runs. Everything else relies on the vagaries of an expansive field and defenders of inconsistent acumen. The Rockies that season had Todd Helton at first base, Troy Tulowitzki at shortstop, a very young Dexter Fowler in centerfield and after that, well, like I said, fielders of inconsistent acumen.

Thus, FIP is an estimation of what a pitcher’s ERA should be, measured only by those three true outcomes. In other words, Jiménez was having a great season, but he was the beneficiary of some good luck, and he was a likely candidate to (gasp!) regress to the mean.

Intuitively, I knew Jiménez wasn’t likely to become the first player since Denny McLain in 1968 to win 30 games. Intuitively, I knew that all the balls in play Jiménez allowed just seemed to find a fielder waiting and ready to make an out.

But FIP provided the language to articulate why and how. And that’s what happened. Jiménez produced a 4-7 record with a 3.80 ERA in the second half of 2010. His FIP was essentially the same: 3.11. His ERA overperformed his FIP in the first half, but he underperformed it in the second. It was reasonable to expect Jiménez to have a second half worse than his first, but the surprise came in how he did it. He simply didn’t have great command. He walked 10.3 percent of the batters — known as BB% — he faced that year while tying for the National League lead with 16 wild pitches.

There’s always more to say

Sabermetrics can also add more to stories we think we already know. We know that Pedro is one of the greatest pitchers ever, partly because he dominated during the Steroid Era. And we know that his best season came in the year 2000 — when he went 18-6 and posted a major-league best 1.74 ERA.

If we get down and dirty with hardcore metrics, Martínez also led the American League that season in a number of critical categories, including a 2.17 FIP, 0.737 WHIP or walks and hits per inning pitched, 11.8 K/9 or strikeouts per nine innings, 8.88 K/BB or strikeout-to-walk ratio, and 0.7 HR/9 or home runs per nine innings. He was unhittable.

But how good was that season, compared to other great seasons? If we search for the top five ERAs among qualified starters in the expansion era — 1961 to present — Pedro’s 2000 isn’t in the top five. It goes like this:

  1. Bob Gibson 1.12 (1968)
  2. Dwight Gooden 1.53 (1985)
  3. Greg Maddux 1.56 (1994)
  4. Luis Tiant 1.60 (1968)
  5. Greg Maddux 1.63 (1995)

Pedro’s 2000 season comes in 12th, which is still amazing, of course — but by this measure, not in the top 10 since 1961.

One of the most illuminating practices in baseball analytics is to adjust raw numbers in order to provide a better sense of how events in different contexts compare to one another. Those of us old enough can recall that 1968 was the “Year of the Pitcher,” and you can see that it sticks out in the list above. Pitching was so dominant in 1968 that Major League Baseball afterward lowered the mound to increase offense. In other words, Gibson’s impossibly small 1.12 ERA, while an incredible feat, got an assist from the low run-scoring environment.

Conversely, 2000 was an extremely hitter-friendly year. The Cubs’ Sammy Sosa led the major leagues with 50 home runs, the Angels’ Darin Erstad banged out 240 hits, Helton led all comers with 147 RBI and the Giants’ Barry Bonds scorched opposing pitchers with a 188 Adjusted OPS+.

Pedro’s ERA was “worse” when compared to the top five since 1961, but it was harder for Martínez than it was for Gibson. Before writing the story of the best pitching seasons since 1961, it makes sense to adjust each pitcher’s performance based on the run-scoring environment.

Baseball Reference adjusts with a statistic called ERA+. This metric takes a pitcher’s performance and sets it against a normalized league ERA, which is always set at 100. It also adjusts for the league and the ballpark, because Fenway Park does not play the same as Kauffman Stadium.

For ERA+, a higher number is better, and a 120 ERA+ means a pitcher was 20 percent better than league average for the time period in question. Here are the top five ERA+ seasons since 1961:

  1. Pedro Martínez, 291 (2000)
  2. Greg Maddux, 271 (1994)
  3. Greg Maddux, 260 (1995)
  4. Bob Gibson, 258 (1968)
  5. Pedro Martínez, 243 (1999)

From this view, not only was Pedro’s 2000 the best pitching season since 1961, it was the best by a lot. The use of sabermetrics sometimes challenges previous assumptions, like in the studies of “bad ball hitters.” But in this case, it simply adds flavor to the season Pedro had 17 years ago. It just might be the best summer a starting pitcher has ever enjoyed. Suffice to say, he led the majors with an incredible 11.7 WAR. And remember, he was small and skinny.

It’s about the love of the game

About a year ago, Michael Wilbon wrote at The Undefeated that advanced analytics was incompatible with the way black athletes and fans relate to sports. Wilbon was absolutist in his claim.

“It’s not part of any discussion of any game for any reason, ever,” he said. “And that’s overwhelmingly true of all the black people we know (and older white people, too) regardless of occupation or station in life. No conversation is ever framed or dominated by numbers.”

Here’s an old-fashioned metric: Wilbon riled everybody up and got them talking. Which is why he probably suggested that rather than statistics, the connection fans of color maintain with sports is emotional. It doesn’t take much imagination to see how this view might also be imposed on Latino fans.

I disagree because it’s not an either/or matter for me. Advanced analytics can fuel one’s baseball passion. It has for me. And it will continue to do so as long as there are curious people asking new questions or asking old questions in new ways. That’s how we get those fresh angles, and that’s how we add additional sabor to the stories we’ve already told. Pass the habaneros and the cilantro, please.

Eric García McKinley is an ACLS Public Fellow and Research Analyst at Minnesota Public Radio. He holds a PhD in history from the University of Illinois, Urbana-Champaign. A diehard Rockies fan, he is an editor and writer for the Colorado Rockies blog Purple Row. 

Featured Image: Doug Pensinger / Getty Images Sport