Thursday, January 17, 2008

Jay-f/x 2007: Introduction

I'll be using Josh Kalk's Pitch-f/x data summaries, available for most pitchers at The set of pitchers studied includes all the "important" pitchers, where important is defined by me, plus Tomo Ohka. Pitch classifications are normally left at the Kalk defaults with two exceptions:

  • Roy Halladay's slider was renamed as a curve. Almost everyone considers this pitch a curveball.
  • A.J. Burnett's sinker was changed to a plain fastball. He often works "fastball up, curveball down" so I don't think those are sinkers.
I have no issues with Kalk's classification methods, not by any means, but I felt I should make those changes.

How reliable is the data?

I can't find it now, but there was an article discussing how the pitch speed at Rogers Centre was artificially high. Also, we don't have every single pitch for all of these pitchers. records the number of pitches thrown by each pitcher in 2007 and the Kalk data provides a count. I've summarized these numbers below:

Kalk B-Ref % recorded
Marcum 1826 2543 72%
McGowan 1842 2702 68%
Janssen 691 1065 65%
Litsch 1126 1771 64%
Halladay 2097 3326 63%
Burnett 1551 2649 59%
Frasor 596 1034 58%
Towers 956 1667 57%
Tallet 619 1084 57%
Downs 505 912 55%
Accardo 587 1081 54%

TOTAL 12396 19834 62%

Marcum and McGowan are the best-covered pitchers, though we have the equivalent of two more analysed starts for Halladay. The five starters are represented well, and only one third of Janssen's pitches are missing. So if you want, you can pretty much ignore anything I say for Frasor through Accardo.

Other general warnings about the accuracy of the Pitch-f/x system as a whole apply.

What do they throw?

Glad you asked.

Pitch type and frequency (%) for Toronto Blue Jays pitchers
Pitcher Fastball Cutter Sinker Slider Change Curve
Accardo 72

6 22
Burnett 63

9 28
Downs 53

11 9 27
Frasor 70

25 6
Halladay 32 45

Janssen 33 36
Litsch 10 17
61 12
Marcum 31 26
14 21 8
McGowan 59

19 10 12
Ohka 46


Tallet 45 16
26 13
Towers 51 8
35 6

Or, graphically (click for larger version):

(You'll note that Ohka wasn't included in the pitcher coverage table. I actually added him near the end of my analysis, just so we could have a good laugh. He doesn't even have 500 pitches recorded by Kalk.)

This seems like a decent introduction for now. Consider those tables for a while and let me know what you'd like to see next (or do it yourself!)


Rob said...

In the Show, everyone can hit heat: Frasor, Towers, and Ohka (three pitchers who didn't exactly punch out the other side last year) barely threw a curve or change.

Richard said...

Good stuff Craig. The " Young Hero" really does throw an assortment of pitches. I go back and forth as to how good he is, I'm somewhat optimistic.

Harry Pavlidis said...

Glad to see another PITCHf/x blogger.
Looking forward to your updates. Check out Cubs f/x - I'll add a link to your site, too.

Tybalt said...

Richard, that wasn't me, that was Rob.

The Young Hero's "slider"? That's actually his two-seam fastball.

I'll let that sink in for a sec.

Nice, eh? That's some serious break.

Rob said...

I'm not sure what Litsch throws, actually. That's why I put this graph out there first. Should I just rename his slider or combine the two-seam with his other fastballs?

Also, I missed the Frasor "slider" -- it should be a curve. (If I'm still right about Towers and Ohka, it may be a factor in their crappy performance.)

Rob said...

Wait, how can that be Litsch's two-seam if it's recorded as low as 66 mph?

Mike Fast said...

What Josh Kalk has as Litsch's slider is definitely not a two-seam fastball. However, it's not all sliders, either. It looks like there's both a curveball and slider in that grouping, and probably some of what really are cutters got lumped into the sliders on his graph, too.

It pays to be careful with Josh Kalk's pitch classifications if you're trying to serious analysis. For a little fun they're good enough. In my experience the version of his algorithm that's on the pitcher cards gets about 70% of pitches classified correctly, so take that for what it's worth. The algorithm is particularly bad about sliders and cutters and differentiating different types of fastballs. It usually handles changeups pretty well but doesn't do so well with splitters.

Mike Fast said...

Also, Jonathan Hale has a good PITCHf/x breakdown of Jesse Litsch here:

Jonathan said...

Yah, there's no way you can really generalize about Jesse because his breaking balls were all over the place over the season. I figured he was throwing a cutter that's almost a slider, a true slider and a curveball. Then at the end of the year he morphed his slider and curveball into one pitch that was somewhere between the two.