Predicting the Quality Start

As you well know, fantasy formats have long been eschewing (and ridiculing) the use of the win as a category. It tends to still hold on rather stubbornly in standard Rotisserie 5×5 formats which are also widely panned (yet this author still clings to one of those teams annually). There are a number of logical swaps for the win, and one of them has historically been the “quality start,” which is what this post is all about.

The quality start has also been criticized as being rather useless inasmuch as describing whether a pitcher performed well nor not, and yet if you don’t want to get into weighted metrics in your fantasy league, it’s still preferable to the sometimes arbitrary assignment of wins (and perhaps more on point — the arbitrary lack of assigning a win).

Just so we’re operating with the same definition, in most fantasy circles the quality start still uses the John Lowe Philly Inquirer characterization as being a starting pitcher going six innings without giving up more than three earned runs. We can punch holes in that over beers another day, but that’s our baseline for a quality start going forward.

What this post will seek to do is provide you with some basic raw data on quality starts from 2014. We’ll analyze correlations among a variety of variables with the quality start, and then seek out to develop a predictive model that might help you target pitchers most likely to register a goodly number of quality starts in 2015.

You might want to maximize that browser size to absorb a rather involved chart of data below. I somewhat arbitrarily chose 150 innings as my qualified cut off for starting pitchers in 2014. I say somewhat because in my head I really wanted at least 100 pitchers and I got 98 and 150 innings is nice and tidy, so, Yahtzee. There was far more data in the original chart, but I’ve distilled this down to include many usual suspects plus a couple which require a quick explanation.

“Pitches” is the total number of pitches — we need that to calculate the data point at the end of the chart, “PPI” or pitches per inning. QS is our total raw number of quality starts, but obviously that can be impacted by the total number of starts one had during the season, so next to it we have QS% or the percentage of quality starts per game started. No surprise to see Mr. Kershaw there. QS-W was mainly for my curiosity, which is the number of quality starts to wins, or more crudely put — who got screwed out of a win and who managed to net more wins than actual quality starts, of which there are few (Bud Norris, take a bow). Everything else should be self explanatory.

GS IP Pitches W TBF H R ER HR BB BB% SO K% K/9 BB/9 WHIP H/IP QS QS% QS-W PPI
Clayton Kershaw 27 198.1 2722 21 749 139 42 39 9 31 4% 239 32% 10.85 1.41 0.86 0.70 24 89% 3 13.74
Johnny Cueto 34 243.2 3659 20 961 169 69 61 22 65 7% 242 25% 8.94 2.4 0.96 0.69 29 85% 9 15.05
Jon Lester 32 219.2 3493 16 885 194 76 60 16 48 5% 220 25% 9.01 1.97 1.1 0.89 27 84% 11 15.94
Cole Hamels 30 204.2 3136 9 829 176 60 56 14 59 7% 198 24% 8.71 2.59 1.15 0.86 25 83% 16 15.36
Chris Sale 26 174 2753 12 685 129 48 42 13 39 6% 208 30% 10.76 2.02 0.97 0.74 21 81% 9 15.82
Felix Hernandez 34 236 3434 15 912 170 68 56 16 46 5% 248 27% 9.46 1.75 0.92 0.72 27 79% 12 14.55
Alex Wood 24 156.1 2420 8 625 132 50 45 14 39 6% 151 24% 8.69 2.25 1.09 0.85 19 79% 11 15.50
Sonny Gray 33 219 3295 14 899 187 84 75 15 74 8% 183 20% 7.52 3.04 1.19 0.85 26 79% 12 15.05
Adam Wainwright 32 227 3258 20 898 184 64 60 10 50 6% 179 20% 7.1 1.98 1.03 0.81 25 78% 5 14.35
Corey Kluber 34 235.2 3500 18 951 207 72 64 14 51 5% 269 28% 10.27 1.95 1.09 0.88 26 76% 8 14.88
Julio Teheran 33 221 3271 14 884 188 82 71 22 51 6% 186 21% 7.57 2.08 1.08 0.85 25 76% 11 14.80
Aaron Harang 33 204.1 3394 12 876 215 88 81 15 71 8% 161 18% 7.09 3.13 1.4 1.05 25 76% 13 16.63
Jordan Zimmermann 32 199.2 2924 14 800 185 67 59 13 29 4% 182 23% 8.2 1.31 1.07 0.93 24 75% 10 14.68
Yordano Ventura 30 181.1 2959 13 776 167 70 65 14 69 9% 156 20% 7.74 3.42 1.3 0.92 22 73% 9 16.34
Garrett Richards 26 168.2 2627 13 678 124 51 49 5 51 8% 164 24% 8.75 2.72 1.04 0.74 19 73% 6 15.62
Hyun-Jin Ryu 26 152 2443 14 631 152 60 57 8 29 5% 139 22% 8.23 1.72 1.19 1.00 19 73% 5 16.07
Lance Lynn 33 203.2 3450 15 866 185 72 62 13 72 8% 181 21% 8 3.18 1.26 0.91 24 73% 9 16.98
Dallas Keuchel 29 200 3020 12 808 187 71 65 11 48 6% 146 18% 6.57 2.16 1.18 0.94 21 72% 9 15.10
Doug Fister 25 164 2468 16 662 153 52 44 18 24 4% 98 15% 5.38 1.32 1.08 0.93 18 72% 2 15.05
Jake Arrieta 25 156.2 2416 10 614 114 46 44 5 41 7% 167 27% 9.59 2.36 0.99 0.73 18 72% 8 15.47
John Lackey 31 198 3078 14 833 206 94 84 24 47 6% 164 20% 7.45 2.14 1.28 1.04 22 71% 8 15.55
Tyson Ross 31 195.2 3119 13 811 165 75 61 13 72 9% 195 24% 8.97 3.31 1.21 0.85 22 71% 9 15.98
David Price 34 248.1 3730 15 1009 230 100 90 25 38 4% 271 27% 9.82 1.38 1.08 0.93 24 71% 9 15.03
Stephen Strasburg 34 215 3295 14 868 198 86 75 23 43 5% 242 28% 10.13 1.8 1.12 0.92 24 71% 10 15.33
James Shields 34 227 3632 14 939 224 95 81 23 44 5% 180 19% 7.14 1.74 1.18 0.99 24 71% 10 16.00
Jon Niese 30 187.2 2792 9 786 193 80 71 17 45 6% 138 18% 6.62 2.16 1.27 1.03 21 70% 12 14.91
Jeff Samardzija 33 219.2 3339 7 879 191 86 73 20 43 5% 202 23% 8.28 1.76 1.07 0.87 23 70% 16 15.23
Scott Feldman 29 180.1 2964 8 765 185 86 75 16 50 7% 107 14% 5.34 2.5 1.3 1.03 20 69% 12 16.46
Alfredo Simon 32 196.1 3014 15 818 181 80 75 22 56 7% 127 16% 5.82 2.57 1.21 0.92 22 69% 7 15.37
Wily Peralta 32 198.2 3192 15 838 198 88 78 23 61 7% 154 18% 6.98 2.76 1.3 1.00 22 69% 7 16.10
Rick Porcello 31 202.2 3019 15 828 208 88 77 18 39 5% 129 16% 5.73 1.73 1.22 1.03 21 68% 6 14.93
Tanner Roark 31 198.2 2999 17 798 178 64 63 16 39 5% 138 17% 6.25 1.77 1.09 0.90 21 68% 4 15.13
Kyle Lohse 31 198.1 3002 13 817 183 87 78 22 45 6% 141 17% 6.4 2.04 1.15 0.92 21 68% 8 15.15
R.A. Dickey 34 215.2 3513 14 914 191 101 89 26 74 8% 173 19% 7.22 3.09 1.23 0.89 23 68% 9 16.32
Max Scherzer 33 220.1 3638 18 904 196 80 77 18 63 7% 252 28% 10.29 2.57 1.18 0.89 22 67% 4 16.53
Hiroki Kuroda 32 199 3097 11 820 191 91 82 20 35 4% 146 18% 6.6 1.58 1.14 0.96 21 66% 10 15.56
Zack Greinke 32 202.1 3210 17 821 190 69 61 19 43 5% 207 25% 9.21 1.91 1.15 0.94 21 66% 4 15.88
Jose Quintana 32 200.1 3346 9 830 197 87 74 10 52 6% 178 21% 8 2.34 1.24 0.98 21 66% 12 16.72
Zack Wheeler 32 185.1 3308 11 794 167 84 73 14 79 10% 187 24% 9.08 3.84 1.33 0.90 21 66% 10 17.87
Jered Weaver 34 213.1 3352 18 888 193 87 85 27 65 7% 169 19% 7.13 2.74 1.21 0.91 22 65% 4 15.73
Bartolo Colon 31 202.1 3011 15 846 218 97 92 22 30 4% 151 18% 6.72 1.33 1.23 1.08 20 65% 5 14.90
Collin McHugh 25 154.2 2486 11 619 117 53 47 13 41 7% 157 25% 9.14 2.39 1.02 0.76 16 64% 5 16.12
Madison Bumgarner 33 217.1 3372 18 873 194 81 72 21 43 5% 219 25% 9.07 1.78 1.09 0.89 21 64% 3 15.53
Jarred Cosart 30 180.1 2947 13 766 173 80 74 9 73 10% 115 15% 5.74 3.64 1.36 0.96 19 63% 6 16.36
Matt Garza 27 163.1 2538 8 680 143 77 66 12 50 7% 126 19% 6.94 2.76 1.18 0.88 17 63% 9 15.56
Alex Cobb 27 166.1 2611 10 681 142 56 53 11 47 7% 149 22% 8.06 2.54 1.14 0.85 17 63% 7 15.72
Gio Gonzalez 27 158.2 2623 10 653 134 66 63 10 56 9% 162 25% 9.19 3.18 1.2 0.85 17 63% 7 16.58
Phil Hughes 32 209.2 3046 16 855 221 88 82 16 16 2% 186 22% 7.98 0.69 1.13 1.06 20 63% 4 14.56
Scott Kazmir 32 190.1 2983 15 777 171 81 75 16 50 6% 164 21% 7.75 2.36 1.16 0.90 20 63% 5 15.69
Jake Peavy 32 202.2 3225 7 852 196 91 84 23 63 7% 158 19% 7.02 2.8 1.28 0.97 20 63% 13 15.95
Chris Archer 32 194.2 3160 10 822 177 85 72 12 72 9% 173 21% 8 3.33 1.28 0.91 20 63% 10 16.27
Yovani Gallardo 32 192.1 3216 8 817 195 86 75 21 54 7% 146 18% 6.83 2.53 1.29 1.02 20 63% 12 16.74
John Danks 32 193.2 3298 11 855 205 106 102 25 74 9% 129 15% 5.99 3.44 1.44 1.06 20 63% 9 17.07
Chris Tillman 34 207.1 3411 13 871 189 83 77 21 66 8% 150 17% 6.51 2.86 1.23 0.91 21 62% 8 16.47
Miguel Gonzalez 26 157 2513 10 662 153 59 56 24 50 8% 110 17% 6.31 2.87 1.29 0.97 16 62% 6 16.01
Ervin Santana 31 196 2987 14 817 193 90 86 16 63 8% 179 22% 8.22 2.89 1.31 0.98 19 61% 5 15.24
Edinson Volquez 31 190.2 2949 13 802 165 75 65 17 71 9% 138 17% 6.51 3.35 1.24 0.87 19 61% 6 15.50
Hisashi Iwakuma 28 179 2542 15 709 167 70 70 20 21 3% 154 22% 7.74 1.06 1.05 0.93 17 61% 2 14.20
Mike Leake 33 214.1 3215 11 902 217 93 88 23 50 6% 164 18% 6.89 2.1 1.25 1.01 20 61% 9 15.02
Nathan Eovaldi 33 199.2 3198 6 854 223 107 97 14 43 5% 142 17% 6.4 1.94 1.33 1.12 20 61% 14 16.05
Jason Vargas 30 187 2611 11 790 197 82 77 19 41 5% 128 16% 6.16 1.97 1.27 1.05 18 60% 7 13.96
Henderson Alvarez 30 187 3003 12 772 198 65 55 14 33 4% 111 14% 5.34 1.59 1.24 1.06 18 60% 6 16.06
Mark Buehrle 32 202 3082 13 857 228 83 76 15 46 5% 119 14% 5.3 2.05 1.36 1.13 19 59% 6 15.26
Tom Koehler 32 191.1 2941 10 803 177 84 81 16 71 9% 153 19% 7.2 3.34 1.3 0.93 19 59% 9 15.39
Jason Hammel 29 173.1 2765 10 705 151 70 68 23 44 6% 154 22% 8 2.28 1.13 0.87 17 59% 7 15.97
Tim Hudson 31 189.1 2784 9 789 199 86 75 15 34 4% 120 15% 5.7 1.62 1.23 1.05 18 58% 9 14.72
Wade Miley 33 201.1 3217 8 866 207 103 97 23 75 9% 183 21% 8.18 3.35 1.4 1.03 19 58% 11 16.00
Ian Kennedy 33 201 3402 13 846 189 85 81 16 70 8% 207 24% 9.27 3.13 1.29 0.94 19 58% 6 16.93
Justin Verlander 32 206 3409 15 893 223 114 104 18 65 7% 159 18% 6.95 2.84 1.4 1.08 18 56% 3 16.55
Dan Haren 32 186 3096 13 776 183 101 83 27 36 5% 145 19% 7.02 1.74 1.18 0.98 18 56% 5 16.65
Hector Noesi 27 164.2 2587 8 694 166 88 81 27 54 8% 116 17% 6.34 2.95 1.34 1.01 15 56% 7 15.76
Chris Young 29 163 2696 12 682 143 70 67 26 60 9% 106 16% 5.85 3.31 1.25 0.88 16 55% 4 16.54
Jorge de la Rosa 32 184.1 3067 14 768 161 90 84 21 67 9% 139 18% 6.79 3.27 1.24 0.87 17 53% 3 16.66
Brandon McCarthy 32 200 3044 10 836 222 100 90 25 33 4% 175 21% 7.88 1.49 1.28 1.11 16 50% 6 15.22
Josh Collmenter 28 170.1 2595 10 683 155 73 67 17 35 5% 110 16% 5.81 1.85 1.12 0.91 14 50% 4 15.26
Jeremy Guthrie 32 202.2 3235 13 864 215 100 93 23 49 6% 124 14% 5.51 2.18 1.3 1.06 16 50% 3 16.00
Trevor Bauer 26 153 2591 5 663 151 76 71 16 60 9% 143 22% 8.41 3.53 1.38 0.99 13 50% 8 16.93
J.A. Happ 26 153 2600 11 647 154 75 70 21 46 7% 128 20% 7.53 2.71 1.31 1.01 13 50% 2 16.99
Kyle Gibson 31 179.1 2800 13 757 178 91 89 12 57 8% 107 14% 5.37 2.86 1.31 0.99 15 48% 2 15.63
Wei-Yin Chen 31 185.2 2977 16 772 193 77 73 23 35 5% 136 18% 6.59 1.7 1.23 1.04 15 48% -1 16.07
C.J. Wilson 31 175.2 3108 13 761 169 95 88 17 85 11% 151 20% 7.74 4.35 1.45 0.96 15 48% 2 17.74
Francisco Liriano 29 162.1 2714 7 691 130 68 61 13 81 12% 175 25% 9.7 4.49 1.3 0.80 14 48% 7 16.74
Ryan Vogelsong 32 184.2 3058 8 780 178 86 82 18 58 7% 151 19% 7.36 2.83 1.28 0.97 15 47% 7 16.60
Clay Buchholz 28 170.1 2741 8 737 182 108 101 17 54 7% 132 18% 6.97 2.85 1.39 1.07 13 46% 5 16.11
Charlie Morton 26 157.1 2504 6 666 143 76 65 9 57 9% 126 19% 7.21 3.26 1.27 0.91 12 46% 6 15.94
Shelby Miller 31 182 2841 10 759 158 78 76 22 72 9% 127 17% 6.28 3.56 1.26 0.87 14 45% 4 15.61
Jake Odorizzi 31 168 3028 11 719 156 79 77 20 59 8% 174 24% 9.32 3.16 1.28 0.93 14 45% 3 18.02
A.J. Burnett 34 213.2 3472 8 935 205 122 109 20 96 10% 190 20% 8 4.04 1.41 0.96 15 44% 7 16.29
Eric Stults 32 176 2833 8 763 197 93 84 26 45 6% 111 15% 5.68 2.3 1.38 1.12 14 44% 6 16.10
Travis Wood 31 173.2 3045 8 781 190 110 97 20 76 10% 146 19% 7.57 3.94 1.53 1.10 13 42% 5 17.58
Kyle Kendrick 32 199 3102 10 865 214 108 102 25 57 7% 121 14% 5.47 2.58 1.36 1.08 13 41% 3 15.59
Bud Norris 28 165.1 2746 15 687 149 68 67 20 52 8% 139 20% 7.57 2.83 1.22 0.90 11 39% -4 16.63
Roberto Hernandez 29 162.2 2701 8 713 154 84 75 19 72 10% 102 14% 5.64 3.98 1.39 0.95 11 38% 3 16.65
Drew Hutchison 32 184.2 3051 11 786 173 92 92 23 60 8% 184 23% 8.97 2.92 1.26 0.94 12 38% 1 16.56
Ricky Nolasco 27 159 2641 6 695 203 96 95 22 38 5% 115 17% 6.51 2.15 1.52 1.28 10 37% 4 16.61
Vidal Nuno 28 157.1 2497 2 655 148 82 75 23 43 7% 124 19% 7.09 2.46 1.21 0.94 10 36% 8 15.89
Roenis Elias 29 163.2 2661 10 693 151 77 70 16 64 9% 143 21% 7.86 3.52 1.31 0.93 8 28% -2 16.31
Colby Lewis 29 170.1 2802 10 762 211 107 98 25 48 6% 133 17% 7.03 2.54 1.52 1.24 8 28% -2 16.47

I ran correlations with the percentage of quality starts across many variables, including things like Pace, line drive rate, ground ball rate, and fly ball rate. Because, why not? It’s fun to see what relationships develop even if you can explain them away. In fact, ground ball rate and fly ball rate possessed a significance at the .05 threshold, but that relationship didn’t really translate when developing a model later.

Below what you have is kind of the “no kidding” variables that would impact a quality start — highlighted by IP/GS, which, duh. A quality start and the number of innings you pitch ought to be related — that is they trend together, and if memory serves this is called multicollinearity and my old instructor of statistics actually liked to see that in correlations just to make sure your variables pass the smell test. If I had more statistical chops, I’d tease this point out — but suffice to say that multicollinearity in our correlations doesn’t do anything negative to the instructive nature of our correlations here, and certainly does not impact the model below because I cast them aside. So there.

QS% H/IP WHIP IP/GS Pitches Per Inning K% BB%
QS% Correlation 1 -.504** -.659** .744** -.457** .422** -.329**
Sig. (2-tailed) .000 .000 .000 .000 .000 .001
N 98 98 98 98 98 98 98
H/IP Correlation -.504** 1 .760** -.347** .205* -.631** -.165
Sig. (2-tailed) .000 .000 .000 .043 .000 .104
N 98 98 98 98 98 98 98
WHIP Correlation -.659** .760** 1 -.643** .608** -.583** .515**
Sig. (2-tailed) .000 .000 .000 .000 .000 .000
N 98 98 98 98 98 98 98
IP/GS Correlation .744** -.347** -.643** 1 -.669** .404** -.516**
Sig. (2-tailed) .000 .000 .000 .000 .000 .000
N 98 98 98 98 98 98 98
Pitches Per Inning Correlation -.457** .205* .608** -.669** 1 -.093 .646**
Sig. (2-tailed) .000 .043 .000 .000 .360 .000
N 98 98 98 98 98 98 98
K% Correlation .422** -.631** -.583** .404** -.093 1 -.056
Sig. (2-tailed) .000 .000 .000 .000 .360 .587
N 98 98 98 98 98 98 98
BB% Correlation -.329** -.165 .515** -.516** .646** -.056 1
Sig. (2-tailed) .001 .104 .000 .000 .000 .587
N 98 98 98 98 98 98 98
**Correlation is significant at the 0.01 level (2-tailed)
*Correlation is significant at the 0.05 level (2-tailed)

So in looking over this set of data, I had a variety of thoughts, but allow me to condense for clarity: I wanted a simple model, something that could be actually usable. I didn’t want a model controlling for buckets of variables, and fortunately, we don’t have to. Strikeouts and walks are related to pitches per inning, walks very much so, and strikeouts in a negative way, which makes sense (although, no, not statistically significant — but it’s there). WHIP becomes particularly handy because by definition, it’s assisting us with walks and hits. So in an effort to paint with a broader brush, can we explain a decent degree of the variance in a quality start from just WHIP and average pitches per inning? Sort of.

Using basic multivariate regression with WHIP and PPI as our independent variables and with quality start percentage as our dependent variable, we get the following:

Model Summary
Model R R Square Adjusted R Square Std. Error of the Estimate
1 .663a .439 .427 9.73458%

An adjusted R-squared of .427 ain’t bad — and as far as the model fit goes, again, this is where we get into what might pass the smell test. Frankly, if I wanted to simply drive up the R-squared then I’d use ERA as a dependent variable and that would yield something super impressive and perhaps there would be much rejoicing and we could all go get a beverage. But that’s like predicting cloud coverage in a rain storm, and really doesn’t help us. Because I can say, unequivocally, if you want quality starts pick guys who ought to have a low ERA. But if we want to drill down a little, well, how about efficiency.

Coefficients
Model Unstandardized Coefficients Standardized Coefficients t Sig.
B Std. Error Beta
1 (Constant) 156.769 19.191 8.169 .000
Pitches Per Inning -1.400 1.506 -.090 -.929 .355
WHIP -59.783 9.582 -.604 -6.239 .000

Because they’re negative impacts on quality start percentage, we’re left with a bit of a wacky pair of coefficients working backwards towards the predicted quality start, but in this particular model, WHIP is statistically significant as is the constant, which is good news for the overall fit of the model since pitches per inning doesn’t quite meet the significance standard. My interpretation would be this may suggest some collinearity, but it doesn’t render the model as junk given the “goodness of fit” we find from the model summary. Plowing forward, our model would then look like this: xQS% = 156.769 + (-1.4)*PPI + (-59.783)*WHIP.

Therefore, if you wildly speculate a pitcher with 15 pitches per inning on average who is predicted to have a 1.15 WHIP, the model would predict a 68% quality start rate. Since those two explain just about half of the variance, you could conceivably target the arms projected to have low ERA’s (or FIP, if you prefer) to make up for some of the added variance and you might be on to something. But for the fun of it, applying the model to last year’s starters, we can plug in xQS and then see what the model thinks of their actual quality start percentage:

GS IP Pitches WHIP QS QS% PPI xQS xQS-QS
Clayton Kershaw 27 198.1 2722 0.86 24 89% 13.74 86.1% -3%
Johnny Cueto 34 243.2 3659 0.96 29 85% 15.05 78.3% -7%
Jon Lester 32 219.2 3493 1.1 27 84% 15.94 68.7% -16%
Cole Hamels 30 204.2 3136 1.15 25 83% 15.36 66.5% -17%
Chris Sale 26 174 2753 0.97 21 81% 15.82 76.6% -4%
Felix Hernandez 34 236 3434 0.92 27 79% 14.55 81.4% 2%
Alex Wood 24 156.1 2420 1.09 19 79% 15.50 69.9% -9%
Sonny Gray 33 219 3295 1.19 26 79% 15.05 64.6% -14%
Adam Wainwright 32 227 3258 1.03 25 78% 14.35 75.1% -3%
Corey Kluber 34 235.2 3500 1.09 26 76% 14.88 70.8% -6%
Julio Teheran 33 221 3271 1.08 25 76% 14.80 71.5% -4%
Aaron Harang 33 204.1 3394 1.4 25 76% 16.63 49.8% -26%
Jordan Zimmermann 32 199.2 2924 1.07 24 75% 14.68 72.3% -3%
Yordano Ventura 30 181.1 2959 1.3 22 73% 16.34 56.2% -17%
Garrett Richards 26 168.2 2627 1.04 19 73% 15.62 72.7% 0%
Hyun-Jin Ryu 26 152 2443 1.19 19 73% 16.07 63.1% -10%
Lance Lynn 33 203.2 3450 1.26 24 73% 16.98 57.7% -15%
Dallas Keuchel 29 200 3020 1.18 21 72% 15.10 65.1% -7%
Doug Fister 25 164 2468 1.08 18 72% 15.05 71.1% -1%
Jake Arrieta 25 156.2 2416 0.99 18 72% 15.47 75.9% 4%
John Lackey 31 198 3078 1.28 22 71% 15.55 58.5% -12%
Tyson Ross 31 195.2 3119 1.21 22 71% 15.98 62.1% -9%
David Price 34 248.1 3730 1.08 24 71% 15.03 71.2% 1%
Stephen Strasburg 34 215 3295 1.12 24 71% 15.33 68.4% -2%
James Shields 34 227 3632 1.18 24 71% 16.00 63.8% -7%
Jon Niese 30 187.2 2792 1.27 21 70% 14.91 60.0% -10%
Jeff Samardzija 33 219.2 3339 1.07 23 70% 15.23 71.5% 2%
Scott Feldman 29 180.1 2964 1.3 20 69% 16.46 56.0% -13%
Alfredo Simon 32 196.1 3014 1.21 22 69% 15.37 62.9% -6%
Wily Peralta 32 198.2 3192 1.3 22 69% 16.10 56.5% -12%
Rick Porcello 31 202.2 3019 1.22 21 68% 14.93 62.9% -5%
Tanner Roark 31 198.2 2999 1.09 21 68% 15.13 70.4% 3%
Kyle Lohse 31 198.1 3002 1.15 21 68% 15.15 66.8% -1%
R.A. Dickey 34 215.2 3513 1.23 23 68% 16.32 60.4% -7%
Max Scherzer 33 220.1 3638 1.18 22 67% 16.53 63.1% -4%
Hiroki Kuroda 32 199 3097 1.14 21 66% 15.56 66.8% 1%
Zack Greinke 32 202.1 3210 1.15 21 66% 15.88 65.8% 0%
Jose Quintana 32 200.1 3346 1.24 21 66% 16.72 59.2% -6%
Zack Wheeler 32 185.1 3308 1.33 21 66% 17.87 52.2% -13%
Jered Weaver 34 213.1 3352 1.21 22 65% 15.73 62.4% -2%
Bartolo Colon 31 202.1 3011 1.23 20 65% 14.90 62.4% -2%
Collin McHugh 25 154.2 2486 1.02 16 64% 16.12 73.2% 9%
Madison Bumgarner 33 217.1 3372 1.09 21 64% 15.53 69.9% 6%
Jarred Cosart 30 180.1 2947 1.36 19 63% 16.36 52.6% -11%
Matt Garza 27 163.1 2538 1.18 17 63% 15.56 64.4% 1%
Alex Cobb 27 166.1 2611 1.14 17 63% 15.72 66.6% 4%
Gio Gonzalez 27 158.2 2623 1.2 17 63% 16.58 61.8% -1%
Phil Hughes 32 209.2 3046 1.13 20 63% 14.56 68.8% 6%
Scott Kazmir 32 190.1 2983 1.16 20 63% 15.69 65.5% 3%
Jake Peavy 32 202.2 3225 1.28 20 63% 15.95 57.9% -5%
Chris Archer 32 194.2 3160 1.28 20 63% 16.27 57.5% -5%
Yovani Gallardo 32 192.1 3216 1.29 20 63% 16.74 56.2% -6%
John Danks 32 193.2 3298 1.44 20 63% 17.07 46.8% -16%
Chris Tillman 34 207.1 3411 1.23 21 62% 16.47 60.2% -2%
Miguel Gonzalez 26 157 2513 1.29 16 62% 16.01 57.2% -4%
Ervin Santana 31 196 2987 1.31 19 61% 15.24 57.1% -4%
Edinson Volquez 31 190.2 2949 1.24 19 61% 15.50 60.9% 0%
Hisashi Iwakuma 28 179 2542 1.05 17 61% 14.20 74.1% 13%
Mike Leake 33 214.1 3215 1.25 20 61% 15.02 61.0% 0%
Nathan Eovaldi 33 199.2 3198 1.33 20 61% 16.05 54.8% -6%
Jason Vargas 30 187 2611 1.27 18 60% 13.96 61.3% 1%
Henderson Alvarez 30 187 3003 1.24 18 60% 16.06 60.2% 0%
Mark Buehrle 32 202 3082 1.36 19 59% 15.26 54.1% -5%
Tom Koehler 32 191.1 2941 1.3 19 59% 15.39 57.5% -2%
Jason Hammel 29 173.1 2765 1.13 17 59% 15.97 66.9% 8%
Tim Hudson 31 189.1 2784 1.23 18 58% 14.72 62.6% 5%
Wade Miley 33 201.1 3217 1.4 19 58% 16.00 50.7% -7%
Ian Kennedy 33 201 3402 1.29 19 58% 16.93 56.0% -2%
Justin Verlander 32 206 3409 1.4 18 56% 16.55 49.9% -6%
Dan Haren 32 186 3096 1.18 18 56% 16.65 62.9% 7%
Hector Noesi 27 164.2 2587 1.34 15 56% 15.76 54.6% -1%
Chris Young 29 163 2696 1.25 16 55% 16.54 58.9% 4%
Jorge de la Rosa 32 184.1 3067 1.24 17 53% 16.66 59.3% 6%
Brandon McCarthy 32 200 3044 1.28 16 50% 15.22 58.9% 9%
Josh Collmenter 28 170.1 2595 1.12 14 50% 15.26 68.5% 18%
Jeremy Guthrie 32 202.2 3235 1.3 16 50% 16.00 56.7% 7%
Trevor Bauer 26 153 2591 1.38 13 50% 16.93 50.6% 1%
J.A. Happ 26 153 2600 1.31 13 50% 16.99 54.7% 5%
Kyle Gibson 31 179.1 2800 1.31 15 48% 15.63 56.6% 8%
Wei-Yin Chen 31 185.2 2977 1.23 15 48% 16.07 60.7% 12%
C.J. Wilson 31 175.2 3108 1.45 15 48% 17.74 45.2% -3%
Francisco Liriano 29 162.1 2714 1.3 14 48% 16.74 55.6% 7%
Ryan Vogelsong 32 184.2 3058 1.28 15 47% 16.60 57.0% 10%
Clay Buchholz 28 170.1 2741 1.39 13 46% 16.11 51.1% 5%
Charlie Morton 26 157.1 2504 1.27 12 46% 15.94 58.5% 12%
Shelby Miller 31 182 2841 1.26 14 45% 15.61 59.6% 14%
Jake Odorizzi 31 168 3028 1.28 14 45% 18.02 55.0% 10%
A.J. Burnett 34 213.2 3472 1.41 15 44% 16.29 49.7% 6%
Eric Stults 32 176 2833 1.38 14 44% 16.10 51.7% 8%
Travis Wood 31 173.2 3045 1.53 13 42% 17.58 40.7% -1%
Kyle Kendrick 32 199 3102 1.36 13 41% 15.59 53.6% 13%
Bud Norris 28 165.1 2746 1.22 11 39% 16.63 60.5% 21%
Roberto Hernandez 29 162.2 2701 1.39 11 38% 16.65 50.4% 12%
Drew Hutchison 32 184.2 3051 1.26 12 38% 16.56 58.3% 21%
Ricky Nolasco 27 159 2641 1.52 10 37% 16.61 42.6% 6%
Vidal Nuno 28 157.1 2497 1.21 10 36% 15.89 62.2% 26%
Roenis Elias 29 163.2 2661 1.31 8 28% 16.31 55.6% 28%
Colby Lewis 29 170.1 2802 1.52 8 28% 16.47 42.8% 15%

So it’s buying Clayton Kershaw (thank goodness) but not Aaron Harang. Further down the scale, it’s not surprising to find that pitchers who seemed particularly unlucky (like Nuno and Elias) in QS% are predicted to have higher figures, although not altogether inspiring percentages. Some exceptions are C.J. Wilson and Travis Wood, presumably nailed for their high WHIP and PPI. One name that jumped out here is Hisashi Iwakuma who registered a fairly respectable 61% QS but his xQS is over 74% which puts him near the top in overall xQS.

So if the question is can we predict quality starts, well that answer is no, not really. But can we come up with a simple framework using reasonable projections and historical data in order to get us about half way towards guessing who might rack up more quality starts? Yeah. We can.

(Note, based on a recommendation from a Twitter follower, they asked if team defense might impact a quality start and I thought that was a nice idea since defense might very well impact the number of pitches per inning a starter throws. But in running correlations against QS, WHIP and PPI, team defense is weakly correlated and not statistically significant. Correlations with QS was .093, WHIP was -1.05, and PPI was -.074.)





Michael was born in Massachusetts and grew up in the Seattle area but had nothing to do with the Heathcliff Slocumb trade although Boston fans are welcome to thank him. You can find him on twitter at @michaelcbarr.

6 Responses to “Predicting the Quality Start”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. bmiltenberg says:
    FanGraphs Supporting Member

    Interesting stuff. Would the same equation work within steamer 2015 projections?

  2. Michael Barr says:

    That’s what I plan to do — plug in Steamer projections into the equation, then rank and sort by score and projected ERA.

  3. Thomas says:

    Where do you get Steamer projections for total pitches for a season?

  4. DBA455 says:

    I think there’s a much easier way to do this that’s at least as accurate. It would be clumsy to format the data in this text box, but if someone tells me how to submit an article to the community blog I’ll put it together.

    Thanks for the article Michael. I am no more than 8% robot.

  5. Ryan says:

    Michael, great read, looks like you put a lot of thought into this. When do you plan on inserting Steamer projections? I’d like the inside track on who is undervalued as far as QS are concerned for 2015 🙂

    Thanks!

  6. Chris Walker says:

    For those that want to plug in Steamer projections, you can use this to get pitches per inning:

    http://www.fangraphs.com/blogs/walks-strikeouts-and-pitch-counts/