BASEBALL AND NOT BASEBALL
This is just a quick follow-up to the last post. SO% (K%) does not impact the stranding of runners on third base much. This is probably because it is already factored into batting average.
In fact a two-variable linear regression shows that using batting average and strike out rate gives an r2 of 0.6 while just using batting average give 0.59 using 2009 team American League data.
What I learned is how to use the Excel LINEST function.
LINEST (array,array,TRUE, TRUE) should have the dependent variable as the first array and the potentially independent variables in the second array. The first true allows the y-intercept (b) to be non-zero. The last true tells Excel to compute r2 and other regression statistics.
Using any search engine you can find out where in the array returned by LINEST the information you want is. For instance, INDEX(LINEST (array,array,TRUE, TRUE),3,1) returns r2 for the regression. The coefficients of the variables (the mi and b in y = m1x + m2x + . . . + b) are in row 1 of the LINEST array. You can also force Excel to show the entire array using the selection, F2 and ctrl-shift-enter. (See http://office.microsoft.com/en-us/excel/HP052091551033.aspx.)
Here is the data I used:
|TEAM||2009 BA||SO%||2009 3B, <2|