This isn’t too surprising. Part of it is due to the way Vazquez has pitched. But part of it is the way Mosley has been pitching as well.
Mosley has had a nice run but it seems like it’s coming to an end. For a guy with his stuff, he’s now walking a ton of hitters and is pitching on borrowed time. You get the sense that with a few starts teams are now ready for what he’s going to try to do.
Too soon… I need to see more than 2 relief outings before I decide to put him back in the rotation… 2 long-relief appearances in blowouts is too small a sample for me…
CB – perhaps you could answer my question as well. Wave had mentioned that defensive metrics were useful for fans. I am not sure how or why I would think that. Perhaps you could give some examples from a fan point of view… or is it more for fantasy baseball?
Just make Vazquez think that it’s no outs in the 3rd inning and he’s coming in to relieve Moseley and tell him to give them 6 strong innings. Problem solved.
I thought Moseley would get another start but let’s face facts, we need to get Javy back on the right track. With Pettitte a question mark because of health, Hughes and AJ question marks because of inconsistency, the Yankees need to get Javy straightened out. All we need from him is about three starts in the postseason of about 5-6 innings and 3 to 4 runs against. Let’s hope he can at least provide that on the way to title #28.
Nova will get this starts until the end of the month. For now it’s for Pettitte. Towards the ends it might be for Hughes, and continued Burnett implosion could get him an additional spot.
He’ll keep pitching every 5 days.
He’ll have to be Fernado in ’81 to get a post-season start, however.
I think it’s way too premature to accurately guess who is the next rotation domino to fall. A lot depends on Pettitte’s health and the way Nova, Vazquez and Burnett all pitch, to go along with how they want to handle keeping Hughes’ regular season innings number around 180.
There are so many scenarios that could play out between now and the next full turn through the rotation (what if Nova gets lit up Friday? what if Burnett and Vazquez both flop in their starts? etc. etc.) that I think we have to wait until next week to accurately guess who is next to go.
Another idea, if Pettitte gets back they could use all 6 guys currently in the rotation.
Im hoping Burnett pulls it together, but even if he pitches well tonight (which is no gauge considering oakland has no offense), how do you know 5 days from now he can be counted on?
But be sure to read through the comments to the article as well, particularly those from “Tango Tiger”. The article is basically questioning the validity of defensive metrics from a statistical point of view; some of the comments raise the other side of the debate.
Skipping Hughes a start doesn’t mean he’s in the bullpen. Plus, Nova might piggyback Petttite’s first start or even Hughes starts. My point is Nova will continue to pitch as long as he’s effective.
I’m really not sure what to say. For me they are enormously confusing to interpret. And that’s because they seem to be extremely sensitive to small changes in the way fieldable chances are categorized and require sample sizes that can’t be generated over one year’s time.
The problem is that if the metric requires more than one year’s data it’s simply not very useful or reliable. People want it both ways with defensive metrics. They want to say that they require 2-3 years of data to stabilize in terms of sample size but at the same time primarily use them to assess player performance for a specific year or changes in performance from a year to year basis. But there’s too much “noise” in the data to do that.
So I really don’t know. Designing valid metrics is a painstaking and laborious process that is very technical, very complicated and very difficult to do.
From what I’ve seen of defensive metrics they haven’t been designed with methods that would ensure they yield valid data or can be explicitly and transparently interpreted.
Defense is an big part of the game. Some in the “moneyball” generation of sabermetric analysis diminished it’s value because it couldn’t be quantified. Now that it can be quantified not enough attention is paid to the methods used to produce that data an whether or not they are valid.
There are literally text books written on the subject of how to design valid metrics. If one wants to hold baseball to a lower standard than that that’s fine. But in turn you can’t know whether the numbers are valid which defeats the point of data. I personally don’t get that – I dont’ see the value in data that one can’t be certain is valid or has error estimates that are very large because of sample size.
You can measure anything now. But that doesn’t mean it’s producing high quality, valid data. But this is being done in many industries – people want to measure things but don’t want to take the time and resources to do it properly.
And from what we’ve seen so far with defensive metrics, there is so much variation in the numbers it’s difficult to distinguish performance from “noise” to compare performance from one year to another.
“Im hoping Burnett pulls it together, but even if he pitches well tonight (which is no gauge considering oakland has no offense), how do you know 5 days from now he can be counted on?”
I also expect Phil to be a starter in the post season. And really, we have enough long-guys in the pen as a contingency plan for Phil or any other starter who struggles. Things don’t always work out the way you plan or wish. Someone not named AJ might struggle, too.
I don’t think they pull him from the rotation. He can go 6 IP in all of them, or go deep in some and then have a short last outing, or maybe he goes short in 1 due to ineffectiveness and still has innings left at the end of the year to use.
This works and all you have to do is push Hughes back a start after each day off and let everyone else go on rotation (or push them all back)
I won’t trust Burnett until he throws 3 quality starts in succession and uses an occasional changeup to give hitters another thing to think about.
Any futher outings like his recent ones puts him in jeopardy for the ALDS. Girardi can get away with a 3-man rotation until the ALCS.
I think conservatively what you can do is to look at several years of data, combine it with scouting reports and see where there is agreement and discrepencies. In turn you can utilize the data over those larger periods of time to help inform an evaluation of the player falling into general categories: bad/below average/ average/ good/ very good. Something like that.
But at this time I think that’s most of what you can do with the defensive numbers.
My problem with your thesis is that it would lead those who don’t spend much time thinking about defensive metrics to conclude that there is no point thinking about defensive metrics, or what we have now represents no improvement over the days of “errors” or “total chances”.
That’s not the case. As Tom Tango says, improvement has been made. We just can’t validly say how much.
At this point AJ is like a box of chocolates… with most of the good ones missing.
=============Maybe he ate all the bad ones and mostly good ones are left?
We’ll see tonight.
Anytime AJ gets in trouble on the mound I find myself reciting Lilo Brancato in the movie “A Bronx Tale”
“Keep Your Head AJ”, “Don’t Lose It AJ”, “These People Will Hurt You AJ.”
————————————————————————–
there was a while there I was saying the same thing about joba
It’s nice to see a player finally getting a break after having a $750,000 signing bonus pulled after failing his physical when he was drafted. The Mets are going to try buying out RA Dickey’s one year of FA at $4 mil. The guy has worked hard to overcome a handicap.
They took out an off day in the ALCS, but the ALDS schedule should be the same as last yr. So the Yanks may be able to survive with only 3 starters. Hopefully Javy or AJ gets hot over the next few weeks and the yanks won’t have to tempt fate again.
If its only 3 starters needed then Hughes could go to the pen.
Esp if AJ heats up.
Javy could take Moseleys place as long man or Mitre’s.
He’s looked good there.
To be honest – most real statisticians and methodologists would be completely perplexed at the idea of utilizing data that has unclear validity, especially data of unclear validity whose flaws can’t be corrected because the sources of error are uncertain.
Regarding your other point I guess we just disagree. I see no point in championing one kind of data and overlooking it’s obvious flaws so that the other side doesn’t throw the proverbial baby out with the bath water. I just don’t understand that. That’s not what data is supposed to do. But in baseball it’s taken on that role.
This is a completely different way of thinking about the world. It’s not how statisticians who have expertise in the field think about the world.
The culture of sabermetric analysis in baseball is extremely different than in most other areas where quantitative methods are taken seriously. I don’t know what else to say.
The entire point of data is to have a clearer, more valid sense of what the truth is. If the data is of unclear validity due to methodology then it’s really of unclear import. Validity is everything.
Finally, I’d say that this whole don’t throw the baby out with the bath water argument is a real hindrance to advancements in the field because it limits internal critique. That’s very different than how scientific disciplines work, for example.
CB,
I have a very basic understanding of statistics…basically enough to understand journal articles and clinical reseach in my field.
The biggest problem I see with defensive stats is that much of the data is based off subjective information as opposed to the more objective offensive numbers…its either a hit or its not, there’s no opinion or gray area.
To me, that seems like a fundamental problem that s nearly impossible to overcome.
I don’t think it limits internal critique at all. I believe it stimulates it. To my mind, saying measurement is impossible because we can’t accurately measure is the approach which stifles debate and critique.
Consider RC/27. Is that “valid”? It’s chock full of assumptions and plenty of “noise”. But to say it hasn’t improved our understanding of a hitter’s contribution, because of that, is a giant step backwards.
You can make a similar argument for “batting average”. It’s full of “noise”.
Defensive metrics are full of noise, possibly more full of noise than other measurements, it is hard to say. But that doesn’t mean we should throw them away.
“its either a hit or its not, there’s no opinion or gray area. ”
blake-
Are you really sure of that? Have you notice the way games are scored these days? A hit is often an enormously subjective conclusion. We just pretend it isn’t.
Wave,
Yea but its much more clear cut than what’s being done with defensive recordings. Their are relatively few “judgement calls” with the offensive statistics….
“Their are relatively few “judgement calls” with the offensive statistics….”
I suspect more than you think, especially when you add in umpires’
judgment calls as well. Most judgments about defensive plays are similarly not in question. The debate is over the amount and effect of those that are. We don’t know the amount, people are only beginning to speculate.
“The biggest problem I see with defensive stats is that much of the data is based off subjective information as opposed to the more objective offensive numbers…its either a hit or its not, there’s no opinion or gray area. ”
That’s an issue but it’s really not the primary issue. People try to measure thing that are amorphous all of the time. To do it in a valid fashion however you have to do it very systematically and very rigorously.
For example, industries across the world try to measure “customer satisfaction.” It’s analogous to defense in some ways in that it’s not immediately objective.
It’s clear why they want to measure customer satisfaction – it’s crucial to many businesses and billions of dollars are at stake. So you want to have as “objective” a sense of this as possible. But what the heck is “statisfaction?” How do you define it? How do you measure it? How do you scale it properly? How do you know the number you are recording validly reflects that customers true satisfaction?
It’s very easy to come up with a number. Very, very difficult to come up with data that actually and truly reflects that underlying issue of satisfaction. But people have done a lot work on how to do this – mostly in psychology and sociology.
Rather than objective and subjective it’s about trying to take something that is relatively amorphous and systematize it’s meaning so that it can be measured in ways which produce data that is very explicit in what it is and is not capturing.
That’s just very difficult to do. And there’s not a lot of people who have real expertise doing this.
The whole issue of statistics gets overdone. Statisticians aren’t even the ones who typically drive this kind of work. They analyze data more than they develop measures. In baseball it’s very conflated between the two.
As I understand it…nearly every play recorded in some defensive metrics is subjective at least to some extent…as opposed to the few plays a year where a guy is given a hit or error on a play. There is some variation on error/hit calls but nearly everything else is pretty objective with offensive numbers.
I don’t think Burnett should pitch second in the playoffs. Pitchers should have to earn their spot in the rotation. I don’t think its fair to just give AJ the spot because of what he did last year. He has been the worst pitcher on the Yankees staff this year. He is 9-12 and has one of the best lineups in baseball behind him. Pettitte should pitch second. He has clearly earned it. I also don’t think that having two straight lefties is a bad thing, especially since the games will probably be at Yankee Stadium. I wouldn’t even pitch AJ at all, unless we were leading the series. I would go CC, Pettitte, Hughes, AJ (or CC on short rest depending on who is leading the series). You just can’t throw someone out there in the playoffs when there is a 50% chance that he won’t even give the team a chance to win. At least we know that when Hughes doesn’t have his best stuff, he can still grind it out and give the Yankees a chance to win the game. When AJ’s location is off, the Yankees might as well forfeit after the 1st inning.
CB,
Makes sense. You guys clearly know a lot more about this stuff than I do….it just seems to me that there are a lot of variables to try and account for in order to draw accurate conclusions….(who is recording the data, who was pitching, where the balls were hit that day etc…).
“You can make a similar argument for “batting average”. It’s full of “noise”.”
Noise isn’t in and of itself the issue. If performance is varying then measure needs to relfect that variation. If it doesn’t it’s not valid.
Batting average has very high construct validity. That’s why the variation isn’t concerning.
Defense doesn’t have that kind of construct validity so it’s much more problematic.
As I said. There are text books on this subject. People devote their entire professional lives to this.
If people inside of baseball want to measure defense better they should get people who actually have expertise to do it correctly. From what I’ve been able to see about the proprietary methods used by BIS, etc. they don’t seem to have that expertise. I could be wrong but as they aren’t transparent it’s hard to know and the onus is ultimately on them (and believe me I’ve read everything I could find on this subject, including Tango’s stuff – the methods aren’t transparent and seem to lack basic work that needs to be done to ensure validity).
Perhaps teams are actually getting people with expertise to develop their own high quality defensive metrics that we can’t see because they are proprietary. But from what I’ve read at least the Mariners were making decision supposedly in part on UZR. And I find that remarkable for a business to do so.
No, most defensive plays are easy. A two hopper to SS, a can of corn to LF. It’s the margins that are hard.
Granted, the margins of defensive plays are probably wider than the margins of offensive plays. At least, that’s what my common sense tells me, although I’ve never actually measured it to be sure.
But think how many times a batter is called out on an outside pitch or a low pitch – that’s a subjective judgment. Yet we draw conclusions based on the results of that all the time – it’s part of the most time-honored stat of all, “batting average”. Or a bang-bang play at first. Or a shot off of the third baseman’s glove being ruled a “hit”".
We just assume it all “evens out”. But does it? As far as I know, it hasn’t been measured.
“Batting average has very high construct validity. That’s why the variation isn’t concerning.”
Batting average measures what it measures, but people long over-valued it and over-interpreted it. We only discovered that because of the new stats that you are now attacking.
Defense is not the same as offense. It’s not and it’s not close. Offense in baseball lends itself very well to measurement because it’s broken down into very discrete events that are tied to clear outcomes. Discrete events with clear outcomes are just easier to measure in a more valid fashion. That’s why measurement in baseball is so much richer than in say basketball or even better yet ice hockey.
Defense just doesn’t break up into discrete events with clear outcomes like that as easily. There is always going to be more uncertainty in defensive measurements.
That’s partly why a player who produces a run offensively is more valuable statistically than a player who saves a run defensively. That defensive run saved will always have a larger error bar associated with it.
That’s part of why the mariners were so awful this season. They didn’t seem to get that trade off in uncertainty they were making.
I agree with Vin Scully. Statistics are used the way a drunk uses a lamp post: for support, not illumination.
Any attempt to gain a clearer and more precise understanding of the important events in a game, in a season and a career are well worth the effort, IMO. And not just for fantasy baseball purposes.
A home run is visual and visceral. That’s the immediate impact for a casual fan. But to the players and managers and GMs it’s about how that HR happened. Is it an indication of the hitter’s prowess, or the pitcher’s failings? Is it a match-up issue, a righty on lefty instead of a right on righty thing? Did the catcher call the wrong pitch? Was the wind blowing out or in? Was it early in the game when the pitcher’s arm was warmed up or late when he was tired? Was nobody on base and the pitcher said “here it is, hit it” or is this pitcher starting to give up big hits in high leverage situations? And I’m not even scratching the surface of analysis on the result of that one pitch. I’m on a tangent: Anyone who says baseball is a slow, boring game could learn a lot about the depth and layers of the game by checking into the statistical side. Defensive metrics might be flawed at present but I’m glad somebody is out there trying.
“Batting average measures what it measures, but people long over-valued it and over-interpreted it. We only discovered that because of the new stats that you are now attacking.”
Why do you use the word attack? That’s so curious.
I’m not attacking anything. I’m pointing out methodological shortcomings that concern me.
But somehow that’s an “attack.” That’s what I mean about this round the wagons mentality that quells internal critique and the opportunity to improve.
It’s just so strange. People in science and statistics don’t think this way at all. Not even close. They are ruthless about finding sources of error in methods.
And finally you keep mixing utility value (e.g. the interpretation of batting average) with it’s validity.
Measurement is about validity. It’s interpretation is a separate issue.
But, the Red Sox were reported to have done much the same thing, and their team has proved surprisingly resilient to injury and setback.
Using one team’s experience to draw a general conclusion isn’t helpful.
The question is not, are defensive metrics helpful. The question is, how helpful are they?
Again, the existence of uncertainty doesn’t invalidate the attempt to measure. In your disillusionment with the things you used to believe, you are going to far in the opposite direction.
“It’s just so strange. People in science and statistics don’t think this way at all. Not even close. They are ruthless about finding sources of error in methods.”
And, you are over-interpreting my position. Nowhere have I said sources of error shouldn’t be ferreted out. As I understand your position, though, you are saying it is pointless to even attempt this because the model is so inherently flawed.
“Using one team’s experience to draw a general conclusion isn’t helpful.”
When did I do this?
I didn’t.
I used the Mariners as an example. That’s it.
You are reading things into what I’m saying like “attack” that I’m not saying.
And while you’re concern may be about how “helpful” a metric is I don’t understand that. My concern is about how valid they are. If they are of poor validity then they aren’t going to be particularly helpful (and I’m not even saying they of poor validity because as far as I can tell that work hasn’t been done).
As of know they are to me of uncertain validity.
If you see that as an “attack” or taking an absolutist view then I’m really not sure what to say.
“if you don’t think you are attacking the notion of defensive metrics, you are fooling yourself.”
Then you have no idea how self-critical people who actually do this work are. If you think anything I’ve written is an attack you should attend a scientific conference.
“As I understand your position, though, you are saying it is pointless to even attempt this because the model is so inherently flawed.”
You don’t understand my position at all. I’ve never said it was “pointless.” Not even close.
Would you accept that a statistic with high construct validity could have a low “utility value?” That is, a stat may be “valid,” but not particularly helpful?
I know you this is an obvious claim, but I think it gets to the root of the sabermetrical argument.
I think it was Ken Singleton who said what I’m going to paraphrase. When an outfielder makes an error, something’s already gone very wrong for the ball to be in the outfield. I suppose that might (humorously) support an argument that defensive events are a little tougher to measure cleanly.
While I am a proponent of many of the methods to analyze offensive performance, defensive metric are, a fact supported by almost all sabermetricians, in the testing phase.
There can be no comparison to offensive measurements, the result of a large amount of data, to a metric like defensive, where not only do the sabermetricians have to great the tool to measure the data, but create the data itself.
The attempts at the quantitative measure of defensive is very intriguing to me, but far from complete.
I have to check out, so you are going to get the last word. But, the fact that you can’t say exactly how valid the model is doesn’t mean the model is not helpful. It just means you can’t say how helpful.
But to say we don’t have a better understanding of defense now than we did before, just because we can’t say exactly how much better understanding we have, as I understand your position, is wrong.
And, if my understanding of your position is correct, then you are making an attack.
However, it is always a pleasure to debate with you.
“Would you accept that a statistic with high construct validity could have a low “utility value?” That is, a stat may be “valid,” but not particularly helpful?”
Absolutely. That happens all of the time. Because in general the more narrowly you define a construct the more valid it’s probably going to be. In turn though you reduce it’s scope.
A perfect example of this is FIP. As a statistic is has relatively high construct validity. It is clearly thought out and developed. It’s a very good statistic.
All that said the reason why it has those characterisitcs is because in it’s initial intent it was seeking to measure a very, very small aspect of the skill of pitching.
Over time however, this statistics has in many ways become less “helpful” because people routinely misinterpret it and routinely use it past what it’s initial scope was.
But i wouldn’t critique the validity of FIP becasue it’s now become in many ways misleading or “unhelpful.”
Good for Javy!
Hopefully, we’ll see the new and improved Javy.
Repost:
Wave – You said a couple of threads ago: “For the fan, defensive metrics are very useful. ”
Could you tell me why. I am not being facetious here, honestly would like to know why and how they are useful.
This isn’t too surprising. Part of it is due to the way Vazquez has pitched. But part of it is the way Mosley has been pitching as well.
Mosley has had a nice run but it seems like it’s coming to an end. For a guy with his stuff, he’s now walking a ton of hitters and is pitching on borrowed time. You get the sense that with a few starts teams are now ready for what he’s going to try to do.
Good move. Get Javy rolling for the playoffs.
I guess Nova sits when Andy gets back.
Unless AJ continues to bomb and Nova gets even better.
Too soon… I need to see more than 2 relief outings before I decide to put him back in the rotation… 2 long-relief appearances in blowouts is too small a sample for me…
CB – perhaps you could answer my question as well. Wave had mentioned that defensive metrics were useful for fans. I am not sure how or why I would think that. Perhaps you could give some examples from a fan point of view… or is it more for fantasy baseball?
Just make Vazquez think that it’s no outs in the 3rd inning and he’s coming in to relieve Moseley and tell him to give them 6 strong innings. Problem solved.
I’m going to this game so it better be the new Javy!
This could set up the death match between AJ and Javy for game 4 ALDS.
mick September 1st, 2010 at 4:12 pm
This could set up the death match between AJ and Javy for game 4 ALDS.
************
Doubt it. I am thinking it would be CC on short rest
Maybe these guys need some time off. Hughes could be next. After AJ, of course…
“I guess Nova sits when Andy gets back.
Unless AJ continues to bomb and Nova gets even better.”
Pettitte won’t be back until the middle of September and then Hughes might be pulled so I doubt Nova gets much sitting.
i think it is just as likely burnett goes to the pen when pettitte comes back than Nova headed to the pen.
I thought Moseley would get another start but let’s face facts, we need to get Javy back on the right track. With Pettitte a question mark because of health, Hughes and AJ question marks because of inconsistency, the Yankees need to get Javy straightened out. All we need from him is about three starts in the postseason of about 5-6 innings and 3 to 4 runs against. Let’s hope he can at least provide that on the way to title #28.
Wow, Erica with the creative thinking, nice job!
If you pull Hughes in mid-September that could be an admission he is going to the pen for the ALDS.
Nova will get this starts until the end of the month. For now it’s for Pettitte. Towards the ends it might be for Hughes, and continued Burnett implosion could get him an additional spot.
He’ll keep pitching every 5 days.
He’ll have to be Fernado in ’81 to get a post-season start, however.
BryanHoch Girardi on moving Jeter (2-for-30) down in batting order: “I don’t really see it happening.”
I think it’s way too premature to accurately guess who is the next rotation domino to fall. A lot depends on Pettitte’s health and the way Nova, Vazquez and Burnett all pitch, to go along with how they want to handle keeping Hughes’ regular season innings number around 180.
There are so many scenarios that could play out between now and the next full turn through the rotation (what if Nova gets lit up Friday? what if Burnett and Vazquez both flop in their starts? etc. etc.) that I think we have to wait until next week to accurately guess who is next to go.
Another idea, if Pettitte gets back they could use all 6 guys currently in the rotation.
Im hoping Burnett pulls it together, but even if he pitches well tonight (which is no gauge considering oakland has no offense), how do you know 5 days from now he can be counted on?
If the bullpen stays tight, there might be no room in it for Phil.
Doesn’t look good for Marte or Aceves but we have other alternatives.
mick September 1st, 2010 at 4:14 pm
Wow, Erica with the creative thinking, nice job!
*************
Thanks.
But it really depends on how the off days fall. Usually isn’t it- Game 1, Game 2, Travel Day, Game 3, Off Day or Game 4.
Either way… I am thinking the Yankees would go with a short rotation at least in the first round and use CC in Game 4 if necessary
Bronx Born-
I’m not intentionally avoiding your question. But I fear it will start a tiresome debate.
What I suggest is to start here:
http://www.baseballprospectus......leid=11476
But be sure to read through the comments to the article as well, particularly those from “Tango Tiger”. The article is basically questioning the validity of defensive metrics from a statistical point of view; some of the comments raise the other side of the debate.
And then, everyone remains happy here.
Skipping Hughes a start doesn’t mean he’s in the bullpen. Plus, Nova might piggyback Petttite’s first start or even Hughes starts. My point is Nova will continue to pitch as long as he’s effective.
Pettitte is throwing a pen right now. We should hear soon on how it went.
BronxBorn-
I’m really not sure what to say. For me they are enormously confusing to interpret. And that’s because they seem to be extremely sensitive to small changes in the way fieldable chances are categorized and require sample sizes that can’t be generated over one year’s time.
The problem is that if the metric requires more than one year’s data it’s simply not very useful or reliable. People want it both ways with defensive metrics. They want to say that they require 2-3 years of data to stabilize in terms of sample size but at the same time primarily use them to assess player performance for a specific year or changes in performance from a year to year basis. But there’s too much “noise” in the data to do that.
So I really don’t know. Designing valid metrics is a painstaking and laborious process that is very technical, very complicated and very difficult to do.
From what I’ve seen of defensive metrics they haven’t been designed with methods that would ensure they yield valid data or can be explicitly and transparently interpreted.
Defense is an big part of the game. Some in the “moneyball” generation of sabermetric analysis diminished it’s value because it couldn’t be quantified. Now that it can be quantified not enough attention is paid to the methods used to produce that data an whether or not they are valid.
There are literally text books written on the subject of how to design valid metrics. If one wants to hold baseball to a lower standard than that that’s fine. But in turn you can’t know whether the numbers are valid which defeats the point of data. I personally don’t get that – I dont’ see the value in data that one can’t be certain is valid or has error estimates that are very large because of sample size.
You can measure anything now. But that doesn’t mean it’s producing high quality, valid data. But this is being done in many industries – people want to measure things but don’t want to take the time and resources to do it properly.
And from what we’ve seen so far with defensive metrics, there is so much variation in the numbers it’s difficult to distinguish performance from “noise” to compare performance from one year to another.
I am happy to see Javy get another chance to start.
I would expect Phil to be a starter in the post season.
Mo,Wood, Joba, Robo, Logan,Moseley, Gaudin, Mitre/Nova/Javy?
CC,AJ,Phil,Andy
ALDS?
“Im hoping Burnett pulls it together, but even if he pitches well tonight (which is no gauge considering oakland has no offense), how do you know 5 days from now he can be counted on?”
You don’t.
I understand Wave.. like I said not being contentious. am just trying to understand what to do with all those number and why I would want to.
No extra off day this year.
Thanks muchly CB. I appreciate your taking the time to answer.
kate,
I also expect Phil to be a starter in the post season. And really, we have enough long-guys in the pen as a contingency plan for Phil or any other starter who struggles. Things don’t always work out the way you plan or wish. Someone not named AJ might struggle, too.
CC,AJ,Phil,Andy
ALDS?
__
I’d say there is NO WAY Burnett starts Game 2 or any clinching game in the playoffs.
I’m going out on a limb here and say AJ throws a solid game tonight, something like 7 innings a 2 runs allowed
joeman, that is my prediction as well…he’s pitching for his spot, should focus.
Hughes has 25-30 innings left and 5 starts.
I don’t think they pull him from the rotation. He can go 6 IP in all of them, or go deep in some and then have a short last outing, or maybe he goes short in 1 due to ineffectiveness and still has innings left at the end of the year to use.
This works and all you have to do is push Hughes back a start after each day off and let everyone else go on rotation (or push them all back)
and for that matter, someone named AJ may pitch very well
the bullpen has been a real strength lately as some of the starters have had problems.
CC,AJ,Phil,Andy
===========
Not in that order.
Andy in 3rd, but Joe might put him 2nd.
But what if AJ gets hot, he could go 2nd again.
lol, kate.
That’s baseball. You just never know…or something like that.
# mick September 1st, 2010 at 4:29 pm
joeman, that is my prediction as well…he’s pitching for his spot, should focus.
—————————————————
don’t know if he’s doing that .. we all know as bad as he pitches in games he could turn that all around & get on a hot spell
you mean there’s no rhyme or reason for the way AJ pitches?
Someone not named AJ might struggle, too.
————————————————————————-
So true.
I think we’re guilty of just believing that CC throwing a gem is a given but nothing’s written in stone right?
I won’t trust Burnett until he throws 3 quality starts in succession and uses an occasional changeup to give hitters another thing to think about.
Any futher outings like his recent ones puts him in jeopardy for the ALDS. Girardi can get away with a 3-man rotation until the ALCS.
# mick September 1st, 2010 at 4:33 pm
you mean there’s no rhyme or reason for the way AJ pitches?
———————————————————-
seems that way
BronxBorn-
I think conservatively what you can do is to look at several years of data, combine it with scouting reports and see where there is agreement and discrepencies. In turn you can utilize the data over those larger periods of time to help inform an evaluation of the player falling into general categories: bad/below average/ average/ good/ very good. Something like that.
But at this time I think that’s most of what you can do with the defensive numbers.
mick September 1st, 2010 at 4:33 pm
you mean there’s no rhyme or reason for the way AJ pitches?
—
At this point AJ is like a box of chocolates… with most of the good ones missing.
Girardi can get away with a 3-man rotation until the ALCS.
=======================================
With CC on 3 days rest?
CB-
My problem with your thesis is that it would lead those who don’t spend much time thinking about defensive metrics to conclude that there is no point thinking about defensive metrics, or what we have now represents no improvement over the days of “errors” or “total chances”.
That’s not the case. As Tom Tango says, improvement has been made. We just can’t validly say how much.
Anytime AJ gets in trouble on the mound I find myself reciting Lilo Brancato in the movie “A Bronx Tale”
“Keep Your Head AJ”, “Don’t Lose It AJ”, “These People Will Hurt You AJ.”
Yes. Something about randomness and noise.
Well, I’ve already carved “Cy Cy 2010!!” in Corinthian marble, so I hope you’re wrong.
At this point AJ is like a box of chocolates… with most of the good ones missing.
=============Maybe he ate all the bad ones and mostly good ones are left?
We’ll see tonight.
7/17 at Oak…7 innings,5 hits and 2 runs….look for more of the same tonight
raymagnetic September 1st, 2010 at 4:37 pm
Anytime AJ gets in trouble on the mound I find myself reciting Lilo Brancato in the movie ?A Bronx Tale?
?Keep Your Head AJ?, ?Don?t Lose It AJ?, ?These People Will Hurt You AJ.?
******************************
That’s great!
DeNiro vs Chaz…great movie.
# raymagnetic September 1st, 2010 at 4:37 pm
Anytime AJ gets in trouble on the mound I find myself reciting Lilo Brancato in the movie “A Bronx Tale”
“Keep Your Head AJ”, “Don’t Lose It AJ”, “These People Will Hurt You AJ.”
————————————————————————–
there was a while there I was saying the same thing about joba
Tom Tango?
pat, if you’re out there, your joke about signing Lin’s dad was very funny.
I didn’t want that to go unmentioned.
BrianCoz Arod hitting on field right now. First time since injury #yankees
It’s nice to see a player finally getting a break after having a $750,000 signing bonus pulled after failing his physical when he was drafted. The Mets are going to try buying out RA Dickey’s one year of FA at $4 mil. The guy has worked hard to overcome a handicap.
They took out an off day in the ALCS, but the ALDS schedule should be the same as last yr. So the Yanks may be able to survive with only 3 starters. Hopefully Javy or AJ gets hot over the next few weeks and the yanks won’t have to tempt fate again.
Thanks Nick.
I amuse myself. The rest of you are just along for the ride.
DeNiro and Pesci
BrianCoz Arod just hit one over the visiting bullpen into the bleachers. I’d say his swing looks fine
“I amuse myself. The rest of you are just along for the ride.”
pat, that’s brilliant.
If its only 3 starters needed then Hughes could go to the pen.
Esp if AJ heats up.
Javy could take Moseleys place as long man or Mitre’s.
He’s looked good there.
Chaz Palminterri had a bigger role than Pesci, who is always good.
“We just can’t validly say how much.”
That’s a big problem.
To be honest – most real statisticians and methodologists would be completely perplexed at the idea of utilizing data that has unclear validity, especially data of unclear validity whose flaws can’t be corrected because the sources of error are uncertain.
Regarding your other point I guess we just disagree. I see no point in championing one kind of data and overlooking it’s obvious flaws so that the other side doesn’t throw the proverbial baby out with the bath water. I just don’t understand that. That’s not what data is supposed to do. But in baseball it’s taken on that role.
This is a completely different way of thinking about the world. It’s not how statisticians who have expertise in the field think about the world.
The culture of sabermetric analysis in baseball is extremely different than in most other areas where quantitative methods are taken seriously. I don’t know what else to say.
The entire point of data is to have a clearer, more valid sense of what the truth is. If the data is of unclear validity due to methodology then it’s really of unclear import. Validity is everything.
Finally, I’d say that this whole don’t throw the baby out with the bath water argument is a real hindrance to advancements in the field because it limits internal critique. That’s very different than how scientific disciplines work, for example.
“Chaz Palminterri had a bigger role than Pesci, who is always good.”
One would hope so, considering it’s about Palminterri’s life.
You guys know it’s autobiographical, right?
“C” IS Chaz.
Chaz is a big Yankee fan, calls in regularly to Mike.
CB,
I have a very basic understanding of statistics…basically enough to understand journal articles and clinical reseach in my field.
The biggest problem I see with defensive stats is that much of the data is based off subjective information as opposed to the more objective offensive numbers…its either a hit or its not, there’s no opinion or gray area.
To me, that seems like a fundamental problem that s nearly impossible to overcome.
Legend has it that’s CC’s shadow once killed a dog.
CB-
I don’t think it limits internal critique at all. I believe it stimulates it. To my mind, saying measurement is impossible because we can’t accurately measure is the approach which stifles debate and critique.
Consider RC/27. Is that “valid”? It’s chock full of assumptions and plenty of “noise”. But to say it hasn’t improved our understanding of a hitter’s contribution, because of that, is a giant step backwards.
You can make a similar argument for “batting average”. It’s full of “noise”.
Defensive metrics are full of noise, possibly more full of noise than other measurements, it is hard to say. But that doesn’t mean we should throw them away.
“its either a hit or its not, there’s no opinion or gray area. ”
blake-
Are you really sure of that? Have you notice the way games are scored these days? A hit is often an enormously subjective conclusion. We just pretend it isn’t.
Chaz was INDIGNANT at how the Yankees handled Joba.
Wave,
Yea but its much more clear cut than what’s being done with defensive recordings. Their are relatively few “judgement calls” with the offensive statistics….
“Their are relatively few “judgement calls” with the offensive statistics….”
I suspect more than you think, especially when you add in umpires’
judgment calls as well. Most judgments about defensive plays are similarly not in question. The debate is over the amount and effect of those that are. We don’t know the amount, people are only beginning to speculate.
“The biggest problem I see with defensive stats is that much of the data is based off subjective information as opposed to the more objective offensive numbers…its either a hit or its not, there’s no opinion or gray area. ”
That’s an issue but it’s really not the primary issue. People try to measure thing that are amorphous all of the time. To do it in a valid fashion however you have to do it very systematically and very rigorously.
For example, industries across the world try to measure “customer satisfaction.” It’s analogous to defense in some ways in that it’s not immediately objective.
It’s clear why they want to measure customer satisfaction – it’s crucial to many businesses and billions of dollars are at stake. So you want to have as “objective” a sense of this as possible. But what the heck is “statisfaction?” How do you define it? How do you measure it? How do you scale it properly? How do you know the number you are recording validly reflects that customers true satisfaction?
It’s very easy to come up with a number. Very, very difficult to come up with data that actually and truly reflects that underlying issue of satisfaction. But people have done a lot work on how to do this – mostly in psychology and sociology.
Rather than objective and subjective it’s about trying to take something that is relatively amorphous and systematize it’s meaning so that it can be measured in ways which produce data that is very explicit in what it is and is not capturing.
That’s just very difficult to do. And there’s not a lot of people who have real expertise doing this.
The whole issue of statistics gets overdone. Statisticians aren’t even the ones who typically drive this kind of work. They analyze data more than they develop measures. In baseball it’s very conflated between the two.
As I understand it…nearly every play recorded in some defensive metrics is subjective at least to some extent…as opposed to the few plays a year where a guy is given a hit or error on a play. There is some variation on error/hit calls but nearly everything else is pretty objective with offensive numbers.
Note to Joe Girardi:
Please stop playing games with my head. That is all.
I don’t think Burnett should pitch second in the playoffs. Pitchers should have to earn their spot in the rotation. I don’t think its fair to just give AJ the spot because of what he did last year. He has been the worst pitcher on the Yankees staff this year. He is 9-12 and has one of the best lineups in baseball behind him. Pettitte should pitch second. He has clearly earned it. I also don’t think that having two straight lefties is a bad thing, especially since the games will probably be at Yankee Stadium. I wouldn’t even pitch AJ at all, unless we were leading the series. I would go CC, Pettitte, Hughes, AJ (or CC on short rest depending on who is leading the series). You just can’t throw someone out there in the playoffs when there is a 50% chance that he won’t even give the team a chance to win. At least we know that when Hughes doesn’t have his best stuff, he can still grind it out and give the Yankees a chance to win the game. When AJ’s location is off, the Yankees might as well forfeit after the 1st inning.
CB,
Makes sense. You guys clearly know a lot more about this stuff than I do….it just seems to me that there are a lot of variables to try and account for in order to draw accurate conclusions….(who is recording the data, who was pitching, where the balls were hit that day etc…).
“You can make a similar argument for “batting average”. It’s full of “noise”.”
Noise isn’t in and of itself the issue. If performance is varying then measure needs to relfect that variation. If it doesn’t it’s not valid.
Batting average has very high construct validity. That’s why the variation isn’t concerning.
Defense doesn’t have that kind of construct validity so it’s much more problematic.
As I said. There are text books on this subject. People devote their entire professional lives to this.
If people inside of baseball want to measure defense better they should get people who actually have expertise to do it correctly. From what I’ve been able to see about the proprietary methods used by BIS, etc. they don’t seem to have that expertise. I could be wrong but as they aren’t transparent it’s hard to know and the onus is ultimately on them (and believe me I’ve read everything I could find on this subject, including Tango’s stuff – the methods aren’t transparent and seem to lack basic work that needs to be done to ensure validity).
Perhaps teams are actually getting people with expertise to develop their own high quality defensive metrics that we can’t see because they are proprietary. But from what I’ve read at least the Mariners were making decision supposedly in part on UZR. And I find that remarkable for a business to do so.
blake-
No, most defensive plays are easy. A two hopper to SS, a can of corn to LF. It’s the margins that are hard.
Granted, the margins of defensive plays are probably wider than the margins of offensive plays. At least, that’s what my common sense tells me, although I’ve never actually measured it to be sure.
But think how many times a batter is called out on an outside pitch or a low pitch – that’s a subjective judgment. Yet we draw conclusions based on the results of that all the time – it’s part of the most time-honored stat of all, “batting average”. Or a bang-bang play at first. Or a shot off of the third baseman’s glove being ruled a “hit”".
We just assume it all “evens out”. But does it? As far as I know, it hasn’t been measured.
“Batting average has very high construct validity. That’s why the variation isn’t concerning.”
Batting average measures what it measures, but people long over-valued it and over-interpreted it. We only discovered that because of the new stats that you are now attacking.
Blake-
Defense is not the same as offense. It’s not and it’s not close. Offense in baseball lends itself very well to measurement because it’s broken down into very discrete events that are tied to clear outcomes. Discrete events with clear outcomes are just easier to measure in a more valid fashion. That’s why measurement in baseball is so much richer than in say basketball or even better yet ice hockey.
Defense just doesn’t break up into discrete events with clear outcomes like that as easily. There is always going to be more uncertainty in defensive measurements.
That’s partly why a player who produces a run offensively is more valuable statistically than a player who saves a run defensively. That defensive run saved will always have a larger error bar associated with it.
That’s part of why the mariners were so awful this season. They didn’t seem to get that trade off in uncertainty they were making.
I agree with Vin Scully. Statistics are used the way a drunk uses a lamp post: for support, not illumination.
Any attempt to gain a clearer and more precise understanding of the important events in a game, in a season and a career are well worth the effort, IMO. And not just for fantasy baseball purposes.
A home run is visual and visceral. That’s the immediate impact for a casual fan. But to the players and managers and GMs it’s about how that HR happened. Is it an indication of the hitter’s prowess, or the pitcher’s failings? Is it a match-up issue, a righty on lefty instead of a right on righty thing? Did the catcher call the wrong pitch? Was the wind blowing out or in? Was it early in the game when the pitcher’s arm was warmed up or late when he was tired? Was nobody on base and the pitcher said “here it is, hit it” or is this pitcher starting to give up big hits in high leverage situations? And I’m not even scratching the surface of analysis on the result of that one pitch. I’m on a tangent: Anyone who says baseball is a slow, boring game could learn a lot about the depth and layers of the game by checking into the statistical side. Defensive metrics might be flawed at present but I’m glad somebody is out there trying.
“Batting average measures what it measures, but people long over-valued it and over-interpreted it. We only discovered that because of the new stats that you are now attacking.”
Why do you use the word attack? That’s so curious.
I’m not attacking anything. I’m pointing out methodological shortcomings that concern me.
But somehow that’s an “attack.” That’s what I mean about this round the wagons mentality that quells internal critique and the opportunity to improve.
It’s just so strange. People in science and statistics don’t think this way at all. Not even close. They are ruthless about finding sources of error in methods.
And finally you keep mixing utility value (e.g. the interpretation of batting average) with it’s validity.
Measurement is about validity. It’s interpretation is a separate issue.
But in baseball these two keep getting conflated.
CB-
But, the Red Sox were reported to have done much the same thing, and their team has proved surprisingly resilient to injury and setback.
Using one team’s experience to draw a general conclusion isn’t helpful.
The question is not, are defensive metrics helpful. The question is, how helpful are they?
Again, the existence of uncertainty doesn’t invalidate the attempt to measure. In your disillusionment with the things you used to believe, you are going to far in the opposite direction.
CB, if you don’t think you are attacking the notion of defensive metrics, you are fooling yourself.
“It’s just so strange. People in science and statistics don’t think this way at all. Not even close. They are ruthless about finding sources of error in methods.”
And, you are over-interpreting my position. Nowhere have I said sources of error shouldn’t be ferreted out. As I understand your position, though, you are saying it is pointless to even attempt this because the model is so inherently flawed.
“Using one team’s experience to draw a general conclusion isn’t helpful.”
When did I do this?
I didn’t.
I used the Mariners as an example. That’s it.
You are reading things into what I’m saying like “attack” that I’m not saying.
And while you’re concern may be about how “helpful” a metric is I don’t understand that. My concern is about how valid they are. If they are of poor validity then they aren’t going to be particularly helpful (and I’m not even saying they of poor validity because as far as I can tell that work hasn’t been done).
As of know they are to me of uncertain validity.
If you see that as an “attack” or taking an absolutist view then I’m really not sure what to say.
“if you don’t think you are attacking the notion of defensive metrics, you are fooling yourself.”
Then you have no idea how self-critical people who actually do this work are. If you think anything I’ve written is an attack you should attend a scientific conference.
“As I understand your position, though, you are saying it is pointless to even attempt this because the model is so inherently flawed.”
You don’t understand my position at all. I’ve never said it was “pointless.” Not even close.
CB-
Would you accept that a statistic with high construct validity could have a low “utility value?” That is, a stat may be “valid,” but not particularly helpful?
I know you this is an obvious claim, but I think it gets to the root of the sabermetrical argument.
I think it was Ken Singleton who said what I’m going to paraphrase. When an outfielder makes an error, something’s already gone very wrong for the ball to be in the outfield. I suppose that might (humorously) support an argument that defensive events are a little tougher to measure cleanly.
Re: Defensive Metrics
While I am a proponent of many of the methods to analyze offensive performance, defensive metric are, a fact supported by almost all sabermetricians, in the testing phase.
There can be no comparison to offensive measurements, the result of a large amount of data, to a metric like defensive, where not only do the sabermetricians have to great the tool to measure the data, but create the data itself.
The attempts at the quantitative measure of defensive is very intriguing to me, but far from complete.
CB-
I have to check out, so you are going to get the last word. But, the fact that you can’t say exactly how valid the model is doesn’t mean the model is not helpful. It just means you can’t say how helpful.
But to say we don’t have a better understanding of defense now than we did before, just because we can’t say exactly how much better understanding we have, as I understand your position, is wrong.
And, if my understanding of your position is correct, then you are making an attack.
However, it is always a pleasure to debate with you.
“Would you accept that a statistic with high construct validity could have a low “utility value?” That is, a stat may be “valid,” but not particularly helpful?”
Absolutely. That happens all of the time. Because in general the more narrowly you define a construct the more valid it’s probably going to be. In turn though you reduce it’s scope.
A perfect example of this is FIP. As a statistic is has relatively high construct validity. It is clearly thought out and developed. It’s a very good statistic.
All that said the reason why it has those characterisitcs is because in it’s initial intent it was seeking to measure a very, very small aspect of the skill of pitching.
Over time however, this statistics has in many ways become less “helpful” because people routinely misinterpret it and routinely use it past what it’s initial scope was.
But i wouldn’t critique the validity of FIP becasue it’s now become in many ways misleading or “unhelpful.”
“You don’t understand my position at all. I’ve never said it was “pointless.” Not even close.”
Ah, then we are two ships sailing past each other in a fog, because I don’t think you understand my position at all either. But I really need to go.