[Warning, Wall of Text ™]
First I apologise for the ‘e-peen’ comment. I blame tiredness and bad mood after losing. ;-)
On the topic - if you want to realistically measure player skill, you would optimally find metrics for all relevant attributes. For some (K/D/A, for example) this is easy. For others -let’s say leadership- it is probably not possible. If you have all metrics, you use them for some regression modelling using the existent data on played games, ending up with a good formula determining player skill. Important about this is that the model of course gets better with more included and actually relevant variables. Simplifying the model will make it worse. Important in this is to keep in mind the interdependence of different variables.
Of course, that does not sound feasible. But at least a simple model with a handful of variables could already have enough descriptive power to be useful. However, I highly doubt that only one metric (K/D/A) is in any way descriptive enough. Therefore the examples mentioned above. Another example - during laning, blue team is slightly worse overall than purple Come ganking phase, I start to talk to my team, organise ganks etc. Since blue player A plays Soraka, he gets assists and some deaths. However, the blue Annie (who lost her lane badly, not dying but with 50 cs less than her opponent) thanks to repeated 5 vs. 3, 2, 1 fights racks up the kills. Confident in her abilities she wanders of a few times, leading to tower losses since she was absent. Due to Soraka being convincing Annies stops and blue team wins. If we only include Win/Loss and K/D/A, Annie as a bursty damage dealer most likely gets a higher bonus than Soraka - how would that make sense ? (For other examples see my previous post).
My point - including more variables than W/L only makes sense if we go beyond K/D/A, which incidentally makes the whole model less than transparent from a players viewpoint.
Regarding a pure Win/Loss-model, as implemented atm - Winning is derived from all possible metrics by definition, since it represents the second side of the model equation, so to speak, and is the result of all other factors. Therefore, any model including the W/L metric, but adding only one more stat basically only makes sense if the one added stat is by far the most important metric determining W/L, because it inflates its influence beyond that of other, possibly at least as important factors. As I see it, atm the only reason to elevate K/D/A importance like this without further data for support is based on emotion and not fact. Getting a lot of kills makes you feel like you are the sole reason for winning, and tends to be seen of highly inflated importance. In reality however, the really clever and capable leader might be more important (this being a team-based strategic game, not 1v1 Quake). Yes, K/D/A certainly is important, but not to the extend that it overshadows everything else. This emotional inflation of value is the reason I’d prefer it to not be displayed, actually, since it detracts from team-oriented play.
Regarding the Elo-reality - if you would pressure me, I would admit that I’d love to get my hands on a nice database of statistics for recent games (a few hundred thousand or so should do) and have a go at it. Or even better, someone who has the time and experience I lack at modelling does, and derives a good equation. Since this is not going to happen, W/L-Elo seem the best alternative to me, since it is by default the one most important metric since it explains itself completely. Which is not true for any other metric we can think of. And it can actually be measured very easily.
Regarding the “more to do with having fewer “average” games have extreme outliers who go 1/12/3 offset by players going 12/1/12” - sorry, that already rests on the assumption that K/D/A is the single metric determining a players impact on the game outcome, which I refute. And have provided examples falsifying this assumption. (The reason it rests on this assumption is simply that in any other case, you skew your outcome due to inflated importance of K/D/A and omission of other factors of similar importance, as stated above. By definition you would be moving away from “averaging”, not toward it).
Edited to add: Of course the importance of K/D/A can be modified using a constant factor, but either this is completely arbitrary or based on data. If it is based on data, it should in any case involve additional variables, since again, modelling only based on K/D/A inflates it’s importance in players minds even more and skews your results.
TL;DR if you want to add K/D/A to Elo, at least do it right - add multiple metrics and do some mathematical modelling. Otherwise don’t bother, please. Singling out K/D/A to me seems based primarily on emotion, and would only serve to raise it’s emotion-based impact even more. Finally, raising the emotion-based impact of K/D/A is likely to have a detrimental effect on teamplay and might even hurt appropriateness of the Elo-rating as measure of overall player ability.