Well, that sort of algorithmic approach sounds good on paper, but it's not something I'm a fan of. The reason is context. You'll notice that when someone claims that unit x is overpowered, I want to know why they think so. Often it might be because the opposing ship mix is just really poor against that unit, or the player is using some clever tactic like the old munitions boosting snaking or kiting, or whatever else. With just raw percentages, I'd have no idea what to fix.
I already run something like what you arE decribing in the form of the strong/weak data simulation, and it strips out as much context as possible. So when I want to use a contextless raw set of performance data, that's what I use. But that only goes so far, which is why I rely on intuition and user reports of specific explotative behaviors, often waiting for multiple confirmations. The system works VERY well, I think, but we are constantly adding new content and changing various mechanics, which means we are then also constantly rebalancing; that's just a fact of an ever growing game, it doesn't mean it isn't working.