XAI has many flavours (includnig interpretability as well as explainability) - au fond, the idea is to shine a light into the black box, and not just say why an input produced an output, but potentially show the workings, and, in the process, quantify uncertainty in the output (confidence)- in the process of using an AI that does produce these outputs, the user can necessarily gradually construct a model of what the AI is doing (and why, given the user knows the inputs too) Hence, in a sense, this is like debugging the AI, or indeed, modelling the AI. i.e. reproducing the AI's model. In the end, the user will have reverse engineered the AI. This is an indirect, amd possibly time consuming way of reproducing the model, effectively if not actually. Ironically, in some cases, we may end up with a more accurate, or a cheaper model, or both.
Of course, you may dispute that the model we learn is not actually the same as the thing inside the black box - the analogy of boxes and lights is, of course, nonsense. If we were to know the actual machine learning model (linear regression, random forest, convolutional neueal net, bayesian inferencer etc, and the actual weights (model parameters etc) then it wouldn't be a black box, and we'd be able to simply copy it. various techniques can be used even for quite complex machines, to relate the model parameters (e.g. CNN weights and clustering) to the features the model is able to detect or predict. This is the direct approach. In this approach, we are also able, potentially, to simplify the actual model, removing components that serve no useful purpose ("junk dna"?).
Either way, any sufficiently advanced and thorough explanation of an AI is going to be a copy.
I wonder if the world of LLMs is resistant to XAI techniques partly (honestly) because very large models would be very expensive to re-model these ways, but also partly because some of the proponents of GenAI technlogies like to retain the mystery -- "it's magic", or perhaps less cynically "it's commerical in confidence".
However, if we want to depend on an AI technology for (say) safety critical activities, I think it better be fully explainable. And that means it will be transparent, actually open, and reversable (in the reverse engineering sense).
No comments:
Post a Comment