InShort: Occlusion Analysis for Explaining DNNs | by Eugen Lindwurm | Jan, 2021
There are abundantly many explanation methods for explaining deep neural networks (DNNs), each with its advantages and disadvantages. In most cases, we are interested in local explanation methods, i.e. explanations of the network’s output for a particular input, because DNNs tend to be too complex to be explained globally (independent of an input).
Generally speaking, all local explanation methods have one common goal: to faithfully (i.e. accurately) represent the function f to be explained (e.g. a DNN), at least locally around the input they are asked to explain.
Of course, such explanations also have to be human-understandable to be useful. The simplest way to achieve this is to attribute an importance score to each input dimension, a.k.a. to create an attribution map. Attribution methods assign responsibility for the model output to each dimension of a given input.
In this short article, I will present one fundamental attribution technique: occlusion analysis. The basic concept is as simple as they come: For every input dimension of an input x, we evaluate the model with that dimension missing, and observe how the output changes.
In particular, if ||f(x) — f(x_without_i)|| is large, then the dimension must have been important because removing it changes the output a lot.
If your dimensions are independent, then occlusion analysis is perfectly faithful, as you are exactly measuring the marginal effect of each dimension.
Unfortunately, in most cases, such as image data, this is not the case. Here, you would be advised to remove whole patches instead of individual pixels. The idea is that usually the information of a single pixel can be reconstructed from its neighbors. So if you have an image of a cat, removing one cat-pixel will never have a large effect on the output, whereas removing the patch covering an ear might lead to a noticeable drop in the model’s prediction for ‘cat’.
Another nice thing about occlusion analysis is that it is a post-hoc method. This means that it can be used to explain any (already trained) model. No retraining necessary. The model can even be a non-differentiable black-box. As long as you can feed in inputs and receive outputs, you can use occlusion analysis.
Another advantage occlusion analysis has over gradient-based explanation methods is that it can even deal with functions that are locally flat, with no or only very small gradient.
But what does it actually mean to remove a dimension? Our model always takes inputs of the same size, after all. Removing a dimension means setting it to a value that has “0 information”. It depends on the dataset what this value is. For image data, we usually use the average RGB value. For other data types, often setting the dimension to 0 works. We will see additional considerations later.
As you have probably guessed, occlusion analysis comes with one big caveat: we have to evaluate the model for each of these perturbed inputs. If your input has many dimensions, e.g. an image of 256×256 pixels, then you have to run the model 256×256=65.536 (!) times to get the complete analysis. In most scenarios, this is prohibitively expensive — especially if you want to run the analysis on your whole dataset.
One way to mitigate the computational cost to take multiple features and remove them together (e.g. 8×8 squares in your picture). This only makes sense for datatypes where some dimensions are so strongly interdependent that they semantically belong together.
There is another problem with occlusion analysis that is not talked about much: distribution shift (cf. Hooker et al.). If we think about it closely, the change in output that we observe in the analysis can have another reason besides information being removed: the perturbed input is no longer in the data distribution we trained the model on.
In machine learning, we generally assume that the model will be evaluated on data coming from the same distribution as the training samples. If that is not the case (i.e. if we remove pixels), then the model output can be arbitrarily wrong. While the effect of removing single pixels is usually negligible, removing whole patches is a bigger step away from the training data manifold and can thus have a bigger impact on the output.
This entanglement of the two reasons for our output to change is a fact we have to live with, but there are ways to mitigate the problem. The basic idea is to remove information while still staying close to the data manifold. This means using a more sophisticated information removal technique that still leaves the image looking like a natural image.
One approach is to just blur out the patch to be “removed”. It is not the most effective method, but it should at least remove fine-grained texture information and it is easy to implement.
A better approach is to use an inpainting algorithm: Just use another model to guess (i.e. inpaint) the content of the part that is missing. No information is really added because the inpainting only relies on the remaining pixels of the image, but the result still looks close to a normal image and hence is closer to the training data. You can use a sophisticated algorithm like the one by Yu et al., or rely on easily accessible libraries, like openCV.
The caveats of using an inpainting algorithm are 1) it makes the procedure computationally even more expensive, 2) you have to get it to run, and 3) if you don’t work with a standard benchmark dataset, you probably have to retrain it.
Due to its computational cost, occlusion analysis is certainly not a tool for every occasion but there are certainly uses. Especially if your data are small or you just want something that is easy to implement and reliable (just be careful with the patch size), occlusion analysis can shine. A closely related and more sophisticated approach are Shapley values. Unfortunately, they are even costlier to compute. If you are working with a differentiable model, perhaps the next-best simple approaches are gradient-based explanation methods.
I hope you learned something useful!
Read More …