I searched about SHAP.
Focusing on the paper 1 shown in the bibliography, I will explain what SHAP is with reference to 6, 7 and 8. Since the explanation of terms is abstracted in the paper 1, it is easier to understand by looking at the concrete examples of the paper 7.
What is it like? Proposed SHAP (SHapley Additive exPlanations) values, which is an integrated framework for interpreting forecasts.
What is amazing compared to previous research? There are many methods for defining the importance of features, but the relationship between the methods was not well understood. The method proposed in this paper can find a unique solution based on a certain theory, and the new class integrates the six existing methods.
Where is the "kimo" of technology and method? (1) A new class of additive feature importance scales was identified. (2) It was shown that this class has a unique solution and has some desirable characteristics.
Let's say the model we want to describe is $ f (\ boldsymbol {x}) $.
y = f(\boldsymbol{x})
Let the explainable model be $ g (\ boldsymbol {x'}) $. In the method shown below, the feature quantity for explaining the predicted $ f (\ boldsymbol {x}) $ output for each input $ \ boldsymbol {x} $, not the importance of the feature quantity for the entire data Find the importance of. Such methods are called Local methods. For example, LIME is Local methods.
Explainable models often use map functions to simplify the input $ \ boldsymbol {x} $ of the original model to the input $ \ boldsymbol {x'} $ to the explainable model.
\boldsymbol{x} = h_{\boldsymbol{x}}(\boldsymbol{x}')
Here the map function is a function that depends on the input value, as it has the subscript $ \ boldsymbol {x} $. This simplification translates it into an input format of $ \ boldsymbol {x'} $ that allows us to interpret which part of the original model's input $ \ boldsymbol {x} $ contributes to the prediction. For example, if the input is Bag of Words (BOW) and you want to know which words contribute to the prediction, you only need to enter the presence or absence of the word in the input $ \ boldsymbol {x} $. Therefore, simplification is the operation of converting a nonzero frequency value to 1. Also, the simplified input $ \ boldsymbol {x'} $ can be reverted to the original input with $ h_ {\ boldsymbol {x}} $. This is because $ h_ {\ boldsymbol {x}} $ has information on the local input $ \ boldsymbol {x} $.
Simplification: [1, 2, 1, 1, 2, 1, 1, 0, 0, 0] \rightarrow [1, 1, 1, 1, 1, 1, 1, 0, 0, 0] \\
h_{\boldsymbol{x}}: [1, 1, 1, 1, 1, 1, 1, 0, 0, 0] \rightarrow [1, 2, 1, 1, 2, 1, 1, 0, 0, 0] = \boldsymbol{x} \odot \boldsymbol{x'}
If the input is an image and you want to know which part of the image contributes to the prediction, patch the adjacent continuous parts and express the presence or absence of that patch as 1/0. When returning to the original input space with $ h_ {\ boldsymbol {x}} $, the patch part (1 part) should be converted to the original image. The part other than the patch (0 part) is returned with an appropriate background color, which will be explained later. The blog in Reference 8 is very helpful for the explanation of this area.
Local methods are $ g (\ boldsymbol {z'}) \ simeq f (h_ {\ boldsymbol {x}} (\ boldsymbol { x}')) Determine the weight of $ g $ so that it is $.
In the 1 \ paper, we build an explainable model by a method called Additive feature attribution methods. The simplification is limited to conversion to binary variables. Therefore, the explainable model is a linear function of binary variables.
g(z') = \phi_0 + \sum_{i=1}^M\phi_i z_i'
Where $ z'\ in \ {0,1 \} ^ M $ is a simplified feature and $ \ phi_i \ in \ mathbb {R} $ is a factor that represents the effect of the feature. In this model, the effect of each feature is represented by $ \ phi_i $, and the effects of all features are added together to approximate the output of the original model.
The coefficient $ \ phi_i $ can be uniquely determined so as to satisfy the following conditions.
\begin{aligned}
f(\boldsymbol{x}) = g(\boldsymbol{x}') &= \phi_0 + \sum_{i=1}^M\phi_i x_i' \\
\boldsymbol{x} &= h_\boldsymbol{x}(\boldsymbol{x}') \\
\end{aligned}
x' = 0 \Rightarrow\phi_i=0
$ f_x (\ boldsymbol {z}') = f (h_ \ boldsymbol {x} (\ boldsymbol {z}')) $ is abbreviated, and $ z_i'= 0 $ means $ \ boldsymbol {z}' I will write \ backslash i $. For any two models $ f, f'$
f_\boldsymbol{x}'(\boldsymbol{z}') - f_\boldsymbol{x}'(\boldsymbol{z}'\backslash i)
\geq f_\boldsymbol{x}(\boldsymbol{z}') - f_\boldsymbol{x}(\boldsymbol{z}'\backslash i) \ \ \mathrm{for \ all \ inputs } \ \boldsymbol{z}' \in \{0,1\}^M
When holds, then $ \ phi_i (f', \ boldsymbol {x}) \ geq \ phi_i (f, \ boldsymbol {x}) $ holds. This requires that $ \ phi_i $ also increase or remain the same if the model is modified so that the marginal contribution of the feature $ i $ increases or remains the same (regardless of other features). I am.
It is known that the only explainable model can be determined by imposing the above three conditions.
\phi_i(f, \boldsymbol{x}) = \sum_{\boldsymbol{z}'\subseteq \boldsymbol{x}'}\frac{|\boldsymbol{z}'|!(M-|\boldsymbol{z}'|-1)!}{M!}[f_\boldsymbol{x}(\boldsymbol{z}')-f_\boldsymbol{x}(\boldsymbol{z}'\backslash i)]
here
\phi_i = \sum_{\mathcal{S}\subseteq \mathcal{N}\backslash i}\frac{1}{\frac{N!}{S!(N-S-1)!}}\left( v\left(\mathcal{S}\cup\{i\}\right) - v\left(\mathcal{S}\right) \right)
here
From the above results, it was found that the Additive feature attribution methods are unique for a given $ h_ \ boldsymbol {x} $. Conversely, indicators that are not based on Shapley values are considered undesirable because they do not have the three conditions that feature importance should have.
SHAP (SHapley Additive exPlanation) Values
Consider the SHAP value as a unified measure of the importance of features. This is an approximation of the original model $ f $ to the conditional expectation function in the Shapley values equation.
f_\boldsymbol{x}(\boldsymbol{z}') \equiv f(h_\boldsymbol{x}(\boldsymbol{z}')) \rightarrow E[f(\boldsymbol{z})|z_S] \\
Explain why you need to replace it with an expected function. The mapping function $ h_ \ boldsymbol {x} (\ boldsymbol {z}') = z_S $ is strictly an operation to remove features that are not included in the index $ S $. This is because when mapping from the input space of the explainable model to the original feature space, it may make sense to replace the part not included in the index $ S $ with another value. Because there is. For example, in the image, should all the parts that do not correspond to the patch be replaced with 0? Or should it be replaced with 255? .. On the other hand, in the case of multiple regression analysis, the feature quantity deficiency can be replaced with 0 because removing the feature quantity term from the model is equivalent to setting the feature quantity to 0, but most models are like this. It cannot handle missing values naturally. Therefore, the mapping function decides to randomly allocate values according to the distribution of the data, instead of making the part not included in the index $ S $ a missing value. This allows the model to enter data. This makes the model's predicted value $ f $ an expected function conditional on $ z_S $.
Calculating the SHapley value as defined is difficult because it costs a lot of money. ($ \ Sum_ {\ boldsymbol {z}'\ subseteq \ boldsymbol {x}'} $ is the exponential order). However, you can approximate SHAP values by combining the insights of Additive feature attribution methods. Use feature independence and model linearity when approximating.
\begin{aligned}
E[f(\boldsymbol{z})|z_S] =& E[f(\boldsymbol{z})|z_S]\\
=& E_{\bar z_S}[f(z_S, \bar z_S)] \because \mathrm{Independent features} \\
=& f(z_S, E[\bar z_S]) \because \mathrm{Model is linear}
\end{aligned}
In the paper, six methods are shown as approximation results of SHAP, which means that these methods are integrated.
--Model-independent approximation method - 「Shapley sampling values」 --"Kernel SHAP" (new method) --Model-specific approximation method - 「Linear SHAP」 - 「Low-Order SHAP」 --"Max SHAP" (new method) --"Deep SHAP" (new method)
Shapley sampling values The paper is here. Abandoned due to charge.
Kernel SHAP
Kernel SHAP is determined in 5 steps.
The SHAP kernel is defined by the following expression:
\pi_\boldsymbol{x}(\boldsymbol{z}') = \frac{M-1}{_MC_{|\boldsymbol{z}'|}|\boldsymbol{z}'|(M-|\boldsymbol{z}'|)}
Solve the optimization problem with a weighted loss function.
L(f, g, \pi_\boldsymbol{x}) = \sum_{\boldsymbol{z}'\in Z}\left[ f(h_\boldsymbol{x}(\boldsymbol{z}')) - g(\boldsymbol{z}') \right]^2 \pi_\boldsymbol{x}(\boldsymbol{z}')
Linear SHAP
In the linear model $ f (\ boldsymbol {x}) = \ sum_ {j = 1} w_j x_j + b $, if the features are independent, the SHAP value will be as follows. The derivation is detailed in the Appendix of this paper.
\begin{aligned}
\phi_0(f, \boldsymbol{x}) &= b \\
\phi_i(f, \boldsymbol{x}) &= w_i(x_i - E[x_i])
\end{aligned}
Low-Order SHAP (I don't know what you're talking about) Even in the case of linear regression, if Kernel SHAP is used, only $ \ mathcal O (2 ^ M + M ^ 3) $ needs to be calculated, so it is more efficient to make an approximation that makes the features independent (this approximation is Low- Order SHAP?).
Max SHAP It says that there are attached materials, but isn't there? So I don't know. The reviewers say it's wrong.
Deep SHAP It claims that SHAP can be calculated for the Dense layer and Maxpool layer, and it means that SHAP can be calculated by backpropagation (is it possible in the non-linear part?).
It seems that decision tree can calculate SHAP efficiently. With LightGBM and Xgboost, you can easily output SHAP with the library in Reference 4.
Recommended Posts