Regarding instrumental variable method

What is the instrumental variable method?

This is a method of estimating the causal effect using variables called instrumental variables without adjusting the confounding when trying to estimate the average causal effect under a structure with confounding that cannot be adjusted or observed.

It is especially useful for data analysis in social sciences where many unobservable confounding can be considered.

The following is a graph called DAG (Directed Acyclic Graph) that shows the causal relationship of each variable. I will explain why the causal effect from A to Y can be estimated by using instrumental variables using this DAG.

image.png here,

--A: Measures / measures --Y: Outcome variable --Z: Instrumental variables
--U: Unobservable confounding variables

There are four variables called, and each is connected by an arrow. At DAG ** The variable pointing the arrow affects the variable on the stabbed side ** Interpret as.

Now, suppose you want to estimate the effect of A (measure) on Y. However, the variable U, which correlates with both A and Y, makes it impossible to interpret the correlation between A and Y as a cause and effect.

This time, I will introduce the IV method as a method to estimate this causal relationship without adjusting the confounding when U cannot be observed. (Cf: Missing variable bias)

What are instrumental variables here again? I want to explain. Instrumental variables are

  1. Correlates with measure variable A
  2. ** Have a causal relationship to the objective variable Y only through the measure variable A ** (exclusion constraint)

It is a variable like.

As anyone who knows DAG knows, Z is connected to Y only through A. The condition of 1.2 above is well described by this DAG.

This time, Consider the estimator $ \ displaystyle \ frac {Cov (Z, Y)} {Cov (Z, A)} $, which is an unbiased estimator of the mean causal effect from A to Y. [^ 1]

Why is this estimator (hereinafter referred to as IV estimator) an unbiased estimator? Here, I would like to give an intuitive explanation while referring to Yamaguchi (2019) [^ 2].

image.png Let's assume that the causal effect from Z to A is $ \ alpha $ and the causal effect from A to Y is $ \ beta $.

At this time, the denominator of $ \ displaystyle \ frac {Cov (Z, Y)} {Cov (Z, A)} $ is the estimator of $ \ alpha $, and the numerator is the estimator of $ \ alpha \ beta $. I will. Therefore, by dividing the numerator by the denominator, you can estimate the $ \ beta $ you wanted to estimate.

Well, here is an explanation of the instrumental variable method, which has been stripped of details. [^ 3]

In the following, I will introduce the instrumental variable estimation by actually simulating the above DAG-like situation.

Simulation using Python

--Generate explanatory variable A, instrumental variable Z, and confounding variable U from the multivariate normal distribution $ N (\ mu, \ Sigma) $ - \mu = [\mu_A = 0.5, \mu_Z = 1.5, \mu_U = 20.0] -$ \ Sigma: $ Covariance matrix, $ Cov (Z, A) = 0.4, Cov (A, U) = -0.7, Cov (Z, U) = 0.0 $.

import numpy as np
Sigma = np.eye(3)
#1 line/Column is A,2 lines/The row is U,3 lines/The column corresponds to Z.
Sigma[1, 0] = -0.7
Sigma[0, 1] = -0.7
Sigma[2, 0] = 0.4
Sigma[0, 2] = 0.4
#Explanatory variable A,Instrumental variable Z,And the confounding variable U are multivariate normal distribution$N(\mu, \Sigma)$Generate from
d = np.random.multivariate_normal([0.5, 1.5, 20.0], Sigma, size=10000)

>>> Sigma
array([[ 1. , -0.7,  0.4],
       [-0.7,  1. ,  0. ],
       [ 0.4,  0. ,  1. ]])

--It is assumed that the objective variable y is formulated as follows. - y = 2 A + 6U + \epsilon - where \epsilon \sim N(0, 1)

# error term
e = np.random.randn(len(d))
#Decompose d matrix to create each variable
A = d[:, 0]
u = d[:, 1]
Z = d[:, 2]
#True model of objective variable
# y = 2 A + 6U + \epsilon
y = A*2 + 6*u + e
#Confirmation of correlation
print("Cov(A, u)=", np.corrcoef(A, u)[0, 1])
print("Cov(Z, u)=", np.corrcoef(Z, u)[0, 1])
print("Cov(A, Z)=", np.corrcoef(A, Z)[0,1 ])
Cov(A, u)= -0.701004490456518
Cov(Z, u)= 0.0043542162380179215
Cov(A, Z)= 0.39744458663706667

It can be said that $ Cov (Z, U) $ is almost 0, and the relationship of each variable set by $ \ Sigma $ reproduces the situation of DAG!

Now, if you simply look at the relationship between A and Y in a scatter plot?

g=sns.scatterplot(x=A, y=y, )
g.set_ylabel("y");g.set_xlabel("A");

image.png

It looks like there is a negative correlation like this! However, in the above formula, $ y = 2 A + 6U + \ epsilon $, and the causal effect of A on Y is 2. This is because U affects both A and Y. actually,

g = sns.scatterplot(x=A, y=y-6*u, hue=u)
g.set_ylabel("outcome variable")
g.set_xlabel("Explainary variable")

image.png

Since there is a negative correlation between A and U, we can see that as A increases, U decreases.

from statsmodels.regression.linear_model import OLS 
lm = OLS(endog=y, exog=np.c_[A, u])
results = lm.fit()
results.params
#  ordinary least squares
X = sm.add_constant(A)
>>> print("If you simply regress y with X", OLS(X, y).fit().params[0])
>>> print("If you incorporate confounding factors and make multiple regressions,", results.params[0])
If you simply regress y with A-2.1853899626462927
Incorporating confounding factors and making multiple regressions, 2.022653523550131

If the confounding factor U can be observed, the average causal effect can be estimated by performing multiple regression in this true model.

Try using instrumental variable method

I would like to introduce the following function to calculate $ \ displaystyle \ frac {Cov (Z, Y)} {Cov (Z, A)} $ and actually estimate IV.

def IV(A, y, z):
    denom = np.cov(A, z)[1, 0]
    nomi = np.cov(z, y)[1, 0]
    return nomi/denom

print("Causal effect when estimated using instrumental variable method", IV(A, y, Z))
Causal effect when estimated using instrumental variable method 2.0858603407321765

I was able to successfully estimate the average causal effect without being affected by U!

At the end

It was found that the mean causal effect of A on Y can be estimated by using the IV method even if the confounding variable U cannot be observed.

However, there are two points to be aware of.

  1. Exclusion constraints cannot be substantiated from the data! Unfortunately, ** there is no way to be sure that Z is actually a variable generated under a DAG-like situation described above. ** (Because U cannot be observed) When actually using it in research, it is necessary to make full use of domain knowledge, etc., and to persuade the situation like the above DAG by using variables randomly generated under natural experimental situations. there is.
  2. If the correlation between A and Z is weak, the variance of the estimator becomes large. (Weak instruments problem)

When I actually moved only the value of $ Cov (A, Z) $ with the above settings and verified how the estimator changes, image.png It can be confirmed that if $ Cov (A, Z) $ is too small, the estimated value by the IV method will be far from the true value. The above is a brief introduction to the instrumental variable method and an explanation using simulation data.

This is an estimation method that always comes up when you read econometrics textbooks, so please study it!

Related Documents

  1. Introduction to Randomized Controlled Trials for Finding Causal Relationships for Policy Evaluation
  2. Introduction to Partial Identification by Professor Tsunao Okumura
  3. Professor Naoya Sueishi Econometrics No need to analyze micro data

In addition, @mns_econ introduced the IV method at TokyoR, so I will also introduce it.

  1. @mns_econ https://speakerdeck.com/mns54/cao-zuo-bian-shu-fa-ru-men?slide=10

[^ 1]: According to this document, it is not exactly a ** unbiased ** estimator. This was my lack of understanding. It is just a consistent estimator. $ \ frac {Cov (Z, Y)} {Cov (Z, A)} = \ frac {Cov (Z, 2A + 6U + \ epsilon)} {Cov (Z, A)} = 2 + \ frac {Cov The story is that the $ \ frac {Cov (Z, U)} {Cov (Z, A)} $ part of (Z, U)} {Cov (Z, A)} $ asymptotically converges to 0. did. I feel that I have a little lack of understanding of impartiality and consistency, so I will review it again. [^ 2]: The graph was referenced from Professor Yamaguchi's RIETI discussion paper. https://www.rieti.go.jp/jp/publications/dp/19j003.pdf [^ 3]: In practice, the above estimators also require the assumption that the causal effects of A on Y are homogeneous in each individual.

Recommended Posts

Regarding instrumental variable method
Regarding Python's Sort method