From the right-leaning Forbes magazine comes an admission that has long known to be true: Better stock markets are
positively correlated with Democratic Presidents.
Some people will armchair quarterback that correlation and causation are not the same thing. They are right. But a
lack of negative correlation
does mean a lack of negative causation. In simpler terms, overall, Democratic Presidents do not hurt the economy, and they might help it.
I thought I'd share a few comments on why correlation is not causation so that we have a sense of what questions we
should be asking when we are confronted with correlations, yet are interested in causal impact. In the context of your comment, the
treatment (by analogy with medical trials) is whether or not the President is a Democrat and the response variable you chose is the cumultative return on the S&P500 (call it Y for simplicity). What
truly want to estimate is the
expected difference between Y if D=Democrat and Y if D=Republican. Mathematically, we would write E(Y|D=1) - E(Y|D=0) with he "reference case" (D=0) being a Republican President. Then, it's just a problem of properly estimating those quantities.
One approach would be express those expected values as
functions and limit our attention to estimating
parameters. For example, if I chose a
linear function, I could express this as E(Y|D) = a + b*D with my problem now being to estimate the constant "a" and the slope "b". Obviously, the causal effect is "b" here: E(Y|D=1) - E(Y|D=0) = (a + b*1) - (a + b*0) = b. To apply this on data, we'll just say that we observe this relationship with noise -- meaning, unmodeled factors get lumped into a new variable, say, e. Hence, we get Y = a + b*D + e.
Implicit in what I just did is the idea that E(e|D) = 0 --i.e., conditional on the treatment, other factors have a null impact on average. If this is correct, under very mild conditions, there exists an unbiased estimator of b (i.e., some function of observed values of Y and D that, on average, would give you exactly b). That's the ordinary least square estimator and, in this specific context, it happens to be identical to saying your estimate of b is the sample correlation between Y and D which, given that D is either 0 or 1 (Rep. or Dem.), is just a difference between the means of Y when D=0 and D=1... And
therein lies the problem: generally, it is
not credible to assume that other factors have a null impact conditional on just the treatment, i.e. E(e|D) =/=0. Obviously, that idea generalizes to cases where D takes more than two values.
And, with regards to your comment,
IT IS PERFECTLY POSSIBLE FOR THE CAUSAL IMPACT TO HAVE THE OPPOSITE SIGN OF THE SAMPLE CORRELATION. It depends on how large is the bias in the sample correlation, but it is a mathetical possibility. But,
you may argue that some case the implied bias would be too large.
What are the questions you should ask then?
1. Well, maybe E(e|D) =/=0, but E(e|D, other variables) = 0. In that case, you can add variables to your regression and the OLS estimator will "clean" the sample correlation of the impact of those variables. Here: are there factors that influence the party of the elected president that also influence the cumulative return on the S&P500 over the tenure of said president?
2. It is possible that you cannot find all the variables you would need to do 1. In that case, you have to get fancy, but all those fancier methods rely on the idea that (1) you have to try to compare apples with apples and (2) you try to find a clever way to use "near apple" cases to help gain information about how the "apple" case works.
Another issue here is that your sense of an economy which goes well is the cumulative return on the S&P500. I am not entirely convinced it is an appropriate measure. Or even that the party holding the Whitehouse is the appropriate treatment given that offices in the House and Senate also matter a great deal, if only because this is where bills originate and are debated.