Is the average of betas from Y ~ X and X ~ Y valid?
I am interested in the relationship between two time series variables: $Y$ and $X$. The two variables are related to each other, and it's not clear from theory which one causes the other.
Given this, I have no good reason to prefer the linear regression $ Y = alpha + beta.X$ over $ X = kappa + gamma .Y $.
Clearly there is some relationship between $beta$ and $gamma$, though I recall enough statistics to understand that $beta$ = $1/ gamma$ is not true. Or perhaps it's not even close? I'm a bit hazy.
The problem is to decide how much of $X$ one ought to hold against $Y$.
I'm considering taking the average of $beta$ and $1/ gamma$ and using that as the hedge ratio.
Is the average of $beta$ and $1/ gamma$ a meaningful concept? What is the appropriate way to deal with the fact that the two variables are related to each other -- meaning that there really isn't an independent and dependent variable?
regression regression-coefficients
|
show 11 more comments
I am interested in the relationship between two time series variables: $Y$ and $X$. The two variables are related to each other, and it's not clear from theory which one causes the other.
Given this, I have no good reason to prefer the linear regression $ Y = alpha + beta.X$ over $ X = kappa + gamma .Y $.
Clearly there is some relationship between $beta$ and $gamma$, though I recall enough statistics to understand that $beta$ = $1/ gamma$ is not true. Or perhaps it's not even close? I'm a bit hazy.
The problem is to decide how much of $X$ one ought to hold against $Y$.
I'm considering taking the average of $beta$ and $1/ gamma$ and using that as the hedge ratio.
Is the average of $beta$ and $1/ gamma$ a meaningful concept? What is the appropriate way to deal with the fact that the two variables are related to each other -- meaning that there really isn't an independent and dependent variable?
regression regression-coefficients
The problem is not causality but instead the errors of measurement (it is just that often the dependent variable Y is the one with large measurement error, making "Y = a + B x + error" the common expression) Do you have an idea about the errors in the measurement of X and Y.
– Martijn Weterings
yesterday
To determine causality you need a controlled experiment. An experiment where you are able to change some variable independently from the others. (or a very unique situation where two populations can be considered/assumed equal except for one or more particular variables that are to be considered as "independent" variables)
– Martijn Weterings
yesterday
1
The exact values of $beta$ and $gamma$ can be found in this answer of mine to Effect of switching responses and explanatory variables..., and, as you suspect, $beta$ is not the reciprocal of $gamma$, and averaging $beta$ and $1/gamma$ is not the right way to go. A pictorial view of what $beta$ and $gamma$ are minimizing is given in Elvis's answer to the same question, and he introduces a"least rectangles" regression that you might want .....
– Dilip Sarwate
yesterday
3
You are in the ideal scenario where the choice of technique has a direct, physically measurable impact; you can simply measure the out-of-sample hedging error for each estimate, and compare them. Also, typically optimal hedging is better handled by using a VECM model (see for example Gatarek & Johansen, 2014, Optimal hedging with the cointegrated vector autoregressive model), which does not require choosing to model Y as a function of X or vice-versa.
– Chris Haug
23 hours ago
1
You might want to look at the geometric mean $sqrt{dfrac{beta}{gamma}}$ as a possibility (if they are both negative you might take the negative square root). Then look at $dfrac{s_y}{s_x}$, which should be very similar
– Henry
21 hours ago
|
show 11 more comments
I am interested in the relationship between two time series variables: $Y$ and $X$. The two variables are related to each other, and it's not clear from theory which one causes the other.
Given this, I have no good reason to prefer the linear regression $ Y = alpha + beta.X$ over $ X = kappa + gamma .Y $.
Clearly there is some relationship between $beta$ and $gamma$, though I recall enough statistics to understand that $beta$ = $1/ gamma$ is not true. Or perhaps it's not even close? I'm a bit hazy.
The problem is to decide how much of $X$ one ought to hold against $Y$.
I'm considering taking the average of $beta$ and $1/ gamma$ and using that as the hedge ratio.
Is the average of $beta$ and $1/ gamma$ a meaningful concept? What is the appropriate way to deal with the fact that the two variables are related to each other -- meaning that there really isn't an independent and dependent variable?
regression regression-coefficients
I am interested in the relationship between two time series variables: $Y$ and $X$. The two variables are related to each other, and it's not clear from theory which one causes the other.
Given this, I have no good reason to prefer the linear regression $ Y = alpha + beta.X$ over $ X = kappa + gamma .Y $.
Clearly there is some relationship between $beta$ and $gamma$, though I recall enough statistics to understand that $beta$ = $1/ gamma$ is not true. Or perhaps it's not even close? I'm a bit hazy.
The problem is to decide how much of $X$ one ought to hold against $Y$.
I'm considering taking the average of $beta$ and $1/ gamma$ and using that as the hedge ratio.
Is the average of $beta$ and $1/ gamma$ a meaningful concept? What is the appropriate way to deal with the fact that the two variables are related to each other -- meaning that there really isn't an independent and dependent variable?
regression regression-coefficients
regression regression-coefficients
edited yesterday
asked yesterday
ricardo
1385
1385
The problem is not causality but instead the errors of measurement (it is just that often the dependent variable Y is the one with large measurement error, making "Y = a + B x + error" the common expression) Do you have an idea about the errors in the measurement of X and Y.
– Martijn Weterings
yesterday
To determine causality you need a controlled experiment. An experiment where you are able to change some variable independently from the others. (or a very unique situation where two populations can be considered/assumed equal except for one or more particular variables that are to be considered as "independent" variables)
– Martijn Weterings
yesterday
1
The exact values of $beta$ and $gamma$ can be found in this answer of mine to Effect of switching responses and explanatory variables..., and, as you suspect, $beta$ is not the reciprocal of $gamma$, and averaging $beta$ and $1/gamma$ is not the right way to go. A pictorial view of what $beta$ and $gamma$ are minimizing is given in Elvis's answer to the same question, and he introduces a"least rectangles" regression that you might want .....
– Dilip Sarwate
yesterday
3
You are in the ideal scenario where the choice of technique has a direct, physically measurable impact; you can simply measure the out-of-sample hedging error for each estimate, and compare them. Also, typically optimal hedging is better handled by using a VECM model (see for example Gatarek & Johansen, 2014, Optimal hedging with the cointegrated vector autoregressive model), which does not require choosing to model Y as a function of X or vice-versa.
– Chris Haug
23 hours ago
1
You might want to look at the geometric mean $sqrt{dfrac{beta}{gamma}}$ as a possibility (if they are both negative you might take the negative square root). Then look at $dfrac{s_y}{s_x}$, which should be very similar
– Henry
21 hours ago
|
show 11 more comments
The problem is not causality but instead the errors of measurement (it is just that often the dependent variable Y is the one with large measurement error, making "Y = a + B x + error" the common expression) Do you have an idea about the errors in the measurement of X and Y.
– Martijn Weterings
yesterday
To determine causality you need a controlled experiment. An experiment where you are able to change some variable independently from the others. (or a very unique situation where two populations can be considered/assumed equal except for one or more particular variables that are to be considered as "independent" variables)
– Martijn Weterings
yesterday
1
The exact values of $beta$ and $gamma$ can be found in this answer of mine to Effect of switching responses and explanatory variables..., and, as you suspect, $beta$ is not the reciprocal of $gamma$, and averaging $beta$ and $1/gamma$ is not the right way to go. A pictorial view of what $beta$ and $gamma$ are minimizing is given in Elvis's answer to the same question, and he introduces a"least rectangles" regression that you might want .....
– Dilip Sarwate
yesterday
3
You are in the ideal scenario where the choice of technique has a direct, physically measurable impact; you can simply measure the out-of-sample hedging error for each estimate, and compare them. Also, typically optimal hedging is better handled by using a VECM model (see for example Gatarek & Johansen, 2014, Optimal hedging with the cointegrated vector autoregressive model), which does not require choosing to model Y as a function of X or vice-versa.
– Chris Haug
23 hours ago
1
You might want to look at the geometric mean $sqrt{dfrac{beta}{gamma}}$ as a possibility (if they are both negative you might take the negative square root). Then look at $dfrac{s_y}{s_x}$, which should be very similar
– Henry
21 hours ago
The problem is not causality but instead the errors of measurement (it is just that often the dependent variable Y is the one with large measurement error, making "Y = a + B x + error" the common expression) Do you have an idea about the errors in the measurement of X and Y.
– Martijn Weterings
yesterday
The problem is not causality but instead the errors of measurement (it is just that often the dependent variable Y is the one with large measurement error, making "Y = a + B x + error" the common expression) Do you have an idea about the errors in the measurement of X and Y.
– Martijn Weterings
yesterday
To determine causality you need a controlled experiment. An experiment where you are able to change some variable independently from the others. (or a very unique situation where two populations can be considered/assumed equal except for one or more particular variables that are to be considered as "independent" variables)
– Martijn Weterings
yesterday
To determine causality you need a controlled experiment. An experiment where you are able to change some variable independently from the others. (or a very unique situation where two populations can be considered/assumed equal except for one or more particular variables that are to be considered as "independent" variables)
– Martijn Weterings
yesterday
1
1
The exact values of $beta$ and $gamma$ can be found in this answer of mine to Effect of switching responses and explanatory variables..., and, as you suspect, $beta$ is not the reciprocal of $gamma$, and averaging $beta$ and $1/gamma$ is not the right way to go. A pictorial view of what $beta$ and $gamma$ are minimizing is given in Elvis's answer to the same question, and he introduces a"least rectangles" regression that you might want .....
– Dilip Sarwate
yesterday
The exact values of $beta$ and $gamma$ can be found in this answer of mine to Effect of switching responses and explanatory variables..., and, as you suspect, $beta$ is not the reciprocal of $gamma$, and averaging $beta$ and $1/gamma$ is not the right way to go. A pictorial view of what $beta$ and $gamma$ are minimizing is given in Elvis's answer to the same question, and he introduces a"least rectangles" regression that you might want .....
– Dilip Sarwate
yesterday
3
3
You are in the ideal scenario where the choice of technique has a direct, physically measurable impact; you can simply measure the out-of-sample hedging error for each estimate, and compare them. Also, typically optimal hedging is better handled by using a VECM model (see for example Gatarek & Johansen, 2014, Optimal hedging with the cointegrated vector autoregressive model), which does not require choosing to model Y as a function of X or vice-versa.
– Chris Haug
23 hours ago
You are in the ideal scenario where the choice of technique has a direct, physically measurable impact; you can simply measure the out-of-sample hedging error for each estimate, and compare them. Also, typically optimal hedging is better handled by using a VECM model (see for example Gatarek & Johansen, 2014, Optimal hedging with the cointegrated vector autoregressive model), which does not require choosing to model Y as a function of X or vice-versa.
– Chris Haug
23 hours ago
1
1
You might want to look at the geometric mean $sqrt{dfrac{beta}{gamma}}$ as a possibility (if they are both negative you might take the negative square root). Then look at $dfrac{s_y}{s_x}$, which should be very similar
– Henry
21 hours ago
You might want to look at the geometric mean $sqrt{dfrac{beta}{gamma}}$ as a possibility (if they are both negative you might take the negative square root). Then look at $dfrac{s_y}{s_x}$, which should be very similar
– Henry
21 hours ago
|
show 11 more comments
4 Answers
4
active
oldest
votes
Converted from a comment.....
The exact values of $beta$ and $gamma$
can be found in this answer of mine to Effect of switching responses and explanatory variables in simple linear regression, and, as you suspect,
$beta$ is not the reciprocal of $gamma$, and averaging $beta$ and $gamma$
(or averaging $beta$ and $1/gamma$) is not the right way to go. A pictorial view of what $beta$ and $gamma$
are minimizing is given in Elvis's answer to the same question, and in the answer, he introduces a "least rectangles" regression that might be what you are looking for. The comments following Elvis's answer should not be neglected; they relate this "least rectangles" regression to other, previously studied, techniques. In particular, note that Moderator chl points out that this method is of interest when it is not clear which is the predictor variable and which the response variable.
add a comment |
To see the connection between both representations, take a bivariate Normal vector:
$$
begin{pmatrix}
X_1 \
X_2
end{pmatrix} sim mathcal{N} left( begin{pmatrix}
mu_1 \
mu_2
end{pmatrix} , begin{pmatrix}
sigma^2_1 & rho sigma_1 sigma_2 \
rho sigma_1 sigma_2 & sigma^2_2
end{pmatrix} right)
$$
with conditionals
$$X_1 mid X_2=x_2 sim mathcal{N} left( mu_1 + rho frac{sigma_1}{sigma_2}(x_2 - mu_2),(1-rho^2)sigma^2_1 right)$$
and
$$X_2 mid X_1=x_1 sim mathcal{N} left( mu_2 + rho frac{sigma_2}{sigma_1}(x_1 - mu_1),(1-rho^2)sigma^2_2 right)$$
This means that
$$X_1=underbrace{left(mu_1-rho frac{sigma_1}{sigma_2}mu_2right)}_alpha+underbrace{rho frac{sigma_1}{sigma_2}}_beta X_2+sqrt{1-rho^2}sigma_1epsilon_1$$
and
$$X_2=underbrace{left(mu_2-rho frac{sigma_2}{sigma_1}mu_1right)}_kappa+underbrace{rho frac{sigma_2}{sigma_1}}_gamma X_1+sqrt{1-rho^2}sigma_2epsilon_2$$
which means (a) $gamma$ is not $1/beta$ and (b) the connection between the two regressions depends on the joint distribution of $(X_1,X_2)$.
How would I decide if the average of the two betas is a better measure of the hedge ratio than one or the other?
– ricardo
yesterday
4
I have no idea.
– Xi'an
yesterday
@ricardo By measuring the out-of-sample hedging error under each estimate, which is ultimately what you are trying to minimize.
– Chris Haug
23 hours ago
add a comment |
$beta$ and $gamma$
As Xi'an noted in his answer the $beta$ and $gamma$ are related to each other by relating to the conditional means $X|Y$ and $Y|X$ (which in their turn relate to a single joint distribution) these are not symmetric in the sense that $beta = 1/gamma$. This is neither the case if you would 'know' the true $sigma$ and $rho$ instead of using estimates. You have $$beta = rho_{XY} frac{sigma_Y}{sigma_X}$$ and $$gamma = rho_{XY} frac{sigma_X}{sigma_Y}$$
See also simple linear regression on wikipedia for computation of the $beta$ and $gamma$.
It is this correlation term which sort of disturbs the symmetry. When the $beta$ and $gamma$ would be simply the ratio of the standard deviation $sigma_Y/sigma_X$ and $sigma_X/sigma_Y$ then they would indeed be each others inverse. The $rho_{XY}$ term can be seen as modifying this as a sort of regression to the mean. With perfect correlation $rho_{XY} = 1$ then you can fully predict $X$ based on $Y$ or vice versa. But with $rho_{XY} < 1$ you can not make those perfect predictions and the conditional mean will be somewhat closer to the unconditional mean, in comparison to a simple scaling by $sigma_Y/sigma_X$ or $sigma_X/sigma_Y$.
Is a regression line the right method?
You may wonder whether these conditional probabilities and regression lines is what you need to determine your ratios of $X$ and $Y$. It is unclear to me how you would wish to use a regression line in the computation of an optimal ratio.
Below is an alternative way to compute the ratio. This method does have symmetry (ie if you switch X and Y then you will get the same ratio).
Alternative
Say, the yields of bonds $X$ and $Y$ are distributed according to a multivariate normal distribution$^dagger$ with correlation $rho_{XY}$ and standard deviations $sigma_X$ and $sigma_Y$ then the yield of a hedge that is sum of $X$ and $Y$ will be normal distributed:
$$H = alpha X + (1-alpha) Y sim N(mu_H,sigma_H^2)$$
were $0 leq alpha leq 1$ and with
$$begin{array}{rcl}
mu_H &=& alpha mu_X+(1-alpha) mu_Y \
sigma_H^2 &=& alpha^2 sigma_X^2 + (1-alpha)^2 sigma_Y^2 + 2 alpha (1-alpha) rho_{XY} sigma_X sigma_Y \
& =& alpha^2(sigma_X^2+sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y) + alpha (-2 sigma_Y^2+2rho_{XY}sigma_Xsigma_Y) +sigma_Y^2
end{array} $$
The maximum of the mean $mu_H$ will be at $$alpha = 0 text{ or } alpha=1$$ or not existing when $mu_X=mu_Y$.
The minimum of the variance $sigma_H^2$ will be at $$alpha = 1 - frac{sigma_X^2 -rho_{XY}sigma_Xsigma_Y}{sigma_X^2 +sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y} = frac{sigma_Y^2-rho_{XY}sigma_Xsigma_Y}{sigma_X^2+sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y} $$
The optimum will be somewhere in between those two extremes and depends on how you wish to compare losses and gains
Note that now there is a symmetry between $alpha$ and $1-alpha$. It does not matter whether you use the hedge $H=alpha_1 X+(1-alpha_1)Y$ or the hedge $H=alpha_2 Y + (1-alpha_2) X$. You will get the same ratios in terms of $alpha_1 = 1-alpha_2$.
Minimal variance case and relation with principle components
In the minimal variance case (here you actually do not need to assume a multivariate Normal distribution) you get the following hedge ratio as optimum $$frac{alpha}{1-alpha} = frac{var(Y) - cov(X,Y)}{var(X)-cov(X,Y)}$$ which can be expressed in terms of the regression coefficients $beta = cov(X,Y)/var(X)$ and $gamma = cov(X,Y)/var(Y)$ and is as following $$frac{alpha}{1-alpha} = frac{1-beta}{1-gamma}$$
In a situation with more than two variables/stocks/bonds you might generalize this to the last (smallest eigenvalue) principle component.
Variants
Improvements of the model can be made by using different distributions than multivariate normal. Also you could incorporate the time in a more sophisticated model to make better predictions of future values for the pair $X,Y$.
$dagger$ This is a simplification but it suits the purpose of explaining how one can, and should, perform the analysis to find an optimal ratio without a regression line.
1
I am sorry, but as a physicist, I know too little about the language (long, short, holdings, etc.) related to stocks, bonds and finance. If you could cast it in simpler language I might be able to understand it and work with it. My answer is just a very simple expression that is unaware of the details and possibilities how to express hedging and stocks, but it shows the basic principle how you can get away from the use of a regression line (go back to first principles, express the model for profit which is at the core instead of using regression lines whose relevance is not directly clear).
– Martijn Weterings
4 hours ago
I think i understand. The problem is that 1/ρ_{XY} ne p_{XY}$. indeed, $p_{XY}$ often changes quite and bit when we take the inverse. Your alternative is close to the case I am thinking about, but i do want to check one thing: does this allow non-negative holdings? Adopting your terminology, i'd have a unit holding of bond X, and a negative holding of Y. Say long one unit of bond X and short (say) 1.2 units of bond Y ... but it could be 0.2 units or 5 units, depending on the math.
– ricardo
4 hours ago
long means that i make 1% on a bond if the price increases by ~1%; short means that i lose ~1% on a bond if the price increases by ~1%. So the idea is that i am long one unit of one bond (so i benefit from an appreciation) and am short some amount of the other bond (so i lose from an appreciation).
– ricardo
3 hours ago
"The problem is to decide how much of X one ought to hold against Y." My problem with this is that there is no explanation/model/expression how you decide about this. How do you define losses and gains and how much do you value them?
– Martijn Weterings
3 hours ago
Are there costs associated with being short and long? I imagine that you have a given amount to invest and this limits how much you can be short/long in those bonds. Then based on your previous knowledge you can estimate/determine the distribution of losses/gains for whatever combination on that limit. Finally, based on some function that determines how you value losses and gains (this expresses why/how you hedge) you can decide which combination to choose.
– Martijn Weterings
3 hours ago
|
show 4 more comments
Perhaps the approach of "Granger causality" might help. This would help you to assess whether X is a good predictor of Y or whether X is a better of Y. In other words, it tells you whether beta or gamma is the thing to take more seriously. Also, considering that you are dealing with time series data, it tells you how much of the history of X counts towards the prediction of Y (or vice versa).
Wikipedia gives a simple explanation:
A time series X is said to Granger-cause Y if it can be shown, usually through a series of t-tests and F-tests on lagged values of X (and with lagged values of Y also included), that those X values provide statistically significant information about future values of Y.
What you do is the following:
- regress X(t-1) and Y(t-1) on Y(t)
- regress X(t-1), X(t-2), Y(t-1), Y(t-2) on Y(t)
- regress X(t-1), X(t-2), X(t-3), Y(t-1), Y(t-2), Y(t-3) on Y(t)
Continue for whatever history length might be reasonable. Check the significance of the F-statistics for each regression.
Then do the same the reverse (so, now regress the past values of X and Y on X(t)) and see which regressions have significant F-values.
A very straightforward example, with R code, is found here.
Granger causality has been critiqued for not actually establishing causality (in some cases). But it seems that you application is really about "predictive causality," which is exactly what the Granger causality approach is meant for.
The point is that the approach will tell you whether X predicts Y or whether Y predicts X (so you no longer would be tempted to artificially--and incorrectly--compound the two regression coefficients) and it gives you a better prediction (as you will know how much history of X and Y you need to know to predict Y), which is useful for hedging purposes, right?
I have a strong theoretical reason to believe that neither is truly a cause, and that even if one became a cause that it would not remain true over time. So i don't think that Granger Causailty is the answer in this case. I've upvoted the answer in any case, as it is useful -- esp. the R code.
– ricardo
12 hours ago
That is why I mention that "Granger causality has been critiqued for not actually establishing causality (in some cases)." It seems to me that the question here is more about "predictive causality," which is what Granger causality is meant for.
– Steve G. Jones
8 hours ago
That is why I explicitly mention that "Granger causality has been critiqued for not actually establishing causality (in some cases)." It seems to me that your question is more about establishing "predictive causality," which is what Granger causality is meant for. In addition, Granger's approach uses the information in your time series data, which are a waste not to use if you have them. Of course, you can (should?) re-estimate the effects over time. I expect that the Granger effects are more stable than cross-sectional OLS (you can test this beforehand, using historical data). HTH
– Steve G. Jones
8 hours ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f385812%2fis-the-average-of-betas-from-y-x-and-x-y-valid%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
Converted from a comment.....
The exact values of $beta$ and $gamma$
can be found in this answer of mine to Effect of switching responses and explanatory variables in simple linear regression, and, as you suspect,
$beta$ is not the reciprocal of $gamma$, and averaging $beta$ and $gamma$
(or averaging $beta$ and $1/gamma$) is not the right way to go. A pictorial view of what $beta$ and $gamma$
are minimizing is given in Elvis's answer to the same question, and in the answer, he introduces a "least rectangles" regression that might be what you are looking for. The comments following Elvis's answer should not be neglected; they relate this "least rectangles" regression to other, previously studied, techniques. In particular, note that Moderator chl points out that this method is of interest when it is not clear which is the predictor variable and which the response variable.
add a comment |
Converted from a comment.....
The exact values of $beta$ and $gamma$
can be found in this answer of mine to Effect of switching responses and explanatory variables in simple linear regression, and, as you suspect,
$beta$ is not the reciprocal of $gamma$, and averaging $beta$ and $gamma$
(or averaging $beta$ and $1/gamma$) is not the right way to go. A pictorial view of what $beta$ and $gamma$
are minimizing is given in Elvis's answer to the same question, and in the answer, he introduces a "least rectangles" regression that might be what you are looking for. The comments following Elvis's answer should not be neglected; they relate this "least rectangles" regression to other, previously studied, techniques. In particular, note that Moderator chl points out that this method is of interest when it is not clear which is the predictor variable and which the response variable.
add a comment |
Converted from a comment.....
The exact values of $beta$ and $gamma$
can be found in this answer of mine to Effect of switching responses and explanatory variables in simple linear regression, and, as you suspect,
$beta$ is not the reciprocal of $gamma$, and averaging $beta$ and $gamma$
(or averaging $beta$ and $1/gamma$) is not the right way to go. A pictorial view of what $beta$ and $gamma$
are minimizing is given in Elvis's answer to the same question, and in the answer, he introduces a "least rectangles" regression that might be what you are looking for. The comments following Elvis's answer should not be neglected; they relate this "least rectangles" regression to other, previously studied, techniques. In particular, note that Moderator chl points out that this method is of interest when it is not clear which is the predictor variable and which the response variable.
Converted from a comment.....
The exact values of $beta$ and $gamma$
can be found in this answer of mine to Effect of switching responses and explanatory variables in simple linear regression, and, as you suspect,
$beta$ is not the reciprocal of $gamma$, and averaging $beta$ and $gamma$
(or averaging $beta$ and $1/gamma$) is not the right way to go. A pictorial view of what $beta$ and $gamma$
are minimizing is given in Elvis's answer to the same question, and in the answer, he introduces a "least rectangles" regression that might be what you are looking for. The comments following Elvis's answer should not be neglected; they relate this "least rectangles" regression to other, previously studied, techniques. In particular, note that Moderator chl points out that this method is of interest when it is not clear which is the predictor variable and which the response variable.
answered 16 hours ago
Dilip Sarwate
29.9k252147
29.9k252147
add a comment |
add a comment |
To see the connection between both representations, take a bivariate Normal vector:
$$
begin{pmatrix}
X_1 \
X_2
end{pmatrix} sim mathcal{N} left( begin{pmatrix}
mu_1 \
mu_2
end{pmatrix} , begin{pmatrix}
sigma^2_1 & rho sigma_1 sigma_2 \
rho sigma_1 sigma_2 & sigma^2_2
end{pmatrix} right)
$$
with conditionals
$$X_1 mid X_2=x_2 sim mathcal{N} left( mu_1 + rho frac{sigma_1}{sigma_2}(x_2 - mu_2),(1-rho^2)sigma^2_1 right)$$
and
$$X_2 mid X_1=x_1 sim mathcal{N} left( mu_2 + rho frac{sigma_2}{sigma_1}(x_1 - mu_1),(1-rho^2)sigma^2_2 right)$$
This means that
$$X_1=underbrace{left(mu_1-rho frac{sigma_1}{sigma_2}mu_2right)}_alpha+underbrace{rho frac{sigma_1}{sigma_2}}_beta X_2+sqrt{1-rho^2}sigma_1epsilon_1$$
and
$$X_2=underbrace{left(mu_2-rho frac{sigma_2}{sigma_1}mu_1right)}_kappa+underbrace{rho frac{sigma_2}{sigma_1}}_gamma X_1+sqrt{1-rho^2}sigma_2epsilon_2$$
which means (a) $gamma$ is not $1/beta$ and (b) the connection between the two regressions depends on the joint distribution of $(X_1,X_2)$.
How would I decide if the average of the two betas is a better measure of the hedge ratio than one or the other?
– ricardo
yesterday
4
I have no idea.
– Xi'an
yesterday
@ricardo By measuring the out-of-sample hedging error under each estimate, which is ultimately what you are trying to minimize.
– Chris Haug
23 hours ago
add a comment |
To see the connection between both representations, take a bivariate Normal vector:
$$
begin{pmatrix}
X_1 \
X_2
end{pmatrix} sim mathcal{N} left( begin{pmatrix}
mu_1 \
mu_2
end{pmatrix} , begin{pmatrix}
sigma^2_1 & rho sigma_1 sigma_2 \
rho sigma_1 sigma_2 & sigma^2_2
end{pmatrix} right)
$$
with conditionals
$$X_1 mid X_2=x_2 sim mathcal{N} left( mu_1 + rho frac{sigma_1}{sigma_2}(x_2 - mu_2),(1-rho^2)sigma^2_1 right)$$
and
$$X_2 mid X_1=x_1 sim mathcal{N} left( mu_2 + rho frac{sigma_2}{sigma_1}(x_1 - mu_1),(1-rho^2)sigma^2_2 right)$$
This means that
$$X_1=underbrace{left(mu_1-rho frac{sigma_1}{sigma_2}mu_2right)}_alpha+underbrace{rho frac{sigma_1}{sigma_2}}_beta X_2+sqrt{1-rho^2}sigma_1epsilon_1$$
and
$$X_2=underbrace{left(mu_2-rho frac{sigma_2}{sigma_1}mu_1right)}_kappa+underbrace{rho frac{sigma_2}{sigma_1}}_gamma X_1+sqrt{1-rho^2}sigma_2epsilon_2$$
which means (a) $gamma$ is not $1/beta$ and (b) the connection between the two regressions depends on the joint distribution of $(X_1,X_2)$.
How would I decide if the average of the two betas is a better measure of the hedge ratio than one or the other?
– ricardo
yesterday
4
I have no idea.
– Xi'an
yesterday
@ricardo By measuring the out-of-sample hedging error under each estimate, which is ultimately what you are trying to minimize.
– Chris Haug
23 hours ago
add a comment |
To see the connection between both representations, take a bivariate Normal vector:
$$
begin{pmatrix}
X_1 \
X_2
end{pmatrix} sim mathcal{N} left( begin{pmatrix}
mu_1 \
mu_2
end{pmatrix} , begin{pmatrix}
sigma^2_1 & rho sigma_1 sigma_2 \
rho sigma_1 sigma_2 & sigma^2_2
end{pmatrix} right)
$$
with conditionals
$$X_1 mid X_2=x_2 sim mathcal{N} left( mu_1 + rho frac{sigma_1}{sigma_2}(x_2 - mu_2),(1-rho^2)sigma^2_1 right)$$
and
$$X_2 mid X_1=x_1 sim mathcal{N} left( mu_2 + rho frac{sigma_2}{sigma_1}(x_1 - mu_1),(1-rho^2)sigma^2_2 right)$$
This means that
$$X_1=underbrace{left(mu_1-rho frac{sigma_1}{sigma_2}mu_2right)}_alpha+underbrace{rho frac{sigma_1}{sigma_2}}_beta X_2+sqrt{1-rho^2}sigma_1epsilon_1$$
and
$$X_2=underbrace{left(mu_2-rho frac{sigma_2}{sigma_1}mu_1right)}_kappa+underbrace{rho frac{sigma_2}{sigma_1}}_gamma X_1+sqrt{1-rho^2}sigma_2epsilon_2$$
which means (a) $gamma$ is not $1/beta$ and (b) the connection between the two regressions depends on the joint distribution of $(X_1,X_2)$.
To see the connection between both representations, take a bivariate Normal vector:
$$
begin{pmatrix}
X_1 \
X_2
end{pmatrix} sim mathcal{N} left( begin{pmatrix}
mu_1 \
mu_2
end{pmatrix} , begin{pmatrix}
sigma^2_1 & rho sigma_1 sigma_2 \
rho sigma_1 sigma_2 & sigma^2_2
end{pmatrix} right)
$$
with conditionals
$$X_1 mid X_2=x_2 sim mathcal{N} left( mu_1 + rho frac{sigma_1}{sigma_2}(x_2 - mu_2),(1-rho^2)sigma^2_1 right)$$
and
$$X_2 mid X_1=x_1 sim mathcal{N} left( mu_2 + rho frac{sigma_2}{sigma_1}(x_1 - mu_1),(1-rho^2)sigma^2_2 right)$$
This means that
$$X_1=underbrace{left(mu_1-rho frac{sigma_1}{sigma_2}mu_2right)}_alpha+underbrace{rho frac{sigma_1}{sigma_2}}_beta X_2+sqrt{1-rho^2}sigma_1epsilon_1$$
and
$$X_2=underbrace{left(mu_2-rho frac{sigma_2}{sigma_1}mu_1right)}_kappa+underbrace{rho frac{sigma_2}{sigma_1}}_gamma X_1+sqrt{1-rho^2}sigma_2epsilon_2$$
which means (a) $gamma$ is not $1/beta$ and (b) the connection between the two regressions depends on the joint distribution of $(X_1,X_2)$.
edited 5 hours ago
Martijn Weterings
12.6k1355
12.6k1355
answered yesterday
Xi'an
54k690348
54k690348
How would I decide if the average of the two betas is a better measure of the hedge ratio than one or the other?
– ricardo
yesterday
4
I have no idea.
– Xi'an
yesterday
@ricardo By measuring the out-of-sample hedging error under each estimate, which is ultimately what you are trying to minimize.
– Chris Haug
23 hours ago
add a comment |
How would I decide if the average of the two betas is a better measure of the hedge ratio than one or the other?
– ricardo
yesterday
4
I have no idea.
– Xi'an
yesterday
@ricardo By measuring the out-of-sample hedging error under each estimate, which is ultimately what you are trying to minimize.
– Chris Haug
23 hours ago
How would I decide if the average of the two betas is a better measure of the hedge ratio than one or the other?
– ricardo
yesterday
How would I decide if the average of the two betas is a better measure of the hedge ratio than one or the other?
– ricardo
yesterday
4
4
I have no idea.
– Xi'an
yesterday
I have no idea.
– Xi'an
yesterday
@ricardo By measuring the out-of-sample hedging error under each estimate, which is ultimately what you are trying to minimize.
– Chris Haug
23 hours ago
@ricardo By measuring the out-of-sample hedging error under each estimate, which is ultimately what you are trying to minimize.
– Chris Haug
23 hours ago
add a comment |
$beta$ and $gamma$
As Xi'an noted in his answer the $beta$ and $gamma$ are related to each other by relating to the conditional means $X|Y$ and $Y|X$ (which in their turn relate to a single joint distribution) these are not symmetric in the sense that $beta = 1/gamma$. This is neither the case if you would 'know' the true $sigma$ and $rho$ instead of using estimates. You have $$beta = rho_{XY} frac{sigma_Y}{sigma_X}$$ and $$gamma = rho_{XY} frac{sigma_X}{sigma_Y}$$
See also simple linear regression on wikipedia for computation of the $beta$ and $gamma$.
It is this correlation term which sort of disturbs the symmetry. When the $beta$ and $gamma$ would be simply the ratio of the standard deviation $sigma_Y/sigma_X$ and $sigma_X/sigma_Y$ then they would indeed be each others inverse. The $rho_{XY}$ term can be seen as modifying this as a sort of regression to the mean. With perfect correlation $rho_{XY} = 1$ then you can fully predict $X$ based on $Y$ or vice versa. But with $rho_{XY} < 1$ you can not make those perfect predictions and the conditional mean will be somewhat closer to the unconditional mean, in comparison to a simple scaling by $sigma_Y/sigma_X$ or $sigma_X/sigma_Y$.
Is a regression line the right method?
You may wonder whether these conditional probabilities and regression lines is what you need to determine your ratios of $X$ and $Y$. It is unclear to me how you would wish to use a regression line in the computation of an optimal ratio.
Below is an alternative way to compute the ratio. This method does have symmetry (ie if you switch X and Y then you will get the same ratio).
Alternative
Say, the yields of bonds $X$ and $Y$ are distributed according to a multivariate normal distribution$^dagger$ with correlation $rho_{XY}$ and standard deviations $sigma_X$ and $sigma_Y$ then the yield of a hedge that is sum of $X$ and $Y$ will be normal distributed:
$$H = alpha X + (1-alpha) Y sim N(mu_H,sigma_H^2)$$
were $0 leq alpha leq 1$ and with
$$begin{array}{rcl}
mu_H &=& alpha mu_X+(1-alpha) mu_Y \
sigma_H^2 &=& alpha^2 sigma_X^2 + (1-alpha)^2 sigma_Y^2 + 2 alpha (1-alpha) rho_{XY} sigma_X sigma_Y \
& =& alpha^2(sigma_X^2+sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y) + alpha (-2 sigma_Y^2+2rho_{XY}sigma_Xsigma_Y) +sigma_Y^2
end{array} $$
The maximum of the mean $mu_H$ will be at $$alpha = 0 text{ or } alpha=1$$ or not existing when $mu_X=mu_Y$.
The minimum of the variance $sigma_H^2$ will be at $$alpha = 1 - frac{sigma_X^2 -rho_{XY}sigma_Xsigma_Y}{sigma_X^2 +sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y} = frac{sigma_Y^2-rho_{XY}sigma_Xsigma_Y}{sigma_X^2+sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y} $$
The optimum will be somewhere in between those two extremes and depends on how you wish to compare losses and gains
Note that now there is a symmetry between $alpha$ and $1-alpha$. It does not matter whether you use the hedge $H=alpha_1 X+(1-alpha_1)Y$ or the hedge $H=alpha_2 Y + (1-alpha_2) X$. You will get the same ratios in terms of $alpha_1 = 1-alpha_2$.
Minimal variance case and relation with principle components
In the minimal variance case (here you actually do not need to assume a multivariate Normal distribution) you get the following hedge ratio as optimum $$frac{alpha}{1-alpha} = frac{var(Y) - cov(X,Y)}{var(X)-cov(X,Y)}$$ which can be expressed in terms of the regression coefficients $beta = cov(X,Y)/var(X)$ and $gamma = cov(X,Y)/var(Y)$ and is as following $$frac{alpha}{1-alpha} = frac{1-beta}{1-gamma}$$
In a situation with more than two variables/stocks/bonds you might generalize this to the last (smallest eigenvalue) principle component.
Variants
Improvements of the model can be made by using different distributions than multivariate normal. Also you could incorporate the time in a more sophisticated model to make better predictions of future values for the pair $X,Y$.
$dagger$ This is a simplification but it suits the purpose of explaining how one can, and should, perform the analysis to find an optimal ratio without a regression line.
1
I am sorry, but as a physicist, I know too little about the language (long, short, holdings, etc.) related to stocks, bonds and finance. If you could cast it in simpler language I might be able to understand it and work with it. My answer is just a very simple expression that is unaware of the details and possibilities how to express hedging and stocks, but it shows the basic principle how you can get away from the use of a regression line (go back to first principles, express the model for profit which is at the core instead of using regression lines whose relevance is not directly clear).
– Martijn Weterings
4 hours ago
I think i understand. The problem is that 1/ρ_{XY} ne p_{XY}$. indeed, $p_{XY}$ often changes quite and bit when we take the inverse. Your alternative is close to the case I am thinking about, but i do want to check one thing: does this allow non-negative holdings? Adopting your terminology, i'd have a unit holding of bond X, and a negative holding of Y. Say long one unit of bond X and short (say) 1.2 units of bond Y ... but it could be 0.2 units or 5 units, depending on the math.
– ricardo
4 hours ago
long means that i make 1% on a bond if the price increases by ~1%; short means that i lose ~1% on a bond if the price increases by ~1%. So the idea is that i am long one unit of one bond (so i benefit from an appreciation) and am short some amount of the other bond (so i lose from an appreciation).
– ricardo
3 hours ago
"The problem is to decide how much of X one ought to hold against Y." My problem with this is that there is no explanation/model/expression how you decide about this. How do you define losses and gains and how much do you value them?
– Martijn Weterings
3 hours ago
Are there costs associated with being short and long? I imagine that you have a given amount to invest and this limits how much you can be short/long in those bonds. Then based on your previous knowledge you can estimate/determine the distribution of losses/gains for whatever combination on that limit. Finally, based on some function that determines how you value losses and gains (this expresses why/how you hedge) you can decide which combination to choose.
– Martijn Weterings
3 hours ago
|
show 4 more comments
$beta$ and $gamma$
As Xi'an noted in his answer the $beta$ and $gamma$ are related to each other by relating to the conditional means $X|Y$ and $Y|X$ (which in their turn relate to a single joint distribution) these are not symmetric in the sense that $beta = 1/gamma$. This is neither the case if you would 'know' the true $sigma$ and $rho$ instead of using estimates. You have $$beta = rho_{XY} frac{sigma_Y}{sigma_X}$$ and $$gamma = rho_{XY} frac{sigma_X}{sigma_Y}$$
See also simple linear regression on wikipedia for computation of the $beta$ and $gamma$.
It is this correlation term which sort of disturbs the symmetry. When the $beta$ and $gamma$ would be simply the ratio of the standard deviation $sigma_Y/sigma_X$ and $sigma_X/sigma_Y$ then they would indeed be each others inverse. The $rho_{XY}$ term can be seen as modifying this as a sort of regression to the mean. With perfect correlation $rho_{XY} = 1$ then you can fully predict $X$ based on $Y$ or vice versa. But with $rho_{XY} < 1$ you can not make those perfect predictions and the conditional mean will be somewhat closer to the unconditional mean, in comparison to a simple scaling by $sigma_Y/sigma_X$ or $sigma_X/sigma_Y$.
Is a regression line the right method?
You may wonder whether these conditional probabilities and regression lines is what you need to determine your ratios of $X$ and $Y$. It is unclear to me how you would wish to use a regression line in the computation of an optimal ratio.
Below is an alternative way to compute the ratio. This method does have symmetry (ie if you switch X and Y then you will get the same ratio).
Alternative
Say, the yields of bonds $X$ and $Y$ are distributed according to a multivariate normal distribution$^dagger$ with correlation $rho_{XY}$ and standard deviations $sigma_X$ and $sigma_Y$ then the yield of a hedge that is sum of $X$ and $Y$ will be normal distributed:
$$H = alpha X + (1-alpha) Y sim N(mu_H,sigma_H^2)$$
were $0 leq alpha leq 1$ and with
$$begin{array}{rcl}
mu_H &=& alpha mu_X+(1-alpha) mu_Y \
sigma_H^2 &=& alpha^2 sigma_X^2 + (1-alpha)^2 sigma_Y^2 + 2 alpha (1-alpha) rho_{XY} sigma_X sigma_Y \
& =& alpha^2(sigma_X^2+sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y) + alpha (-2 sigma_Y^2+2rho_{XY}sigma_Xsigma_Y) +sigma_Y^2
end{array} $$
The maximum of the mean $mu_H$ will be at $$alpha = 0 text{ or } alpha=1$$ or not existing when $mu_X=mu_Y$.
The minimum of the variance $sigma_H^2$ will be at $$alpha = 1 - frac{sigma_X^2 -rho_{XY}sigma_Xsigma_Y}{sigma_X^2 +sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y} = frac{sigma_Y^2-rho_{XY}sigma_Xsigma_Y}{sigma_X^2+sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y} $$
The optimum will be somewhere in between those two extremes and depends on how you wish to compare losses and gains
Note that now there is a symmetry between $alpha$ and $1-alpha$. It does not matter whether you use the hedge $H=alpha_1 X+(1-alpha_1)Y$ or the hedge $H=alpha_2 Y + (1-alpha_2) X$. You will get the same ratios in terms of $alpha_1 = 1-alpha_2$.
Minimal variance case and relation with principle components
In the minimal variance case (here you actually do not need to assume a multivariate Normal distribution) you get the following hedge ratio as optimum $$frac{alpha}{1-alpha} = frac{var(Y) - cov(X,Y)}{var(X)-cov(X,Y)}$$ which can be expressed in terms of the regression coefficients $beta = cov(X,Y)/var(X)$ and $gamma = cov(X,Y)/var(Y)$ and is as following $$frac{alpha}{1-alpha} = frac{1-beta}{1-gamma}$$
In a situation with more than two variables/stocks/bonds you might generalize this to the last (smallest eigenvalue) principle component.
Variants
Improvements of the model can be made by using different distributions than multivariate normal. Also you could incorporate the time in a more sophisticated model to make better predictions of future values for the pair $X,Y$.
$dagger$ This is a simplification but it suits the purpose of explaining how one can, and should, perform the analysis to find an optimal ratio without a regression line.
1
I am sorry, but as a physicist, I know too little about the language (long, short, holdings, etc.) related to stocks, bonds and finance. If you could cast it in simpler language I might be able to understand it and work with it. My answer is just a very simple expression that is unaware of the details and possibilities how to express hedging and stocks, but it shows the basic principle how you can get away from the use of a regression line (go back to first principles, express the model for profit which is at the core instead of using regression lines whose relevance is not directly clear).
– Martijn Weterings
4 hours ago
I think i understand. The problem is that 1/ρ_{XY} ne p_{XY}$. indeed, $p_{XY}$ often changes quite and bit when we take the inverse. Your alternative is close to the case I am thinking about, but i do want to check one thing: does this allow non-negative holdings? Adopting your terminology, i'd have a unit holding of bond X, and a negative holding of Y. Say long one unit of bond X and short (say) 1.2 units of bond Y ... but it could be 0.2 units or 5 units, depending on the math.
– ricardo
4 hours ago
long means that i make 1% on a bond if the price increases by ~1%; short means that i lose ~1% on a bond if the price increases by ~1%. So the idea is that i am long one unit of one bond (so i benefit from an appreciation) and am short some amount of the other bond (so i lose from an appreciation).
– ricardo
3 hours ago
"The problem is to decide how much of X one ought to hold against Y." My problem with this is that there is no explanation/model/expression how you decide about this. How do you define losses and gains and how much do you value them?
– Martijn Weterings
3 hours ago
Are there costs associated with being short and long? I imagine that you have a given amount to invest and this limits how much you can be short/long in those bonds. Then based on your previous knowledge you can estimate/determine the distribution of losses/gains for whatever combination on that limit. Finally, based on some function that determines how you value losses and gains (this expresses why/how you hedge) you can decide which combination to choose.
– Martijn Weterings
3 hours ago
|
show 4 more comments
$beta$ and $gamma$
As Xi'an noted in his answer the $beta$ and $gamma$ are related to each other by relating to the conditional means $X|Y$ and $Y|X$ (which in their turn relate to a single joint distribution) these are not symmetric in the sense that $beta = 1/gamma$. This is neither the case if you would 'know' the true $sigma$ and $rho$ instead of using estimates. You have $$beta = rho_{XY} frac{sigma_Y}{sigma_X}$$ and $$gamma = rho_{XY} frac{sigma_X}{sigma_Y}$$
See also simple linear regression on wikipedia for computation of the $beta$ and $gamma$.
It is this correlation term which sort of disturbs the symmetry. When the $beta$ and $gamma$ would be simply the ratio of the standard deviation $sigma_Y/sigma_X$ and $sigma_X/sigma_Y$ then they would indeed be each others inverse. The $rho_{XY}$ term can be seen as modifying this as a sort of regression to the mean. With perfect correlation $rho_{XY} = 1$ then you can fully predict $X$ based on $Y$ or vice versa. But with $rho_{XY} < 1$ you can not make those perfect predictions and the conditional mean will be somewhat closer to the unconditional mean, in comparison to a simple scaling by $sigma_Y/sigma_X$ or $sigma_X/sigma_Y$.
Is a regression line the right method?
You may wonder whether these conditional probabilities and regression lines is what you need to determine your ratios of $X$ and $Y$. It is unclear to me how you would wish to use a regression line in the computation of an optimal ratio.
Below is an alternative way to compute the ratio. This method does have symmetry (ie if you switch X and Y then you will get the same ratio).
Alternative
Say, the yields of bonds $X$ and $Y$ are distributed according to a multivariate normal distribution$^dagger$ with correlation $rho_{XY}$ and standard deviations $sigma_X$ and $sigma_Y$ then the yield of a hedge that is sum of $X$ and $Y$ will be normal distributed:
$$H = alpha X + (1-alpha) Y sim N(mu_H,sigma_H^2)$$
were $0 leq alpha leq 1$ and with
$$begin{array}{rcl}
mu_H &=& alpha mu_X+(1-alpha) mu_Y \
sigma_H^2 &=& alpha^2 sigma_X^2 + (1-alpha)^2 sigma_Y^2 + 2 alpha (1-alpha) rho_{XY} sigma_X sigma_Y \
& =& alpha^2(sigma_X^2+sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y) + alpha (-2 sigma_Y^2+2rho_{XY}sigma_Xsigma_Y) +sigma_Y^2
end{array} $$
The maximum of the mean $mu_H$ will be at $$alpha = 0 text{ or } alpha=1$$ or not existing when $mu_X=mu_Y$.
The minimum of the variance $sigma_H^2$ will be at $$alpha = 1 - frac{sigma_X^2 -rho_{XY}sigma_Xsigma_Y}{sigma_X^2 +sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y} = frac{sigma_Y^2-rho_{XY}sigma_Xsigma_Y}{sigma_X^2+sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y} $$
The optimum will be somewhere in between those two extremes and depends on how you wish to compare losses and gains
Note that now there is a symmetry between $alpha$ and $1-alpha$. It does not matter whether you use the hedge $H=alpha_1 X+(1-alpha_1)Y$ or the hedge $H=alpha_2 Y + (1-alpha_2) X$. You will get the same ratios in terms of $alpha_1 = 1-alpha_2$.
Minimal variance case and relation with principle components
In the minimal variance case (here you actually do not need to assume a multivariate Normal distribution) you get the following hedge ratio as optimum $$frac{alpha}{1-alpha} = frac{var(Y) - cov(X,Y)}{var(X)-cov(X,Y)}$$ which can be expressed in terms of the regression coefficients $beta = cov(X,Y)/var(X)$ and $gamma = cov(X,Y)/var(Y)$ and is as following $$frac{alpha}{1-alpha} = frac{1-beta}{1-gamma}$$
In a situation with more than two variables/stocks/bonds you might generalize this to the last (smallest eigenvalue) principle component.
Variants
Improvements of the model can be made by using different distributions than multivariate normal. Also you could incorporate the time in a more sophisticated model to make better predictions of future values for the pair $X,Y$.
$dagger$ This is a simplification but it suits the purpose of explaining how one can, and should, perform the analysis to find an optimal ratio without a regression line.
$beta$ and $gamma$
As Xi'an noted in his answer the $beta$ and $gamma$ are related to each other by relating to the conditional means $X|Y$ and $Y|X$ (which in their turn relate to a single joint distribution) these are not symmetric in the sense that $beta = 1/gamma$. This is neither the case if you would 'know' the true $sigma$ and $rho$ instead of using estimates. You have $$beta = rho_{XY} frac{sigma_Y}{sigma_X}$$ and $$gamma = rho_{XY} frac{sigma_X}{sigma_Y}$$
See also simple linear regression on wikipedia for computation of the $beta$ and $gamma$.
It is this correlation term which sort of disturbs the symmetry. When the $beta$ and $gamma$ would be simply the ratio of the standard deviation $sigma_Y/sigma_X$ and $sigma_X/sigma_Y$ then they would indeed be each others inverse. The $rho_{XY}$ term can be seen as modifying this as a sort of regression to the mean. With perfect correlation $rho_{XY} = 1$ then you can fully predict $X$ based on $Y$ or vice versa. But with $rho_{XY} < 1$ you can not make those perfect predictions and the conditional mean will be somewhat closer to the unconditional mean, in comparison to a simple scaling by $sigma_Y/sigma_X$ or $sigma_X/sigma_Y$.
Is a regression line the right method?
You may wonder whether these conditional probabilities and regression lines is what you need to determine your ratios of $X$ and $Y$. It is unclear to me how you would wish to use a regression line in the computation of an optimal ratio.
Below is an alternative way to compute the ratio. This method does have symmetry (ie if you switch X and Y then you will get the same ratio).
Alternative
Say, the yields of bonds $X$ and $Y$ are distributed according to a multivariate normal distribution$^dagger$ with correlation $rho_{XY}$ and standard deviations $sigma_X$ and $sigma_Y$ then the yield of a hedge that is sum of $X$ and $Y$ will be normal distributed:
$$H = alpha X + (1-alpha) Y sim N(mu_H,sigma_H^2)$$
were $0 leq alpha leq 1$ and with
$$begin{array}{rcl}
mu_H &=& alpha mu_X+(1-alpha) mu_Y \
sigma_H^2 &=& alpha^2 sigma_X^2 + (1-alpha)^2 sigma_Y^2 + 2 alpha (1-alpha) rho_{XY} sigma_X sigma_Y \
& =& alpha^2(sigma_X^2+sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y) + alpha (-2 sigma_Y^2+2rho_{XY}sigma_Xsigma_Y) +sigma_Y^2
end{array} $$
The maximum of the mean $mu_H$ will be at $$alpha = 0 text{ or } alpha=1$$ or not existing when $mu_X=mu_Y$.
The minimum of the variance $sigma_H^2$ will be at $$alpha = 1 - frac{sigma_X^2 -rho_{XY}sigma_Xsigma_Y}{sigma_X^2 +sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y} = frac{sigma_Y^2-rho_{XY}sigma_Xsigma_Y}{sigma_X^2+sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y} $$
The optimum will be somewhere in between those two extremes and depends on how you wish to compare losses and gains
Note that now there is a symmetry between $alpha$ and $1-alpha$. It does not matter whether you use the hedge $H=alpha_1 X+(1-alpha_1)Y$ or the hedge $H=alpha_2 Y + (1-alpha_2) X$. You will get the same ratios in terms of $alpha_1 = 1-alpha_2$.
Minimal variance case and relation with principle components
In the minimal variance case (here you actually do not need to assume a multivariate Normal distribution) you get the following hedge ratio as optimum $$frac{alpha}{1-alpha} = frac{var(Y) - cov(X,Y)}{var(X)-cov(X,Y)}$$ which can be expressed in terms of the regression coefficients $beta = cov(X,Y)/var(X)$ and $gamma = cov(X,Y)/var(Y)$ and is as following $$frac{alpha}{1-alpha} = frac{1-beta}{1-gamma}$$
In a situation with more than two variables/stocks/bonds you might generalize this to the last (smallest eigenvalue) principle component.
Variants
Improvements of the model can be made by using different distributions than multivariate normal. Also you could incorporate the time in a more sophisticated model to make better predictions of future values for the pair $X,Y$.
$dagger$ This is a simplification but it suits the purpose of explaining how one can, and should, perform the analysis to find an optimal ratio without a regression line.
edited 1 hour ago
answered 6 hours ago
Martijn Weterings
12.6k1355
12.6k1355
1
I am sorry, but as a physicist, I know too little about the language (long, short, holdings, etc.) related to stocks, bonds and finance. If you could cast it in simpler language I might be able to understand it and work with it. My answer is just a very simple expression that is unaware of the details and possibilities how to express hedging and stocks, but it shows the basic principle how you can get away from the use of a regression line (go back to first principles, express the model for profit which is at the core instead of using regression lines whose relevance is not directly clear).
– Martijn Weterings
4 hours ago
I think i understand. The problem is that 1/ρ_{XY} ne p_{XY}$. indeed, $p_{XY}$ often changes quite and bit when we take the inverse. Your alternative is close to the case I am thinking about, but i do want to check one thing: does this allow non-negative holdings? Adopting your terminology, i'd have a unit holding of bond X, and a negative holding of Y. Say long one unit of bond X and short (say) 1.2 units of bond Y ... but it could be 0.2 units or 5 units, depending on the math.
– ricardo
4 hours ago
long means that i make 1% on a bond if the price increases by ~1%; short means that i lose ~1% on a bond if the price increases by ~1%. So the idea is that i am long one unit of one bond (so i benefit from an appreciation) and am short some amount of the other bond (so i lose from an appreciation).
– ricardo
3 hours ago
"The problem is to decide how much of X one ought to hold against Y." My problem with this is that there is no explanation/model/expression how you decide about this. How do you define losses and gains and how much do you value them?
– Martijn Weterings
3 hours ago
Are there costs associated with being short and long? I imagine that you have a given amount to invest and this limits how much you can be short/long in those bonds. Then based on your previous knowledge you can estimate/determine the distribution of losses/gains for whatever combination on that limit. Finally, based on some function that determines how you value losses and gains (this expresses why/how you hedge) you can decide which combination to choose.
– Martijn Weterings
3 hours ago
|
show 4 more comments
1
I am sorry, but as a physicist, I know too little about the language (long, short, holdings, etc.) related to stocks, bonds and finance. If you could cast it in simpler language I might be able to understand it and work with it. My answer is just a very simple expression that is unaware of the details and possibilities how to express hedging and stocks, but it shows the basic principle how you can get away from the use of a regression line (go back to first principles, express the model for profit which is at the core instead of using regression lines whose relevance is not directly clear).
– Martijn Weterings
4 hours ago
I think i understand. The problem is that 1/ρ_{XY} ne p_{XY}$. indeed, $p_{XY}$ often changes quite and bit when we take the inverse. Your alternative is close to the case I am thinking about, but i do want to check one thing: does this allow non-negative holdings? Adopting your terminology, i'd have a unit holding of bond X, and a negative holding of Y. Say long one unit of bond X and short (say) 1.2 units of bond Y ... but it could be 0.2 units or 5 units, depending on the math.
– ricardo
4 hours ago
long means that i make 1% on a bond if the price increases by ~1%; short means that i lose ~1% on a bond if the price increases by ~1%. So the idea is that i am long one unit of one bond (so i benefit from an appreciation) and am short some amount of the other bond (so i lose from an appreciation).
– ricardo
3 hours ago
"The problem is to decide how much of X one ought to hold against Y." My problem with this is that there is no explanation/model/expression how you decide about this. How do you define losses and gains and how much do you value them?
– Martijn Weterings
3 hours ago
Are there costs associated with being short and long? I imagine that you have a given amount to invest and this limits how much you can be short/long in those bonds. Then based on your previous knowledge you can estimate/determine the distribution of losses/gains for whatever combination on that limit. Finally, based on some function that determines how you value losses and gains (this expresses why/how you hedge) you can decide which combination to choose.
– Martijn Weterings
3 hours ago
1
1
I am sorry, but as a physicist, I know too little about the language (long, short, holdings, etc.) related to stocks, bonds and finance. If you could cast it in simpler language I might be able to understand it and work with it. My answer is just a very simple expression that is unaware of the details and possibilities how to express hedging and stocks, but it shows the basic principle how you can get away from the use of a regression line (go back to first principles, express the model for profit which is at the core instead of using regression lines whose relevance is not directly clear).
– Martijn Weterings
4 hours ago
I am sorry, but as a physicist, I know too little about the language (long, short, holdings, etc.) related to stocks, bonds and finance. If you could cast it in simpler language I might be able to understand it and work with it. My answer is just a very simple expression that is unaware of the details and possibilities how to express hedging and stocks, but it shows the basic principle how you can get away from the use of a regression line (go back to first principles, express the model for profit which is at the core instead of using regression lines whose relevance is not directly clear).
– Martijn Weterings
4 hours ago
I think i understand. The problem is that 1/ρ_{XY} ne p_{XY}$. indeed, $p_{XY}$ often changes quite and bit when we take the inverse. Your alternative is close to the case I am thinking about, but i do want to check one thing: does this allow non-negative holdings? Adopting your terminology, i'd have a unit holding of bond X, and a negative holding of Y. Say long one unit of bond X and short (say) 1.2 units of bond Y ... but it could be 0.2 units or 5 units, depending on the math.
– ricardo
4 hours ago
I think i understand. The problem is that 1/ρ_{XY} ne p_{XY}$. indeed, $p_{XY}$ often changes quite and bit when we take the inverse. Your alternative is close to the case I am thinking about, but i do want to check one thing: does this allow non-negative holdings? Adopting your terminology, i'd have a unit holding of bond X, and a negative holding of Y. Say long one unit of bond X and short (say) 1.2 units of bond Y ... but it could be 0.2 units or 5 units, depending on the math.
– ricardo
4 hours ago
long means that i make 1% on a bond if the price increases by ~1%; short means that i lose ~1% on a bond if the price increases by ~1%. So the idea is that i am long one unit of one bond (so i benefit from an appreciation) and am short some amount of the other bond (so i lose from an appreciation).
– ricardo
3 hours ago
long means that i make 1% on a bond if the price increases by ~1%; short means that i lose ~1% on a bond if the price increases by ~1%. So the idea is that i am long one unit of one bond (so i benefit from an appreciation) and am short some amount of the other bond (so i lose from an appreciation).
– ricardo
3 hours ago
"The problem is to decide how much of X one ought to hold against Y." My problem with this is that there is no explanation/model/expression how you decide about this. How do you define losses and gains and how much do you value them?
– Martijn Weterings
3 hours ago
"The problem is to decide how much of X one ought to hold against Y." My problem with this is that there is no explanation/model/expression how you decide about this. How do you define losses and gains and how much do you value them?
– Martijn Weterings
3 hours ago
Are there costs associated with being short and long? I imagine that you have a given amount to invest and this limits how much you can be short/long in those bonds. Then based on your previous knowledge you can estimate/determine the distribution of losses/gains for whatever combination on that limit. Finally, based on some function that determines how you value losses and gains (this expresses why/how you hedge) you can decide which combination to choose.
– Martijn Weterings
3 hours ago
Are there costs associated with being short and long? I imagine that you have a given amount to invest and this limits how much you can be short/long in those bonds. Then based on your previous knowledge you can estimate/determine the distribution of losses/gains for whatever combination on that limit. Finally, based on some function that determines how you value losses and gains (this expresses why/how you hedge) you can decide which combination to choose.
– Martijn Weterings
3 hours ago
|
show 4 more comments
Perhaps the approach of "Granger causality" might help. This would help you to assess whether X is a good predictor of Y or whether X is a better of Y. In other words, it tells you whether beta or gamma is the thing to take more seriously. Also, considering that you are dealing with time series data, it tells you how much of the history of X counts towards the prediction of Y (or vice versa).
Wikipedia gives a simple explanation:
A time series X is said to Granger-cause Y if it can be shown, usually through a series of t-tests and F-tests on lagged values of X (and with lagged values of Y also included), that those X values provide statistically significant information about future values of Y.
What you do is the following:
- regress X(t-1) and Y(t-1) on Y(t)
- regress X(t-1), X(t-2), Y(t-1), Y(t-2) on Y(t)
- regress X(t-1), X(t-2), X(t-3), Y(t-1), Y(t-2), Y(t-3) on Y(t)
Continue for whatever history length might be reasonable. Check the significance of the F-statistics for each regression.
Then do the same the reverse (so, now regress the past values of X and Y on X(t)) and see which regressions have significant F-values.
A very straightforward example, with R code, is found here.
Granger causality has been critiqued for not actually establishing causality (in some cases). But it seems that you application is really about "predictive causality," which is exactly what the Granger causality approach is meant for.
The point is that the approach will tell you whether X predicts Y or whether Y predicts X (so you no longer would be tempted to artificially--and incorrectly--compound the two regression coefficients) and it gives you a better prediction (as you will know how much history of X and Y you need to know to predict Y), which is useful for hedging purposes, right?
I have a strong theoretical reason to believe that neither is truly a cause, and that even if one became a cause that it would not remain true over time. So i don't think that Granger Causailty is the answer in this case. I've upvoted the answer in any case, as it is useful -- esp. the R code.
– ricardo
12 hours ago
That is why I mention that "Granger causality has been critiqued for not actually establishing causality (in some cases)." It seems to me that the question here is more about "predictive causality," which is what Granger causality is meant for.
– Steve G. Jones
8 hours ago
That is why I explicitly mention that "Granger causality has been critiqued for not actually establishing causality (in some cases)." It seems to me that your question is more about establishing "predictive causality," which is what Granger causality is meant for. In addition, Granger's approach uses the information in your time series data, which are a waste not to use if you have them. Of course, you can (should?) re-estimate the effects over time. I expect that the Granger effects are more stable than cross-sectional OLS (you can test this beforehand, using historical data). HTH
– Steve G. Jones
8 hours ago
add a comment |
Perhaps the approach of "Granger causality" might help. This would help you to assess whether X is a good predictor of Y or whether X is a better of Y. In other words, it tells you whether beta or gamma is the thing to take more seriously. Also, considering that you are dealing with time series data, it tells you how much of the history of X counts towards the prediction of Y (or vice versa).
Wikipedia gives a simple explanation:
A time series X is said to Granger-cause Y if it can be shown, usually through a series of t-tests and F-tests on lagged values of X (and with lagged values of Y also included), that those X values provide statistically significant information about future values of Y.
What you do is the following:
- regress X(t-1) and Y(t-1) on Y(t)
- regress X(t-1), X(t-2), Y(t-1), Y(t-2) on Y(t)
- regress X(t-1), X(t-2), X(t-3), Y(t-1), Y(t-2), Y(t-3) on Y(t)
Continue for whatever history length might be reasonable. Check the significance of the F-statistics for each regression.
Then do the same the reverse (so, now regress the past values of X and Y on X(t)) and see which regressions have significant F-values.
A very straightforward example, with R code, is found here.
Granger causality has been critiqued for not actually establishing causality (in some cases). But it seems that you application is really about "predictive causality," which is exactly what the Granger causality approach is meant for.
The point is that the approach will tell you whether X predicts Y or whether Y predicts X (so you no longer would be tempted to artificially--and incorrectly--compound the two regression coefficients) and it gives you a better prediction (as you will know how much history of X and Y you need to know to predict Y), which is useful for hedging purposes, right?
I have a strong theoretical reason to believe that neither is truly a cause, and that even if one became a cause that it would not remain true over time. So i don't think that Granger Causailty is the answer in this case. I've upvoted the answer in any case, as it is useful -- esp. the R code.
– ricardo
12 hours ago
That is why I mention that "Granger causality has been critiqued for not actually establishing causality (in some cases)." It seems to me that the question here is more about "predictive causality," which is what Granger causality is meant for.
– Steve G. Jones
8 hours ago
That is why I explicitly mention that "Granger causality has been critiqued for not actually establishing causality (in some cases)." It seems to me that your question is more about establishing "predictive causality," which is what Granger causality is meant for. In addition, Granger's approach uses the information in your time series data, which are a waste not to use if you have them. Of course, you can (should?) re-estimate the effects over time. I expect that the Granger effects are more stable than cross-sectional OLS (you can test this beforehand, using historical data). HTH
– Steve G. Jones
8 hours ago
add a comment |
Perhaps the approach of "Granger causality" might help. This would help you to assess whether X is a good predictor of Y or whether X is a better of Y. In other words, it tells you whether beta or gamma is the thing to take more seriously. Also, considering that you are dealing with time series data, it tells you how much of the history of X counts towards the prediction of Y (or vice versa).
Wikipedia gives a simple explanation:
A time series X is said to Granger-cause Y if it can be shown, usually through a series of t-tests and F-tests on lagged values of X (and with lagged values of Y also included), that those X values provide statistically significant information about future values of Y.
What you do is the following:
- regress X(t-1) and Y(t-1) on Y(t)
- regress X(t-1), X(t-2), Y(t-1), Y(t-2) on Y(t)
- regress X(t-1), X(t-2), X(t-3), Y(t-1), Y(t-2), Y(t-3) on Y(t)
Continue for whatever history length might be reasonable. Check the significance of the F-statistics for each regression.
Then do the same the reverse (so, now regress the past values of X and Y on X(t)) and see which regressions have significant F-values.
A very straightforward example, with R code, is found here.
Granger causality has been critiqued for not actually establishing causality (in some cases). But it seems that you application is really about "predictive causality," which is exactly what the Granger causality approach is meant for.
The point is that the approach will tell you whether X predicts Y or whether Y predicts X (so you no longer would be tempted to artificially--and incorrectly--compound the two regression coefficients) and it gives you a better prediction (as you will know how much history of X and Y you need to know to predict Y), which is useful for hedging purposes, right?
Perhaps the approach of "Granger causality" might help. This would help you to assess whether X is a good predictor of Y or whether X is a better of Y. In other words, it tells you whether beta or gamma is the thing to take more seriously. Also, considering that you are dealing with time series data, it tells you how much of the history of X counts towards the prediction of Y (or vice versa).
Wikipedia gives a simple explanation:
A time series X is said to Granger-cause Y if it can be shown, usually through a series of t-tests and F-tests on lagged values of X (and with lagged values of Y also included), that those X values provide statistically significant information about future values of Y.
What you do is the following:
- regress X(t-1) and Y(t-1) on Y(t)
- regress X(t-1), X(t-2), Y(t-1), Y(t-2) on Y(t)
- regress X(t-1), X(t-2), X(t-3), Y(t-1), Y(t-2), Y(t-3) on Y(t)
Continue for whatever history length might be reasonable. Check the significance of the F-statistics for each regression.
Then do the same the reverse (so, now regress the past values of X and Y on X(t)) and see which regressions have significant F-values.
A very straightforward example, with R code, is found here.
Granger causality has been critiqued for not actually establishing causality (in some cases). But it seems that you application is really about "predictive causality," which is exactly what the Granger causality approach is meant for.
The point is that the approach will tell you whether X predicts Y or whether Y predicts X (so you no longer would be tempted to artificially--and incorrectly--compound the two regression coefficients) and it gives you a better prediction (as you will know how much history of X and Y you need to know to predict Y), which is useful for hedging purposes, right?
answered yesterday
Steve G. Jones
1485
1485
I have a strong theoretical reason to believe that neither is truly a cause, and that even if one became a cause that it would not remain true over time. So i don't think that Granger Causailty is the answer in this case. I've upvoted the answer in any case, as it is useful -- esp. the R code.
– ricardo
12 hours ago
That is why I mention that "Granger causality has been critiqued for not actually establishing causality (in some cases)." It seems to me that the question here is more about "predictive causality," which is what Granger causality is meant for.
– Steve G. Jones
8 hours ago
That is why I explicitly mention that "Granger causality has been critiqued for not actually establishing causality (in some cases)." It seems to me that your question is more about establishing "predictive causality," which is what Granger causality is meant for. In addition, Granger's approach uses the information in your time series data, which are a waste not to use if you have them. Of course, you can (should?) re-estimate the effects over time. I expect that the Granger effects are more stable than cross-sectional OLS (you can test this beforehand, using historical data). HTH
– Steve G. Jones
8 hours ago
add a comment |
I have a strong theoretical reason to believe that neither is truly a cause, and that even if one became a cause that it would not remain true over time. So i don't think that Granger Causailty is the answer in this case. I've upvoted the answer in any case, as it is useful -- esp. the R code.
– ricardo
12 hours ago
That is why I mention that "Granger causality has been critiqued for not actually establishing causality (in some cases)." It seems to me that the question here is more about "predictive causality," which is what Granger causality is meant for.
– Steve G. Jones
8 hours ago
That is why I explicitly mention that "Granger causality has been critiqued for not actually establishing causality (in some cases)." It seems to me that your question is more about establishing "predictive causality," which is what Granger causality is meant for. In addition, Granger's approach uses the information in your time series data, which are a waste not to use if you have them. Of course, you can (should?) re-estimate the effects over time. I expect that the Granger effects are more stable than cross-sectional OLS (you can test this beforehand, using historical data). HTH
– Steve G. Jones
8 hours ago
I have a strong theoretical reason to believe that neither is truly a cause, and that even if one became a cause that it would not remain true over time. So i don't think that Granger Causailty is the answer in this case. I've upvoted the answer in any case, as it is useful -- esp. the R code.
– ricardo
12 hours ago
I have a strong theoretical reason to believe that neither is truly a cause, and that even if one became a cause that it would not remain true over time. So i don't think that Granger Causailty is the answer in this case. I've upvoted the answer in any case, as it is useful -- esp. the R code.
– ricardo
12 hours ago
That is why I mention that "Granger causality has been critiqued for not actually establishing causality (in some cases)." It seems to me that the question here is more about "predictive causality," which is what Granger causality is meant for.
– Steve G. Jones
8 hours ago
That is why I mention that "Granger causality has been critiqued for not actually establishing causality (in some cases)." It seems to me that the question here is more about "predictive causality," which is what Granger causality is meant for.
– Steve G. Jones
8 hours ago
That is why I explicitly mention that "Granger causality has been critiqued for not actually establishing causality (in some cases)." It seems to me that your question is more about establishing "predictive causality," which is what Granger causality is meant for. In addition, Granger's approach uses the information in your time series data, which are a waste not to use if you have them. Of course, you can (should?) re-estimate the effects over time. I expect that the Granger effects are more stable than cross-sectional OLS (you can test this beforehand, using historical data). HTH
– Steve G. Jones
8 hours ago
That is why I explicitly mention that "Granger causality has been critiqued for not actually establishing causality (in some cases)." It seems to me that your question is more about establishing "predictive causality," which is what Granger causality is meant for. In addition, Granger's approach uses the information in your time series data, which are a waste not to use if you have them. Of course, you can (should?) re-estimate the effects over time. I expect that the Granger effects are more stable than cross-sectional OLS (you can test this beforehand, using historical data). HTH
– Steve G. Jones
8 hours ago
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f385812%2fis-the-average-of-betas-from-y-x-and-x-y-valid%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
The problem is not causality but instead the errors of measurement (it is just that often the dependent variable Y is the one with large measurement error, making "Y = a + B x + error" the common expression) Do you have an idea about the errors in the measurement of X and Y.
– Martijn Weterings
yesterday
To determine causality you need a controlled experiment. An experiment where you are able to change some variable independently from the others. (or a very unique situation where two populations can be considered/assumed equal except for one or more particular variables that are to be considered as "independent" variables)
– Martijn Weterings
yesterday
1
The exact values of $beta$ and $gamma$ can be found in this answer of mine to Effect of switching responses and explanatory variables..., and, as you suspect, $beta$ is not the reciprocal of $gamma$, and averaging $beta$ and $1/gamma$ is not the right way to go. A pictorial view of what $beta$ and $gamma$ are minimizing is given in Elvis's answer to the same question, and he introduces a"least rectangles" regression that you might want .....
– Dilip Sarwate
yesterday
3
You are in the ideal scenario where the choice of technique has a direct, physically measurable impact; you can simply measure the out-of-sample hedging error for each estimate, and compare them. Also, typically optimal hedging is better handled by using a VECM model (see for example Gatarek & Johansen, 2014, Optimal hedging with the cointegrated vector autoregressive model), which does not require choosing to model Y as a function of X or vice-versa.
– Chris Haug
23 hours ago
1
You might want to look at the geometric mean $sqrt{dfrac{beta}{gamma}}$ as a possibility (if they are both negative you might take the negative square root). Then look at $dfrac{s_y}{s_x}$, which should be very similar
– Henry
21 hours ago