What is the formula for projection onto spectraplex?

A spectraplex (special case of spectrahedron) is the set of all positive semi-definite matrices whose trace is equal to one. Formally, let
$$
S={textbf{W} in mathbb{R}^{d times d} mid textbf{W} succeq 0, text{Tr}(textbf{W})=1}
$$

On page 7, lemma 2 in On the Regret Minimization of Nonconvex Online Gradient Ascent for Online PCA, the papers says "it is well known that the projection of $textbf{W}$ onto $S$ is given by"
$$
Pi_S[textbf{W}]=(lambda_1(textbf{W})-lambda)w'w'^T+ sum_{i=2}^{d}max{0,lambda_i(textbf{W})-lambda }y_iy_i^T
$$

where $(w',y_2,y_3,cdots,y_d)$ are eigenvectors of $textbf{W}$ corresponding to descending eigenvalues such that

$$
lambda_1(textbf{W})-lambda+ sum_{i=2}^{d}max{0,lambda_i(textbf{W})-lambda }=1
$$
for some $lambda geq 0$.

Where the above comes from?

Is there any reference that explains such projections like this?

asked Dec 27 '18 at 3:46

Saeed

928310

$begingroup$
I believe that you can reduce this problem to the problem of projecting onto the unit simplex in $R^{n}_{+}$ by using the orthogonality of the $y_{i}y_{i}^{T}$ with respect to the trace inner product. The equation for $lambda$ is the same equation that appears in the commonly used algorithm for computing the projection on the unit simplex in $R^{n}_{+}$.
$endgroup$
– Brian Borchers
Jan 5 at 16:15

add a comment |

where $(w',y_2,y_3,cdots,y_d)$ are eigenvectors of $textbf{W}$ corresponding to descending eigenvalues such that

$$
lambda_1(textbf{W})-lambda+ sum_{i=2}^{d}max{0,lambda_i(textbf{W})-lambda }=1
$$
for some $lambda geq 0$.

Where the above comes from?

Is there any reference that explains such projections like this?

asked Dec 27 '18 at 3:46

Saeed

928310

$begingroup$
I believe that you can reduce this problem to the problem of projecting onto the unit simplex in $R^{n}_{+}$ by using the orthogonality of the $y_{i}y_{i}^{T}$ with respect to the trace inner product. The equation for $lambda$ is the same equation that appears in the commonly used algorithm for computing the projection on the unit simplex in $R^{n}_{+}$.
$endgroup$
– Brian Borchers
Jan 5 at 16:15

add a comment |

where $(w',y_2,y_3,cdots,y_d)$ are eigenvectors of $textbf{W}$ corresponding to descending eigenvalues such that

$$
lambda_1(textbf{W})-lambda+ sum_{i=2}^{d}max{0,lambda_i(textbf{W})-lambda }=1
$$
for some $lambda geq 0$.

Where the above comes from?

Is there any reference that explains such projections like this?

asked Dec 27 '18 at 3:46

Saeed

928310

where $(w',y_2,y_3,cdots,y_d)$ are eigenvectors of $textbf{W}$ corresponding to descending eigenvalues such that

$$
lambda_1(textbf{W})-lambda+ sum_{i=2}^{d}max{0,lambda_i(textbf{W})-lambda }=1
$$
for some $lambda geq 0$.

Where the above comes from?

Is there any reference that explains such projections like this?

eigenvalues-eigenvectors convex-optimization positive-semidefinite projection

asked Dec 27 '18 at 3:46

Saeed

928310

asked Dec 27 '18 at 3:46

Saeed

928310

asked Dec 27 '18 at 3:46

Saeed

928310

asked Dec 27 '18 at 3:46

Saeed

928310

asked Dec 27 '18 at 3:46

Saeed

928310

$begingroup$
I believe that you can reduce this problem to the problem of projecting onto the unit simplex in $R^{n}_{+}$ by using the orthogonality of the $y_{i}y_{i}^{T}$ with respect to the trace inner product. The equation for $lambda$ is the same equation that appears in the commonly used algorithm for computing the projection on the unit simplex in $R^{n}_{+}$.
$endgroup$
– Brian Borchers
Jan 5 at 16:15

add a comment |

$begingroup$
I believe that you can reduce this problem to the problem of projecting onto the unit simplex in $R^{n}_{+}$ by using the orthogonality of the $y_{i}y_{i}^{T}$ with respect to the trace inner product. The equation for $lambda$ is the same equation that appears in the commonly used algorithm for computing the projection on the unit simplex in $R^{n}_{+}$.
$endgroup$
– Brian Borchers
Jan 5 at 16:15

I believe that you can reduce this problem to the problem of projecting onto the unit simplex in $R^{n}_{+}$ by using the orthogonality of the $y_{i}y_{i}^{T}$ with respect to the trace inner product. The equation for $lambda$ is the same equation that appears in the commonly used algorithm for computing the projection on the unit simplex in $R^{n}_{+}$.

– Brian Borchers
Jan 5 at 16:15

add a comment |

1 Answer
1

active

oldest

votes

First, note that no such $lambda$ exists if

$sum_{i=1}^{d} lambda_{i} < 1$.

For example, the method fails if $W=0$ or if $W=(1/(2d))I$. It appears that in the context where this is presented in the paper that you cited we can be sure that the sum of the $lambda_{i}$ is large enough.

I believe that if you allow for negative values of $lambda$ the method will actually produce correct results for matrices $W$ that have $sum_{i=1}^{d} lambda_{i}<1$.

In the following, we'll assume that

$sum_{i=1}^{d} lambda_{i} geq 1$.

Since $W$ is symmetric, we can write it in terms of eigenvalues and normalized eigenvectors as

$W=sum_{i=1}^{d} lambda_{i} y^{(i)}{y^{(i)}}^{T}$

$W=sum_{i=1}^{d} lambda_{i} Y^{(i)}$

where

$Y^{(i)}=y^{(i)}{y^{(i)}}^{T}$.

We need to be specific about the matrix inner product and norm that we're projecting with respect to. Since $W$ lives in the space of symmetric matrices, The appropriate inner product is

$langle A, B rangle = mbox{tr}(AB)$

and the norm is

$| A |_{F}=sqrt{langle A, A rangle}$.

Note that with respect to this inner product, the $Y^{(i)}$ matrices are orthogonal because the $y_{i}$ eigenvectors are orthogonal.

$langle Y^{(i)}, Y^{(j)} rangle=mbox{tr}(y^{(i)}{y^{(i)}}^{T}y^{(j)}{y^{(j)}}^{T})$

$langle Y^({i)}, Y^{(j)} rangle=mbox{tr}(y^{(i)}0{y^{(j)}}^{T})=0$.

Also,

$| Y^{(i)} |_{F}^{2}=mbox{tr}(Y^{(i)}Y^{(i)})=
mbox{tr}
(y^{(i)}{y^{(i)}}^{T} y^{(i)}{y^{(i)}}^{T})
=mbox{tr}(y^{(i)}{y^{(i)}}^{T})
=mbox{tr}({y^{(i)}}^{T}y^{(i)})
=1$.

The desired projection is

$Pi_{S}(W)=mbox{arg} mbox{min}_{Y} | W-Y |_{F}^{2}$

subject to the constraints

$mbox{tr}(Y)=1$

$Y succeq 0$.

Write $Y$ as a linear combination of the $Y^{(i)}$ matrices plus an additional component that is in the orthogonal complement of the span of the $Y^{(i)}$.

$Y=Y^{(0)}+sum_{i=1}^{d} alpha_{i} Y^{(i)}$

Now, by the Pythagorean theorem,

$| W-Y |_{F}^{2}=left| W-Y^{(0)} right|_{F}^{2}+left| left( W- sum_{i=1}^{d} alpha_{i} Y^{(i)} right) right|_{F}^{2}$.

$| W-Y |_{F}^{2}=left| W-Y^{(0)} right|_{F}^{2}+left| sum_{i=1}^{d} (lambda_{i}-alpha_{i}) Y^{(i)} right|_{F}^{2}$.

Using the orthogonality of the $Y^{(i)}$ matrices, we get that

$| W-Y |_{F}^{2}=left| W-Y^{(0)} right|_{F}^{2}+sum_{i=1}^{d} (lambda_{i}-alpha_{i})^{2}$.

Since $W perp Y^{(0)}$,

$| W-Y |_{F}^{2}=left| W right|_{F}^{2}+left| Y^{(0)} right|_{F}^{2}+sum_{i=1}^{d} (lambda_{i}-alpha_{i})^{2}$.

Since

$mbox{tr}(Y)=mbox{tr}(Y^{(0)})+sum_{i=1}^{d} alpha_{i}$,

and

$sum_{i=1}^{d} alpha_{i} leq sum_{i=1}^{d} lambda_{i}$,

using a non-zero value of $Y^{(0)}$ can only increase this objective. Thus it's optimal to set $Y^{(0)}=0$.

We've now reduced the problem to

$min sum_{i=1}^{d} (lambda_{i}-alpha_{i})^{2}$

subject to

$sum_{i=1}^{d} alpha_{i}=1$

$alpha geq 0$.

It's easy to see that the equation given in the paper results from applying the KKT conditions to this problem.

If we let $mu$ be the Lagrange multiplier for the constraint $sum_{i=1}^{d}alpha_{i}=1$ and $nu_{i}$ be the Lagrange multiplier for the constraint $alpha_{i} geq 0$, then the ith component of the KKT conditions depends on whether $alpha_{i}>0$, in which case $nu_{i}=0$, or $alpha_{i}=0$, in which case $nu_{i}$ is free to be nonzero.

In the case $alpha_{i}>0$ and $nu_{i}=0$, we have

$-2(lambda_{i}-alpha_{i})=mu$

$(lambda_{i}-alpha_{i})=mu/2$.

In the case where $alpha_{i}=0$ and $nu_{i}$ is positive, the equation can satisfied by adjusting $nu$. Next, we use the constraint

$sum_{alpha_{i}>0} alpha_{i}=1$

to get

$sum_{alpha_{i}>0} (lambda_{i}-mu/2)=1$

Adding in the terms for $alpha_{i}=0$, we get

$sum_{i=1}^{d} max(0,lambda_{i}-mu/2)=1$.

Once $mu$ has been determined, let

$alpha_{i}=max(0,lambda_{i}-mu/2)$ for $i=1, 2, ldots, d$.

edited Jan 6 at 18:20

answered Jan 6 at 17:05

Brian Borchers

5,94111119

$begingroup$
Wow, it is a very rigorous proof.
$endgroup$
– Sepide
Jan 7 at 19:01

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3053571%2fwhat-is-the-formula-for-projection-onto-spectraplex%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

First, note that no such $lambda$ exists if

$sum_{i=1}^{d} lambda_{i} < 1$.

I believe that if you allow for negative values of $lambda$ the method will actually produce correct results for matrices $W$ that have $sum_{i=1}^{d} lambda_{i}<1$.

In the following, we'll assume that

$sum_{i=1}^{d} lambda_{i} geq 1$.

Since $W$ is symmetric, we can write it in terms of eigenvalues and normalized eigenvectors as

$W=sum_{i=1}^{d} lambda_{i} y^{(i)}{y^{(i)}}^{T}$

$W=sum_{i=1}^{d} lambda_{i} Y^{(i)}$

where

$Y^{(i)}=y^{(i)}{y^{(i)}}^{T}$.

We need to be specific about the matrix inner product and norm that we're projecting with respect to. Since $W$ lives in the space of symmetric matrices, The appropriate inner product is

$langle A, B rangle = mbox{tr}(AB)$

and the norm is

$| A |_{F}=sqrt{langle A, A rangle}$.

Note that with respect to this inner product, the $Y^{(i)}$ matrices are orthogonal because the $y_{i}$ eigenvectors are orthogonal.

$langle Y^{(i)}, Y^{(j)} rangle=mbox{tr}(y^{(i)}{y^{(i)}}^{T}y^{(j)}{y^{(j)}}^{T})$

$langle Y^({i)}, Y^{(j)} rangle=mbox{tr}(y^{(i)}0{y^{(j)}}^{T})=0$.

Also,

$| Y^{(i)} |_{F}^{2}=mbox{tr}(Y^{(i)}Y^{(i)})=
mbox{tr}
(y^{(i)}{y^{(i)}}^{T} y^{(i)}{y^{(i)}}^{T})
=mbox{tr}(y^{(i)}{y^{(i)}}^{T})
=mbox{tr}({y^{(i)}}^{T}y^{(i)})
=1$.

The desired projection is

$Pi_{S}(W)=mbox{arg} mbox{min}_{Y} | W-Y |_{F}^{2}$

subject to the constraints

$mbox{tr}(Y)=1$

$Y succeq 0$.

Write $Y$ as a linear combination of the $Y^{(i)}$ matrices plus an additional component that is in the orthogonal complement of the span of the $Y^{(i)}$.

$Y=Y^{(0)}+sum_{i=1}^{d} alpha_{i} Y^{(i)}$

Now, by the Pythagorean theorem,

$| W-Y |_{F}^{2}=left| W-Y^{(0)} right|_{F}^{2}+left| left( W- sum_{i=1}^{d} alpha_{i} Y^{(i)} right) right|_{F}^{2}$.

$| W-Y |_{F}^{2}=left| W-Y^{(0)} right|_{F}^{2}+left| sum_{i=1}^{d} (lambda_{i}-alpha_{i}) Y^{(i)} right|_{F}^{2}$.

Using the orthogonality of the $Y^{(i)}$ matrices, we get that

$| W-Y |_{F}^{2}=left| W-Y^{(0)} right|_{F}^{2}+sum_{i=1}^{d} (lambda_{i}-alpha_{i})^{2}$.

Since $W perp Y^{(0)}$,

$| W-Y |_{F}^{2}=left| W right|_{F}^{2}+left| Y^{(0)} right|_{F}^{2}+sum_{i=1}^{d} (lambda_{i}-alpha_{i})^{2}$.

Since

$mbox{tr}(Y)=mbox{tr}(Y^{(0)})+sum_{i=1}^{d} alpha_{i}$,

and

$sum_{i=1}^{d} alpha_{i} leq sum_{i=1}^{d} lambda_{i}$,

using a non-zero value of $Y^{(0)}$ can only increase this objective. Thus it's optimal to set $Y^{(0)}=0$.

We've now reduced the problem to

$min sum_{i=1}^{d} (lambda_{i}-alpha_{i})^{2}$

subject to

$sum_{i=1}^{d} alpha_{i}=1$

$alpha geq 0$.

It's easy to see that the equation given in the paper results from applying the KKT conditions to this problem.

In the case $alpha_{i}>0$ and $nu_{i}=0$, we have

$-2(lambda_{i}-alpha_{i})=mu$

$(lambda_{i}-alpha_{i})=mu/2$.

In the case where $alpha_{i}=0$ and $nu_{i}$ is positive, the equation can satisfied by adjusting $nu$. Next, we use the constraint

$sum_{alpha_{i}>0} alpha_{i}=1$

to get

$sum_{alpha_{i}>0} (lambda_{i}-mu/2)=1$

Adding in the terms for $alpha_{i}=0$, we get

$sum_{i=1}^{d} max(0,lambda_{i}-mu/2)=1$.

Once $mu$ has been determined, let

$alpha_{i}=max(0,lambda_{i}-mu/2)$ for $i=1, 2, ldots, d$.

edited Jan 6 at 18:20

answered Jan 6 at 17:05

Brian Borchers

5,94111119

$begingroup$
Wow, it is a very rigorous proof.
$endgroup$
– Sepide
Jan 7 at 19:01

add a comment |

First, note that no such $lambda$ exists if

$sum_{i=1}^{d} lambda_{i} < 1$.

I believe that if you allow for negative values of $lambda$ the method will actually produce correct results for matrices $W$ that have $sum_{i=1}^{d} lambda_{i}<1$.

In the following, we'll assume that

$sum_{i=1}^{d} lambda_{i} geq 1$.

Since $W$ is symmetric, we can write it in terms of eigenvalues and normalized eigenvectors as

$W=sum_{i=1}^{d} lambda_{i} y^{(i)}{y^{(i)}}^{T}$

$W=sum_{i=1}^{d} lambda_{i} Y^{(i)}$

where

$Y^{(i)}=y^{(i)}{y^{(i)}}^{T}$.

We need to be specific about the matrix inner product and norm that we're projecting with respect to. Since $W$ lives in the space of symmetric matrices, The appropriate inner product is

$langle A, B rangle = mbox{tr}(AB)$

and the norm is

$| A |_{F}=sqrt{langle A, A rangle}$.

Note that with respect to this inner product, the $Y^{(i)}$ matrices are orthogonal because the $y_{i}$ eigenvectors are orthogonal.

$langle Y^{(i)}, Y^{(j)} rangle=mbox{tr}(y^{(i)}{y^{(i)}}^{T}y^{(j)}{y^{(j)}}^{T})$

$langle Y^({i)}, Y^{(j)} rangle=mbox{tr}(y^{(i)}0{y^{(j)}}^{T})=0$.

Also,

$| Y^{(i)} |_{F}^{2}=mbox{tr}(Y^{(i)}Y^{(i)})=
mbox{tr}
(y^{(i)}{y^{(i)}}^{T} y^{(i)}{y^{(i)}}^{T})
=mbox{tr}(y^{(i)}{y^{(i)}}^{T})
=mbox{tr}({y^{(i)}}^{T}y^{(i)})
=1$.

The desired projection is

$Pi_{S}(W)=mbox{arg} mbox{min}_{Y} | W-Y |_{F}^{2}$

subject to the constraints

$mbox{tr}(Y)=1$

$Y succeq 0$.

Write $Y$ as a linear combination of the $Y^{(i)}$ matrices plus an additional component that is in the orthogonal complement of the span of the $Y^{(i)}$.

$Y=Y^{(0)}+sum_{i=1}^{d} alpha_{i} Y^{(i)}$

Now, by the Pythagorean theorem,

$| W-Y |_{F}^{2}=left| W-Y^{(0)} right|_{F}^{2}+left| left( W- sum_{i=1}^{d} alpha_{i} Y^{(i)} right) right|_{F}^{2}$.

$| W-Y |_{F}^{2}=left| W-Y^{(0)} right|_{F}^{2}+left| sum_{i=1}^{d} (lambda_{i}-alpha_{i}) Y^{(i)} right|_{F}^{2}$.

Using the orthogonality of the $Y^{(i)}$ matrices, we get that

$| W-Y |_{F}^{2}=left| W-Y^{(0)} right|_{F}^{2}+sum_{i=1}^{d} (lambda_{i}-alpha_{i})^{2}$.

Since $W perp Y^{(0)}$,

$| W-Y |_{F}^{2}=left| W right|_{F}^{2}+left| Y^{(0)} right|_{F}^{2}+sum_{i=1}^{d} (lambda_{i}-alpha_{i})^{2}$.

Since

$mbox{tr}(Y)=mbox{tr}(Y^{(0)})+sum_{i=1}^{d} alpha_{i}$,

and

$sum_{i=1}^{d} alpha_{i} leq sum_{i=1}^{d} lambda_{i}$,

using a non-zero value of $Y^{(0)}$ can only increase this objective. Thus it's optimal to set $Y^{(0)}=0$.

We've now reduced the problem to

$min sum_{i=1}^{d} (lambda_{i}-alpha_{i})^{2}$

subject to

$sum_{i=1}^{d} alpha_{i}=1$

$alpha geq 0$.

It's easy to see that the equation given in the paper results from applying the KKT conditions to this problem.

In the case $alpha_{i}>0$ and $nu_{i}=0$, we have

$-2(lambda_{i}-alpha_{i})=mu$

$(lambda_{i}-alpha_{i})=mu/2$.

In the case where $alpha_{i}=0$ and $nu_{i}$ is positive, the equation can satisfied by adjusting $nu$. Next, we use the constraint

$sum_{alpha_{i}>0} alpha_{i}=1$

to get

$sum_{alpha_{i}>0} (lambda_{i}-mu/2)=1$

Adding in the terms for $alpha_{i}=0$, we get

$sum_{i=1}^{d} max(0,lambda_{i}-mu/2)=1$.

Once $mu$ has been determined, let

$alpha_{i}=max(0,lambda_{i}-mu/2)$ for $i=1, 2, ldots, d$.

edited Jan 6 at 18:20

answered Jan 6 at 17:05

Brian Borchers

5,94111119

$begingroup$
Wow, it is a very rigorous proof.
$endgroup$
– Sepide
Jan 7 at 19:01

add a comment |

First, note that no such $lambda$ exists if

$sum_{i=1}^{d} lambda_{i} < 1$.

I believe that if you allow for negative values of $lambda$ the method will actually produce correct results for matrices $W$ that have $sum_{i=1}^{d} lambda_{i}<1$.

In the following, we'll assume that

$sum_{i=1}^{d} lambda_{i} geq 1$.

Since $W$ is symmetric, we can write it in terms of eigenvalues and normalized eigenvectors as

$W=sum_{i=1}^{d} lambda_{i} y^{(i)}{y^{(i)}}^{T}$

$W=sum_{i=1}^{d} lambda_{i} Y^{(i)}$

where

$Y^{(i)}=y^{(i)}{y^{(i)}}^{T}$.

We need to be specific about the matrix inner product and norm that we're projecting with respect to. Since $W$ lives in the space of symmetric matrices, The appropriate inner product is

$langle A, B rangle = mbox{tr}(AB)$

and the norm is

$| A |_{F}=sqrt{langle A, A rangle}$.

Note that with respect to this inner product, the $Y^{(i)}$ matrices are orthogonal because the $y_{i}$ eigenvectors are orthogonal.

$langle Y^{(i)}, Y^{(j)} rangle=mbox{tr}(y^{(i)}{y^{(i)}}^{T}y^{(j)}{y^{(j)}}^{T})$

$langle Y^({i)}, Y^{(j)} rangle=mbox{tr}(y^{(i)}0{y^{(j)}}^{T})=0$.

Also,

$| Y^{(i)} |_{F}^{2}=mbox{tr}(Y^{(i)}Y^{(i)})=
mbox{tr}
(y^{(i)}{y^{(i)}}^{T} y^{(i)}{y^{(i)}}^{T})
=mbox{tr}(y^{(i)}{y^{(i)}}^{T})
=mbox{tr}({y^{(i)}}^{T}y^{(i)})
=1$.

The desired projection is

$Pi_{S}(W)=mbox{arg} mbox{min}_{Y} | W-Y |_{F}^{2}$

subject to the constraints

$mbox{tr}(Y)=1$

$Y succeq 0$.

Write $Y$ as a linear combination of the $Y^{(i)}$ matrices plus an additional component that is in the orthogonal complement of the span of the $Y^{(i)}$.

$Y=Y^{(0)}+sum_{i=1}^{d} alpha_{i} Y^{(i)}$

Now, by the Pythagorean theorem,

$| W-Y |_{F}^{2}=left| W-Y^{(0)} right|_{F}^{2}+left| left( W- sum_{i=1}^{d} alpha_{i} Y^{(i)} right) right|_{F}^{2}$.

$| W-Y |_{F}^{2}=left| W-Y^{(0)} right|_{F}^{2}+left| sum_{i=1}^{d} (lambda_{i}-alpha_{i}) Y^{(i)} right|_{F}^{2}$.

Using the orthogonality of the $Y^{(i)}$ matrices, we get that

$| W-Y |_{F}^{2}=left| W-Y^{(0)} right|_{F}^{2}+sum_{i=1}^{d} (lambda_{i}-alpha_{i})^{2}$.

Since $W perp Y^{(0)}$,

$| W-Y |_{F}^{2}=left| W right|_{F}^{2}+left| Y^{(0)} right|_{F}^{2}+sum_{i=1}^{d} (lambda_{i}-alpha_{i})^{2}$.

Since

$mbox{tr}(Y)=mbox{tr}(Y^{(0)})+sum_{i=1}^{d} alpha_{i}$,

and

$sum_{i=1}^{d} alpha_{i} leq sum_{i=1}^{d} lambda_{i}$,

using a non-zero value of $Y^{(0)}$ can only increase this objective. Thus it's optimal to set $Y^{(0)}=0$.

We've now reduced the problem to

$min sum_{i=1}^{d} (lambda_{i}-alpha_{i})^{2}$

subject to

$sum_{i=1}^{d} alpha_{i}=1$

$alpha geq 0$.

It's easy to see that the equation given in the paper results from applying the KKT conditions to this problem.

In the case $alpha_{i}>0$ and $nu_{i}=0$, we have

$-2(lambda_{i}-alpha_{i})=mu$

$(lambda_{i}-alpha_{i})=mu/2$.

In the case where $alpha_{i}=0$ and $nu_{i}$ is positive, the equation can satisfied by adjusting $nu$. Next, we use the constraint

$sum_{alpha_{i}>0} alpha_{i}=1$

to get

$sum_{alpha_{i}>0} (lambda_{i}-mu/2)=1$

Adding in the terms for $alpha_{i}=0$, we get

$sum_{i=1}^{d} max(0,lambda_{i}-mu/2)=1$.

Once $mu$ has been determined, let

$alpha_{i}=max(0,lambda_{i}-mu/2)$ for $i=1, 2, ldots, d$.

edited Jan 6 at 18:20

answered Jan 6 at 17:05

Brian Borchers

5,94111119

First, note that no such $lambda$ exists if

$sum_{i=1}^{d} lambda_{i} < 1$.

I believe that if you allow for negative values of $lambda$ the method will actually produce correct results for matrices $W$ that have $sum_{i=1}^{d} lambda_{i}<1$.

In the following, we'll assume that

$sum_{i=1}^{d} lambda_{i} geq 1$.

Since $W$ is symmetric, we can write it in terms of eigenvalues and normalized eigenvectors as

$W=sum_{i=1}^{d} lambda_{i} y^{(i)}{y^{(i)}}^{T}$

$W=sum_{i=1}^{d} lambda_{i} Y^{(i)}$

where

$Y^{(i)}=y^{(i)}{y^{(i)}}^{T}$.

We need to be specific about the matrix inner product and norm that we're projecting with respect to. Since $W$ lives in the space of symmetric matrices, The appropriate inner product is

$langle A, B rangle = mbox{tr}(AB)$

and the norm is

$| A |_{F}=sqrt{langle A, A rangle}$.

Note that with respect to this inner product, the $Y^{(i)}$ matrices are orthogonal because the $y_{i}$ eigenvectors are orthogonal.

$langle Y^{(i)}, Y^{(j)} rangle=mbox{tr}(y^{(i)}{y^{(i)}}^{T}y^{(j)}{y^{(j)}}^{T})$

$langle Y^({i)}, Y^{(j)} rangle=mbox{tr}(y^{(i)}0{y^{(j)}}^{T})=0$.

Also,

$| Y^{(i)} |_{F}^{2}=mbox{tr}(Y^{(i)}Y^{(i)})=
mbox{tr}
(y^{(i)}{y^{(i)}}^{T} y^{(i)}{y^{(i)}}^{T})
=mbox{tr}(y^{(i)}{y^{(i)}}^{T})
=mbox{tr}({y^{(i)}}^{T}y^{(i)})
=1$.

The desired projection is

$Pi_{S}(W)=mbox{arg} mbox{min}_{Y} | W-Y |_{F}^{2}$

subject to the constraints

$mbox{tr}(Y)=1$

$Y succeq 0$.

Write $Y$ as a linear combination of the $Y^{(i)}$ matrices plus an additional component that is in the orthogonal complement of the span of the $Y^{(i)}$.

$Y=Y^{(0)}+sum_{i=1}^{d} alpha_{i} Y^{(i)}$

Now, by the Pythagorean theorem,

$| W-Y |_{F}^{2}=left| W-Y^{(0)} right|_{F}^{2}+left| left( W- sum_{i=1}^{d} alpha_{i} Y^{(i)} right) right|_{F}^{2}$.

$| W-Y |_{F}^{2}=left| W-Y^{(0)} right|_{F}^{2}+left| sum_{i=1}^{d} (lambda_{i}-alpha_{i}) Y^{(i)} right|_{F}^{2}$.

Using the orthogonality of the $Y^{(i)}$ matrices, we get that

$| W-Y |_{F}^{2}=left| W-Y^{(0)} right|_{F}^{2}+sum_{i=1}^{d} (lambda_{i}-alpha_{i})^{2}$.

Since $W perp Y^{(0)}$,

$| W-Y |_{F}^{2}=left| W right|_{F}^{2}+left| Y^{(0)} right|_{F}^{2}+sum_{i=1}^{d} (lambda_{i}-alpha_{i})^{2}$.

Since

$mbox{tr}(Y)=mbox{tr}(Y^{(0)})+sum_{i=1}^{d} alpha_{i}$,

and

$sum_{i=1}^{d} alpha_{i} leq sum_{i=1}^{d} lambda_{i}$,

using a non-zero value of $Y^{(0)}$ can only increase this objective. Thus it's optimal to set $Y^{(0)}=0$.

We've now reduced the problem to

$min sum_{i=1}^{d} (lambda_{i}-alpha_{i})^{2}$

subject to

$sum_{i=1}^{d} alpha_{i}=1$

$alpha geq 0$.

It's easy to see that the equation given in the paper results from applying the KKT conditions to this problem.

In the case $alpha_{i}>0$ and $nu_{i}=0$, we have

$-2(lambda_{i}-alpha_{i})=mu$

$(lambda_{i}-alpha_{i})=mu/2$.

In the case where $alpha_{i}=0$ and $nu_{i}$ is positive, the equation can satisfied by adjusting $nu$. Next, we use the constraint

$sum_{alpha_{i}>0} alpha_{i}=1$

to get

$sum_{alpha_{i}>0} (lambda_{i}-mu/2)=1$

Adding in the terms for $alpha_{i}=0$, we get

$sum_{i=1}^{d} max(0,lambda_{i}-mu/2)=1$.

Once $mu$ has been determined, let

$alpha_{i}=max(0,lambda_{i}-mu/2)$ for $i=1, 2, ldots, d$.

edited Jan 6 at 18:20

answered Jan 6 at 17:05

Brian Borchers

5,94111119

edited Jan 6 at 18:20

answered Jan 6 at 17:05

Brian Borchers

5,94111119

answered Jan 6 at 17:05

Brian Borchers

5,94111119

answered Jan 6 at 17:05

Brian Borchers

5,94111119

$begingroup$
Wow, it is a very rigorous proof.
$endgroup$
– Sepide
Jan 7 at 19:01

add a comment |

$begingroup$
Wow, it is a very rigorous proof.
$endgroup$
– Sepide
Jan 7 at 19:01

Wow, it is a very rigorous proof.

– Sepide
Jan 7 at 19:01

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Mathematics Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Mfrhtyj