2. 论文逐段精读

2.1. Abstract

        ①They proposed a Multi-view Multigraph Embedding (M2E) to get information from different views

2.2. Introduction

        ①The conceptual view of M2E:

2.3. Related work

        ①Introducing graph embedding methods

        ②Compared with multi-view clustering and multi-view embedding

2.4. Preliminaries


        ②Definition 1: introducing partial symmetric tensor(不过我觉得作者没有解释地很清楚,他说“如果一个M阶张量在模态1到M上偏对称,那么它就是秩一偏对称张量”。不如看看我的知识补充)

        ③Definition 2: matricize tensor \mathcal{X}\in\mathbb{R}^{I_{1}\times\cdots\times I_{M}} to \mathbf{X}_{(m)}\in \mathbb{R}^{I_m\times J}, where 

\begin{aligned}&j=1+\sum_{p=1,p\neq m}^{M}(i_{p}-1)J_{p}, with\\&J_{p}=\begin{cases}1,&if p=1 or (p=2 and m=1)\\\Pi_{q=1,q\neq m}^{p-1}I_q,&otherwise.\end{cases}\end{aligned}

        ④Definition 3: factorize \mathcal{X}\in\mathbb{R}^{I_{1}\times\cdots\times I_{M}} to:


which needs to minimize the estimation error:


and, to solve non convex optimization problems:

\mathbf{X}^{(k)}\leftarrow\arg\min_{\mathbf{X}^{(k)}}\|\mathbf{X}_{(k)}-\mathbf{X}^{(k)}(\odot_{i\neq k}^n\mathbf{X}^{(i)})^\mathrm{T}\|_F^2

where \odot_{i\neq k}^{M}\mathbf{X}^{(i)}=\mathbf{X}^{(M)}\odot\cdots\mathbf{X}^{(k-1)}\odot\mathbf{X}^{(k+1)}\cdots\odot\mathbf{X}^{(1)}

2.5. Methodology

2.5.1. Problem definition

        ①For N samples with V views, they have brain connectivity \mathbf{W}\in\mathbb{R}^{M\times M} each with M nodes

        ②For each view, the whole graph set is \mathcal{D}^{(v)}=\{\mathbf{W}_{1}^{(v)},\mathbf{W}_{2}^{(v)},\cdots,\mathbf{W}_{N}^{(v)}\}

        ③All the views: \mathcal{D} = \{\mathcal{D}^{(1)},\mathcal{D}^{(2)},\cdots,\mathcal{D}^{(V)}\}

        ④To learn an embedding \mathbf{F}^*\in\mathbb{R}^{N\times R} for each participant 

2.5.2. M2E approach

        ①Concatenated third-order tensor: 

\mathcal{X}^{(v)}=[\mathbf{W}_1^{(v)},\mathbf{W}_2^{(v)},\cdots,\mathbf{W}_N^{(v)}]\in \mathbb{R}^{M\times M\times N},v \in [1 : V]

        ②Embedding function:


where \mathbf{H}^{(v)}\in\mathbb{R}^{M\times R} and \mathbf{F}^{(v)}\in\mathbb{R}^{N\times R} calculated by CP factorization:

        ③Common embedding learning:


        ④Combining them to optimize M2E:


where the first term is for minimize the dependence of multi-graphs and the second is for multi-views

2.5.3. Optimization framework

        ①Parameter needs estimate: \mathbf{H}^{(v)}\in\mathbb{R}^{M\times R}\mathbf{F }^{(v)}\in\mathbb{R}^{N\times R}, and \mathbf{F}^{*}\in\mathbb{R}^{N\times R}. Due to they are not convex, no closed-form adopted. Then they introduced an iteration method, Alternating Direction Method of Multipliers (ADMM) approach.

        ②They use variable substitution technique, fixing \mathbf{F }^{(v)} and \mathbf{F}^{*}, compute \mathbf{H}^{(v)}:

\begin{aligned}&\min_{\mathbf{H}^{(v)},\mathbf{P}^{(v)}}||\mathcal{X}^{(v)}-[[\mathbf{H}^{(v)},\mathbf{P}^{(v)},\mathbf{F}^{(v)}]]||_{F}^{2}\\&s.t. \mathbf{H}^{(v)}= \mathbf{P}^{(v)}\end{aligned}

the Lagragian function:


where \mathbf{U}^{(v)}\in\mathbb{R}^{M\times R} denotes Lagrange multipliers, \mu denotes penalty parameter. Optimization problem:


they transfer \mathcal{X}^{(v)} to \mathbf{X}_{(1)}^{(v)}\in\mathbb{R}^{M\times(MN)}, and define \mathbf{D}^{(v)}=\mathbf{F}^{(v)}\odot\mathbf{P}^{(v)}\in\mathbb{R}^{(NM)\times R}

. Further changing the minimizing function:


where \mathbf{A}^{(v)}=\mathbf{D}^{(v)^{\mathrm{T}}}\mathbf{D}^{(v)}+\frac{\mu}{2}\mathbf{I} and \mathbf{B}^{(v)}=2\mathbf{X}_{(1)}^{(v)}\mathbf{D}^{(v)}+\mu\mathbf{P}^{(v)}-\mathbf{U}^{(v)}. Solving it by update \mathbf{H}^{(v)}


where L^{(v)} denotes Lipschitz coefficient and equals to the maximum eigenvalue of 2\mathbf{A}^{(v)}. They applied Khatri-Rao product to calculate \mathbf{D}^{(v)^\mathrm{T}}\mathbf{D}^{(v)}:

\begin{aligned} \mathbf{D}^{(v)^{\mathrm{T}}}\mathbf{D}^{(v)}& =(\mathbf{F}^{(v)}\odot\mathbf{P}^{(v)^{\mathrm{T}}})(\mathbf{F}^{(v)}\odot\mathbf{P}^{(v)}) \\ &=(\mathbf{F}^{(v)^{\mathrm{T}}}\mathbf{F}^{(v)})*(\mathbf{P}^{(v)^{\mathrm{T}}}\mathbf{P}^{(v)}) \end{aligned}

where \ast denotes Hadamard product. The updating function of \mathrm{P}^{(v)}:


where \mathbf{A}^{(v)}=\mathbf{E}^{(v)^{\mathrm{T}}}\mathbf{E}^{(v)}+\frac\mu2(\mathbf{I})\mathbf{B}^{(v)}=2\mathbf{X}_{(2)}^{(v)}\mathbf{E}^{(v)}+\mu\mathbf{H}^{(v)}+\mathbf{U}^{(v)}\mathbf{E}^{(v)}=\mathbf{F}^{(v)}\odot\mathbf{H}^{(v)}\in\mathbb{R}^{(NM)\times R}. Lastly update \mathrm{U}(v):


        ③Then they fix \mathbf{F}^{*} and \mathbf{H}^{(v)} to compute \mathbf{F }^{(v)} by minimize:

\min_{\mathbf{F}^{(v)}} ||\mathbf{X}_{(3)}^{(v)}-\mathbf{F}^{(v)}\mathbf{J}^{(v)^{\mathrm{T}}}||_{F}^{2}+\lambda_{(v)}||\mathbf{F}^{(v)}-\mathbf{F}^{*}||_{F}^{2}

where \mathbf{J}^{(v)}=\mathbf{P}^{(v)}\odot\mathbf{H}^{(v)}\in\mathbb{R}^{(MM)\times R}. The updating function of \mathbf{F }^{(v)}:


where \mathbf{A}^{(v)} = \mathbf{J}^{(v)^\mathrm{T}}\mathbf{J}^{(v)} + \lambda_{(v)}(\mathbf{I})\mathbf{B}^{v} = 2\mathbf{X}_{(3)}^{(v)}\mathbf{J}^{(v)} +2\lambda_{(v)}\mathbf{F}^*

        ④Finally, they fix \mathbf{H}^{(v)} and \mathbf{F }^{(v)} to minimize {\mathcal{O}} over \mathbf{F}^{*}:


        ⑤Overall time complexity: 


2.6. Experiments and evaluation

2.6.1. Data collection and preprocessing

(1)Human Immunodeficiency Virus Infection (HIV)

        ①Sample: randomly select 35 patients and 35 controls from dataset due to the data imbalance

        ②Atlas: AAL 90

(2)Bipolar Disorder (BP)

        ①Sample: 52 BP and 45 controls

        ②Atlas: self-generated 82 regions

euthymia  n. 情感正常

2.6.2. Baselines and metrics

        ①Introducing compared models

        ②Grid search for hyper-parameters: \lambda _1,\lambda _2\in\{10^{-4},10^{-2},...,10^{4}\}R form \{1,2,...,20\}

2.6.3. Clustering results

        ①Performance comparison table:

2.6.4. Parameter sensitivity analysis

        ①Ablation on \lambda:

        ②Ablation on R:

2.6.5. Factor analysis

        ①The activity intensity of the brain region and the embedded feature \mathbf{F }^{(v)}:

2.7. Conclusion

        They design a novel multi-view multi-graph embedding framework based on partially-symmetric tensor factorization

3. 知识补充

3.1. 偏对称张量






4. Reference

Liu, Y. et al. (2018) 'Multi-View Multi-Graph Embedding for Brain Network Clustering Analysis', AAAI. doi: https://doi.org/10.48550/arXiv.1806.07703


