报告名称:Debiased distributed PCA under high-dimensional spiked models
报告人:李卫明 教授 上海财经大学
报告时间:2025年11月13日 下午3:00--4:00
报告地点:腾讯会议439 644 582
校内联系人:丁雪 dingxue83@jlu.edu.cn
报告摘要:
We study principal component analysis (PCA) in distributed high-dimensional settings under spiked models. In such regimes, sample eigenvectors can deviate significantly from population ones, introducing a persistent bias. Existing distributed PCA methods are sensitive to this bias, particularly when the number of machines is small. Their consistency typically relies on the number of machines tending to infinity. We propose a debiased distributed PCA algorithm that corrects the local bias before aggregation and incorporates a sparsity-detection step to adaptively handle sparse and non-sparse eigenvectors. Theoretically, we establish the consistency of our estimator under much weaker conditions compared to existing literature. In particular, our approach does not require symmetric innovations and only assumes a finite sixth moment. Furthermore, our method generally achieves a smaller estimation error, especially when the number of machines is small. Empirically, extensive simulations and real data experiments demonstrate that our method consistently outperforms existing distributed PCA approaches. The advantage is especially prominent when the leading eigenvectors are sparse or the number of machines is limited. Our method and theoretical analysis are also applicable to the sample correlation matrix.
个人简介:
李卫明,上海财经大学统计与数据科学学院,教授、博士生导师。主要研究方向为随机矩阵理论与高维统计分析。在 AOS、JRSSB、JASA 等国际知名期刊发表多篇论文。现任CSDA期刊副主编。