统计与数据科学系系列学术报告之三百九十

 

时    间:2023年04月11日(周四)14:00-15:00

主持人:复旦大学 管理学院 统计与数据科学系 张新生 教授

地    点:史带楼303室

报告人:邹长亮教授

              南开大学统计研究院院长、统计与数据科学学院副院长

题    目:Diversified sample selection via predictive inference

摘    要:In the big data era, sub-data selection techniques are often adopted to extract a fraction of informative individuals from the massive data. Existing subsampling algorithms focus mainly on obtaining a representative subset to achieve the best estimation accuracy under a given class of models. We consider here how to obtain informative individuals that are characterized by their unobserved responses with a given budget. We propose an optimal subsampling procedure that is able to maximize the diversity of the selected subsample and control the false selection rate (FSR) simultaneously, allowing us to explore reliable information as much as possible. Further, we extend the algorithm to the problem of sample selection in the online setting, where one encounters a possibly infinite sequence of individuals collected by time with covariate information available.

个人简介:邹长亮教授2008年于南开大学获博士学位,随后留校任教。主要从事统计学及其与数据科学领域的交叉研究和实际应用。研究兴趣包括:高维数据统计推断、大规模数据流分析、变点和异常点检测等,在Ann.Stat.、Biometrika、J.Am.Stat.Asso.、Math. Program.、Technometrics、IISE Tran.等统计学和工业工程领域期刊上发表论文几十篇,主持国家自然科学基金委优青、杰青、重点项目、重大项目课题等。

 

 统计与数据科学系

2023-3-27