הרצאה פומבית Efficient Analysis of High-Dimensional Big Data via Diffusion Geometry Preservation and Matrix Decomposition

Moshe Salhov

14 באפריל 2016, 14:00 
בניין שרייבר, חדר 209 
הרצאה לקהל הרחב

Abstract:

The Diffusion Maps (DM) framework is a kernel based method for manifold learning and data analysis that defines diffusion similarities by imposing a Markovian process on the given dataset.

 

Analysis by this process uncovers the intrinsic geometric structures in the data. Recently, it has been utilized for many modern data analysis applications.

 

In this talk we describe several methodologies that extend and optimize the DM framework to provide or efficient approximation efficient approximations and algorithms for analyzing  high dimensional big data. Furthermore,  we introduce DM analysis of data patches (i.e., local data clusters or neighborhoods) instead of processing individual data points. The defined affinities incorporate information about the dominant tangential directions in these patches together with their geometric positions on the manifold. Finally, we will  propose an alternative to a non-parametric kernel method approach for obtaining data representations via spectral decompositions of a big kernel operator or matrix with finite settings. The presentation of our approach is based on the Measure-based Gaussian Correlation (MGC) diffusion kernel and on the resulting measure-based DM embedding obtained by its decomposition. We will show that when the underlying measure is modeled by a GMM, an equivalent embedding, which preserves the diffusion geometry of the data, can be computed without the need to decompose  the full  kernel.

 

This seminar presents part of the speaker's Ph.D. thesis under the same name, carried out under the supervision of Prof. Amir Averbuch.

אוניברסיטת תל אביב עושה כל מאמץ לכבד זכויות יוצרים. אם בבעלותך זכויות יוצרים בתכנים שנמצאים פה ו/או השימוש שנעשה בתכנים אלה לדעתך מפר זכויות
שנעשה בתכנים אלה לדעתך מפר זכויות נא לפנות בהקדם לכתובת שכאן >>