2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

Technical Program

Paper Detail

Paper IDMLSP-9.2
Paper Title A LARGE-DIMENSIONAL ANALYSIS OF SYMMETRIC SNE
Authors Charles Séjourné, Romain Couillet, Pierre Comon, GIPSA-Lab, University Grenoble Alpes, France
SessionMLSP-9: Learning Theory for Neural Networks
LocationGather.Town
Session Time:Tuesday, 08 June, 16:30 - 17:15
Presentation Time:Tuesday, 08 June, 16:30 - 17:15
Presentation Poster
Topic Machine Learning for Signal Processing: [MLR-LEAR] Learning theory and algorithms
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Virtual Presentation  Click here to watch in the Virtual Conference
Abstract Stochastic Neighbour Embedding methods (SNE, t-SNE) aim at finding a faithful low-dimensional representation of a high-dimensional dataset. Despite their popularity, being solution to a non-convex optimization, the behavior of these tools is not well understood. This work provides first answers by leveraging a large dimensional statistics approach, where the number n and dimension p of the large-dimensional data are of the same magnitude. We derive and study the canonical equation verified by the critical points of this non-convex optimization problem. The study notably reveals that, in a simple setup, the achievable SNE solutions correspond to a subset of those critical points. In particular, when the clusters composing the dataset are balanced in size, these solutions are symmetrical and assume closed-form expressions. As a major conclusion, the analysis rigorously proves along-standing heuristic statement on the “proper normalization” of the symmetric SNE: out of two natural normalization choices, only the claimed proper one leads to non-trivial solutions.