Learning Beyond Euclid: Curvature-Adaptive Generalization for Neural Networks on Manifolds

Krisanu Sarkar
Indian Institute of Technology Bombay
ArXiv Preprint 2025

Abstract

In this work, we develop new generalization bounds for neural networks trained on data supported on Riemannian manifolds. Existing generalization theories often rely on complexity measures derived from Euclidean geometry, which fail to account for the intrinsic structure of non-Euclidean spaces. Our analysis introduces a geometric refinement: we derive covering number bounds that explicitly incorporate manifold-specific properties such as sectional curvature, volume growth, and injectivity radius. These geometric corrections lead to sharper Rademacher complexity bounds for classes of Lipschitz neural networks defined on compact manifolds. We illustrate the tightness of our bounds in negatively curved spaces, where exponential volume growth leads to provably higher complexity, and in positively curved spaces, where curvature acts as a regularizing factor.

Theoretical Framework

We consider a compact, smooth Riemannian manifold \((\mathcal{M}, g, \mu)\) with sectional curvature bounded as \(\kappa_{\min} \leq \sec \leq \kappa_{\max}\). Our goal is to analyze the generalization properties of neural networks \(f: \mathcal{M} \to \mathbb{R}\) from a class \(\mathcal{F}\) of \(L\)-Lipschitz functions.

Manifold Covering Number

For \(\epsilon < \frac{1}{2}\text{inj}(\mathcal{M})\), the covering number of the manifold with respect to geodesic distance satisfies:

\[ \log N(\mathcal{M}, \epsilon, d_g) \leq \log \left( \frac{\text{Vol}(\mathcal{M})}{\inf_{x \in \mathcal{M}} \text{Vol}(B(x, \epsilon/2))} \right) + C\sqrt{|\kappa_{\max}|}\epsilon \]

Using Bishop-Gromov volume comparison, we capture how curvature constrains the density of points required to cover the manifold.

Curvature-Adaptive Generalization Bound

With probability at least \(1 - \delta\), for all \(f \in \mathcal{F}\), the generalization error is bounded by:

\[ |R(f) - \hat{R}_n(f)| \leq \mathcal{O} \left( \frac{L_\ell \sqrt{d \log(L\sqrt{n}) + \psi(\kappa_{\max}, L)}}{n^{1/d}} + B\sqrt{\frac{\log(1/\delta)}{n}} \right) \]

where the curvature penalty \(\psi(\kappa, L) = \sqrt{|\kappa|}/L\) captures the complexity increase due to negative curvature.

Key Insights

Experimental Validation

We empirically validate our curvature-dependent generalization bounds through synthetic manifold experiments and real-world data embeddings.

Generalization Gap vs Sample Size
Figure 1: Generalization Gap vs. Sample Size. Our bound (red/blue dashed) more accurately predicts the decay rate of the empirical gap across different geometries (Spherical, Hyperbolic, Euclidean) compared to ambient Euclidean bounds.
Curvature Ablation Study
Figure 2: Curvature Ablation. The generalization gap increases significantly as the manifold curvature becomes more negative, confirming the theoretical prediction of the \(\psi(\kappa, L)\) term.

Real-World Embedding Geometry

Dataset Ambient \(D\) Intrinsic \(d\) Curvature \(\kappa\) Improvement over Euclidean
Swiss Roll 3 2.1 0.2553 91.2%
MNIST Digits 64 7.5 0.0146 77.5%
Noisy Sphere 3 2.3 6.9138 89.5%
Manifold Visualization
Figure 3: Manifold Visualization. Visualization of the learned representations and their estimated intrinsic geometric properties.

Conclusion

This work bridges the gap between differential geometry and statistical learning theory. By deriving generalization guarantees that explicitly account for the Riemannian structure of the data domain, we provide a more principled understanding of how intrinsic geometry affects the learning capacity of neural networks. Our findings have significant implications for geometric deep learning, particularly in choosing appropriate latent space geometries for structured data.