This talk walks through two recent works that generalize the residual connection. Hyper-Connections replaces the fixed identity skip with learnable depth- and width-connections, maintaining multiple residual streams in parallel through a unified connection matrix. This subsumes Pre-LN and Post-LN as special cases, closes the seesaw trade-off...

๐Ÿ”— Read More & Access Full Source ๐Ÿ”“

Verified link by Valmet Tissue Converting Solutions

Reading Guide & Coverage Overview

Manifold-Constrained Hyper-Connections Information Center

Get comprehensive updates, key reports, and detailed insights compiled from verified editorial sources.

About to Manifold-Constrained Hyper-Connections

This talk walks through two recent works that generalize the residual connection. Hyper-Connections replaces the fixed identity skip with learnable depth- and width-connections, maintaining multiple residual streams in parallel through a unified connection matrix. This subsumes Pre-LN and Post-LN as special cases, closes the seesaw trade-off between gradient stability and expressiveness, and delivers consistent gains on OLMo-1B/7B at essentially zero overhead. The follow-up, Manifold-Constrained Hyper-Connections (mHC), fixes a training-instability issue in the original formulation โ€” where the unconstrained residual mapping can blow up across depth โ€” by projecting it onto the doubly stochastic manifold via Sinkhorn-Knopp, yielding stable training and improved downstream performance at 27B scale. Paper: Xie et al., "mHC: Manifold-Constrained Hyper-Connections," arXiv 2025. Presenter: Woojin Kim

Key Details

Explore the main sources for Manifold-Constrained Hyper-Connections.

Latest News

Stay updated on Manifold-Constrained Hyper-Connections's newest achievements.

Featured Video Reports & Highlights

Below is a handpicked selection of video coverage, expert reports, and highlights regarding Manifold-Constrained Hyper-Connections from verified contributors.

Manifold-Constrained Hyper-Connections
VIDEO

Manifold-Constrained Hyper-Connections

0 views Live Report

This talk walks through two recent works that generalize the residual connection. Hyper-Connections replaces the fixed identity skip with learnable depth- and width-connections, maintaining multiple residual streams in parallel through a unified connection matrix. This subsumes Pre-LN and Post-LN as special cases, closes the seesaw trade-off between gradient stability and expressiveness, and delivers consistent gains on OLMo-1B/7B at essentially zero overhead. The follow-up, Manifold-Constrained Hyper-Connections (mHC), fixes a training-instability issue in the original formulation โ€” where the unconstrained residual mapping can blow up across depth โ€” by projecting it onto the doubly stochastic manifold via Sinkhorn-Knopp, yielding stable training and improved downstream performance at 27B scale. Paper: Xie et al., "mHC: Manifold-Constrained Hyper-Connections," arXiv 2025. Presenter: Woojin Kim

Expert Insights

Data is compiled from public records and verified media reports.

Last Updated: May 23, 2026

Conclusion

For 2026, Manifold-Constrained Hyper-Connections remains one of the most talked-about profiles. Check back for the newest reports.

Disclaimer: