Giulio<p><a href="https://mastodon.world/tags/Language" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Language</span></a> <a href="https://mastodon.world/tags/models" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>models</span></a>. For example, in <a href="https://mastodon.world/tags/NLP" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>NLP</span></a>, when <a href="https://mastodon.world/tags/training" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>training</span></a><br>recurrent <a href="https://mastodon.world/tags/neural" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>neural</span></a> <a href="https://mastodon.world/tags/networks" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>networks</span></a>, it is useful to constraint the transition <a href="https://mastodon.world/tags/matrix" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>matrix</span></a> to be <a href="https://mastodon.world/tags/unitary" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>unitary</span></a> (Arjovsky et al., 2015). The unitary<br>matrix keeps the <a href="https://mastodon.world/tags/gradient" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>gradient</span></a> <a href="https://mastodon.world/tags/norm" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>norm</span></a> unchanged, and the network<br>is able to learn long-range dependencies. Unitary matrices<br>form a <a href="https://mastodon.world/tags/smooth" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>smooth</span></a> <a href="https://mastodon.world/tags/Riemannian" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Riemannian</span></a> <a href="https://mastodon.world/tags/manifold" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>manifold</span></a>, and Riemannian <a href="https://mastodon.world/tags/optimization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>optimization</span></a> can be easily applied to them.<br><a href="https://arxiv.org/pdf/2005.02819" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">arxiv.org/pdf/2005.02819</span><span class="invisible"></span></a></p><p>Geodesic Clustering in Deep Generative Models<br><a href="https://arxiv.org/abs/1809.04747" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">arxiv.org/abs/1809.04747</span><span class="invisible"></span></a></p>