Publikationen des Lehrstuhls Datenassimilation

Hyperbolic Embedding for Finding Syntax in BERT (short paper)

Autoren: T. Auyespek, T. Mach, Zh. Assylbekov (2022)

Recent advances in natural language processing have improved our understanding of what kind of linguistic knowledge is encoded in modern word representations. For example, methods for testing the ability to extract syntax trees from a language model architecture were developed by Hewitt and Manning (2019)—they project word vectors into Euclidean subspace in such a way that the corresponding squared Euclidean distance approximates the tree distance between words in the syntax tree. This work proposes a method for assessing whether embedding word representations in hyperbolic space can better reflect the graph structure of syntax trees. We show that the tree distance between words in a syntax tree can be approximated well by the hyperbolic distance between corresponding word vectors.

CEUR Workshop Proceedings
AIxIA 2021 Discussion Papers

zur Übersicht der Publikationen