Discussion about this post

User's avatar
Antony's avatar

The study of text embedding outside of computer science is important but rare.

"Running head: Theoretical foundations and limits of word embeddings" by Alina Arseniev-Koehler

https://arxiv.org/pdf/2107.10413.pdf gives a detailed analysis of linguistic structuralism.

an extract from her paper intro:

In this paper, I focus on three of its key premises. First, that language is a system comprised of various signs (e.g., words, suffixes, or idioms) where these signs are purely defined by their relationship to other signs in the system, rather than by any external reality. For example, a word is defined by its co- occurrence relationship to all other words – not from the intrinsic properties of the letters or sounds that comprise the word, from dictionary definitions, or by its reference to some external object (Saussure, 1983, p. 113). This suggests, for example, that if a misspelled word is used in a similar way as a correctly spelled word, both spelling variants will be understood in the same way. If, however, spelling variants are used in some systematically different way (e.g., British versus American spellings), the variants will evoke slightly different interpretations– even when all variants refer to the same physical object.

Expand full comment

No posts