Computationally proving synonyms don't exist

At least kind of

Nov 19, 2023

Galton board - Wikipedia — Shhh, I promise there’s no real statistics in this

When doing a quick Google search about what a synonym is, you’ll find that it is “a word or phrase that means exactly or nearly the same as another word or phrase in the same language”. Now if you pay attention to this phrasing, you’ll notice a little escape mechanism, nearly the same. In fact, without this little added wiggle room, “true” synonyms don’t exist at all.

For the google example, they use the words shut and close. Now there’s not much arguing, that asking someone to shut or close a door are basically identical statements, but now picture what these two phrases look like “she shut the door” and “she closed the door”. Synonyms yes, but the connotations are completely different. Even for something like gray and grey, I’d argue that these two words are used in drastically different contexts (there’s an ocean separating them usually).

A short, little fake proof

So let’s say hypothetically we have a perfect synonym. This means that for one exact meaning, there are two words that each get used 50% of the time to express this sentiment. Now if we had infinite people, then perhaps these words would continue this balance, but as with any finite sample, eventually Word A or B will get used >50% of the time. And if you think about it, the only two stable futures that we could ever arrive at are either using Word A 100% of the time or Word B 100% time. Any other split between the two would lead to instability.

For fun, I plotted the probability of using Word A1 given a certain “population size”, and you can see the pattern in that one word will always dominate no matter how large the population gets. (If anyone knows how to mathematically prove that this is the case without just hand waving and coding, I’d love to learn how!)

Sure, but who cares?

After all, British people and Americans will happily coexist while using both grey and gray. For me, it was really pretty seeing how absorbing states in a Markov sense might help reason about synonyms in a language. Even though our human affairs are often of the squishy and messy variety, they’re still sometimes amenable to more rigorous tools.2

This is slightly untrue since I’m plotting the probability of whichever word wins out in the end, but I couldn’t find a clean way to say that in the main paragraph.

The original paper that inspired this was Nowak et al. 1999 “The evolutionary language game”

Branching Beyond

Discussion about this post

Ready for more?