How to pronounce GIF

Possible spoiler: I don’t have an easy answer for you.

How to pronounce GIF
Photo by David Clode / Unsplash

I've been piecing together a post about AI development and why claims that it's "higher order abstraction" or "natural language programming" are… well, unconvincing.

This is not that post. But while drafting the other I realized it was inevitable I would go off on a tangent about this subject. To spare you (and me), I figured I could just make it another post.

The debate

Technically, a GIF is an acronym that describes a specific image encoding format. It stands for Graphics Interchange Format.

💡
It's really tedious to say "hard g" and "soft g" all the time. It's very nerdy, but we can use the phonetic alphabet. The hard-g sound is represented by /ɡ/ and the soft-g by /dʒ/. The slashes around the letters are used to indicate we are switching lettering systems.

Originally – and we're talking very originally here – "GIF" was pronounced with a soft g (/dʒ/) as in "giraffe." This was GIF team lead Steve Wilhite's preference. But the word "graphics" has a hard g (/ɡ/) sound – so a lot of folks use /ɡ/.

This debate has raged for years. I hoped we would eventually drop it as the image format became obsolete. Unfortunately, the term "GIF" has now entered popular use to mean a small, looping video typically used for memes or reaction images in chat. Many things people call "gifs" aren't actually – they are PNG, WebP, AVIF, or even very short mpeg4 videos. "GIF" now has meaning beyond its technical purpose, so we will be debating pronunciation for the foreseeable.

What no one seems to ask is: why do we have two different pronunciations ("phonemes") for the letter "g" in the first place? Why use g for /dʒ/ and not j, which is always only /dʒ/?

Tale of Two Gs

Language is not an engineered construct; words and sounds change over time and geography. The above justifications for different pronunciations are not necessarily rational decisions. I suspect they are likely rationalizations, or explanations created to explain a choice that we didn't necessarily make intentionally. This may even be the case where Wilhite is concerned; he felt like GIF should be pronounced /dʒ/, then justified that to himself later.

If you go looking for words with the "gi" construction in English, you'll find that pronunciation seems random:

  • Giant
  • Girl
  • Giraffe
  • Ginger
  • Gift

The different pronunciation between "girl" and "giraffe" is particularly interesting because they share gir in common. There is a pattern, however.

English is an odd bird of a language because it's strongly influenced by two separate root languages. Old English – a language we would not recognize today, even written down – was a Germanic language.

💡
Some people think of the language of Shakespeare as being "old English," and they are correct in the sense that it is both English and old; but from the perspective of linguistics it is an early form of Modern English. Shakespeare would probably have struggled as much to read Old English as the rest of us.

When England was conquered by the Duke of Normandy (a.k.a "William the Conqueror"), Norman French became the language of the educated elite and business. French is a Romantic language, which means its root is Latin. This was the language of the Church in England.

Under this two-pronged, heavy-handed cultural influence, somewhere between 70%-80% of Old English words fell out of use, often replaced by French equivalents.

This matters when we talk about the phonemes of g because the pronunciation of g drifted from /ɡ/ to /dʒ/ in the Romantic languages but not the Germanic ones. "Girl" and "Gift" are Old English holdovers; the soft-g words came to us through Romantic languages.

What this means for "GIF" is this: both /ɡ/ and /dʒ/ feel natural to English speakers when followed by the letter "i", and it's something of a die-roll which one someone will settle on first.

But Wilhite said…

Few people care.

Most of us don't know where the words we use come from. I had to look up the etymology of gift, girl, giraffe, ginger, and giant myself, and those words came through so many transformations you can't put your finger on who the "inventor" was. I suspect most people who use the word "gif" today don't know (or care) it refers to a very specific way of encoding images, much less what Wilhite's preferences were. Language just does not work that way.

Both pronunciations are broadly acceptable, although apparently /ɡ/ is somewhat more common. Many words have different pronunciations based on cultural identity, geography, or even snobbery. Attempts to enforce a specific pronunciation are not rooted in fact but in a desire to dominate and control other people's language, an impulse we see (for example) in correcting Southern American's pronunciation of "nuclear" ("nook-you-ler") or the American pronunciation and spelling of "aluminum."

For my part, I tend to use the two interchangeably, adopting the pronunciation of the group I am with. This is called "accommodation," and it arises from a desire or need to fit in with one's peer group. My subconscious tendency to do this with many words (like "aunt") likely comes from my childhood growing up on military bases surrounded by people who spoke with many accents and dialects; it's a behavior that's not uncommon among military brats.

The "divergence" approach – intentionally maintaining a pronunciation separate from your peers – can be seen either as an attempt to maintain an identity that feels threatened or as an aggressive attempt to dominate (depending on the context).

It has nothing to do with which pronunciation is correct. It has a lot more to do with your own background and identity.

What this has to do with AI programming

We seem to have strayed pretty far from the idea of natural language programming. I mean, I told you it was a tangent. I'm working on a blog post that pulls together several different threads, this just being one of them, that I think weave together an argument that natural language programming is impossible and arguably not even desirable.

Until then, however, you can reflect on this:

Programming is an activity that requires extreme detail and relies on complete – almost mathematical – clarity. An argument about how to pronounce a single word involves two millennia of history, a dedicated academic discipline, and forays into sociology and philosophy.

What does that say about the suitability of natural language for this task?