"i" before "e" except after "c"

I before e is basically useless. Here's a list of e before i words I ran
across. (I know, some of the words are iffy, but I didn't feel like editing.
DAD
absenteeism, ageism, albeit, Alexei, Alzheimer, Anaheim, atheist, beige,
being, Beijing, Beirut, Boeing, Budweiser, caffeine, codeine, counterfeit,
cuneiform, deign, Deirdre, edelweiss, eider, eight, Einstein, either,
Fahrenheit, feign, feint, feisty, foreign, forfeit, freight, geiger counter,
geisha, heifer, height, heinous, heir, heist, herein, Holstein, homogeneity,
Hygeia, inveigh, inveigle, kaleidoscope, Keith, Klein, Leicester,
leishmaniasis, leisure, leitmotif, Madeira, meiosis, monteith, neigh, Neil,
neither, nonpareil, nuclei, nucleic, obeisance, onomatopoeic, Oppenheimer,
plebeian, Pleiades, poltergeist, protein, queueing, reimburse, rein,
reindeer, reinstate, reinvent, reitbok, seismic, seize, sheik, Sheila,
skein, sleigh, sovereign, surfeit, surveillance, Taipei, their, theism,
veil, vein, weigh, weir, weird, wisenheimer, zein, zeitgeist

> I examined the; "i" before "e" except after "c" rule (cei vs. cie).
> The database of truespel book 4 is used to analyze the number
> of words with tradstreengz "cie" and "cei".  The data involves
> the top 5k most popular words in English print.  The overall word
> instances for these 5k words 15.4 million.  For example, the word
> "the" is most popular with 1.08 M instances out of 15.4 M total.
> There are only four words containing "cei" in the top 5k words.
> These are: received, receive, ceiling, and receiving in order.
> They add up to 3,077 instances in 15.4 M instances total.
> (Yet this is supposed to be the majority form.)
> There are 16 words with "cie" (which is opposite the rule).
> They add up to 17,351 instances.
> These words are (in order of popularity):
> society, science, species, ancient, scientific, societies,
> policies, scientists, sufficient, efficient, efficiency, sufficiently,
> conscience, sciences, agencies, scientist.
> The "i" before "e" except after "c" rule is busted.

Have you set up something of a straw man here?  I've never perceived the
"rule" to be necessary or apply to words in which the "i" and "e" are in
parts of two different syllables (from your example, Society, Science,
Scientific, Societies, Scientists, sciences, scientist), or in cases
where "ies" is an inflection of words ending in "y" (your Policies and
Agencies, also Fancied).  If you take out these examples from your
statistics, how does the data look?

Further, when I mentally "sound out" a word to spell it, I deal with
"cient/cienc" as (for example) sufficient/Suh-fish-ee-ent, internally
breaking "ie" into two syllables.  I realize that this isn't standard
pronunciation (and I don't actually pronounce the words that way), but
it leads me to the conclusion that when the "c" in "cie" is soft ch/sh,
the rule doesn't apply in an analogous way to the two-syllable exception
I mention above.

With these addenda to the "rule", the only exception left is "species"
-- not sufficient, I would think, to bust the rule.

I'm surprised that "receipt", "perceive", "conceive", "conceit",
"deceive" and "deceit" (and associated forms) aren't in your list.
"Transceiver" isn't as common, but is still common enough that your net
should be cast wide enough to include

If you expanded your sample size so that most of the words I list in the
paragraph immediately preceding are included, and remove as exceptions
those words in which "i" and "e" are parts of two separate syllables,
I'd bet that the rule applies (even if a sample space of this size ends
up including "regencies", "efficiencies", and "necromancies").

I learned the full form of the rule:

I before E
Except after C
Or when sounded as A
As in neighbor and weigh.
and the rule works for me.  And by works, I mean that when I'm stumped
on how to spell a word, the rule gives useful guidance in my experience.
It may be that I instinctively know how to spell your exception words,
and never apply the rule to them, but nevertheless, when I do apply it,
it is helpful.  Which is the standard by which I judge the rule's

