Thanks for the excellent article on Latin plurals.
One thing I don’t think is emphasized strongly enough is that what “data” has become is a mass noun.
Like other mass nouns, it agrees with singular verb conjugations, never takes the indirect article “a” or an explicit number, and has no corresponding plural.
It makes sense that as the ability to process large amounts of data have increased, and as computers have imposed varying and often arbitrary units of data, the notion of data as something that is obviously countable as individual items has gone by the wayside.
You’re hiding in the bushes, watching a group of specimens of the species homo sapiens through a pair of binoculars (which is of course the plural of binocular). You switch on your tape recorder, and whisper into the microphone “I am observing three…” what?
I’m not sure if you’re making an wildly obvious observation or a complaint here. I feel datum is a useless word, and statements like ‘the data are ready’ sound awful and awkward. It may etymologically correct, but it’s infinitely simpler to refer to a ‘piece of data’ than a datum.
Awful or not, statements like that are used by scientists, academicians, scholars, and other Smart People all over the world. The usage may have long ago disappeared from the language of everyday people, but it will probably never die in these circles.
Around the parts I now inhabit, large numbers of people contract the phrase “[noun] needs to be [verb]” into “[noun] needs [verb]”, as in, “floor needs swept.” Amazingly enough, there are those of us who feel it important to use proper grammatical construction, and we eschew this particular usage, even though it has the advantage of being more concise in verbage. In fact, it tends to make you sound like a rube, much as usage of “data” as a singluar should.
One point frequently overlooked in these discussions is that while “datum” and “data” are the correct Latin forms, the Latin rules do not necessarily apply to the derivative English word. I think of data as an English collective noun with a Latin origin. I think of datum/datums as a different English noun derived from the same Latin source.
My somewhat simplistic impression of a rather complex subject is that words of foreign origin imported into English may retain the original grammatical construction as long as they are recognized as foreign, and are from a language familiar to the user (as Latin and Greek were for most educated English speakers until comparitively recently). Once the words become embedded into the vocabulary as “English” rather than foreign, the normal English grammar rules generally begin to replace the original foreign rules. YMMV
It’s a wildly obvious observation, except that I frequently run into people who reply to the use of “data” with a singular verb conjugations with the quip that “‘data’ is plural, not singular.” I conclude that it isn’t wildly obvious to them. Otherwise they would reply with “‘data’ is plural, not mass.” The fact that it isn’t wildly obvious to them leads me to believe that it needs to be emphasized more.
I work in the earth sciences where datum, datums and data are part of everyday usage. I see no sign that plural constructions when using “data” are waning.
I’m of mixed minds on this – it appears to be an evolving usage, with purists holding for the etymological plurality, and the sense that the data ought to be considered as a collective unit in most instances. I’m wondering if a straddle position, similar to British collective-noun usage, would be appropriate:
“The data consistently suggests…” (sc., “The evidence comprised of all the data consistently suggests…”
but
“The data he exhibits are a strangely mixed assortment of statistics, some apparently accurately collected and others haphazardly thrown together.” (Sc., “Some of the data are accurate and some are not.”)
My PC’s motherboard reference (made in Taiwan) even contains the following sentence:
“If you remove that jumper, all your datas will be lost”
Is that considered misspelling? Or does data as mass noun have a plural form?