Fun with "data" (Well, "fun" if you are interested in word usage)

I was raised on the west coast of the US and I am used to using “data” as if it were singular. “That data is not correct” instead of “Those data are not correct.” To me, “data” always refers to a defined set of data points and is therefore a single thing, like a bag of sugar or a bucket of sand. You could say “Some of that data is correct and some is not correct” where “some” is a stand-in for the actual definition of the set. It feels very awkward to me to say “Some of those data are correct and some are not correct” although of course I understand what is meant if someone says it that way. In fact that is what triggered this post, I heard a commentator on NPR say something like “His data were used in a lot of scientific papers.”

I understand that it is the plural of “datum” but the singular form is comparatively rarely used. Are there other words like this? And where do you stand on the use of “data” as if it were a singular thing vs a plural of things?

Data could be thought of as a collective noun. In English a collective noun takes a singular verb.

Nevertheless “some” takes a plural verb. You would say “Some of these examples are correct.” and I expect you think that sounds correct because the plural “examples” is next to the verb while you think your example sounds wrong because “data”, which you wish to be singular, is next to the verb. But in both cases “some” is the subject and requires a plural verb.

I disagree. You can certainly say “some of that sand is wet, and some is dry.” “Some” can be used both for counting things (marbles) and for measuring things (flour).

From what I can gather, a collective noun is equivalent to “group,”, that is, like flock of geese or colony of ants or swarm of bees. “Data” isn’t like that. It is a plural noun, but rather a distinct one as far as I can see.

Take a simple case: “a data point” vs. “a datum point”.

The first sounds right to me. The second sounds affected. Yet I know that shouldn’t be the case.

An out: a data point isn’t, hopefully, one number (for example). It’s two or more numbers (or strings or whatever). Two or more? That’s a plural!:wink:

If data is meant to be “a collected body of information”, there is nothing wrong with using it as a singular subject. Like saying “The outgoing mail is on my desk”. Different users of English apply plural to the grammatical form, or the physical fact of the subject.

For example, in Canada, “government” is considered to be a plural noun. “The government are responsible for high taxes”. There is no difference between saying “Toronto have the puck in their own end” and “The Leafs have the puck in their own end”. (But never “the Leaves”) “Toronto” and “The (Maple) Leafs” are interchangeable words for the same entity, requiring the same verb.

Americans, on the other hand, would say “Chicago is”, but “The Black Hawks are”. To argue that one is right and the other wrong is simply Parochialism.

In fact the writing style guide at my relatively new place of employment (3 months) prefers the singular “data is.” Another battle lost.

Other words like this: “agenda.”

If you’re speaking Latin, “data” is plural, but that doesn’t have the slightest relevance if you’re speaking English.

I think a data point would just be a datum. That is, a single datum is a point in an array of data.

Interesting example. Would anyone anywhere say “the agenda are complete” or “the agenda still have some openings?” In common usage in all English speaking countries, I believe, “agenda” is used as a singular noun although technically it is the plural of “agendum,” a word that I have never seen or heard used but is in the dictionary. In crossword puzzles, a single point on an agenda is always “item.”

I disagree about the treatment of data being irrelevent in English, since it is used both ways; more often as a singular in the US and at least sometimes as a plural in the UK.

I forgot to copy in jtur88’s interesting offerings. “Government” is not the plural of anything, but it can be and is treated that way (also in the UK). Which leads to an interesting picture of a sort of dis-united government which sometimes thinks alike and sometimes is divided.

As for the hockey example, I think Americans saying “Chicago is” is just shorthand for “the Chicago team is,” while Canadians saying “Toronto are” is a shorthand for “the Toronto Maple Leafs are.” At least that’s how I intend to look at it, parochialism or not.

via xkcd

Data are one of the most popular Star Trek characters.

I tend to use data as a plural unless I’m talking about the concept “data.”

You can always coin examples that seem to violate that rule:

Movies was the most popular form of family entertainment.

Home-Economics is a required subject.

Stamps was how I learned history when I was a collector.

The news is not good.

The back forty was left fallow.

Data is an important research tool

One viscus, two viscera.
One insigne, two insignia.
One graffito, two graffiti.

One criterion, two criteria
one bacterium, two bacteria
one forthemoney, two fortheshow…

Good.

The battle to establish Latin as the national language of Canada?

It does sound impressive.

I like it! :smiley:

Yes, but which way do you pronounce the word?

I can’t listen to an American say “duty” without giggling.

Duty calls.
Let’s all do our duty.
Donald Trump wants more duty on imported goods.

These are a different issue, IMO. Using the plural in place of the singular in error, e.g. criteria for criterion, is not the same as using a plural noun as if it were a singular noun.

I use the second one, which is apparently this: /ˈdeɪtə/

In discussing airplanes, the word “datum” is used to mean “a (more-or-less) arbitrarily-chosen reference point” from which various places in the aircraft are measured. It is used in computing the location of the center-of-gravity. Typical usage is something like:

The datum is at the front edge of the wings where they attach to the fuselage. [Could also be at the front of the engine or at the firewall between engine and pilot’s feet.]

Front seat is at -14.57" from the datum.
Rear seat is at +1.38" from the datum.
Baggage compartment is at +31.89" from the datum.

Taking these measurements, along with the weights of the passengers or baggage at those stations, the CoG can be computed. For all aircraft, but some more than others, this is all critically important.

This usage of the word seems a bit odd to me: It seems to refer to a location rather than to an item of information. It took me a while to understand this usage. Why don’t they just call it “the reference point”?

(The above example measures are for the Grob G-103 Twin III Acro sailplane.)