Conflicting Data
Although Latin is a dead language, it has a funny way of coming back to haunt us from beyond the grave. Complex declensions and deponent verbs aside, aspects of Latin grammar and vocabulary continue to affect the English language today. One such linguistic haunting is the question of whether the Latin-derived word data should be treated as a plural or a singular. So which is it, data is or data are?
The short answer is “It depends”. Several factors play a role in determining whether the use of the plural or singular is more appropriate. So let’s delve in, shall we?
A Tale of Datum and Data
The word data is derived from the Latin word datum meaning “a thing given”. Its first recorded use in English dates back to the mid-17th century. Strictly speaking, the singular form of data remains datum, with the meaning “a single value or piece of information”. Purists argue that since data is the plural form of datum, it should naturally be treated as such and used with are. While this argument reasonable appears enough at first glance, things are not necessarily that clear-cut. For one, once foreign words enter a language’s lexicon, their use tends to evolve over time. They don’t necessarily adhere to the rules of their language of origin. For example, nobody bats an eye when the Latin word formula is pluralized as formulas instead of formulae. Moreover, you are far more likely to hear piece of data than datum. Outside specialized fields, using datum can come off as overly formal.
It’s Semantics Really
So does this mean that data should never be treated as a plural? Well, not exactly. It depends on whether you treat data as a mass noun or as a plural noun (with datum as its singular form). Proponents of data is will most often argue that data is a mass noun, akin to its near-synonym information, which always takes a singular verb. For example, you could easily replace data with information in the following sentence:
The data is clear and convincing.
However, if you were to use data to mean “many individual pieces of information”, then the use of a plural verb would be more appropriate. Replace data with pieces of information in the following sentence to see the subtle difference:
The data were collected systematically.
This slight difference in meaning partly explains the conflicting uses of data. The choice of a plural or singular verb is often dictated by context. Datum and data are appear more frequently in technical and scientific works. This is natural enough since work in such fields often involves the collection and analysis of large amounts of individual pieces of information. As for data is, it is used far more often in everyday, non-specialized contexts.
The Experts Weigh In
So what do style guides have to say on the data is or data are issue? Once again, opinion is divided.
The UK newspaper The Guardian advocates the use of a singular verb, arguing that while data is technically a plural word, the use of datum is too rare to warrant treating data as its plural counterpart.
The American Psychological Association (APA) takes the line that data is the plural of datum and not a mass noun.
The New York Times is on the fence; it accepts both so long as usage is consistent within a text.
The Associated Press Stylebook also accepts both: data is, where data is a mass noun and data are, where it is a plural noun.
So it would appear that there are as many approaches as there are style guides.
The Results Are In
So what conclusions can we draw? It seems safe to say that both data is and data are are acceptable. The argument that data is always a plural noun that requires a plural verb is not supported by evidence: the use of data as a mass noun with a singular verb is well-established in English. What should you do if you’re unsure of which one to use? It is always a good idea to consider context when deciding how to use data. Data are is often used in scientific or technical contexts, while data is is more common in everyday writing and speech. If you’re writing for an organization, always check what their stance on the issue is; many style guides mandate the use of one or the other. Also, remember that consistency is key. Don’t use data is and data are in the same text. If you choose data are, make sure to use the singular form datum: by using piece of data you will be treating data as a mass noun.