There’s a bit of terminology I’ve found helpful: the phrase existential claims.

I use it as a slightly more formal way to talk about “records”–that is, instances of some category (to use the terminology of my mental model of data) that are stored in a system. (You might also call them “rows”, “tuples”, “data points”, “objects”, “observations”, “triples”, “statements”, “assertions”, “facts”, “samples”, or any number of other things.)

Let me unpack what I mean a bit:

Existential

By “existential”, I don't mean the sense of “pondering the meaninglessness of our existence on this planet”. I simply mean “relating to existence”. I'm saying that when we create data, we are making a logical assertion that’s akin to using an existential quantifier in logic, like:

∃ p (Name(p) = “Taylor Swift” && IsSinger(p) = True)

Which reads as “There exists some P such that P’s name is “Taylor Swift” and P is a singer.”

This is not something I invented, of course; see also Russell's definite descriptions.

Claims

Consider a table that stores “Presidents Of The United States”. Now imagine it has a few records in it, like:

Presidents:

Name              Entered Office  Left Office
----------------- --------------- ------------
Grover Cleveland  1885            1897
Howard Taft       1909            1913
Taylor Swift      2019            NULL

The third record here is not true, of course; a very bizarre (and probably awesome) set of circumstances would have to transpire to make it so. But, right or wrong, there's no denying that the record is indeed there.

So: just because it's data doesn't mean it's true. What's true here is that someone (in this case, me) has created data saying that Taylor Swift is the President. Nothing in the world can prevent me from making a statement like that; I’d simply be wrong. (You can discuss the propriety of making false assertions all day long, but that doesn’t make them cease to exist.)

This is why I favor the word “claims” for stored data; using language like this decouples the data manifestation from the subject of the data. This might seem like a subtle shift, but notice that it gets us past a whole lot of philosophical baggage, like having to differentiate between truth, knowledge, and belief. We can all agree that a claim exists (it’s a pattern in bits in some substrate like memory or disk), without having to agree about the messy distinction of whether the claim is true in some factual world.

A claim must have a claimer. That claimer could be a human who caused this data to be entered, or a machine that translated some other signal in the world (like a sensor) into a claim, or even just software (say, a program that writes a record of the current time). Directing attention to this claimer also adverts to a few other useful things, like:

  • Who claimed it?
  • When did they make the claim?
  • Where did they store or transmit the claim?
  • Did they claim it would be true forever, or for some fixed duration?
  • Should I believe this claim?

You don’t have to record who the claimer was, or any of these other points, but they are matters of fact whether you record them or not.

My contention is that all data follows this rule: if it's data, it's an existential claim made by someone about something (potentially also for some purpose, but let's leave that for later.)

From that perspective, a category can be seen “a kind of thing you can make existential claims of”. And similarly, a relationship is also a claim, of a slightly different type. Structure / concept modeling is really the act of saying what kinds of facts it’s useful to be able to claim, in some domain of applicability.

Is All Data Existential Claims?

Is it really appropriate to say this about all data, though? Is it possible to have a data record that isn’t performing this function, i.e. that makes no claim of existence of anything? Let's see.

For one thing, let's agree that an existential claim doesn’t have to be about the existence of something in the physical world. If I claim that Peter Pan is a fictional character, and put a record about him in a table of fictional characters, I've made an existential claim about an instance of the category of fictional characters. The nature of a category tells you what kind of existence is being claimed, and in this case, fictional character means “a person described in writing but who doesn't exist in the real world”.

We can even go further than that, though. Imagine a table containing a bunch of US presidents that don't exist:

False US Presidents

Name  
--------------------------
Taylor Swift
Janus Figglestein
President McPresidentface

Every time you think of another president that doesn't exist, you could add it to this table. (This sounds fun! Feel free to take a break from reading to do this.)

So are these fake presidents “existential claims”? Yes: they make the claim that these are some presidents that don't exist. Or, more precisely, it's claiming the existence of a name that could plausibly be a president, but isn't. If someone named “President McPresidentface” did get elected to president, the existential claim wouldn’t suddenly evaporate, but it would now be false. There are an infinite number of true claims you could make in this category, and a very small number of false ones.

One step further, we can even invent paradoxical categories, like “the category of all words that are not in this category”.

Words Not In This Table
------------------------

Any existential claim you make by putting a record in this table would be false by definition. But that's OK, you can still insert data with impunity; being wrong doesn't negate being existential claims!

Are there really no counter examples?

The closest I have come is the following scenario: imagine I create a record claiming that London is the capital of France (which is false, but still a claim). Then imagine some cosmic radiation changes my data, stored in a computer, to read that Londom (with an M) is the capital. This is no longer my claim. Is it right to say that the cosmic radiation “made a claim”? Or did it destroy my claim by altering it, moving it from the realm of existential claims into just meaningless bits? I don't know.

I may yet think of some other kind of data that isn’t an existential claim, but I haven’t so far. I welcome suggestions!

Truth Does Matter, Tho

The philosophical ontology I’m describing is very permissive. Anyone who wants to claim the being of any "thing", whether it's a donut or a hole or an economy or the color red, is perfectly permitted to do so. (This is the pragmatic stance taken by Semantic web technologies: “anyone can say anything about any subject”.)

Just to be clear, though, I'm not suggesting truth is irrelevant. Indeed, it's usually the most relevant characteristic of any data; bad data crashes planes. And indeed, there are lots of technological solutions (including encryption, non-repudiation, blockchain, etc) whose entire aim is to increase the truthiness of data. One such company, with a nice taxonomy, is:

https://www.factom.com/solutions/

So while any piece of data is an existential claim; the difference here is that these solutions give observers higher confidence about (a) the claim itself (who made it, and when) as well as (b) the facts being claimed (based on public witness, evidence capture, etc).

Similarly, it's entirely common to create computer systems that refuses to store claims that violate some rules of truth (however they're defined). Constraints on data systems (including primary keys) amount to restrictions on what kinds of claims can exist (or coexist). It won't allow me to claim that an account with a negative balance exists because that's not allowed by the definition of their category “account”. Or, I can't simultaneously claim that the contract value is $100 and $200.

Categories Are Also Claims

Just to muddy the water a bit more, recognize that categories are also existential claims. But the category they are claims about is definitional: they are of the form, “there exists a word or phrase W such that there is a useful relationship between this word and some set of shared characteristics”.

For instance, I’m claiming that there’s a category called “dog” and it means “a member of the genus Canis, the most widely abundant terrestrial carnivore, which was the first species to be domesticated,” etc. (I'm also usually implying there is some set of instances of this category, but that's not necessary.)

I could ALSO make a claim that a “dog” is a good word for “a rigid airship named after the German count who pioneered airship development at the beginning of the 20th century.” No one can stop me from making this claim; in fact, maybe I just did? I don’t have to be claiming the existence of any fact, but rather attempting to assign a meaning to a word.

What’s the difference between these “dog = furry mammal” and “dog = rigid airship”? Consensus, plain and simple.

References

Marsili, Neri. “Truth and Assertion: Rules Vs Aims.” Analysis 78, no. 4 (2018): 638–648. https://doi.org/10.1093/analys/any008.

Russell, Bertrand. “Knowledge by Acquaintance and Knowledge by Description.” Proceedings of the Aristotelian Society 11 (1910): 108–28.