Towards a formal ontology of filling in forms


Boris Hennig

IFOMIS Saarbrücken


Someone has recently suggested the following definition of what a document is:

x is a document if and only if x is a (potentially permanent) record of information of type T, where information is of type T if and only if it is time-sensitive and instances of T are reliably used as constituents of complex social actions.

In the following, I will not leave much of this definition intact.

1. Some Minor Suggestions

(1) To begin with one of several minor points about the information that documents are supposed to record: I find it a bit odd to say that this information should be used as a constituent of an action. What might that mean?  I may be said to use information when I give it to someone, when I receive information or when I rely on that information in doing something. In the former cases, the information is the object and not a constituent of my action. I deliver or receive information in the sense in which I lend someone a tool or get a suntan. Tools and suntans are objects, but not constituents of actions. Further, I can deliver and receive information without understanding it; for instance, when I only receive it to give it to someone else. Only in the latter case will the information enter the constitution of the action. When I pull the trigger, relying on the information that the gun is not loaded, then this knowledge alters the type of action that I perform: it might be classified as an accident and not as murder. In such cases, however, my knowledge does not enter the action as a constituent, but rather as a presupposition. Only actions can properly be said to be constituents of actions. I will come back to this point. Let me here simply suggest modifying the definition to the effect that documents record information that reliably plays a certain role in complex social actions.

(2) Further, that the recorded information plays its role in complex social actions seems to be a matter of fact, not of definition. So far I can see no reason why we should not allow for the (exceptional) case that the information recorded in a document figures in simple social actions. Anyway: how shall we draw that distinction between complex and simple actions? It will turn out later in this paper that the respective actions must have a kind of teleological structure. In this sense, they may perhaps be said to be complex. But for one thing, every action can be shown to be complex in that way.[1]  For another, that actions must be complex in the specified sense can be demonstrated and hence, it need not be put into the definition.

For similar reasons, we may omit the qualification "social". There may be reasons why we use documents only or primarily in social actions, but these reasons are clear by other means and do not belong into the definition of the kind of information that documents record. They should not even be entailed by the definition, since documents may be used in actions that are as nonsocial as any action can be. There are secret documents, for instance. These documents themselves may play an important role in some complex social actions involving secret services, licenses to kill etc., but this is not what the definition says. The information that a document records is defined as playing a role in social action, and this need not be the case. For instance, it might be the case that the recipe for a certain kind of chocolate is secret. This recipe is used in producing the chocolate, and producing chocolate is not any more social than any other action.

Regarding these two points, my intuition is that the prevalent complexity and sociality of the actions involved may well be consequences of some defining feature of the relevant kind of information, but they are not themselves defining features. It is true that the information recorded in documents must be relevant for action. But in order to distinguish documents from other things that may be said to be relevant for actions, we need not further specify the nature of the respective actions. We need to further specify the circumstances and the way in which the information recorded by documents is relevant.

(3) To advance to the next point. The information that documents record is said to be "of a type instances of which are reliably used as etc." Unless the generality of the type in question is clearly restricted, this leads to a severe leak in the definition. The reason is that presumably, any old pair of entities can be considered as two instances of some common type. The information that my birth certificate records, for instance, is of the general type "person-related information". The fact that I do not like Bob Marley is of the same type. Are then both pieces of information instances of the same type, such that the information that I do not like Bob Marley is of a type instances of which are reliably used in complex social actions?  This is surely not the case, and the definition should make that clear. It might do that by addressing type T as a low-level type.

(4) Why does the information have to be used in a reliable way?  This might serve to exclude two contrary cases. First, it need not be the case that information of the relevant kind actually or even always plays a role in some actions. There are entirely useless documents, at least on the level of instances. But a useless document is still a document — presumably by virtue of being of a low-level type instances of which are typically of some use. Second, it cannot be the case that the information is used only accidentally in that way. Reliability can be no accident. "Reliably" thus serves to situate the use of the relevant kind of information somewhere between "actually" and "possibly". Not every document actually plays a role in an action, and not everything that might perhaps play some such role is a document. Rather, documents normally and typically play some role in actions.

I am thereby gradually approaching one of the more general themes that I want to discuss: Documents should be defined in terms of their typical use. The next remark, which will introduce this discussion, concerns the mood or aspect of the "is". Something was said to be a document if and only if it is a record of some certain information. Does that mean that every document actually and de facto is a record or rather that it is supposed to be a record (de jure)?  The occurrence of "potentially" and "reliably" in the definition may indicate the latter. Not every document actually records the relevant kind of information. Some documents are invalid, but they may be documents nonetheless. This means that something is a document insofar as it is supposed to record the relevant kind of information. I think this deserves to be rendered more explicit. So far, the definition has undergone the following changes:

Something is a document if and only if it is supposed to permanently record information that belongs to some low-level type T, where information is of type T if and only if it is time-sensitive and instances of T play some role in some action.

2. Teleology

That a document is supposed to behave in a certain way does not mean that one always intends this behavior when using it. One may use a birth certificate for anything that can be done with a sheet of paper. Only some of these possible uses are official ones. What a document is supposed to do is much like what Elisabeth Anscombe has called a "point" in a language game. For instance, the point of an order is (roughly) to make someone do something. Nonetheless, it is perfectly possible to give orders without intending that they be executed (1957:3). Likewise, the point of a document need not coincide with its actually intended use.

It is already part of the definition that the information recorded by documents play a certain role in an action. When I now stress that documents should be defined in terms of their point or function, then I am again emphasizing that documents are what they are by playing a certain role in an action. As has been announced a few pages ago, this might motivate a restriction on complex actions: the actions in question must be complex enough for there to be something that can play a role in them.

In order to make clear what kind of complexity is involved here, let me briefly address the topic of intentional agency. When we describe behavior as intentional or purposive, we describe what further is done in performing it. We can do this with animals and perhaps even plants. A cat is stalking a bird by slinking and crouching, a plant is extending towards the sun by growing. In such cases, both the purpose and the means for attaining it are processes. The cat is crouching and it is stalking; it is crouching in order to stalk. The plant is reaching out for the sun and growing; it is doing one in order to do the other.

This can also be expressed in the following way: whenever behavior is purposive, then we may give more or less enlarged descriptions of it. This has often been noticed, and has been dubbed the "accordeon effect" (Feinberg 1970:134). Insofar as I turn the key in order to open the door, I am opening the door, and insofar a cat is slinking along in order to catch a bird, it is stalking that bird. In such enlarged descriptions, we do not simply situate an action in its context, nor do we add arbitrary causal consequences to the description. Rather, we situate an action within a context or nexus of further actions (Cf. Anscombe 1957:86). The consequences that we can add to the description must already be actions. For instance, that I am dropping this shadow is a consequence of my action. But this alone does not entitle anyone to say that dropping the shadow is one of my actions. The notion of a consequence that we need here is not independent of the notion of an action.

But in our present inquiry it does not matter whether we add consequences to the description of an action and thereby get further descriptions of this action, or conversely call all and only those entities consequences of an action that could also be included in an enlarged description of this very action. The point of interest is that purposive behavior displays a structure such that an enlarged description may embrace its consequences in the relevant sense of "consequence" and the more narrow descriptions specify means to that end (Cf. Austin 1979:201). I will call this a teleological structure.

We may segment an action (type) into parts according to this teleological structure, and will thereby be entitled to speak of one part being an end of or a means for another. The teleological or explanatory nexus by which actions divide into such parts is rendered explicit in practical reasoning. Actions can be divided into steps that display an order of reason; such that one stretch can be cited as the reason for another.[2]  I have claimed in the beginning that only actions can be constituents of actions. Now I have added the claim that the constituents of any single action are teleologically related to another. An action A is a constituent of an action B insofar as A is done as part of and in order that B. B is then done by doing A, and A will be mentioned in an answer to the question how B is done.

Actions are purposive by playing a role within further actions. Things other than actions, when they play a role in actions, are purposive in a different way: They have functions. When I use a key in order to open the door, then the key functions as a door opener. When I use it in order to open a bottle, it functions as a bottle opener. But opening doors is, and opening bottles is not the function of a key. The reason is that opening bottles by using keys is not a standardized kind of action. It is not something that one, in general, does.

Like keys, documents may function in unorthodox ways. I may use my birth certificate as toilet paper. But this is not its function. In order to specify the function of a document, one has to specify its standard use, or the general kind of action in which it plays its role. Such general kinds of actions will be called practices.

We can now omit the reference to the low-level types in the second part of our definition. Functions attach to types, not to particular tokens. That is, something has a function only insofar as it is of a certain low-level type instances of which are typically used in some certain way. To say that something plays a role in a practice is yet another way of saying the same.

Documents record information that figures in a practice. Not every type of action is a practice. Things people do in the morning is a type of action, but it is not a practice. Practices are ways of acting that can be practiced: learned, introduced, invented, improved, prohibited and the like. In short, practices are kinds of actions that are reflected in their instances. This means that in order to instantiate a practice, one needs to know how one does it. There are no accidental or unconscious executions of practices.

The explanation how one does it is also an explication of the teleological structure of the respective kind of action. One makes an omelet by breaking eggs, frying them and so on. In such an explanation, the action is divided into steps that are related to each other and to a common whole as means to ends. Practices are described and explained in teleological terms (Anscombe 1957:83).

Another way of making the same point is to say that practices are described in terms of conditions of success. The description of a practice is the description of its successful actualization. Particular actions are individuated in terms of conditions of satisfaction. These conditions need not be fulfilled in every case. They belong to individual actions only by belonging to the practice they instantiate. Defects, in contrast, can only be realized in particular ways. They cannot, as defects, belong to a practice. It is certainly possible not to intend to be successful. But being unsuccessful cannot as such be a practice. If it were, then one could successfully instantiate this practice. Defects are necessarily deviations from standard actualizations of practices. A defect occurs when a particular agent does not do what one does. This will become important now.

We have seen that the function of a document need not coincide with its intended use. For instance, it need not be actualized. My birth certificate has the function of certifying the place and time of my birth, the name of my parents and so on; but I need not use it for that purpose. Fake documents constitute the converse case: they are used for certifying something, but in some sense, this is not their function. For in the case of success, they allow for the same intended uses as real documents. But these uses will then not be constituents of practice. The successful use of a fake document constitutes a defect of the practice within which it is used. This is simply what "fake" means in this context.

Now consider again the definition under consideration, modulo the changes recommended so far:

Something is a document if and only if it is supposed to permanently record information that is time-sensitive and has a function (= plays a role in some practice).

This definition does not yet enable us to draw a sufficiently clear distinction between fake and real documents. Fake documents are, in one sense at least, supposed to permanently record time-sensitive information. This information must further have all the features that the information recorded by a real document has. Two features may perhaps be cited in order to distinguish fake documents from real ones.

First, fake documents do not convey true information, and in that sense they may perhaps be said not to record information at all. But this move would involve a rather ad hoc redefinition of information. For is there no false information?  " To record information" does not appear to be a success verb in that sense; we can be given false information, and we can record it. Conversely, why should a fake document not convey entirely true information?  This alone would not turn it into a real document.

Secondly, one may admit that fake documents function in the place of real documents, but in a certain sense they do not have the same function. They do not have it because they are not, as such, parts of the respective practice. They only pretend to have such a function, and pretending to have this function is indeed their function. There is a practice of issuing and using fake documents, and insofar as fake documents figure in that practice, they have the function to mislead people. But this only means that fake documents do not have the same function as the real documents they pretend to be. The corresponding real documents will (usually) not have the function of misleading people. In a word, the functions that fake documents actually have cannot be their official functions, since a fake document only does its job as long as people do not know what its real function is (i.e. to mislead them). An official function can be defined as a function that is supposed to be known to all people who use the item that has it.  Hence this will be a possible improved definition:

Something is a document insofar as its official function is to permanently record time-sensitive information that has a function.

But this is not yet the final version.

3. Bridging Informational Gaps

An important point that will still require some work is the following: why should the information be time-sensitive?  This also relates to the word "permanently". Presumably, only time-sensitive information needs be recorded in order to be permanently available. Time-insensitive information is permanently available, anyway.

It may be interesting at that stage of our investigation to do some etymology. The word "document" is composed from docere, to teach, and the suffix –mentum as it also occurs in "monument". The suffix –mentum typically indicates that the item in question is a warning, example or instance. Thus taken very literally, a document is a kind of example, instance or warning that teaches us something. Documents convey and conserve information that might not be available otherwise. This seems to be a good reason for not recording time-insensitive information, even if it is relevant in the respective kind of complex social actions. But still, there are documents that do not record time-sensitive information, or only accidentally do so. Take, for instance, a document that certifies the correctness of a photocopy or translation. The information that it records is: "this copy / translation is a faithful one". But the correctness of a translation does not change over time (unless the translation changes). Hence, the information conveyed by the document is not time-sensitive. To be sure, there will be some time-sensitive information involved, as in a formulation like " I hereby certify that the copy / translation is correct". But it is precisely not important at what time this assertion is made. In this case, it is only important who makes it. This gets us right into the center: the general purpose of documents is not to preserve information over time, but more generally to bridge any kind of informational gap.

Documents connect us to something that we cannot experience ourselves. The correctness of a translation is not something that every person can be expected to experience herself. For those who cannot, we have documents testifying what they cannot experience themselves.

Hence, documents have the very same function as witnesses. The complex social actions in which they are used are acts of testifying, certifying and witnessing. They do not only record information; more precisely, they compensate for the lack of immediacy, directness and authenticity of information. They are issued in order to replace the experience itself. This leads to one further comment on the original definition, this time relating to the word "record". Like witnesses, documents do not merely record information, but they convey it in an authoritative fashion. They are entitled to convey information in a legal context, but also obliged to tell the truth.

In order to bridge informational gaps, documents must be separable from their origin and from the context in which they acquired their content. If the facts could speak for themselves, we would need no documents. This however entails that we cannot verify documents in exactly those contexts in which we need to rely on them. As Plato would perhaps put it: documents are orphaned and separated at birth from the assistance of their father. The point in the Phaedrus, where Plato suggests this formulation, is mainly that a practical skill like philosophy cannot be learned from books (275c). Philosophy is the art of using information, not of possessing or compiling it. Written text is worth nothing without a culture of interpreting it (278c–e).

This point is then taken up by Augustine. In the absence of the author, says Augustine, we need a technique of interpretation. More specifically, we must follow the principle of charity, much later made popular by Donald Davidson. In order to understand a written document in the absence of the circumstances that would render it superfluous, we need to assume that it tells us the truth (Confessions 11,3). This again means that documents do more that just recording information. They are authoritative, such that we shall believe in their correctness in the absence of any other evidence.

This will be my semi-final version of the definition of document, then:

Something is a document insofar as its official function is to compensate for the impossibility of immediately acquiring information that has a function (= plays a role in a practice).

One final question, "to go", as it were: what is the distinction between a witness and a document?  Does a witness, or perhaps her brain, not also record information of the relevant kind?  Is her brain, then, a document?  Or is it, perhaps, rather a monument?[3]


Anscombe, E. (1957). Intention. London: Basil Blackwell.

Austin, J. L. (1979). Philosophical Papers (3. ed.). Oxford: Clarendon Press.

Feinberg, J. (1970). Doing and Deserving, Essays in the Theory of Responsibility. Princeton University Press.

Rödl, S. (2002). Practice and the unity of action. In G. Meggle (Ed.), Social Facts & Collective Intentionality. Frankfurt am Main: Dr. Hänsel-Hohenhausen AG.

Thompson, M. Naive action theory. Typescript. Forthcoming as part of Life and Action, Harvard University Press.

[1] Every movement can be divided into further movements that are means for carrying out the original movement (I move my hand this bit in order to move it that bit). Since every action involves movement, every action can be divided into parts that are means for the whole.

[2] See Michael Thompson's Naive Action Theory and Rödl 2002.

[3] This paper was written under the auspices of the Wolfgang Paul Program of the Alexander von Humboldt Foundation and the project "Forms of Life" sponsored by the Volkswagen Foundation.

areas of Interest
master thesis
PhD thesis
hosted by