THE LOGIC OF CAUSATION
Phase One: Macroanalysis
J. S. Mill’s Methods: A Critical Analysis.
New version. The present essay was originally written, or at least published, in 1999; but I decided to rewrite almost all of it in March 2005, when I found the time to engage in more detailed hermeneutics. Following an analysis that could be characterized as almost Talmudic (though not as mere ‘pilpul’), my conclusions about Mill’s methods are considerably more severe.
Below, I list John Stuart Mill’s five “Methods of Experimental Inquiry”; then I try to expose and evaluate them. It should be noted that though my approach is at times critical, my main intent is to clarify; I am more interested in Mill’s achievements, than in his apparent mistakes. (All symbols used below are mine – introduced to facilitate and clarify discussion.)
Mill’s terminology is a bit obscure, but can be interpreted with some effort.
In the paradigm (the first method), he seems to be looking out at the world, or a specific domain of it, and observing something (say, X) occurring in some things or events, in scattered places and times, and not occurring in others; and also observing some second thing (say, Y) occurring in some things or events, in scattered places and times, and not occurring in others; and he wonders at how two such events can be causally related.
In the first three methods, Mill verbally differentiates the two things under study by naming one “the phenomenon” (X, for us) and the other “the circumstance” (Y, for us), suggesting that in his mind’s eye the former is the effect and the latter its cause, although note well in his conclusions he rightly (usually) considers the two items interchangeable, so that either might be the cause or effect of the other. In the fourth method, Y is viewed as a “part” of the “phenomenon” X. In the last method, Mill refers to both items with the same word, viz. “phenomenon”. Whatever the words used for X and Y, it is clear that Mill has no intent to prejudice the conclusion. These terms are intended very broadly to mean any thing or event, i.e. (since he is considering experimental inquiry) any object of perception. (I prefer the very neutral – purely logical – term “item” for this.)
Now, these items (X, Y, or their negations) are found scattered in the world, or some segment thereof, in various things or events, in scattered places and times – this is what Mill means by “instances”. Wherever X, Y, or their negations occur, that is one of the “instances” or cases under consideration. Thus, the instances might be instances of a kind of thing (e.g. humans or water), and X, Y, or their negations, might be predicates (in a broad sense, including any attribute or movement or situation or quantitative property or relation or whatever) of that subject.
The goal of Mill’s present study, as its name implies, is methodological: he seeks to correlate phenomena, i.e. to identify how we can establish one thing to be a cause or effect of another, to whatever extent. Given certain facts about X and Y, what conclusions can be drawn as to their causal relation? The causal relation investigated is evidently causation (of whatever mode), rather than volition (although once volitions have taken place, their results become causatives), note.
He has in mind experimental inquiry – but in fact, his arguments could equally be applied to passive observations. His are (ideally, at least) universal inductive principles, which effectively define various causative relations, as well as offer practical guidance for their discovery.
As we shall see, Mill apparently makes numerous mistakes; and overall, his treatment of causation is not as systematic and exhaustive as it should have been. For all that, his doctrine is instructive, as is the discussions it stimulates.
Let X be the phenomenon, and A, B… be instances in which it occurs, and C, D… be instances in which it does not occur; and let Y be a circumstance the former instances (A, B…) have in common exclusively, and the latter instances (C, D…) lack in common exclusively. Then, according to Mill:
Instances A, B… have X and have Y (exclusively); and
Instances C, D… lack X and lack Y (exclusively);
Therefore: Y is the effect, or the cause (or an indispensable part of the cause), of X.
This may be considered as an inductive argument, with two compound premises and a disjunctive conclusion (i.e. a set of three possible conclusions). I have here put in brackets and in italics those parts of the premises and the conclusion that I consider mistaken, for reasons I shall presently discuss.
A simplified and corrected version of Mill’s statement would look as follows:
If a phenomenon (X) is invariably accompanied by another (Y), and its absence (not-X) is invariably accompanied by the other’s absence (not-Y) –– we may infer that X is the cause of Y, or Y is the cause of X, in the sense of complete and necessary causation.
This simple statement is an apt description of the strongest causative relation possible between two items X and Y or between their negations. It corresponds to what David Hume earlier called “constant conjunction”, between two phenomena and between their negations. That this is for Mill the essence of the method under consideration is evident in the name he gave it: “agreement and difference”. This name also shows his awareness that causation has both a positive and a negative aspect.
Had Mill contented himself with such a simple statement, I would have congratulated him for providing scientists with an excellent research tool. I do not therefore quite know why Mill chose to complicate the matter by adding an extraneous condition in each premise and proposing an inaccurate alternative conclusion. Before we consider these problems, however, let me further analyze the intent of Mill’s main statement.
The terms “agreement” and “difference” in the title of this method refer respectively to having in common or lacking in common some feature (namely the “circumstance” Y). The expression “joint” method here is due to these terms recurring separately in the next two methods.
When Mill refers to “two or more instances” in each premise, he must in fact be referring to the two or more instances, i.e. all (the known) instances. Clearly, this must be the case, otherwise it would be conceivable that we encountered instances of X without Y, or instances of not-X without not-Y; and if that were the case, the proposed strong conclusion would not be valid. Mill ought to have added the definite article “the” to avoid all misunderstanding.
Mill’s use of the expression “two or more” is due to his trying to say several (too many) things at once. First, that one instance is hardly sufficient to establish causation; there must be repetition of the conjunctions. Second, the number of repetitions is indefinite, because we are here (except when dealing with finite sets) concerned with open-ended induction. We can never know all the instances directly, but can only arrive at general premises through generalization from all known cases to all cases period. The conclusion is only as valid as those generalizations.
The form “If X, then Y, and if not X, then not Y” (= “Y is the effect of X”) and its contraposite “if Y, then X, and if not Y, then not X” (= “Y is the cause X”) are both generalizations from the forms “X and Y are universally conjoined, and not-X and not-Y are universally conjoined”. If, upon further inquiry, the latter generalities turn out to be inaccurate, the inferences drawn from them must also be attenuated.
Mill should have specified all that explicitly (I do not know if he did so somewhere else). But there is little doubt in my mind that he tacitly intended it. He might also have pointed out that the “two sets of instances” involved (here symbolized as A, B… and C, D…) once generalized, together exhaustively cover the whole world.
Another implicit detail worth highlighting is that “X is contingent and Y is contingent”. This is inferable from the observation and mention of occurrences of X and of not-X, and likewise regarding Y and not-Y. Note also that although Mill speaks of “the phenomenon” or “the circumstance” – the predicates X and Y are general terms, and not one-time happenstances, since each occurs in “two or more instances”.
Finally, looking at Mill’s conclusion, we may add that his uncertainty as to which of the two items X and Y causes the other (at least in our main conclusion) is justified. Since the relationships described in the premises are symmetrical with regard to X and Y (apart from purely verbal differences), the conclusion cannot differentiate between them. At this level, then, the words “cause” and “effect” have no formal difference; some other condition (such as time’s arrow or the degree of abstraction) must be specified before we can identify a direction of causation.
Now, let us turn to criticism of Mill’s formula.
Mill’s first inexplicable complication is his requirement that the “circumstance in common” (viz. Y or not-Y) in the premises be exclusive. In the first premise, he says Y is the “only one”; and in the second, there is “nothing save” not-Y. Moreover, these circumstances must “alone” differentiate the two sets of instances, for the conclusion to follow.
Mill apparently fears that some third item, say Z, might come into play and affect the projected strong relation between X and Y. However, this fear is formally unjustified. Let us consider the extreme case where three items X, Y, Z are constantly conjoined, and their negations not-X, not-Y, not-Z likewise always occur in tandem. In such a situation, all the following propositions (and their respective contraposites) are true:
If X then Y, and if not X then not Y.
If X then Z, and if not X then not Z.
If Y then Z, and if not Y then not Z.
The truth of the latter two propositions does not impinge upon the truth of the first one. The causative relation between X and Y remains the same, even if some third factor like Z comes into play. The same can be argued if only one of these extra propositions is true. In such situations, we would simply conclude that there are parallel causations, or again causative chains.
Mill apparently failed to develop these concepts, and inserted an extraneous requirement of exclusivity in a vague attempt to insure against possible third-factor interference. In truth, the relation between any two variables X and Y can be determined without reference to any other variables.
If – as indeed does occur – the two variables under consideration are affected by others, to the extent that their relation is weaker than here concluded, we will soon notice the fact by observing that X is not always with Y, and/or that not-X is not always with not-Y. But in such case, the stated premise(s) about constant conjunction will simply not be true! In other words, in such case, Mill’s conception of the premise(s) would be self-contradictory.
Perhaps, someone might interject, Mill was here trying to account for the scientific methodology of “keeping all other things equal”? No – because: this refers to a situation where there are two or more partial causes to an effect, and to establish each of the partial causes as such, we have to consider each one in turn without the other – and in such case, complete causation could not be a putative conclusion for any of the partial causes.
The second inexplicable complication in Mill’s formula is his reference in the conclusion to a third alternative, viz. that Y might be “an indispensable part of the cause” of X. This clause is interesting, first of all, because it indicates that when Mill initially states that Y might be “the effect, or the cause” of X, he has in mind complete causation (as distinct from the partial causation in the third alternative).
With regard to this third alternative, let us first notice that Mill does not mention that X might equally be “an indispensable part of the cause of” Y, even though he has granted that X and Y are interchangeable in the first two alternatives. Why this asymmetry? I suspect it is not intended to convey some radical insight, but merely reflects Mill’s terminology and the gradual development of his formula.
He started by referring to Y as a “circumstance”, suggesting that he viewed it as the precondition or cause of X, “the phenomenon” under investigation. Then, it probably occurred to him that he could not formally distinguish between X and Y, as to which is the cause and which is the effect – so he added the possibility that Y might be the effect of X. Then, he got to thinking Y could be a partial (necessary) cause of X, so he added that in; but he simply forgot to recover symmetry and suggest the reverse to be possible.
Now, the big issue: the phrase “an indispensable part of the cause” clearly refers to partial necessary causation. Given that X and Y are indeed constantly conjoined and that their negations are constantly conjoined, no conclusion is formally permissible other than complete necessary causation. It follows that it was an error for Mill to insert this additional disjunct in his conclusion.
Note parenthetically, Mill does not anywhere give us a clue as to how partial necessary causation might be distinguished from complete necessary causation. Supposing such alternative conclusion had been correct, he would have been obliged to a detail practical methodology for resolving the issue.
I suspect that Mill resorted to the said third alternative conclusion due to his lingering doubt concerning some possible third factor (which we above labeled Z) weakening the relation between X and Y. Apparently, Mill considered that Z might diminish the degree of causation of X by Y from complete to partial; i.e. he viewed Z as a complementary partial cause imbedded with Y in some larger cause.
This explanation is appealing, because it suggests a correlation between the said complications in premises and conclusion. However, as already shown, Z might equally well be a parallel or concatenated complete cause – so we must still fault Mill for imprecision and confusion. In any case, logically, Mill could not have his cake and eat it too. If in the premises he has firmly excluded circumstances besides Y, there is no reason for him to make allowance in his conclusion for an eventual complement Z!
Another objection we could raise here is: if Mill considered the possibility here of partial necessary causation, why not equally that of complete contingent causation, or for that matter, the possibility of partial contingent causation? If he felt (perhaps because of their inductive basis) his premises were shaky, then why did he not foresee all possible modifications of the main conclusion (complete necessary causation)?
The answer to the latter question(s) is simply that although Mill conceived of partial causation, he apparently never grasped the inverse concept of contingent causation. This will become evident as we continue our analysis of his methods, and find no mention anywhere of that weak alternative to necessary causation. Mill’s omission suggests that, in his mind, only “indispensable” things could be causatives (although if asked the question he might well have denied it).
Another deficiency in Mill’s viewpoint is his failure to consider that in some cases, though X and Y and their negations exhibit perfect regularities of conjunction as described in the premises, we (i.e. people in general) do not conclude that Y causes X or X causes Y, but conclude that “X and Y are both effects of some third thing”. This alternative conclusion is admittedly inexplicable formally, just as the distinction between cause and effect is difficult to pinpoint. But there may in practice be indices that encourage the former, just as there are indices for the latter. Granting this, it would have been more appropriate for Mill to use that clause as his third alternative.
To sum up: what is manifest from all our above analysis is that Mill had an unclear idea of causation, mixing its paradigm up with its possible variations. He failed to first clearly distinguish and separately consider all the determinations of causation (both generically and specifically). Consequently, when he faced the inductive issue – the issue of how in practice to identify causation – his confusion was compounded by the need to consider the fact of generalization and the possibility of particularization.
Let X be the phenomenon, and A, B… be instances in which it occurs; and let Y be the only circumstance they have in common. Then, according to Mill:
Instances A, B… have X and have Y (exclusively); and
Therefore: Y is the cause, or the effect, of X.
This may be considered as an inductive argument, with a compound premise and a disjunctive conclusion (i.e. a set of two possible conclusions). In view of the name given to this method, the conclusion may be taken to refer to the positive aspect of causation, i.e. complete causation. I have here put in brackets and in italics the ‘exclusive’ demand of the premise, which I consider mistaken for reasons to be presently given.
The essence of this argument is generalization, from the constant conjunction of two items, X and Y, wherever and whenever they are observed to occur (the instances A, B…), to all existing or possible instances. The conclusion from such universal repetition is either that “if Y, then X” (whence, Y completely causes X) or that “if X, then Y” (whence, X completely causes Y).
Such generalization is logically possible, note well, provided that the “two or more instances” (A, B…) are all the encountered instances of X and of Y. Mill obviously intended that, but he should have made it clear – e.g. by saying the two or more – to preempt his formula being construed as allowing for unspecified instances in which X occurs without Y or Y occurs without X.
Mill should have mentioned this to show his awareness of the formalities involved, notably that the form “if X, then Y” means “X is impossible without Y” (and similarly, “if Y, then X” means “Y is impossible without X”). The most significant aspect (for a causative conclusion) of the constant conjunction of the two items is the implied denial of possible conjunction between one item and the negation of the other.
We could offer a generous reading Mill’s statement to cover this issue. We could suppose that Mill confused circumstances other than Y with circumstances contrary to Y, and suggest that the clause “only one circumstance (Y) in common” is intended to mean that there are no instances with X accompanied by some negation of Y. Likewise, the exclusive word “alone” could be taken to refer to X rather than Y, meaning that the two or more instances involving X, are the only ones among “all the instances” to have Y, implying that there are no instances without X that have Y. However, I do not seriously think Mill intended this interpretation.
Another tacit proviso for drawing our conclusion is that each of the items X and Y be contingent. Strictly speaking, a conditional proposition like “if X, then Y” or “if Y, then X” can be taken to imply causation only if we know that “X is possible, but unnecessary” and “Y is possible, but unnecessary”. In Mill’s statement, here (unlike in the joint method), the occurrence of X and Y is implied in the premise, but their non-occurrence is not mentioned. This omission is noteworthy, suggesting that Mill was not fully aware of these requirements for validity.
It should be said, too, that once the contingency of the theses is granted, a hypothetical proposition could be contraposited. That is, “if X, then Y” would imply “if not Y, then not X”; similarly, “if Y, then X” would imply “if not X, then not Y”. Thus, although the intent of Mill’s formula (judging by its title) was an inference of complete causation, strictly speaking his formula allows for one of necessary causation. That is, the valid conclusion from his premise is a disjunction of four possible conclusions.
Thus, Mill’s formula leaves us uncertain, not only as to which item is the cause and which is the effect (as he admits), but also as to whether we are dealing with complete or necessary causation (which he fails to notice). One thing is sure, however, is that the conclusion is a strong determination. This is tacitly suggested by Mill in his use of the definite article “the” in “the cause” or in “the effect”. If he had had in mind weak determination (i.e. partial or contingent causation), he would have probably written “a cause” and “an effect”.
This brings us to Mill’s requirement that the instances where the phenomenon (X) have “only one” circumstance (Y) in common, which he repeats when we says that the latter is “alone” that in which the instances agree. Why such exclusiveness? We have seen a similar, mystifying concern in Mill’s joint method. In the present case, again, Mill seems worried that there may be circumstances other than Y that will weaken the causative relation between Y and X; i.e. he is trying to preempt any possibility of partial (or contingent) causation.
In his mind’s eye, apparently, if some other circumstance (say, Z) was also (like Y) constantly conjoined with the phenomenon (X), a doubt would arise as to which of the two circumstances, Y or Z, caused X. But this is formally unjustified: the possible truth of “if Y, then X” would not be affected by the eventual truth of any other proposition like “if Z, then X”; if X, Y and Z are compatible, as our premise confirms, the two hypotheticals are quite compatible. Mill here again has apparently not considered the possibility of parallel causations or causative chains.
We might add that Mill’s attempt to limit the number of accompanying circumstances to just ‘one’ is ontologically open to doubt. Are there anywhere in the world two or more things (instances in which X occurs) having literally only ‘one’ circumstance (Y) in common? I very much doubt it! If there is such a set of things, it must be very exceptional. Most things have many (innumerable) common factors. There are always large predicates like existence, location in space and time, size, shape, etc. to consider, for a start.
Usually, when we say something so exclusive, we do not really mean it. For example, saying “the only similarity between these two individuals is their wealth” – we do not really mean to imply that the individuals do not both have a spinal cord, a heart, a brain, etc. Such misleading language is not accurate in scientific statements; at least, we should think twice before ever using it or taking it literally.
Let X be the phenomenon, and A be an instance in which it occurs and B be an instance in which it does not occur; and let Y be the only circumstance they do not have in common. Then, according to Mill:
Instance A has X and has Y; and
Instance B lacks X and lacks Y and
(Instances A and B, have every other circumstance in common;)
Therefore: Y is the effect, or the cause (or an indispensable part of the cause), of X.
This was intended as an inductive argument, with two compound premises and a disjunctive conclusion (i.e. a set of three possible conclusions). As we shall demonstrate below, this argument is a rather gauche depiction of necessary causation. I have here put in brackets and in italics those parts of the premises (here treated as a third premise) and the conclusion that I consider mistaken, for reasons I shall presently discuss.
It should first be noted that Mill’s formulation does not make clear whether the presence of X is accompanied by the presence or absence of Y, and inversely what the absence of X is accompanied by. I have assumed symmetry, i.e. presence with presence, and absence with absence, in order that the conclusion be expressed wholly in positive terms. It is not a very important issue, but still a puzzling imprecision on Mill’s part.
Next, let us notice that Mill’s formula mentions only one instance (A) of X’s occurrence (presumably with Y) and only one instance (B) of X’s (and Y’s) non-occurrence – without this time in any way suggesting plurality, let alone universality. Mill’s wording as it stands does not exclude the possibility of some third instance where X occurs with not-Y, and of some fourth instance where not-X occurs with Y. In such cases, how would Mill dare claim a causative relation?
This is very intriguing: I find it hard to suppose that Mill considers that causation can be induced from single instances. One may from single occurrences deny that some causation is applicable, but one could in nowise affirm it. In order for the premises to allow the conclusion he proposes, we would have to replace “an instance” with “all (known) instances” in at least one of the premises. Causation is about patterns of conjunction, not about coincidences. Mere occasional agreement or difference does not establish a pattern.
One wonders what Mill possibly had in mind! (I suspect he had eaten or drunk too much the day he wrote this.)
Perhaps Mill considered the constancy of surrounding circumstances as the requisite pattern, somehow. Why does he at all refer to the two instances (A and B) having “every circumstance in common” save one? This is a redundancy: the very uniformity of surrounding circumstances makes them irrelevant. In any case, uniformity in only two instances is hardly significant.
I presume, here again, he imagined that if the surrounding circumstances had not been uniform, they would have somehow impinged on the causative relation between X and Y. For this reason, he insists on their distinctive uniformity whether X or not-X is the case. He is apparently not aware of the possibility of parallel causations or of causative chains.
In any case, there are always innumerable surrounding circumstances, behaving in quite random fashion, that are totally unconnected with the phenomena at hand; non-uniformity is not proof of causation. And moreover, the circumstances that are here uniform (in the instances A, B) might behave more erratically in other instances.
Is the exceptive (“save one”) clause in Mill’s formula, i.e. the contrasting behavior of Y, his main focus, perhaps? The given fact that one circumstance (Y) differs from all other circumstances in that it is uncommon, i.e. present in one instance (say, A) but absent in the other (say, B), just makes Y stand out from the rest; it does not signify a causative relation to X. This is all the more true when, as here, only a couple of instances are under consideration.
But finally, it occurs to me that there is one way we can at least in part redeem Mill’s statement. That is by supposing that, when he here referred to “an instance” he subconsciously had in mind “a kind of instance”! In that case, A and B are each a set of instances, corresponding respectively to the occurrences of X (with Y) and those of not-X (with not-Y). From these (experimentally) encountered instances, we may by generalization assume the same regularities hold universally.
Granting this supposition, and ignoring the extraneous mention of uniform surrounding conditions and insistence that Y be the only non-uniform circumstance, a causative relation between X and Y can indeed be inferred. However, in such case the premises and conclusion of this method would seem identical to those of the joint method! This is obviously not Mill’s intention.
Considering the title of the ‘method of difference’, we can safely suppose that it refers to something found in part in the ‘joint method of agreement and difference’ and not found in the ‘method of agreement’. Mill was apparently struggling to split necessary complete causation (the ‘joint method’) into its two components, complete causation (agreement) and necessary causation (difference). He managed to formulate the former, positive aspect readily enough, but had considerable trouble putting his finger on the latter, negative aspect.
A further confirmation of our supposition is to be found by comparison of the conclusions of the three methods. Note first that whereas the method of agreement concludes that Y is “the cause (or effect) of” X, the other two methods conclude in reverse order that Y is “the effect, or the cause… of” X. Moreover, the joint method and the method of difference, distinctively from the method of agreement, propose as an alternative conclusion that Y might be “an indispensable part of the cause” of X.
This latter possibility obviously refers to partial necessary causation, as earlier pointed out. “Indispensable” means that one cannot do without it, it is a sine qua non, a necessity; and “part of the cause” means a fraction of the sufficient cause. All this suggests that, in Mill’s mind, the causation found by the method of agreement is essentially positive and whole, whereas that found in the other two ways may be negative and fractional.
But since, as already said, the joint method and the method of difference cannot be identical, the latter must be assumed to focus on necessary causation only. We should, by combination of the methods of agreement and difference, arrive at the same result as with the joint method. So, our task is to isolate the ‘difference’ component (necessary causation) from the ‘agreement’ component (complete causation).
Mill might have achieved this by proposing some sort of negative ‘mirror image’ of his formula for the method of agreement, one about “two or more instances (A, B) in which the phenomenon (X) does not occur” having “the absence of one circumstance (Y) in common”. Some such more analogous statement could be constructed for the method of difference, but I will not even try, because of all the difficulties in the earlier statements already discussed.
Moreover, if we attempt such a reconstruction, we soon realize the title “method of difference” to be a misnomer, in view of the use of the term “agree” within Mill’s formula for the method of agreement. His method of difference is really just another application of the method of agreement, except that we focus in it on the absences, instead of presences, of the items (X, Y) concerned. “Difference” (i.e. disagreement) can only really be claimed in the joint method, where we switch from presence to absence or vice-versa. In this perspective, the titles ‘method of agreement of positives’ and ‘method of agreement of negatives’ might be more appropriate.
Whatever the name used for it, and the language used to formulate it, it is evident for reasons of symmetry that the method of difference aims at the negative aspect of causation, i.e. necessary causation. It follows that the premise(s) must be such that by generalization we can ideally conclude that “if not-X, then not-Y” or “if not-Y, then not-X”. This would in practice be based on observed constant conjunction between not-X and not-Y. The matter is that simple!
Mill realizes this at some level, but goes quite astray in his attempt to put it in words. His statement of the method of difference is incredibly garbled. He not only repeats some of the mistakes he made in formulating the preceding two methods, but also makes many more.
Before leaving this topic, it should be added that the said constant conjunction of negations only formally implies causation after generalization if the terms concerned are known contingent, i.e. if X is possible and Y is possible. Moreover, given such contingency, the inferred conditional propositions can be contraposited to “if Y, then X” and “if X, then Y”; so that strictly speaking, the conclusion formally allows for complete causation as well necessary causation (whether of X by Y, or of Y by X).
Observed constant conjunction of negations does not, however, formally allow as alternative conclusion partial necessary causation – or for that matter, complete contingent causation or partial contingent causation. Mill’s proposition that Y may be “an indispensable part of the cause” of X is artificial and erroneous. Needless to say, reversing its direction would also be erroneous, as would inverting the polarities of the terms. Anyway, as already pointed out, Mill apparently completely misses out on the possibility of contingent causation.
I have already discussed the issue of partial causation with regard to the joint method, and will not repeat my comments here. These are commendable attempts by Mill to insert it in his analyses, but his approach so far is unequal to the task. He makes arbitrary claims in his conclusions, which are incompatible with his premises; and even supposing consistency, he provides no means to decide between his alternative conclusions. He does, however, offer some more precise means for identifying partial causes in his next method, that of ‘residues’.
Here, Mill is attempting to deal with partial causation. He is saying:
Suppose: D is a part of F; and E is the rest of F (i.e. D + E = F).
And suppose: A causes D (i.e. presumably, If A, then D, etc.)
It follows that: B causes E (i.e. presumably, If B, then E, etc.)
Note that a tacit assumption, here (suggested by the reference in the conclusion to “remaining” antecedents), which we can readily grant, is that A and B together (as C, say) cause F (the compound of D and E), i.e. that:
(A + B) = C; and C causes F (i.e. presumably, If C, then F, etc.)
Note also that I presume that the kind of causation by A of D, and by B of E, intended by Mill, is complete causation, i.e. a relation including positive implication by the cause of the effect (i.e. if the cause, then the effect), plus strictly speaking a negation of the inverse implication (i.e. if not the cause, not-then not the effect).
The causations mentioned and tacit in Mill’s statement are considered as already established, as he admits by saying “as is known by previous inductions”. The means of induction used is not specified; he presumably intends one of the other four ‘methods’ (probably the second). His formula is only intended to infer a causation from within other, given causations. This is a purely deductive argument.
Moreover, Mill appeals to the relation between whole and parts without really defining it. We could briefly express that relation by saying that D and E together imply and are implied by F. But to fully clarify this relation, we ought to mention that D without E or E without D, as well as not-D + not-E, amount to not-F. Similarly, with regard to A + B versus C.
Mill’s process of “subduction” is thus essentially based on the following reasoning:
If A+B (= C), then D+E (= F) – call this the major premise.
But: If A, then D – call this the minor premise.
Therefore, If B, then E – the putative conclusion.
This argument is, I hasten to add, formally invalid, although a common error of inference! This can be seen by splitting the major premise into the two hypotheticals:
If A + B, then D
If A + B, then E
Clearly, the minor premise “if A, then D” overrides the first proposition, “if A + B, then D”, which has the same consequent, showing the component “B” of the antecedent to be extraneous. However, the second proposition, “if A + B, then E”, whose consequent is different, is unaffected by the minor premise; i.e. its antecedent remains compound. We can, if we wish, “nest” this eduction, putting our result in the form “if A, then if B, then D”. But this inference still leaves “A” conditional.
Whence, Mill’s putative conclusion “if B, then E” is pretentious. The only way we could draw it would be to confirm “A” to be categorically true. It does not suffice to mention the element “A” conditionally, as in “if A, then D”. Thus, Mill’s present account of partial causation is not strictly correct.
Partial causation can readily be defined and in practice identified, but the appropriate formula for it is a bit more complicated than Mill suggests. It requires a more radical understanding and more systematic treatment of causation. There is no need to go into it here, since I treat it in detail in my main text on the subject.
The notion of a “residue” (or remainder or leftover) is a mathematical one, rooted in the relation of whole and part: if you have a basket with three fruits and you remove one, you still have two left. A similar idea can be used in causation – but only to say:
If one of the partial causes is found to be present, then we can anticipate that as soon as the remaining partial causes are also found to be present, all their collective effects will follow on their heels.
Mill’s ‘method of residues’ subconsciously appeals to this obvious truth. But he confuses the issue, when he considers that things (like D) that the present phenomenon (A) causes by itself (i.e. things it alone suffices to bring about) can be counted as among the collective effects (like E) of all the causal phenomena under consideration (A and B). This is his essential error. Here again (as with his previous attempts to infer partial causations), his premises and conclusion are not consistent with each other.
Note finally that Mill’s language is positive, suggesting that he had in mind specifically partial causation. Here again, as in the preceding methods, he does not apparently consider the other form of weak causation, that involving negative theses, viz. contingent causation.
Moreover, even supposing that Mill had successfully identified partial causatives, he does not here specify that such causes might be necessary or contingent. Perhaps, having spoken (although out of place) about necessary partial causation in the joint method and the method of difference (mentioning “an indispensable part of the cause”), he might be supposed here to be focusing on contingent partial causation. But this would be reading into Mill’s treatment something he has given no sign he has awareness of.
One more point worth adding, concerning the appeal to “residues” in reasoning about causes. There is indeed a method that can be so named, one commonly used by scientists and ordinary thinkers. This method was known to Francis Bacon already, long before Mill. It consists simply of disjunctive apodosis – i.e. the gradual elimination of alternative hypotheses. Such reasoning has the form:
Either P or Q or R or… is the cause of S;
these (P, Q, R,…) are all the conceivable causes of S.
The cause of S is not …; and it is not R; and it is not Q.
Therefore, the cause of S must be P (i.e. the only remaining alternative).
For example, Sherlock Holmes might say: “the culprit is either Jack or Jill; it can’t be Jack, since he has an alibi; therefore, it has to be Jill.”
Let X be “whatever phenomenon”, and Y be “another phenomenon”; let X1, X2, X3… be variants of X, and Y1, Y2, Y3… be corresponding variants of Y. Then:
· Whenever Y varies from Y1 to Y2, X varies from X1 to X2;
· Whenever Y varies from Y2 to Y3, X varies from X2 to X3;
· therefore, X is “either a cause or an effect of” Y, “or is connected with it through some fact of causation”.
Notice Mill’s use of “whenever”: he is correctly referring to unvarying relations, not mere random coincidences. That is, we may suppose he was implying that if the variations of the two phenomena (the kinds of events we labeled X and Y) are not concomitant, they may be assumed independent of each other.
Mill does not explicitly tell us what degree of causation may be inferred – whether complete and necessary, or only the one or the other, or neither. He is seemingly open to all possibilities, since he vaguely mentions that “some fact of causation” may in some cases be the best conclusion we can draw. Granting this phrase refers to the weaker determinations, we may suppose that when he refers to “a cause or an effect” he means a stronger determination. However, since he here uses the indefinite article “a”, instead of his usual definite article “the”, this supposition is debatable. In sum, Mill concludes some sort of causation to be inferable, but is vague as to which sort and when.
Whereas in the first three methods, changes from presence to absence or vice versa are concerned – in concomitant variations, every incremental change in measure or degree of the cause is accompanied by a corresponding incremental change in the measure or degree of the effect; and/or vice-versa. In some cases, the correspondences between two phenomena are in this way very regular; but in other cases, additional phenomena have to be taken into consideration to clarify the more complex relationship involved.
In any case, the fact of concomitant variation may be considered an ontological derivative of that of causation, dealing with quantitative instead of merely qualitative relationships between two or more phenomena. That is, in Mill’s terms, this fifth method is a corollary, or frequent further development, of the preceding four.
In the case of ‘agreement’ (interpreted as complete causation), we would expect changes in the cause to be invariably followed by concomitant variations in the effect. In the case of ‘difference’ (interpreted as necessary causation), we would expect changes in the effect indicative of predictable concomitant variations in the cause. In the ‘joint’ case (i.e. the strongest possible causative relation), both these directions of inference would be applicable.
Note that these alternatives are not made clear in Mill’s formula, where X’s variations follow Y’s variations, yet X is concluded to be “either a cause or an effect” of Y. Given regular variation of X with Y, the more probable conclusion would be that Y is a complete cause of X; although a second possible conclusion would be that X is a necessary cause of Y. Mill presumably does not mention the reverse case, where Y varies with X, simply because he considers that in such case we would just place each term in the other’s position in his formula. Fair enough, but then he should at least have mentioned in his formula that in some cases variations are concomitant in one direction only, and in others in both directions!
In the last case (‘residues’ – interpreted as partial and/or contingent causation), we would have to use more a elaborate technique: to identify and monitor all the factors involved, and observe how and how much each varies with whatever changes occur, or are experimentally produced, in the other factors. This is generally achieved using the cunning method of “keeping all other things equal” while investigating just two factors at a time, until all the factors have successively been played off against each other and we obtain a full picture of their multilateral quantitative relationship.
In my view, Mill should have mentioned all that explicitly in his formula. It is reasonable to assume he knew it, since the method was oft used in scientific experiments in his day. Why didn’t he, then? Let us go on, anyway, and analyze these matters a bit more.
We can theoretically express concomitant variations by means of series of causal propositions, either through statements mentioning changes (as above initially done) or more radically through statements mentioning states, like:
If A=A1, then B=B1;
if A=A2, then B=B2;
However, often in practice, these innumerable, point-by-point correlations between various quantities are plotted on a graph and then summarized in a mathematical equation. For example, if B is directly proportional to A, we would write (where k is some constant):
B = kA.
Actually, we have to be careful in this matter, because such a mathematical equation implies/presupposes fully convertible relations. Thus, the following would also have to be true:
If A=A1, then B=B1; if B=B1, then A=A1; where B1=kA1.
If A=A2, then B=B2; if B=B2, then A=A2; where B2=kA2.
This does not have to imply the causation involved to be reversible, only that A be a complete and necessary cause of B. Thus, while in common language we can readily express concomitant variation between a merely complete cause and its effect (or conceivably between a merely necessary cause and its effect) – in the language of mathematical equations, necessary as well as complete causation is implied (although, I believe, modern mathematics can readily overcome this difficulty).
Note well that if we just say “If A=A1, then B=B1”, it does not exclude that for another value of A (say, A8), B may have the same value (B1) again. Such reiterations of value will translate mathematically into more complex formulas than mere proportionality.
More complex relationships may, but do not in all cases, signify partial and/or contingent causation, involving more than two items (at least two causes and one effect). Thus, note well, Mill’s statement of this method need not be limited to two variables; he presumably had this in mind when he wrote the alternative conclusion “or is connected with it through some fact of causation”.
Note, finally: the idea of comparing variations between two or more variables was proposed long before Mill, by Francis Bacon.
John Stuart Mill (1806-73) was an English philosopher, a highly educated man whose interests ranged very widely, including all aspects of logic. He published the work in which he presents the above ‘methods of experimental inquiry’, A System of Logic, when he was 37. He sought for a pragmatic, empiricist, inductive approach to knowledge; an updated logic, but one that would “supplement and not supersede” Aristotle’s.
Mill’s five methods have generally been well received, and I acknowledge them as having been an inspiration to me. However, as the above analysis shows, though his intentions were laudable, his performance was often woefully inadequate. I take no pleasure in saying this; but I am somewhat consoled by the knowledge that others have before me also sharply criticized him.
If these methods had been developed before the dawn of modern science – say before the publication of Isaac Newton’s Principia (1687) – I would have congratulated their author for having provided researchers with potentially valuable cognitive tools. But Mill’s work is dated 1843 – almost the mid-19th Century!
At that late date in modern science and philosophy, one could no longer discover these research tools, but one could at least give an ex post facto exposé and validation of them. Mill’s effort in that direction was, in the last analysis, surprisingly confused, considering his broad knowledge of science and philosophy till his day.
As we have seen, Mill’s methods could just as well be characterized as techniques ‘for identifying causation’, because that is the form of their conclusions; and also, because experimental data is not essential to them, i.e. they can be applied as well to passive observations. His method of residues, unlike the others, is deductive rather than inductive. Whether this list of methods, without regard to its internal imperfections, constitutes an exhaustive summary of actual scientific techniques is open to debate.
What is clear, anyhow, is that Mill did not fully understand the relations of causation. Flaws are evident in his treatment of each of his five methods. Briefly put:
- In the ‘joint method’, he seemingly tries and succeeds defining or identifying the paradigm of causation, complete necessary causation. However, his understanding is put in doubt by his mention of irrelevant conditions (exclusiveness of the circumstances) in the premises, and his drawing of an alternative conclusion (“an indispensable part of the cause”) logically contrary to the given premises.
- In the ‘agreement method’, he seemingly tries and succeeds defining or identifying complete causation. However, his understanding is put in doubt by his mention of irrelevant conditions (exclusiveness) in the premises, and his failure to specify the unnecessity of the theses (as needed to infer causation from constant conjunction).
- In the ‘difference method’, he seemingly tries but quite fails defining or identifying necessary causation. His understanding is put in doubt by his appeal to single instances (instead of kinds), his mention of extraneous conditions (the uniformity of surrounding circumstances), and his drawing of an alternative conclusion (“an indispensable part of the cause”) logically contrary to the seemingly intended premises.
- In the ‘residues method’, he seemingly tries to deduce a partial causation from two complete causations. His understanding is here again put in doubt, by his proposing conflicting premises (the same thing cannot be both a complete and a partial cause of a given phenomenon), and his suggesting an excessive conclusion (i.e. more than the givens allow).
- In the ‘concomitant variations method’, he seemingly tries and vaguely succeeds defining or identifying the quantitative aspect of causation. This is logically his soundest method, but he fails to mention and distinguish the various degrees of causation that may be involved.
Mill obviously had difficulty with the concept of plurality of causes; i.e. distinguishing between parallelism and composition of causes. The inclusion of redundancies concerning surrounding circumstances in some of his statements indicates that he did not have an entirely accurate picture of causation. His resort to seemingly last minute inserts at the tail end of certain conclusions leads to the same suspicion. Moreover, in none of the five methods does he so much as hint he has heard of contingent causation.
Mill’s first four methods may be taken to essentially refer to the causative forms mn, m, n, and p, respectively. The first and third methods mention the specific determination np, but give us no clue as to how such causation might be established, i.e. concluded rather than mn or n, respectively. Since he apparently ignores the generic determination q, he misses the specific determinations mq and pq. His treatment is thus neither symmetrical nor exhaustive.
I should also point out that Mill does not clearly distinguish between generic and specific determinations. I assume he does not intend the generic determinations that he separately as well as jointly affirms (namely: m, n) as absolute lone determinations; but the issue is not to my knowledge explicitly raised by him so we cannot be sure what he imagined.
As we have seen, Mill’s formulations are open to further criticism. His language is often ambiguous and its intent difficult to fathom. He did not always manage to capture in words what he was trying to say. His logic is in places dubious, if not downright self-contradictory. He may propose mutually incoherent premises and/ conclusions that contradict explicit premises.
The main reason for the weaknesses in Mill’s treatment is perhaps his attempt to deal with definition and induction simultaneously. He would have been more successful if he had, more systematically, first defined the various forms of causation (ratio essendi) and then investigated how their contents may be induced (ratio cognoscendi). Perhaps due to his association with the Utilitarian school of philosophy, he was ideologically inclined towards a rather heuristic approach, eschewing a more theoretical treatment of causation.
The logician’s main task is to describe and validate forms of reasoning. While Mill took some pains to describe causal arguments, he made little effort to validate them. At times, his treatment seems like a sham – not out of malice, but due to negligence. He does not seem to intentionally lie (as some do); but one gets the impression he has not really done his best to do a good job, and he does not expect anyone to notice or care.
The logician’s role is also to provide methodological aids for scientists, students, and indeed thinking people in general. Whether Mill’s contributions to causal logic ever actually affected anyone’s investigation of nature in a positive or negative way is hard to say. Nevertheless, some of his thoughts on the subject were misleading, and the fact should be made public.
This is all very disappointing, considering J. S. Mill’s status in British intellectual history. How could a man of his social standing and educational caliber have made such mistakes, and moreover gotten away with them, one wonders.
After all, causation and its varieties were pretty well known to the ancients; this is even evident in commonly used Latin legal terms, like causa sufficiens or sine qua non. And Mill was very well read in ancient thought; he was brought up with it by his father, James.
Major British philosophers had already discussed causality at considerable length. John Locke (1632-1704), in An Essay Concerning Human Understanding (1690), put forward a theory of induction based on regularities of sequence between phenomena. David Hume (1711-1776), for all his avowed skepticism in An Enquiry Concerning Human Understanding (1748), had clearly expounded constant conjunction. Mill’s views about causation were frankly influenced by Hume’s.
Most shocking, is the realization that Mill’s logical treatise (1843) was published 238 years after the founding father of British Empiricism, Francis Bacon (1561-1626), published his Novum Organon (1605). Mill was aware of Bacon’s work, too, since he (rightly) criticized Bacon’s view of causation as simplistic in various respects. But he manifestly failed to notice and learn the important lessons taught by this unsung (or insufficiently sung) hero of the modern scientific method; namely, Bacon’s programme of adduction and matricial analysis (to use my terminology).
Suffices to quote the Encyclopaedia Britannica (2004) description of Bacon’s “new method” for this failure of Mill’s to be clear:
The crucial point, Bacon realized, is that induction must work by elimination not, as it does in common life and the defective scientific tradition, by simple enumeration. Thus he stressed “the greater force of the negative instance”—the fact that while “all A are B” is only very weakly confirmed by “this A is B,” it is shown conclusively to be false by “this A is not B.” He devised tables, or formal devices for the presentation of singular pieces of evidence, in order to facilitate the rapid discovery of false generalizations. What survives this eliminative screening, Bacon assumes, may be taken to be true.
Bacon presents tables of presence, of absence, and of degree. Tables of presence contain a collection of cases in which one specified property is found. They are then compared to each other to see what other properties are always present. Any property not present in just one case in such a collection cannot be a necessary condition of the property being investigated. Second, there are tables of absence, which list cases that are as alike as possible to the cases in the tables of presence except for the property under investigation. Any property that is found in the second case cannot be a sufficient condition of the original property. Finally, in tables of degree proportionate variations of two properties are compared to see if the proportion is maintained.
 Literally: ‘standing around’ – suggesting something found to accompany the object in some way, a condition or situation.
 Even a distinguishable individual may be treated as a “kind”: a man, say Aristotle, is in this sense the sum of all moments of his life; i.e. Aristotle is the class of Aristotle today, the same yesterday, etc.
 This is at least true in the logical mode of conditioning, where hypothetical propositions may be true in cases where one or both of the theses are necessary. Note that in “if X, then Y”, X and Y are both implied possible anyway, and all we need to add is that Y is unnecessary, for the unnecessity of X then formally follows; similarly, “if Y, then X” only requires addition that X is unnecessary, to infer causation. In the natural or extensional modes, the issue does not arise, because unless both antecedent and consequent are contingent, we would not be formally allowed to construct a conditional proposition let alone infer causation. This may be highlighted using categoricals: “All X are Y” is equivalent to “No X is not-Y”; similarly, for “All not-X are not-Y” and “No not-X is Y”.
 This would be a typical case of the fallacy, known already to Aristotle, post hoc ergo propter hoc. An example of it would be racist “reasoning”.
 Note in passing that “every” implies general knowledge – which is empirically impossible without generalization (except with regard to finite sets). We can never in practice be sure to have identified all existing circumstances; and though we may assume we have done so, as a working hypothesis, we have to remain vigilant and continue to look for still unidentified factors that might also be relevant.
 We could reexamine the whole argument, based on the opposite assumption, that necessary causation is intended throughout this argument. But I anticipate the overall result would be the same, for the underlying process of subduction is the basic issue at stake. It is just as erroneous if the elements we focus on are of negative polarity, i.e. not-A, not-B, etc.