Creative Processes in Scientific Discovery

Lorenzo Magnani

Introduction

Philosophers of science in the twentieth century have distinguished between the logic of discovery and the logic of justification. Most have concluded that there is no a logic of discovery and, moreover, that a rational model of discovery is impossible. Scientific discovery is irrational, there is no reasoning to hypotheses.

As for the present, computational models of scientific discovery and theory formation play a prominent role in illuminating transformations of rational conceptual systems. I propose a high level dynamic design able to account for the epistemological features and constraints of these conceptual transformations. Hence, I propose a new abstraction paradigm aiming at unifying the different perspectives and to provide some design insight for the future ones.

My aim is to emphasize the significance of abduction in order to illustrate the problem solving process and to propose a unified epistemological model of scientific discovery. This model first describes four meanings of the word abduction (creative, selective, to the best explanation, visual) in order to clarify their significance in epistemology, psychological experimental research and artificial intelligence (AI). In my opinion, the controversial status of abduction, a very important topic concerning creative reasoning, is related to a confusion between the epistemological and cognitive levels, and to a lack of explanation as to why people sometimes deviate from normative epistemological principles. Moreover, among the different abductive roles played by various kinds of conceptual transformations in developing scientific creative reasoning, such as radical ontological change, conceptual combination, analogical and visual thinking, anomaly resolution, thought experiment, I will consider how the use of visual mental imagery in thinking may be relevant to scientific hypotheses generation.

The epistemology of abduction

Let’s consider the following interesting passage, from an article by Simon from 1965, published in the British Journal for the Philosophy of Science (Simon, 1965) and dealing with the logic of normative theories: «The problem-solving process is not a process of ‘deducing’ one set of imperatives (the performance programme) from another set (the goals). Instead, it is a process of selective trial and error, using heuristic rules derived from previous experience, that is sometimes successful in discovering means that are more or less efficacious in attaining some end. It is legitimate to regard the imperatives embodying the means as ‘derived’ in some sense from the imperatives embodying the ends; but the process of derivation is not a deductive process, it is one of discovery. If we want a name for it, we can appropriately use the name coined by Peirce and revived recently by Norwood Hanson [1958]: it is a retroductive process. The nature of this process - which has been sketched roughly here - is the main subject of the theory of problem solving in both its positive and normative versions» (Simon, 1977, p. 151). Simon states that discovering means that are more or less efficacious in attaining their objective is a retroductive process. He goes on to show that it is easy to obtain one set of imperatives from another set by processes of discovery or retroduction, and that the relation between the initial set and the derived set is not a relation of logical implication.

The word «retroduction» used by Simon is the Hansonian neopositivistic one replacing the Peircian classical word abduction: they have the same epistemological and philosophical meaning. I completely agree with Simon: abduction is the main subject of the theory of problem solving and developments in the fields of cognitive science and artificial intelligence have strengthened this conviction.

As Fetzer has recently stressed, from a philosophical point of view the main modes of argumentation for reasoning from sentential premises to sentential conclusions are expressed by these three general attributes: deductive (demonstrative, non ampliative, additive), inductive (non-demonstrative, ampliative, non additive), fallacious (neither, irrelevant, ambiguous). Abduction, which expresses likelihood in reasoning, is a typical form of fallacious inference: «[it] is a matter of utilizing the principle of maximum likelihood in order to formalize a pattern of reasoning known as ‘inference to the best explanation’» (Fetzer, 1990, p. 103). A hundred years ago, Peirce (1955) also was studying and debating these three main inference types of reasoning.

The type of inference called abduction was studied by Aristotelian syllogistics, as a form of apagwgh, and later on by mediaeval reworkers of syllogism. In the last century abduction was once again studied closely, by Peirce (Peirce, 1931-1958). Peirce interpreted abduction essentially as a creative process of generating a new hypothesis. Abduction and induction, viewed together as processes of production and generalization of new hypotheses, are sometimes called reduction, that is apagwgh. As Lukasiewicz (1970, p. 7) makes clear, «Reasoning which starts from reasons and looks for consequences is called deduction; that which starts from consequences and looks for reasons is called reduction».

To illustrate from the field of medical knowledge (Magnani, 1992), the discovery of a new disease and the definition of the manifestations it causes can be considered as the result of the creative abductive inference previously described. Therefore, creative abduction deals with the whole field of the growth of scientific knowledge. However, this is irrelevant in medical diagnosis where instead the task is to select from an encyclopedia of pre-stored diagnostic entities, diseases, and pathophysiologic states, which can be made to account for the patient’s condition. On the other hand, diagnostic reasoning also involves abductive steps, but its creativity is much weaker: it usually requires the selection of a diagnostic hypothesis from a set of pre-enumerated hypotheses provided from established medical knowledge. Thus, this type of abduction can be called selective abduction (Magnani, 1988). Selective abduction implies uncertainty and corresponds to the heuristic classification problem-solving model proposed by Clancey (1985), it deals with a kind of rediscovery, instead of a genuine discovery.

Induction in its widest sense is an ampliative process of the generalization of knowledge. Peirce distinguished three types of induction and the first was further divided into three sub-types. A common feature of all kinds of induction is the ability to compare individual statements: using induction it is possible to synthesize individual statements into general laws (types I and II), but it is also possible to confirm or discount hypotheses (type III). Clearly we are referring here to the latter type of induction, that in my model is used as the process of reducing the uncertainty of established hypotheses by comparing their consequences with observed facts.

Deduction is an inference that refers to a logical implication. Deduction may be distinguished from abduction and induction on the grounds that only in deduction is the truth of inference guaranteed by the truth of the premises on which it is based.

Thus, selective abduction is the making of a preliminary guess that introduces a set of plausible hypotheses, followed by deduction to explore their consequences, and by induction to test them with available data: 1) to increase the likelihood of a hypothesis by noting evidence explained by that one, rather than by competing hypotheses; or 2) to refute all but one.

If during this first cycle new information emerges, hypotheses not previously considered can be suggested and a new cycle takes place: in this case the nonmonotonic character of abductive reasoning is clear.

There are two main epistemological meanings of the word abduction: 1) abduction that only generates plausible hypotheses (selective or creative) - and this is the meaning of abduction accepted in my epistemological model - and 2) abduction considered as inference to the best explanation, that also evaluates hypotheses. In the latter sense the classical meaning of abduction as inference to the best explanation (for instance in medicine, to the best diagnosis) is described in my epistemological model by the complete abduction-deduction-induction cycle. All we can expect of my «selective» abduction, is that it tends to produce hypotheses that have some chance of turning out to be the best explanation. Selective abduction will always produce hypotheses that give at least a partial explanation and therefore have a small amount of initial plausibility. In this respect abduction is more efficacious than the blind generation of hypotheses. Visual abduction, a special form of abduction, occurs when hypotheses are automatically derived from a stored series of previous similar experiences. In this case there is no uncertainty (see below).

To achieve the best explanation, it is necessary to have a set of criteria for evaluating the competing explanatory hypotheses reached by creative or selective abduction. Evaluation has a multi-dimensional character. Consilience (Thagard, 1988) can measure how much a hypothesis explains, so it can be used to determine whether one hypothesis explains more of the evidence (for instance, empirical or patient data) than another: thus, it deals with a form of corroboration. In this way a hypothesis is considered more consilient than another if it explains more «important» (as opposed to «trivial») data than the others do. In inferring the best explanation, the aim is not the sheer amount of data explained, but its relative significance. The assessment of relative importance presupposes that an inquirer has a rich background knowledge about the kinds of criteria that concern the data. Simplicity too can be highly relevant when discriminating between competing explanatory hypotheses; it deals with the problem of the level of conceptual complexity of hypotheses when their consiliences are equal. Explanatory criteria are needed because the rejection of a hypothesis requires demonstrating that a competing hypothesis provides a better explanation. Clearly, in some cases conclusions are reached according to rational criteria such as consilience or simplicity. Nevertheless, in reasoning to the best explanation, motivational, ethical or pragmatic criteria cannot be discounted. Indeed the context suggests that they are unavoidable: this is especially true in medical reasoning (for instance, in therapy planning), but scientists that must discriminate between competing scientific hypotheses or competing scientific theories are sometimes also conditioned by motivationally biasing their inferences to the best explanation.

Scientific Theory Change

My epistemological model should be considered as an illustration of scientific theory change: in this case selective abduction is replaced by creative abduction and there is a set of competing theories instead of diagnostic hypotheses. Furthermore the language of background scientific knowledge is to be regarded as open: in the case of competing theories, as they are studied by the epistemology of theory change, we cannot - contrary to Popper’s (1970) viewpoint - reject a theory merely because it fails occasionally. If it is simpler and explains more significant data than its competitors, such a theory can be acceptable as the best explanation.

Nevertheless, if we consider the epistemological model as an illustration of medical diagnostic reasoning, the modus tollens is very efficacious because of the fixedness of language that expresses the background medical knowledge: a hypothesis that fails can nearly always be rejected immediately.

When Buchanan illustrates the old epistemological method of induction by elimination - and its computational meaning, as a model of the «heuristic search» - , first advanced by Bacon and Hooke and developed later on by J. Stuart Mill, he is referring implicitly to my epistemological framework in terms of abduction, deduction and induction, as illustrative of medical diagnostic reasoning: «The method of systematic exploration is [...] very like the old method of induction by elimination. Solutions to problems can be found and proved correct, in this view, by enumerating possible solutions and refuting all but one. Obviously the method is used frequently in contemporary science and medicine, and is as powerful as the generator of possibilities. According to Laudan, however, the method of proof by eliminative induction, advanced by Bacon and Hooke, was dropped after Condillac, Newton, and LeSage argued successfully that it is impossible to enumerate exhaustively all the hypotheses that could conceivably explain a set of events. The force of the refutation lies in the open-endedness of the language of science. Within a fixed language the method reduces to modus tollens [...]. The computational method known as heuristic search is in some sense a revival of those old ideas of induction by elimination, but with machine methods of generation and search substituted for exhaustive enumeration. Instead of enumerating all sentences in the language of science and trying each one in turn, a computer program can use heuristics enabling it to discard large classes of hypotheses and search only a small number of remaining possibilities» (Buchanan, 1985, pp. 97-98).

Visual hypotheses

We should remember, as Peirce noted, that abduction plays a role even in relatively simple visual phenomena. «Visual abduction» (Magnani et. al., 1994), a special form of abduction, occurs when hypotheses are instantly derived from a stored series of previous similar experiences. In this case there is no uncertainty. It covers a mental procedure that tapers into a non-inferential one, and falls into the category called «perception» (Anderson, 1987, pp. 38-44). Philosophically, perception is viewed by Peirce as a fast and uncontrolled knowledge-production procedure. Perception, in fact, is a vehicle for the instantaneous retrieval of knowledge that was previously structured in our mind through inferential processes. By perception, knowledge constructions are so instantly reorganized that they become habitual and diffuse and do not need any further testing. Many visual stimuli are ambiguous, yet people are adept at imposing order on them: «We readily form such hypotheses as that an obscurely seen face belongs to a friend of ours, because we can thereby explain what has been observed» (Thagard, 1988, p. 53). This kind of image-based hypothesis formation can be considered as a form of visual abduction.

My general objective is to consider how the use of visual mental imagery in thinking may be relevant to hypothesis generation. There has been little research into the possibility of visual and imagery representations of hypotheses, despite abundant reports (e.g. Einstein and Faraday) that imaging is crucial to scientific discovery (Holton, 1972; Miller, Nersessian, 1994; 1989; Tweney, 1989). Some hypotheses naturally take a pictorial form: the hypothesis that the earth has a molten core might be better represented by a picture that shows solid material surrounding the core.

I will now discuss, from a computational perspective, a visual abductive problem solving strategy. To provide manageable bounds to my very general objective, i.e. to analyze the role of visual hypothesis generation, which is so crucial to scientific discovery, I have initially limited myself to the subtask of illustrating some structurally similar examples from the field of common sense reasoning, where it is very easy to find many cases dealing with what we have just called visual abduction.

Although there is considerable agreement concerning the existence of a high-level visual and spatial medium of thought as a mechanism relevant to abductive (selective and creative) hypothesis generation (Kosslyn, 1980; Kosslyn, 1983, Kosslyn & Koenig, 1992; Tye, 1991), the underlying cognitive processes involved are still not well understood.

Let us consider the following preliminary cognitive case: many visual stimuli are ambiguous, yet people are adept at imposing order on them. As stated above, this is the case when we readily form hypotheses such as that an obscurely seen face belongs to a friend of ours, because we can explain what has been observed (Peirce’s visual abduction is related to this example).

More generally, we can face an initial (eventually) observed image in which we recognize a problem to solve. For example, given a visual or imagery datum, we may have to explain: 1) the absence of an object; 2) why an object is in a particular position; or 3) how an object can achieve a given task moving itself and/or interacting with the remaining objects in the scene/image. How can visual reasoning perform this explanation? To answer this question it is necessary to show how visual abduction may be relevant to hypothesis generation, that is, how an image-based explanation is able to solve the problem given in the initial image.

Faced with the initial image, in which we have previously recognized a problem to solve, as stated above, we have to work out an imagery hypothesis that can explain the problem-data. Thus, the formed image acquires a hypothetical status in the inferential abductive process at hand.

The generation of a new imagery hypothesis can be considered the result of the creative abductive inference previously described; in this respect we can consider how the imagery representations of new hypotheses lead to scientific discovery. The selection of an imagery hypothesis from a set of pre-enumerated imagery hypotheses, stored in long-term memory, also involves abductive steps, but its creativity is much weaker: this type of visual abduction can be called selective.

All we can expect of visual abduction is that it tends to produce imagery hypotheses that have some chances of turning out to be the best explanation. Visual abduction will always produce hypotheses that give at least a partial explanation, and therefore have a small amount of initial plausibility. In this respect abduction is more effective than the blind generation of hypotheses.

How can we computationally perform the generation of imagery hypotheses which are able to explain problem-data? This complex task can be achieved in an environment supplied with suitable levels of expressivity and adequate inferential systems.

I am developing a computational system (VASt) (described in Magnani et al., 1994), dedicated to showing how an image-based explanation is able to solve the problem given in an initial image. The system’s design clearly reflects the observation that the existence of a high-level spatial medium of thought is a mechanism relevant to abductive hypothesis generation. As for the present it focuses on spatial reasoning tasks in common sense reasoning, but I am extending this system to the field of scientific discovery.

Given a visual or imagery datum, the system is able to perform an image-based explanation of how an object can achieve a given task moving itself and/or interacting with the remaining objects in the scene/image. From a functional point of view the system is extendable in several ways - each related to different capabilities: given a visual or imagery datum, it would be able to provide an image-based explanation of: 1) the absence of an object in a scene; 2) why an object is in a particular position in a scene.

We shall clarify the first case with the following example, which is based on common sense reasoning: we see a broken horizontal glass on the floor, near the table. On the floor there are also some leaves and we see that the window is open. If we retrieve from long-term memory another visual (imagery) description still containing the glass (intact), the table, and the window, and we recognize this new representation as being a slightly different version of the previous one, we have to explain the presence of the leaves and broken glass in the initial image. They constitute an anomaly that needs to be solved (explained). If we can link the leaves to the presence, say, of wind, we are in turn transported to a new imagery explanatory hypothesis.

The second case deals with the capacity to justify the absence of a given object in a scene. Let us consider the following example: one of our friends is accustomed to travel the same route every day. The road passes near to a little bridge, under which ducks can usually be seen swimming. On a particularly cold day our friend does not see the ducks. He asks himself where the ducks could be, but, since he has never seen any ducks in a different setting, while he is able to detect the anomaly he is unable to explain it. The imagery explanatory reasoning is impossible: therefore, it is stopped. On the contrary, if our friend had previously seen the ducks, say, under the roof of a farmhouse, once he has detected the absence of the ducks he can retrieve from long-term memory the image of the ducks sleeping under the roof. The imagery explanatory hypothesis is immediately achieved.

The third case deals with the well-known monkey-banana problem. In a room there is a banana, a box, and a monkey. The monkey cannot reach the banana because it is on the ceiling, but it can push the box to a point below the banana, climb on top of it and so reach the banana. Every visual representation of the effect of a sequence of actions the monkey can perform may be considered as an hypothesis generation. Such an hypothesis, if successful, is viewed as the one that gives a solution of the problem.

The details of the computational system (VASt) able to perform these tasks - open to the cases dealing with the problem of visual thiking in scientific discovery - are given in Magnani et al. (1994).

Conclusion

It is well known that epistemology is not alone in investigating reasoning. Reasoning is also a major subject of investigation in AI and cognitive psychology. Epistemological theories of reasoning, when implemented in a computer, become AI programs. The theories and the programs are, quite literally, two different ways of expressing the same thing. After all, theories of reasoning are about rules for reasoning and these are rules telling us to do certain things in certain circumstances. Writing a program allows us to state such rules precisely.

Some philosophers might insist that, between epistemology and cognitive psychology, there is a little, if any connection. The basis for such a claim would be that epistemology is normative while psychology is descriptive. That is, psychology is concerned with how scientists do reason, whereas epistemology with how scientists ought to reason. One of the central dogmas of philosophy is that you cannot derive an ought from an is. Nevertheless, this kind of ought might be called a «procedural ought». The apparent normativity of epistemology is just a reflection of the fact that epistemology is concerned with rules for how to do something. There is no reason for thinking that you cannot derive epistemological oughts from psychological iss. It would be very unreasonable to design a computational model of scientific discovery and reasoning without taking into account how scientists to reason, what scientists know, and what data scientists can acquire.

Nevertheless, the general goal is not the full simulation of scientists, but the making of discoveries about the world, using methods that extend human cognitive capacities. The goal is to build prosthetic scientists: just as telescopes are designed to extend the sensory capacity of humans, computational models of scientific discovery and reasoning are designed to extend their cognitive capacity.

References

Anderson, D. R. (1987). Creativity and the philosophy of Charles Sanders Peirce. Oxford: Clarendon Press.

Buchanan, B.G. (1985). Steps toward mechanizing discovery. In: K.F. Schaffner (Ed.), Logic of discovery and diagnosis in medicine (pp. 94-114). Berkeley and Los Angeles, CA: University of California Press.

Clancey, W.J. (1985). Heuristic classification. Artificial Intelligence, 27, 289-350.

Fetzer, J.H. (1990). Artificial intelligence: Its scope and limits. Dordrecht: Kluwer Academic Publishers.

Glasgow, J.I & Papadias, D. (1992). Computational imagery. Cognitive Science, 16(3), 355-394.

Hanson, N.R. (1958). Patterns of discovery. An inquiry into the conceptual foundations of science. Cambridge: Cambridge University Press.

Holton, G. (1972). On trying to understand scientific genius. American Scholar, 41, 95-110.

Kosslyn, S. M. (1980). Image and mind. Cambridge, MA: Harvard University Press.

Kosslyn, S. M. (1983). Ghosts in the mind’s machine: Creating and using images in the brain. New York, NY: W.W. Norton.

Kosslyn, S. M. & Koenig, O. (1992). Wet mind. The new cognitive neuroscience. New York, NY: Free Press.

Lukasiewicz, J. (1970). Creative elements in science [1912]. In J. Lukasiewicz, Selected works (pp. 12-44). Amsterdam: North Holland.

Magnani, L. (1988). Epistémologie de l’invention scientifique. Communication & Cognition, 21, 273-291.

Magnani, L. (1992). Abductive reasoning: philosophical and educational perspectives in medicine. In V. L. Patel & D. Evans (Eds.), Advanced models of cognition in medical training and practice (pp. 21-41). Berlin: Springer.

Magnani, L., Civita, S. & Previde Massara, G. (1994). Visual cognition and cognitive modeling. In V. Cantoni (ed.), Human and machine vision: Analogies and divergences (pp. 229-243). New York, NY: Plenum Press.

Miller, A. I. (1989). Imagery and intuition in creative scientific thinking: Albert Einstein’s invention of the special theory of relativity. In D.B. Wallace & H.E. Gruber (Eds.), Creative people at work. Twelve cognitive case studies (pp. 171-187). Oxford: Oxford U.P.

Nersessian, N. J. (1994). Opening the black box: Cognitive science and history of science. GIT-COGSCI 94/23, July. Cognitive Science Report Series, Atlanta, Georgia Institute of Technology.

Peirce, C.S. (1931-1958). Collected papers, 8 vols. Edited by C. Hartshorne, P. Weiss & A. Burks. Cambridge, MA: Harvard University Press.

Peirce, C.S. (1955). Abduction and induction. In J. Buchler (Ed.), Philosophical writings of Peirce (pp. 150-156). New York, NY: Dover.

Popper, K. (1970). The logic of scientific discovery. London: Hutchinson.

Simon, H.A. (1965). The logic of rational decision. British Journal for the Philosophy of Science, 16, 169-186. Reprinted in H.A. Simon (1977) (pp. 137-153).

Simon, H.A. (1966). Thinking by computers. In R. Colodny (Ed.), Mind and cosmos (pp. 2-21). Pittsburgh, PA: Pittsburgh University Press. Reprinted in H.A. Simon (1977) (pp. 268-285).

Simon, H.A. (1977). Models of discovery and other topics in the methods of science. Dordrecht: Reidel.

Simon, H.A. (1985). Artificial-intelligence approaches to problem solving and clinical diagnosis. In K.F. Schaffner (Ed.), Logic of discovery and diagnosis in medicine (pp. 72-93). Berkeley and Los Angeles, CA: University of California Press.

Thagard, P. (1988). Computational philosophy of science. Cambridge, MA: The MIT Press.

Tye, M. (1991). The imagery debate. Cambridge, MA: The MIT Press.

Tweney, R. D. (1989). Fields of enterprise: On Michael Faraday’s thought. In D.B. Wallace & H.E. Gruber (Eds.), Creative people at work. Twelve cognitive case studies (pp.91-106). Oxford: Oxford University Press.