Symbolism and connectionism paradigms

Lorenzo Magnani

Computational Philosophy Laboratory

Department of Philosophy

University of Pavia

I--27100 Pavia -- Italy

lorenzo@philos.unipv.it

Human and Machine perception: Emergence, Attention and Creativity

Pavia, September 14-17, 1998

1. Evaluating computational-representational approaches

In his last book Paul Thagard (Thagard, 1996) summarizes the central hypothesis of cognitive science: "Thinking can best be understood in terms of representational structures in the mind and computational procedures that operate on those structures. Although there is much disagreement about the nature of the representations and computations that constitute thinking, the central hypothesis is general enough to encompass the current range of thinking in cognitive science, including connectionist theories" (p. 10). This approach is called Computational-Representational Understanding of Mind. Hence, representation and computation are viewed as the two major fields of research and Logic, Rules, Concepts (frames, semantic networks, and so on), Analogies, Images, and Connections (artificial neural networks) are considered the most important ways of representation and computation.

Thagard gives us an analysis of all these approaches (logic, rule, concepts, analogies, images, and connections) in terms of five complex criteria (p. 15)

1. Representational power

2. Computational power

a. Problem solving

i. Planning

ii. Decision

iii. Explanation

b. Learning

c. Language

3. Psychological plausibility

4. Neurological plausibility

5. Practical applicability

a. Education

b. Design

c. Intelligent systems

I propose to adopt this framework because it is very useful to correctly pose the problem of the dichotomy simbolism/connectionism from a theoretical point of view.

The first criterion, representational power, deals with the quantity of information a particular kind of representation can express. Different ways of representation vary very greatly in representational power. Moreover, a computational approach to mental representations surely has to account for three important kinds of high-level reasoning: problems solving (planning, decision making, and explanation), learning, and language. We can evaluate the computational power of a particular representation in terms of how it regards these kinds of reasoning. The ability to learn from experience is also an important part of intelligent reasoning, and the capability to explain the use of language is a desirable aspect of a general cognitive theory.

The psychological plausibility of a theory of mental representation aims to account for quantitative and qualitative results of psychological experiments concerning certain mental capabilities of humans. A theory of mental representation should be also consistent with the results of neuroscientific experiments.

Finally, the practical applicability concerns the effective results in the different areas of education (for instance, how to teach better), design (for instance, new computer interfaces), and intelligent systems (expert and knowledge-based systems - rule-based, case-based, connectionist - in many fields of artificial intelligence).

2. Cognitive tasks and representations

The cognitive accomplishments made in cognitive science by the approaches in terms of logic, concepts, rules, analogies, and imagery, are well-known. We have very large computational systems that employ the resources of logic (Genesereth & Nilsson, 1987), the most ancient way of representing knowledge, sometimes with the help of probabilistic theory (Pearl, 1988). Concepts (in terms of frames, scripts, semantic networks) (Minksy, 1975, Schank & Abelson, 1977), and rules (as heuristic search and rule-based chunking) (Newell, Shaw & Simon, 1958, Newell & Simon, 1972, Newell, 1990, Anderson, 1983, 1993), constitute the background of many other systems able to model many cognitive aspects of language and problem solving. Analogical thinking (also called case-based reasoning) (Holyoak & Thagard, 1995) and imagery (Kosslyn, 1994, Kosslyn & Koenig, 1992, Glasgow & Papadias, 1992) are subject to very new and interesting research that involve psychological experiments, computational programs, and neurological examinations.

Hence, concepts, analogies, and images are very useful to provide many organized clusters of representations, rules and propositions in logic are useful because of their verbal expressiveness, and images, imagery and connectionism because of their sensory richness. What exactly about connectionism? Let us devote more time to illustrate some details concerning the connectionist approach.

If we consider the two well-known classes of neural networks, local and distributed, in the first class the neuronlike structures are given an identifiable interpretation in terms of concepts or propositions, while in the second one the distributed representations in networks learn how to represent concepts or propositions in complex intricate ways that distribute meaning over clusters of neuronlike structures: both local and distributed representations can be used to perform the so-called parallel contraint satisfaction.

It is well-known the first models of parallel constraint satisfaction were first developed for computer vision (Marr and Poggio, 1976, Feldam, 1981, McClelland and Rumhelhart, 1981); it is also well-known that links between units in a distributed network are adequate to represent simple associations but lack the representational power to grasp more complicated kinds of rules such as that ones containing some uses of universal quantifiers and complex logical relations. Nevertheless, neural networks offer a more flexibility in grasping a broader range of sensory experience, for instance they make possible many more tastes and aromas than we can say in words (Churchland, 1995).

Constructing a plan is naturally understood in terms of rules and analogies, but decision among plans is easier to represent using parallel constraint satisfaction (Thagard & Millgram, 1995, Mannes & Kintsch, 1991) because of the fact we can interpret the facility relations as positive internal constraints, and the negative internal constraints as incompatibility relations, when two actions or goals cannot be performed or satisfied together. We have also to remember that connectionist networks can implement simple types of ruled-based systems (Touretzky & Hinton, 1990) but also the "mental leaps" of the analogical reasoning (Nelson, Thagard, & Hardy, 1994, Holyoak & Thagard, 1995).

In its turn explanation can be seen as activation of prototypes represented by distributed networks (Churchland, 1989); on this perspective, inference to the best explanation can be viewed as the activation of the suitable prototype. For example, the theory of explanatory coherence (Thagard, 1989, 1992), implemented in a connectionist local neural network, is able to activate the hypotheses (or propositions) that better explain (that is, "they have more explanatory coherence") some clusters of evidence.

Adding new units and changing the weights - by training - on the links between units to a connectionist network are the main ways to perform learning tasks (Rumelhart & McClelland, 1986): the so-called feedforward networks trained by backpropagation techniques have had many successful applications.

Finally, McClelland and Rumelhart have also illustrated that word recognition can be considered from the point of view of a parallel constraint satisfaction, and many other connectionist tools have implemented other cognitive linguistic tasks: disambiguation, speech perception, learning verbs by children (and producing errors in that learning ).

We can conclude that the psychological plausibility of connectionist models is very high: from the phenomena of word perception (McClelland & Rumelhart, 1981) to discourse comprehension (Rumelhart & McClelland, 1982, Kintsch, 1988) from analogical mapping (Holyoak & Thagard, 1985) to visual word recognition (Seidenberg & McClelland, 1989). Connectionist weight adjustment is a very powerful mechanism of learning language, even if we do not dispose of a unified connectionist theory of language production and comprehension. If compared with the approaches in terms of logic, rules, and concepts, the connectionism has an obvious neurological plausibility: we can say that most connectionist models are a very rough approximation to the behavior or the real (human) neurons.

At present cognitive science does not show a unified theory able to explain a full range of psychological phenomena, similar to the genetic theory in biology or quantum theory in physics: all the approaches we have illustrated above give rise to different autonomous points of view concerning the "mind". We can say that the different approaches stress different representational and inferential aspects of mind, have different psychological and neurological plausibility, and many original applications in education, design, expert and knowledge-based systems.

The consequence is that an eclectic attitude is imperative, not only from the epistemological point of view, but also in the activity of the knowledge engineer. Some researchers insistently refer to the need of an hybrid approach, an integration of symbolic and neural representation and computation, in areas such as linguistic theory (Prince & Smolesky, 1997), natural language processing (Dyer, 1995) and expert systems (Medsker, 1994), where symbolic approaches have been prevalent in the past. The opportunity of using neural techniques is usually suggested by the fact that neural networks can reduce well-known shortcomings of symbolic problem solvers: for example, brittleness caused by incomplete data, no increase in performance with experience, time consuming knowledge acquisition, sensory processing needed by data acquisition, exploitation of sensory information in the training phase to provide pictorial non-symbolic forms of explanation (Medsker, 1994, Burattini et. al., 1998).

Anderson, J. R. ,1990³, Cognitive science and its implications, Freeman, New York.

Anderson, J. R., 1993, Rules of the mind, Erlbaum, Hillsdale, nj.

Burattini, E., De Gregorio, M. & Tamburrini, G., 1998, Hybrid expert systems. an approach combinig neural computation and rule-based reasoning, in C. D. Leontes, ed., Expert systems techniques and applications, Gordon & Breach, London.

Churchland, P. M., 1989, A neurocomputational perspective, The mit Press, Cambridge, ma.

Churchland. P. M., 1995, The engine of reason, the seat of the soul, The mit Press, Cambridge, ma.

Dyer, M. G., 1991, Symbolic neuroengineering for natural language processing: a multilev approach, in Barnden J. A. & Pollack J.B., eds., Advances in Connectionist and neural Computation Theory, vol I. High Level Connectionist models, Ablex, Norwood, NJ, pp. 32-86

Feldman, J., 1981, "A connectionist model of visual memory", in G. E. Hinton e J. A. Anderson (eds.), Parallel models of associative memory, Erlbaum, Hillsdale, nj, pp. 49-81.

Genesereth, M. & Nilsson, N., 1987, Logical foundations of artificial intelligence, Morgan Kaufmann, Los Altos, ca.

Glasgow, J. I. & Papadias, D., 1992, "Computational imagery", Cognitive science, 16, 355-394.

Holyoak, K. & Thagard, P., 1995, Mental leaps: Analogy in creative thought, The mit Press, Cambridge, ma.

Kintsch, W., 1988, "The role of knowledge in discourse comprehension: A construction-integration model", Psychological Review, 95, 163-182.

Kosslyn, S. M., 1994, Image and brain: The resolution of the imagery debate, The mit Press, Cambridge, ma.

Kosslyn, S. M. & Koenig, O., 1992, Wet Mind. The New Cognitive Neuroscience, Free Press, New York.

Mannes, S. M. & Kintsch, W., 1991, "Routine computing tasks: Planning as understanding", Cognitive Science, 15, 305-342.

Marr, D. & Poggio, T., 1976, "Cooperative computation of stereo disparity", Science, 194, 283-287.

McClelland, J. L. & Rumelhart, D. E., 1981, "An interactive activation model of context effects in letter perception. Part I, An account of basic findings", Psychological Review, 88, 375-407.

Medsker, L. R. (1994), Hybrid neural networks and expert systems, Kluwer, Dordrecht.

Minsky, M., 1975, "A framework for representing knowledge", in P. Winston (ed.), The psychology of computer vision, McGraw-Hill, New York, pp. 211-277.

Nelson, G., Thagard, P. & Hardy, S., 1994, Integrating analogies with rules and explanations, in K. Holyoak e J. Barnden (eds.), Advances in Connectionism, vol. ii, Analogical Connections, Ablex, Norwood, nj, pp. 181-205.

Newell, A., 1990, Unified theories of cognition, Harvard University Press, Cambridge, ma.

Newell, A., Shaw, C. & Simon, H., 1958, "Elements of a theory of human problem solving", Psychological Review, 65, 151-166.

Newell, A. & Simon, H. A., 1972, Human problem solving, Prentice-Hall, Englewood Cliffs, nj.

Pearl, J., 1988, Probabilistic reasoning in intelligent systems, Morgan Kaufmann, San Mateo.

Prince, A. & Smoleski, P., 1997, Optimality: from neural networks to universal grammar, Science, 275, 1604-1610.

Rumelhart, D. E.& McClelland, J. L., 1982, "An interactive activation model of context effects in letter perception. Part 2, The contextual enhancement effect and some tests extensions of the model", Psychological Review, 89, 60-94.

Rumelhart, D. E., McClelland, J. L. & pdp Research Group, 1986, Parallel distributed processing: Explorations in the microstructure of cognition, 2 voll., The mit Press, Cambridge, ma.

Seidenberg, M. S. & McClelland J. L., 1989, "A distributed, developmental model of word recognition and naming", Psychological Review, 96, 523-568.

Schank, R. C. & Abelson, R. P., 1977, Scripts, plans, goals, and understanding, An inquiry into human knowledge structures, Erlbaum, Hillsdale, nj.

Thagard, P., 1989, "Explanatory coherence", Behavioral and Brain Sciences, 12, 435-467.

Thagard, P., 1992, Conceptual revolutions, Princeton University Press, Princeton.

Thagard, P., 1996, Mind. Introduction to Cognitive Science, The MIT Press, Cambridge, MA.

Thagard, P. & Millgram. E., 1995, "Inference to the best plan: A coherence theory of decision", in A. Ram e D. B. Leake (eds.), Goal-driven learning, The mit Press, Camdridge, ma.

Touretzky, D. & Hinton, G., 1988, "A distributed production system", Cognitive Science, 12, 423-466.