"No assertion is ever known with certainty...
but that does not stop us making assertions."Carneades, 214-129 BCE
"The facts were always fuzzy or vague or inexact... Science treated the gray or fuzzy facts as if they were the black-white facts of math. Yet no one had put forth a single fact about the world that was 100% true or 100% false."Bart Kosko, Fuzzy Thinking, 1994, Preface
Logic to most people relates to two state thinking, the idea that the outcome can only be either true or false, 1 or 0, right or wrong. This form of logic dates back to ancient Greece and is perfectly adequate to answer simple questions in single dimensions, for example, if A is 1 and B is 0 what is A AND B ? It can be extended, as is done in Boolean algebra to more complex questions, as long as all the parts can be described using the same restricted alphabet of two symbols. Such logic is a deductive way of understanding consequences, and a highly valuable intellectual technique.
But this sort of logic is inadequate when we need to reason about variables that have more than two values, or in cases where multiple incompatible variables are involved. Yet we still need to make decisions in these cases, so how can we proceed ? Bivalent, or two state, logic is just a sub-set of a more powerful type of logic known as fuzzy logic, so here we will introduce this subject and show how it can be used to evaluate choices both in the multi-valued scenarios of one variable and more importantly in the multi-dimensional scenarios common to complex systems incorporating multiple variables.
The logic employed in computers is of the two valued kind, which we can represent by voltages, 0 and 1. In real computer logic chips the output changes suddenly when the input exceeds a threshold value, so we can say that all inputs between 0V and 0.5V will give one output and those between 0.5V and 1V will give the other. If we plot input voltage against output we get a step function. Many other less sharp functions are possible, including a linear function that goes smoothly from 0 to 1 in a straight line.
Generalising, we can say that the step function is the most nonlinear extreme of a continuum between straight line linearity and nonlinearity. In between are many functions that have intermediate levels of nonlinearity (e.g. the sigmoid function used in neural networks). Science most often deals with linear functions, so we can regard scientific formula as implementations of a sort of linear logic at one extreme, with traditional logic as a form of nonlinear maths at the other. Thus maths and logic are both equivalent and are just alternative ways of speaking about the same system - with infinite accuracy (maths) or just one bit accuracy (logic). The breakthrough in fuzzy logic is to recognise that there are other alternative representations in between these extremes.
We see then that the number of different values possessed by the logic can vary from just two [0,1], up to infinity [0,...1]. Thus this wider ranging view can be called multivalued or multivalent logic, which in the limit, using all the real number values, is called fuzzy logic. The range of values available in any logic forms what is called a mathematical set. Therefore fuzzy and multivalued logic deals not with single binary numbers but with sets of numbers. These relate to both inputs and outputs, so the most general form maps input sets to output sets.
To see how this works, consider the question, "which of three possible outputs (heating, cooling, off) should a air conditioner take, given temperatures in three rooms ?" The extremes (all inputs lower than wanted or all higher) are easy, as is the case for all within limits, but how do we treat the case where there is one of each ? Many incompatible combinations are thus possible and we need a function that can weight the relevant inputs and not treat them as just binary values.
Fuzzy sets have membership properties defined between 0 and 1. This means that if we take an attribute say 'red' we can express the colour of any particular apple as a position in this fuzzy set. We may say for example that it is 30% red and thus has a fuzzy truth value (FTV), fuzzy unit (FIT) or membership function of 0.3. How the FTV relates to actual values depends upon our desired mapping from the real world to the normalised range 0 to 1, and this is arbitrary. Note that if we ask how green is it, we may have a quite different value for the same apple (maybe 0.7, if only red and green are allowed). This means that questions like is it a Red or a Green apple are meaningless, it is both ! Thus fuzzy logic destroys one of the bastions of ancient logic, Aristotle's Law of the Excluded Middle - not only can we have statements that are both A and NOT A, but almost no real world cases actually conform to that either/or realm !
We can of course also have independent values, e.g. size, giving maybe 0.8 if it is a "fairly large" apple. Taking all the possible attributes we will have several fuzzy sets and several FTVs for each object. We then have the problem of how to combine these fuzzy functions, and this is a problem midway between arithmetic and logic, but we cannot just add FTVs since adding two 0.8s say gives 1.6 - a value outside logical reality. Nor can we just use normal logic which cannot give values other than 1 or 0. Thus we need a separate form of logic designed for fuzzy systems.
Fuzzy logic is reasoning with fuzzy sets. Operations on fuzzy sets are similar to those of standard logic but are differently defined. Let us assume two FTVs to illustrate, A(0.4) and B(0.7).
Union (the joined boundarys of the values):
A OR B = Maximum of the FTVs, here 0.7
(note that this reduces to traditional logic for allowed
memberships values of only 0 and 1)
Intersection (the commonality between the values):
A AND B = Minimum of the FTVs, here 0.4
(again reducing to bivalent logic in the extremes)
Negation (the opposite of the value)
NOT A = 1 - FTV, here 0.6
(once more this is simply an extension of normal logic)
The ratio (A AND NOT A) / (A OR NOT A) gives the Fuzzy Entropy, always zero in classical logic but rising to 1 in the maximum fuzziness case where FTV=0.5. This can be generalised to multiple variables.
Note that combining fuzzy truth values is not the same as combining probabilities, if we say that the probability of tall is 0.7 and probability of strong is 0.6 then the probability of being strong AND tall would be 0.7 x 0.6 = 0.42, unrelated to the fuzzy AND combination of 0.6.
Like the values considered so far, probabilities range between 0 and 1. Here too we have extremes, at one end we have deterministic systems which we can regard as having only one possible value, so the probability is always 1. These are equivalent to those scientific laws which are always true and are stable, predictable systems. At the other extreme is total freedom with infinite, stochastic, options so that the probability of any one state is effectively zero, these are unstable, indeterminate systems. Probability is thus a measure of certainty, and moves towards one with increasing information.
With probability we still take the view that an option either happens or does not, so each event has a fuzzy set membership of either one or zero. Fuzziness extends the concepts of probability to encompass partial events, each of which can now occur with a strength between 0 and 1, for example we now allow for the probability of a half (eaten) apple (FTV 0.5). Fuzziness is thus a measure of completeness and has no relation to uncertainty. This means that a snapshot of a system (collapse of uncertainty) is no longer a vector of just binary 'truth' values, all either present or absent, but a vector of real values (FTVs - how much present). We can regard this as similar to the difference between single dimensional values and multidimensional ones
Fuzzy thinking allows us to see more clearly what set membership really means. A 'part' is its relative existence within the 'whole', its fuzzy set membership or completeness, Where the part equals the whole the FTV is 1, and where there is no presence in the whole the FTV is 0. But what about the presence of the whole in the part ? In classical logic this is meaningless, but this is actually probability ! If we consider the whole as the available state space, then if the whole is the same as the part the probability is 1, if the whole is infinite around the part then the probability is 0. Thus the whole in relation to the part is the probability (called the 'Subsethood Theorem').
But the range of values for probability and fuzzy truth are the same ! Thus we can say that the two are just alternative perspectives of the same thing, orthogonal views of reality, one concentrating on bottom up (fuzzy occupancy) and one on top down (probabilistic existence). This holistic view of fuzziness ties in nicely with the complexity perspective of downwards causation (the whole constraining the part) and upwards causation (the part forming the whole). More generally, both fuzzy logic and probability form part of Generalized Information Theory (GIT), which also contains other formalisms such as possibility theory and random sets.
If we have just one variable then decisions are easy, we just pick the option with the best value, but how do we deal with multiple variables where we need to compromise or trade-off the values ? In classical logic we can pick the option whose worst is the least bad (Maximin) or we could pick the option whose best is the highest (Maximax). In fuzzy logic we rate each variable as a fuzzy truth value, giving 1 to the best option, 0 to the worst and proportionate in between (we could alternatively rate them with respect to a theoretical or practical minimum and maximum for the variable in question). A motoring example:
Real Values | Consumption mpg | Max Speed mph | Acceleration s |
Car A | 30 | 120 | 9 |
Car B | 40 | 110 | 11 |
Car C | 45 | 100 | 12 |
Classical logic would set the best at 1 and rest (not-best) at 0, i.e.:
Logic Values | Consumption | Max Speed | Acceleration |
Car A | 0 | 1 | 1 |
Car B | 0 | 0 | 0 |
Car C | 1 | 0 | 0 |
Then Maximin would choose all of them (all are 0 minimum) and Maximax either A or C (both have 1) - not much use as a method of choice !
Fuzzifying these values instead (where for Acceleration here low is good, so the minimum gets the maximum marks) we get:
Fuzzy Values | Consumption | Max Speed | Acceleration |
Car A | 0 | 1 | 1 |
Car B | 0.66 | 0.5 | 0.33 |
Car C | 1 | 0 | 0 |
Here Maximin would choose B (0.33 minimum satisfaction) and Maximax A (two 1s), a compromise choice is provided by fuzzy reasoning (depending on your preference for least-worst versus most-best).
The need to perform mathematical operations on fuzzy sets requires us to define transformation functions or 'hedges' to operate on FTVs. These transform, say, Peter is 'old' into Peter is 'very old' (e.g. if timesteps need adding) changing the FTV from, say, 0.7 to 0.9. The way that this is done is very context dependent, and relates to how we define fuzzy membership for the sets in question, but generally can involve squaring or square-rooting the relevant membership values. This allows us to process natural language statements (e.g. tall, old) by coding as FTVs, performing transforms and then decoding back to natural language.
Operators are functions that act in various ways on sets of fuzzy values, they include common things like averages and maximisers but also specialist fuzzy functions (e.g. Ordered Weight Averaging and Monotonic Identity Commutative Aggregation operators - whew, we won't go into these here !) used for multiobjective fuzzy models. Note that fuzzy systems are nonlinear systems and thus standard linear mathematical techniques are often inadequate and need to be extended, as were the basic logical operators.
Looking at our verbal concepts, we can see that words themselves are fuzzy values. Take 'house', this covers all sorts of buildings (e.g. how much is a castle or a tent a 'house' ?). Adding adjectives just reduces the size of the fuzzy set (e.g. 'old house' is a sub-set of the fuzzy set 'house', 'very old house' is a smaller set still - these are intersections of overlapping sets since 'old' also applies to many other concepts. Even numbers are fuzzy, for example when we say that the temperature is zero degrees we mean between say -0.5 and +0.5, it is an approximate or fuzzy usage.
Thought is playing with these fuzzy sets. When we speak we combine these fuzzy concepts in a way that makes sense in the context of our conversation. When we recognise someone we combine the fuzzy sets corresponding to their features to obtain a fuzzy choice between possible acquaintances. These sets are not disjoint but overlap, they form not a tree of higher and lower detail but a web of fuzzy occupancy around a centre average (the 'prototype') - much like the 'probability' clouds used to depict an electron position in quantum mechanics.
Extending these ideas to the artificial realm, we can view these clouds or patches as overlapping attempts to follow a function curve occupying many dimensions. For any input to our system multiple patches or fuzzy sets will activate, giving multiple parallel outputs. These can then be combined to give a decision, a weighted average of the activations - intuition. This FAM (or Fuzzy Cognitive Map) technique is widely used in fuzzy technology as a better alternative to Decision Trees.
We can view the size of the patch as relating to the accuracy of our knowledge. Information that is certain will activate only with very specific inputs, more general concepts will operate on much of our world. This range of matching criteria echoes the ideas behind the schema theorem used in Genetic Algorithms and Classifier Systems.
Taking things one stage further, we can allow our fuzzy set memberships to evolve. This relates to learning the rules by which we make our decisions. This is difficult to do in human terms, for whilst we can provide evidence after-the-fact we are unable to query our neural connections to find out what factors actually contributed to our intuitive decisions and by how much.
Using variants of Neural Networks this problem too can be solved. Fuzzy Adaptive Systems use NNs to identify clusters in the input data, and this is what we mean by our fuzzy sets. It is possible to extract the definitions from these systems which allows us to build artificial control and decision systems using fuzzy methods (a form of automatic expert system).
The bivalent 1/0 realm makes control decisions an all or nothing affair, this mode of working is highly unstable as we see at traffic lights. Traditionally 'Go' or 'Stop' must be permanent states, either/or, and cannot depend upon the fuzzy value of how much traffic there is - thus they have fixed on/off times and this causes jams and empty lanes to coexist. Applying fuzzy logic however enables us to classify the traffic density from the two directions as fuzzy values, thus we can then decide relative times for Go and Stop to best balance traffic flow. Such applications are widespread and show fuzzy thinking is not just equivalent to probabilities as some assume, but is a way of dealing logically with fuzzy actualities (e.g. lane occupancy 'completeness') - real world values.
In fuzzy control systems we make extensive use of the concept of cybernetic feedback, common to many other systems fields. Fuzzy systems being inherently nonlinear however can deal with those situations hard to formulate in traditional linear mathematical terms, and this includes complex nonlinear machines and systems with multiple interrelated variables.
Although we have concentrated here on explaining fuzzy logic, we need to put this into perspective. Fuzzy logic is just one of a number of non-standard logics that currently exist and which are used to cope with situations excluded from classical bivalent logic. One of main classes of these logics (which includes fuzzy logic), is known as paraconsistent logic and this allows for various types of contradiction. Fuzzy logic, which we have outlined here, retains a form of balance, the 'true' and 'false' axes always add to 1, i.e. if a statement become more 'true' then it become less 'false' in proportion. But we can relax this rule, and allow for separate values on both these axes, e.g. a value could be 100% true and 100% false at the same time, i.e. the very definition of a paradox. Going even further, we can realise that our fuzzy values are themselves uncertain, so we can add a third 'indeterminacy' axis to reflect this - a mode more akin to quantum ideas. The implications of these wider logics have yet to be widely studied, but by merging a number of them we can perhaps derive a common form of decision logic that can apply to multi-level complex systems, values and dynamical science alike.
The fuzzification of our ideas is not a new thing, our brains have been doing it for millions of years. What is new is the discarding of the dualist true/false dogma of traditional philosophy, which has created a world strewn with artificially forced boundaries, whether they be logical, scientific, religious or political - a justification for conflict that has been enthusiastically embraced by shallow thinkers everywhere. That is the main social benefit of this approach and why it is important for the complexity sciences which also reject such dualisms.
Many alternative formulations and approaches to this subject are around (we have only scratched the surface) but these relate more to the mathematical aspects and how these are used than to the general idea. In complexity terms what is important is the idea that multiple vague values can successfully be considered at one time, in both informal and formal ways. This allows us to improve our notion of 'truth' in such a way as to provide a fitter context in which to explore science and life.