Declarative vs. Imperative IMPERATIVE programming is programming where you "tell the computer what to do" using imperative statements. You directly write an algorithm that the compiler or interpreter translates more or less straightforwardly to machine code. All common general-purpose languages are imperative languages. This includes Lisp and Scheme, and object-oriented languages such as Smalltalk, Eiffel and C++. DECLARATIVE programming is programming by telling the computer WHAT, not HOW. In a declarative programming language, you tell the computer facts, and ask it questions. It figures out the answers to the questions, based on the facts you told it. There are no purely declarative programming languages that are general-purpose (i.e., suitable for writing a large variety of typical application programs). Pure declarative programming languages are a holy grail that is unlikely ever to be achieved, unless artificial intelligence results in computer programs that are smarter than computer programmers, and that's a long, long way off. (At that point, instead of having a compiler, you'll just have a very smart AI program that you ask questions.) Until then, sometimes you'll have to tell the computer HOW to solve a problem, as well as WHAT the problem is. Or you'll at least have to give it hints. Likewise, imperative languages aren't purely imperative, either. They have little declarative aspects that simplify the programmer's task. She can declare certain simple and tedious things, and rely on the compiler to figure out how to do them. --- Arithmetic Expressions as Slightly-Declarative Programming Even in imperative languages, we often throw in simple declarative features, as long as they don't cost much. For example, when you write a arithmetic expression in languages like FORTRAN or Pascal (or, most of the time, C or C++), you do get to write something that looks like an arithmetic expression, i.e., you DECLARE that the value should be the equal to the value of the expression. It so happens that there's a mindless procedural rule for doing this kind of arithmetic, as we've seen---evaluate the subexpressions, and then use those results to compute the values of the expressions they occur in. On the other hand, real compilers often do NOT do that---they take the arithmetic expression as a general statement of what is to be computed, and come up with an equivalent procedure that produces the same results, but which may be faster. So if you write foo := (a * 2) + (b * 2); the compiler may convert that to the equivalent foo := (a + b) * 2; or whatever equivalent expression it thinks will be fastest. --- Type Declarations as Slightly-Declarative Programming Another aspect of normal programming languages that's declarative is the use of strong or static typing. When we DECLARE (get it?) the type of a variable, we're making a declarative statement that the value of that variable will always be of that type. Based on those declarations of the types of values at particular points in the code, the compiler uses a built-in TYPE-CHECKING ALGORITHM to INFER whether the types of variables and expressions are consistent. If so, it draws the conclusion that the actual types of the variables will never violate their declarations at run time. In essence, when you submit a program to the compiler, you're asking a question: "is it true that none of the variables in this program will never take on the wrong type of value at run time?" The type checking algorithm attempts to construct a simple "proof" that the answer is YES. If it can't prove it, the compilation fails. --- Class definitions as Slightly-Declarative Programming Another aspect of normal programming that's declarative is the construction of classes. When you define a class, the compiler must infer things from what you explicitly say. It must infer a layout of the class, perhaps based on the inheritance relationships you've declared and the structure of the superclasses. It must infer which methods are applicable to the class, based on the superclasses and the methods they inherit. At each point where you send a message (or call a generic/virtual function) the language implementation must infer what the appropriate method is. (Some of this inference happens at compile time, and some of it happens at run time. The compiler computes a partial answer, and the runtime code infers the details.) --- SERIOUSLY DECLARATIVE Programming The declarative aspects mentioned above are so commonplace that we take them for granted, and we still call common languages imperative. The programmer still has to say HOW to solve the real problems, even if the compiler can infer little details. Declarative Math One kind of serious, but simple declarative programming is a system that simple math problems. For example, we might declare that one variable equals the sum of two others: A = B + C Then we might declare the values of two of the variables, and ask the system (compiler or interpreter) to infer the value of the third: B = 2 C = 15 A = ? The system will recognize that the value of a depends on B and C and compute the value of A: Answer: A = 17 We can use the expression A = B + C to infer other things than the value of A: if we know A and B, we can infer the value of C. A = B + C A = 2 B = 3 C = ? Answer: C = -1 The expression A = B + C declares a relationship between the values of A, B, and C, not a procedure for computing the value of any of them. Based on that relationship, the system can figure out how to compute the value of any of the variables based on the values of the other two. Equality is Not Assignment Notice that in this framework, assignment statements don't make sense. If you try to use an equality statement as an assignment statement, e.g., "X = X + 1", you're doing something that doesn't make logical sense. For what value of X is X equal to X + 1? None. So it's not true that "X = X + 1". If you tell a declarative system something that's not true, all bets are off---it may infer lots of false things from one false statement. In many systems, a false statement may lead the system to infer weird and surprising (and false) statements about seemingly unrelated things. Typically, in declarative programming, we're making statements about a static state of the world, and the system infers other statements about the SAME state of the world. It does not transform the state of the world, it simply "elaborates", filling in missing detail that's implied by what was already said. More Advanced Mathematics, and the Limits of Declarative Programming There are advanced declarative programming systems that can solve much more advanced math problems than the simple example we've described here. There are some that can solve large systems of equations, do certain kinds of calculus, etc. These systems are limited however---there is no system that can solve all math problems, because there is no algorithm to do that. (It's not even possible, much less likely.) Part of the art of designing declarative systems is to come up with a well-defined class of problems that a system will address, and ensure that it is capable of solving problems of that type (at least most of the time). Another important part of designing such systems is to make it clear to the users what the system CANNOT do, and when it's up to the programmer to solve parts of the problem procedurally. -------------------------------------------------------------------- Theorem Provers The closest thing to pure declarative programming is AUTOMATIC THEOREM PROVING. A theorem prover provides a language in which the user can make statements in a LOGIC LANGUAGE, and it has an INFERENCE ENGINE, which is an algorithm that infers things from what it is told. A theorem prover stores representations of statements in a DATABASE. (We don't mean a general-purpose database system here---we're using the general sense of "database" which just means a data structure that you use to store data, especially declarative data.) The inference engine operates on the database, looking up statements, and using facts in combination to infer more statements. These new statements may be added to the database, so that further statements can be inferred from them. The items in the database are the ones that the system views as "TRUE". --- Propositional Logic A simple logic language for a simple theorem prover is PROPOSITIONAL LOGIC. (This is a weak logic that you probably wouldn't use for complicated problems, but it's a special case of more interesting logic languages.) In propositional logic, we have some kind of symbol that we use to represent facts. If we implement a theorem prover in Scheme (or Lisp) these might be Scheme (or Lisp) symbols. We just need some unique symbol to represent a particular statement. Facts and Rules In propositional logic, we will use individual symbols to represent simple facts. So, for example, the statement that it's humid might be represented by the symbol HUMID, and the statement that it's raining might be represented by the symbol RAINING; we might represent the fact that the humidity is 100% by the symbol HUMID100. We will also use RULES that say how one fact might be inferred from another fact or set of facts. We can represent the fact that if it's raining, the humidity is 100% by a statement of IMPLICATION; we'll use a right-arrow symbol (->) to denote that one fact implies another: RAINING -> HUMID100 Given the rule RAINING -> HUMID100, and the fact RAINING, our theorem prover is allowed to infer that the humidity is 100%. It can add the symbol HUMID100 to its database. Similarly, we can represent the fact that if the humidity is 100%, it's definitely HUMID. HUMID100 -> HUMID We can also represent NEGATION (the fact that something is NOT true by having a logical operator NOT. We can represent the fact that if the humidity is not 100%, it's not raining. (NOT HUMID100) -> (NOT RAINING) and the fact that if the humidity is 100%, it's humid HUMID100 -> HUMID Notice that we have to assign some interpretation to HUMID. How high must the humidity be before we consider it humid? We might decide that 50% humidity or higher is HUMID. We don't necessarily have to tell the theorem prover that, though, as long as all of our rules are consistent with that. (It's difficult to represent precise arithmetic with simple propositional logic. For that, we'll want to use a more powerful logic.) Implication is not causality A -> B does NOT mean that A CAUSES B, or that B is true BECAUSE A is true. It just means that "in all possible states of affairs, B is true if A is true---even if there's no direct or interesting connection or causal relationship between those facts". (Of course, the rule itself could be wrong, but to do theorem proving we have to assume that our statements are right, or we're lost from the start.) For example, if we know that Bill Clinton jogs only on Tuesdays, we might write BILL_IS_JOGGING -> TUESDAY and that doesn't mean that it's Tuesday BECAUSE Bill is jogging. --- Exploiting Implications: Modus Ponens and Modus Tollens The inference engine of our theorem prover will encode some very basic inference rules; these are different from the rules we give it to work with---they are so basic that they are encoded in the algorithm used by the inference engine itself, and they're what lets the inference engine do something with the rules we give it. The basic inference rule that we used to infer things in the examples above is called Modus Ponens. It says that if the thing to the left of the arrow is true, then the thing to the right of the arrow is true as well. There's another basic inference rule for implications, which allows us to infer that if the thing to the right of the arrow is false, then the thing to the left is false as well. This is simple reasoning by contradiction: we know that if the statement to the left is true, then the statement to the right is true, but that can't be the case because we also know that the thing to the right is false. Ergo, the thing to the left is false as well. If we can count on our inference engine to use modus tollens, and we say that RAINING -> HUMID100, we don't have to also say that (NOT HUMID100) -> (NOT RAINING). Modus tollens lets us infer both kinds of things from the first rule. This illustrates an important principle---what we have to explicitly say depends on the inference strategy of our inference engine. It also illustrates an important relationship between statements, called SUBSUMPTION. From RAINING -> HUMID100, we can infer everything we can infer from (NOT HUMID100) -> (NOT RAINING), plus more things. Thus if our inference strategy is powerful enough, we don't have to record the latter fact to be able to use it. Modus Tollens will allow us to infer the same things from just the first statement. --- A simple FORWARD-CHAINING inference strategy A simple design for a theorem prover is to have a simple list of statements and an inference engine that just goes through this list and tries to match things up so that it can draw inferences. Our inference engine can just go through the list and take each rule, and for each rule see if there's another statement that it can use to infer something new. FOR each statement IF the statement is a rule, THEN FOR each statement IF the statement enables an inference via modus ponens or modus tollens, THEN IF we inferred a new fact, THEN add the fact to the database We thus go through the list finding things we can infer, and adding them to the database. We can put this pair of nested loops inside another loop, and just do this as long as we keep inferring new things. This is called FORWARD CHAINING. It's called FORWARD CHAINING because we go forward from things we already know to infer new things. This implementation is slow for two reasons. One reason is that keeping simple lists means that we spend a lot of time going through the lists looking for matches. The pair of nested loops that looks for matches gets very expensive (n squared) if there are a lot of statements in the database. The performance of the algorithm can be improved dramatically if we use INDEXING to make searches faster. For example, we might keep some hash tables that hold lists of statements categorized by what's on their left-hand sides. Then we don't have to use a linear search to find matches. But that's not enough to make simple forward chaining practical. A more basic reason that it's inefficient is that mindless forward-chaining is likely to infer lots of things that don't have anything to do with solving a particular problem. Undirected forward chaining will eventually infer all of the statements can be inferred, but it may take a very, very long time before the fact you're interested in pops out. If you want to get a useful answer in your lifetime, you'll usually have to use a more directed strategy. Forward-chaining inference engines therefore usually use HEURISTICS to decide which of the many inferences are worth making. The heuristics (rules of thumb) are supposed to help it order its "search" for answers, so that it works on the things that are most likely to be fruitful first. --- COMPLETENESS and EFFICIENCY This illustrates a basic tension in theorem proving. Some inference strategies are COMPLETE, in that they are able to infer anything that can be inferred from the information they're given. Unfortunately, COMPLETE strategies are INEFFICIENT, because they're likely to produce a lot of useless stuff before the answer you want pops out. There are two basic strategies for coping with this: 1. Use heuristics to guide the complete strategy, to increase the probability of getting the answer you want within the time you're willing to wait. 2. Use an incomplete strategy, but ensure that the information you put in the database is encoded in such a way that all of the useful inferences will get made anyway. Both of these strategies are fine arts, and underscore an important fact about theorem proving: just because something is logically correct doesn't mean it's useful. Either way, you have to have a problem-solving strategy that tells you how to code the inference strategy you need, or how to encode the information operated on by a simpler strategy to get a similar effect. There's also a third, common strategy used by many Prolog programmers: 3. Encode the information in a way that is not logically correct, but be careful to ensure that the inferences that are drawn don't actually affect the correctness of the answer. No matter what you do, you can't guarantee that what the inference system will do is both correct and efficient without knowing what the inference engine will do with the information you encode. That's why "logic programming" IS programming---you're encoding a problem solution strategy one way or the other: by controlling the inference engine's algorithm, or by encoding the data with careful regard to how the inference engine will react. --- BACKWARD CHAINING An alternative to FORWARD CHAINING is BACKWARD CHAINING. Rather than starting from what you know and trying to infer something useful, you start from something it would be useful to know and work backwards, trying to see if you can infer it from what you already know. Here's an example. Suppose we have the following statements in our database: RAINING -> HUMID100 RAINING -> STREETS_ARE_WET HUMID TUESDAY -> (NOT MONDAY) RAINING THURSDAY Now we ask our backward chaining inference engine if it's HUMID100. The inference engine looks through the database to see if there's a rule it can use to infer that it's HUMID100. It finds the rule RAINING->HUMID100 and sees if it can find something that proves what's on the left-hand side, i.e., RAINING. It finds the fact RAINING in the database, which satisfies the rule without any further rule applications. It then draws the inference that HUMID100 is true, and says YES. This backward search for supporting evidence can continue if the desired fact is not in the database, but a rule is present that can confirm what's on the left-hand side. The inference engine may go through many chains of rules to find a sequence of rules that leads from known facts to the desired fact. The advantage of backward chaining is that it's DIRECTED---it only searches through rules that are (at least superficially) relevant to answering the question you pose. (Notice, however, at the search for a chain of rules may still be slow---if the database holds many rules about the statement in question, the inference engine may backtrack through a lot of seemingly relevant rules that turn out to be dead ends because they're not enabled by any facts, before getting to the right chain of rules that does lead back to known facts.) --- RESOLUTION-based theorem proving Recall that logical implication is can be used to infer that the left-hand statement is false if the right hand statement is false, as well as being used to infer that the right one is true if the left one is true. Given this interpretation of A -> B (logical implication), it turns out that A -> B is exactly equivalent to B OR (NOT A). Think about that for a while if it's not obvious. This means that we can convert rules into logical disjunctions (OR statements). We can use this as part of a strategy for converting every statement into an equivalent disjunction. Suppose we have these two statements A -> B (NOT A) -> C We can convert them to the form B OR (NOT A) C OR (NOT (NOT A)) The second one can be trivially simplified so that we have B OR (NOT A) C OR A (This simplification will be performed automatically by a theorem prover.) There is a basic inference rule that says that if you have two disjunctions, and one contains a statement and the other contains the negation of that statement, then you can infer a new disjunction that combines the remaining components of each disjunction into a single disjunction. In this case, we can combine these two disjunctions to get B OR C This is a very important technique called RESOLUTION. It turns out that if we convert all of our implications into disjunctions, then this inference rule gives the effect of both modus tollens and modus ponens---anything that can be inferred using modus ponens and modus tollens on implications can also be inferred using resolution. Forward-chaining using only RESOLUTION is a complete inference strategy. As long as we convert all of our statements to disjunctions (and we can), then anything we can infer using other sound inference techniques can also be inferred using resolution alone. RESOLUTION-BASED theorem provers are based on this idea. All of the statements in the database are represented as DISJUNCTIVE CLAUSES (OR statements) of propositions or the negations of propositions, and all the inference engine does is to combine disjunctions using resolution. To make this work, we need to be able to convert all kinds of statements involving implications and AND into disjunctions. There are some simple rules for this. When we convert our initial database to a set of disjunctive clauses, it may be much harder for humans to understand than if we used implications. We not only convert the implications into disjunctions, but we have to convert ANDs into ORs somehow as well. Consider a clause like A OR (B AND C) There's another conversion which lets us split this into two clauses. Notice that if A OR (B AND C) is true, then if we know A is false, we know that B is true and we also know that C is true. We can split this "and" up into two separate clauses to which give the same effect: A OR B A OR C (Since all clauses in the database are supposed to be true, there's implicitly an AND between them.) By using these simplification rules, we can convert complex expressions in terms of AND and -> into disjunctions. For example, (A AND B AND C) -> D first gets turned into (NOT D) OR (A AND B AND C) and then gets turned into (NOT D) OR A (NOT D) OR B (NOT D) OR C This has the advantage of presenting a simple representation that a simple algorithm can work with, but it has the disadvantage that the resulting clauses are hard for humans to understand. It's very difficult at times to figure out what's going on with a resolution theorem prover, and what the individual inference steps mean. --- Like any other complete inference strategy, resolution can result in the generation of a lot of useless statements clogging up the database. Resolution-based theorem provers typically have several control features to help the programmer tailor the inference engine's search strategy so that it's likely to work for a particular problem or kind of problem. For example, the inference engine is likely to favor resolutions involving short clauses rather than long ones, because short disjunctions are more powerful statements than long ones. (E.g., "it's raining or it's tuesday" is a weaker statement than "it's raining".) These heuristics are weak, however, in that they're superficial; they don't "know" anything about the meanings of the clauses being computed over. There is no way to get around the fact that the programmer will often have to take advantage of deep knowledge about the domain and "tell the system what to do", one way or another. One common way of doing this is for the programmer to propose lemmas, to guide the theorem prover's search. Rather than expecting the theorem prover to work all the way from the facts to the conclusion of a large proof, the programmer may use the theorem prover as an automatic assistant; the programmer will design an overall proof, but rely on the the theorem prover to automatically fill in many of the details. Going even further in that direction, theorem provers are sometimes used as PROOF CHECKERS; a human may construct a complex proof, and the theorem prover is used to check to see that all of the steps are sound. --- Resolution theorem provers are based on PROOF BY CONTRADICTION. (We're talking about the high-level strategy here, not a single local inference step as in the case of modus tollens.) Rather than trying to prove that something is true, you try to prove that its negation results in a contradiction. This turns out to be a convenient way to detect that the system has done the necessary steps, because detecting a simple contradiction is easy---at some point, you end up with two clauses that cancel each other out completely (because one contains just the negations of the terms in the other), and yield a zero-length clause. If this ever happens, you know that the negation of the thing you want to prove is inconsistent with the other things in the database, therefore it must be true. For example, we might end up with a clause that contains only P, and another clause that contains only (NOT P). At that point we know there's an inconsistency somewhere. There's a catch here, of course---when we detect an inconsistency, it's not always obvious where the inconsistency came from. Real theorem provers do some bookkeeping to keep track of which inference steps led to each clause in the database, so that they can show the whole chain of steps that led from the initial database state to a contradiction, and leave out all of the other steps, which ended up producing clauses that weren't involved in generating the contradiction. There's still the problem that what the theorem prover did may be hard to understand---there may be a sequence of resolutions you can follow, but it's unclear what the clauses mean because they're just weird disjunctions. ---------------------------------------------------------------------- Prolog Prolog is a "logic programming" language, which is based on a simple kind of theorem prover. Prolog restricts the language that you write rules in to be implications whose right-hand sides contain preconditions ANDed together, e.g., A <- B AND C AND D tells prolog that if B and C and D are all true, then A is true. Actually, not only do we reverse the direction of the implication arrow, but we also write it differently, as :-, and we connect use a comma to mean AND so we say A :- B, C, D There are two reasons for this particular restricted format: 1. It's pretty easy for humans to understand, and 2. Rules like this can be converted to HORN clauses, which are a special case of disjunctive clauses, and can be operated on by a special restricted case of the resolution algorithm. --- HORN CLAUSES and Prolog There's a special kind of disjunctive clause called a HORN CLAUSE, which consists of any number of negated terms and exactly one non-negated term. By convention, the non-negated term is written first, and is called the HEAD. Consider the rule A <- B, C, D We can convert any Prolog-style implication statement into a Horn clause automatically. First we rewrite it as A OR (NOT (B AND C AND D)), then we must rewrite (NOT (B AND C AND D) to get rid of the ANDs. There's a simple rule for this: the negation of a conjunction is equivalent to a disjunction where the parts are negated. In this case, if we know it's NOT true that B and C and D are all true, then that's the same thing as knowing that at least one of A, B, and C is false. That is, (NOT B) OR (NOT C) or (NOT D); we plug this back in to get A OR (NOT B) OR (NOT C) OR (NOT D) Now it's in the format we want for Prolog's restricted version of resolution: a disjunction with exactly one non-negated term. --- Prolog's Inference Strategy While Prolog uses resolution, in a restricted way, you don't have to worry about that when you program in Prolog. You can think of it as being Modus Ponens applied to the rules in the Prolog program. Prolog uses a very simple backward-chaining search strategy, specifically a depth-first search for rules, which is ordered by the order you write the rules. When you state a goal clause, Prolog attempts to use the first relevant rule it finds to prove that clause. To do so, it tries to find proofs of the preconditions for those rules, in order. To prove the precondition of a rule, it similarly tries to find an applicable rule, prove its preconditions, and so on. This results in a depth-first search of all of the chains of rule applications that could lead to proving the clause in question. At each attempt to satisfy a precondition of a rule, that term becomes a SUBGOAL, i.e., a goal to be satisfied as part of the overall goal of proving what you asked. Whenever prolog fails to prove a subgoal, it FAILS, and BACKS UP to find another way of proving the higher-level subgoal. This depth-first pattern is therefore called BACKTRACKING. --- Problems with Prolog Prolog has three main problems, and two of them arise because of this simple depth-first backtracking strategy: 1. backward chaining may be inefficient, because it may backtrack through many subgoals and rule applications without being able to find a path backward to the facts. There can be an exponential explosion due to the number of rules that are applicable at each step. 2. backtracking is not a complete proof strategy. There are many simple inferences that this simple search will miss, and that means The first problem is dealt with by structuring the rules carefully, and using certain non-logical operations (in particular, the CUT, which can be embedded in a clause to tell Prolog not to backtrack through it). This means that Prolog programmers must be very careful how they structure their rule sets. They order their rules carefully to make sure that the right rules are used first, and so that CUTs can be used to prune the "search tree" of the normal backtracking. The second problem is also dealt with by using non-logical operators. Here's an example of something Prolog can't infer using its usual backtracking alone. Suppose we have the following two rules: A -> B (NOT A) -> B and we ask if B is true. Logically, it's obvious that B is true, because if A is true, then B is true, AND if A is false, then B is true in that case, too. There's no possible state of affairs where B is not true. What Prolog will do is try to prove B using the first rule, and fail, because there's no rule that allows it to infer A. Then it will try to use the second rule, which will fail because there's no rule that allows it to infer not A. Then it will stop, having failed to prove B. In contrast, this is a trivial one for normal resolution. The equivalent disjunctive clauses are: B OR (NOT A) B OR (NOT (NOT A)), which simplifies to B OR (NOT A) B OR A One resolution step will cancel the (NOT A) and the A out, and combine the remaining terms to get (B OR B), which simplifies immediately to B. --- Because of Prolog's limitation, there is a facility for forward-chaining, which programmers must use explicitly if they want forward chaining to occur. This facility is called ASSERT, which is an operation that can be imbedded in a rule to add something to the database. e.g., foo :- bar, bleen, assert(quux) If prolog attempts to apply this rule while it's backtracking, it will first try to prove bar, then bleen, and if it gets that far, it reaches the assert(quux). At that point, quux is added to the database. Prolog programmers become very adept at using non-logical operations like this to get around the incompleteness of backward chaining. They must understand both the data (rules) and the search algorithm, and ensure that they use assertions like this to force Prolog to draw inferences it otherwise would not draw. This is rather tacky, because ideally there would be a SEPARATION OF CONCERNS in logic programming: * the database would hold declarative information only, and * the search strategy would be controlled separately. That way, verifying that a logic program is correct could be separated into two steps: * ensuring that the declarative information is correct, i.e., factual, and * ensuring that the proof strategy is efficient enough to make the system usable. --- NEGATION-AS-FAILURE and THE CLOSED-WORLD ASSUMPTION Not only is Prolog's inference strategy incomplete, it's also unsound. There are times when Prolog will draw inferences that are not logically implied by the rules and facts in the database. This occurs because Prolog assumes that if it can't prove something is true, then it must be false. If you ask whether something is true, and Prolog can't prove it using backtracking (and whatever assertions you've put in), it will tell you it's false. Prolog doesn't know how to say "I don't know". This embodies the "closed-world" assumption: Prolog assumes that its database contains all of the information relevant to any question that you ask it. The closed-world assumption makes sense in some domains. If you can guarantee that all of the relevant information is in the database, it doesn't, in itself, cause a problem. But that means that you can't count on Prolog to give you the right answer if the information in the database is incomplete, as is true in many real-world situations. Actually, though, Prolog makes an even stronger assumption, because of its incomplete inference strategy. Not only does it assume that all of the relevant information is available in the database, it assumes that it's all encoded in such a way that any question you ask it can be answered correctly using its incomplete search strategy. --- You might think that the undsoundness of Prolog's inference strategy would make it unusable, but Prolog programmers become adept at organizing their rules in a way that works with Prolog's inference strategy. They also become adept at writing procedural code within the framework of depth-first backtracking; this allows them to easily mix real logic programming with more conventional algorithmic programming. Because of this, it is fairly easy to write many programs in Prolog and know that they will execute reasonably efficiently (within, say, an order of magnitude of the speed of a C program). In contrast, purer theorem provers are notoriously difficult to make efficient. Prolog is by far the most popular logic programming language, partly because of its pragmatic nature, allowing a mix of programming styles. --- VARIABLES, UNIFICATION, DATA STRUCTURES (See book chapter on Prolog.) --- TIME in LOGIC PROGRAMMING Earlier we said that in logic programming, we generally don't CHANGE (mutate) the state of our representation, in the sense we often do in conventional programming languages. Instead, as we infer new facts, we ADD those to our representation to flesh it out. Usually, when solving a logic problem, there's no state involved. Sometimes, though, we want to reason about time. For example, a robot might have a theorem prover that proves that it's possible to accomplish a task, and the proof itself will contain a plan for accomplishing the task. Usually, rather than relying on side effects (like Prolog's assert and retract), we represent states explicitly as sets of facts or data structures. For example, we may represent time explicitly using numbers that indicate steps of a process. The fact ON_CHAIR(Box,21) might indicate that the Box is on the chair at time 21. ------------------------------------------------------------------------ CONSTRAINT PROGRAMMING Like logic programming, constraint programming is a kind of declarative programming. A constraint programming system allows you to partially specify the state of a program, and have the system automatically flesh out things that you didn't explictly declare. The first (recognizable) constraint programming system was SKETCHPAD, an interactive graphical program created by Ivan Sutherland. SKETCHPAD let you graphically specify geometric shapes and constraints between them, and then interactively change things. (For example, you might draw a triangle, and declare that two sides should always be of equal length.) In response to a change (e.g., moving a point of a triangle), SKETCHPAD would update other aspects of the state of the system, in order to preserve the specified constraints. (E.g., if you stretched one side of the triangle whose two sides must be the same length, SKETCHPAD would stretch the other one to match.) The basic idea of a constraint programming system is that you declare some constraints and the system figures out how to keep them satisfied. In most constraint programming systems, you can use side effects (i.e., change something), and the system must respond. In effect, this poses a new problem to the system, which must solve it. At each step, however, the system's task is to solve a problem: how to transform the current state into an acceptable new state that incorporates the changes made by the user. If the user can specify arbitrary constraints, then there is no algorithm that can ensure that they are all satisfied. It's even less likely that the system will automatically find a satisfactory arrangement within a reasonable period of time. Real constraint programming systems (like logic programmign systems) therefore trade of generality for efficiency, or require the programmer to carefully arrange the constraints so that the problems are tractable. One common way of helping keep constraints tractable is to LIMIT THE DOMAIN over which you can specify constraints. You might, for example, only be able to express constraints about 2-dimensional arrangements of geometric shapes using quadratic (and simpler) formulas. (Another example would be constraints among voltages and resistances in resistive networks.) Another way of making things tractable is by requiring that the user PRIORITIZE constraints, and be willing to accept violations of low- priority constraints. This allows the system to satisfy the high-priority constraints first, and then see if it can satisfy the low-priority ones. Without this kind of prioritization, the system might repeatedly attempt to satisfy a constraint, only to find that it causes a chain of changes which eventually violates another constraint. A third way of making life easier for the constraint satisfier is to let the programmer supply PROCEDURES for satisfying constraints. That is, besides specifying the constraint (what), the programmer may supply procedures for re-satisfying the constraint when things change. This lets the programmer explicitly program strategies that the underlying constraint satisfaction engine is too dumb to find by itself. General constraint-satisfaction programming languages let you combine these strategies. In effect, they let you come up with your own constraint-satisfaction language for the application domain you care about, specifying your own constraints, priorities, and procedures for re-satisfying constraints. Then you can use the result as a domain-specific constraint programming system. The programmer who tailors the general-purpose system may be a different person than the "users" who use the resulting domain-specific system. SPREADSHEETS The most popular form of constraint satisfaction language is the SPREADSHEET. A spreadsheet is a simple constraint language, presented to the user via a graphical interface. The user sees a two-dimensional array of CELLS, organized into rows and columns. The user can load data into cells, either by typing them directly in at the graphical user interface, or by loading them in from files. The user can also specify constraints between the values in the cells, for example telling the system that the value in a given cell should always be the sum of the values in two other cells, or in a whole column of cells. Spreadsheets are acceptably efficient because they limit the kinds of constraints you can specify (e.g., you typically can't have cyclic dependencies), and because you generally don't use a spreadsheet for operating on huge volumes of data. (You might load tens or hundreds or even a few thousand data points into a spreadsheet, but probably not a million.) Spreadsheets are limited, but within their limitations they can be extremely useful and easy to use. They are often used for financial calculations and for exploratory data analysis. It's easy to program a formula into a spreadsheed, and then fiddle with one of the inputs to the formula and watch the changes propagate to the outputs. This kind of "what if analysis" makes it easier to see the impact of things like budget changes, or different ways of normalizing experimental data. Spreadsheets are the most popular new form of programming language in the last couple of decades, precisely because they are NOT general-purpose programming langauges. Their limitations and their user interface make them easy for relatively naive users to grasp and start using, without having to "learn to program" in the usual sense.