AI, expert systems and knowledge representation

August 2021 | Carla Melia, Lucia Campomaggiore, Ludovico Dellavecchia

What is an expert system

An expert (or knowledge-based) system is software capable of artificially reproducing the performance of one or more “expert” persons in a given field of activity, and finds its application in the branch ofartificial intelligence.

One of the special features of an expert system is that it is capable of autonomously implementing inference procedures, an inductive or deductive process that enables it to reach a conclusion following the analysis of a set of facts or circumstances appropriate for solving particularly complex problems.

Functions of an expert system

A typical expert system has two main functions: that of drawing conclusions and then making or suggesting choices (WHAT) and that of explaining how, by what reasoning it arrived at those conclusions (CONTROL).

An expert system consists of:

a knowledge base‍
a working memory‍
an inferential engine‍
a user interface

The knowledge base contains the deductive rules and procedural dictates that the system uses in its operation. The inferential engine is needed to direct the program tointerpret, classify and apply the knowledge base and its rules for each individual aspect or scenario of the specific subject field. The working memory, or short-term memory, contains the data and conclusions reached by the system. Finally, the user interface allowsinteraction between the human subject and the program that is to provide answers to his or her problems.

This information is quite general and extremely flexible. The program is not a set of immutable instructions that represent the solution to the problem, but an environment in which to represent, use, and modify a knowledge base.

Characteristic of expert systems is to imitate the human expert not only in performance but also in the way they perform inferences. Most of them make use of so-called“surface reasoning” or fuzzy logic, based on the use of a large number of strategies or rules of thumb, called heuristics, that directly link known facts with those to be inferred, without a true understanding of the type of link that exists.

The stratagem of heuristics takes over if the components that preside over inference procedures, fail to achieve the rigor inherent in analgorithm, since in highly complicated situations it would be too expensive to analyze every possibility.

It is in the very nature of heuristics that they cannot be proven correct as this would sacrifice highly probable but still fallible results, not always giving the best outcome in practice; however, they enable experts to make decisions when “stronger” criteria are not available.

As mentioned earlier, expert systems can rely on a knowledge base to come up with their conclusions.The knowledge is stored in the system’s long-term memory and is organized and represented in the form of rules, for example defined as “if-then” structures, which include a premise or condition (if) and a conclusion or action (then), and describe the resolution of a given problem.

A tree-based expert system, on the other hand, starts with a set of data and some inferences and then creates a classification tree through which the new data will be analyzed leading to the inference represented by the target node.

We will now focus on rule-based expert systems. An introduction to mathematical logic is therefore in order.

Expert systems and mathematical logic

The purpose of logic is to formalize the mechanisms of reasoning. We will make a brief mention of propositions, that is, expressions representing statements that in the context of this article can only be either TRUE or FALSE. As obvious as this may seem, there are in truth areas in which each proposition can be given a degree of truth other than 0 and 1 and between them (i.e., using fuzzy logic).

Each elementary proposition is associated with a propositional variable. For example:

P1: If it is hot and humid, then it will rain
P2: If it is humid and it is summer, then it is hot
P3: Now it is humid
P4: Now it’s summer

If I call A = IT IS HOT, B = IT IS HUMID, C = IT IS SUMMER, D = IT WILL RAIN.

The representation in formulas becomes

F1: A∧B →D,
F2: B∧C→A
F3: B
F4: C

One may want to show that from these 4 facts it logically follows P5: It will rain, or F5: D. This can be taken care of by an expert system.

As we see aformulais composed of atomicformulas(or atoms) (i.e. A, B, …) andlogical connectives.

The most commonly used logical connectives are:

¬(NOT: negation)
∨(OR: disjunction)
∧(AND: conjunction)
→(IMPLIES or IF…THEN…: implication)
↔(IF AND ONLY IF: bi-impllication)

A formula is well-formed(FBF) if and only if it can be obtained by applying the following rules:

an atom is an FBF
if F is an FBF, then (¬F) is an FBF
If F and G are FBFs, then (F∨G), (F∧G), (F→G), (F↔G) are FBFs

Simplifying in this discussion, an FBF is a formula to which a truth value, true or false, can be assigned.

The rules seen express the SYNTHASIS (structural constraints) of the formulas of the propositional calculus. By establishing an ordering between connectives, some parentheses can be eliminated. The order that will be adopted is as follows:

¬
∧, ∨
→, ↔

Thus, The formula ((((A∨B)∧C)→D)↔(((¬D)→A)∨E) can be rewritten (by removing the outer parentheses) as ((A∨B)∧C→D)↔(¬D→A)∨E.

The SEMANTIQUE of propositional logic requires the introduction of truth values. The set of truth values (which we will denote by B) includes TRUE and FALSE, represented by B= {T, F}.

An I-interpretation consists of a mapping between the set of formulas and B (that is, specifying, for each formula, whether it is true or false).

How can the truth value of a formula be derived?

For example, if we ask whether (P ∧Q) ∨R →(P ↔ R) ∧ Q we can pose

α= (P ∧Q) ∨R

β= (P ↔ R) ∧Q

and then ask whether α →β by consulting the following classicaltruth table(intuitively, each row of a truth table corresponds to a different possible situation (interpretation)).

Some formulas are true in all interpretations. They are calledvalid formulas or tautologies. Other formulas are false in all interpretations. They are calledinconsistent formulas or contradictions.

Since each formula is finite and thus contains a finite number of atomic formulas, it is always possible to determine whether it is valid, inconsistent or neither. Propositional logic is therefore decidable.

Some valid formulas are counterintuitive (and are calledparadoxes). In them the connective → appears.

An example is ¬P→(P →Q) which is always true.

Two formulas F and G areequivalent(written F ⇔G) if and only if they have the same truth value in all interpretations.

Modus ponens (MP) is a simple and valid rule of inference, which states in words, “If p implies q is a true proposition, and the premise p is also true, then the consequence q is true,” i.e. (p ∧(p →q)) →q.

It is reiterated that implication has nothing to do with causality. In fact, propositional logic deals only with “combinations” and not with “cause-and-effect” meanings.

We will not go into the logic of predicates as, while it is a more powerful and flexible tool, it is also much more complicated and not suitable for the purposes of this article.

In order to deal mechanically, fbf must be placed in clause form. The steps to follow are.

Elimination of implications
Reducing the range of negation signs
Standardization of variables
Elimination of existential quantifiers: each existentially quantified variable is replaced with a Skolem function
Conversion to preneeded form
Matrix transformation to conjunctive normal form
Elimination of universal quantifiers
Elimination of conjunction marks

Example of application

A number of operations, illustrated by means of the following example, are applied successively:
( ∀x){P(x) →{( ∀y){P(y) →P(f(x,y))} ∧ ¬ ( ∀y){Q(x,y) →P(y)}}

1) Elimination of implications
i.e. A → B is replaced by:

¬A ∨ B Therefore: ( ∀x){P(x) →{( ∀y){P(y) →P(f(x,y))} ∧ ¬ ( ∀y){Q(x,y) →P(y)}}

becomes:

( ∀x){ ¬P(x) ∨{( ∀y){ ¬P(y) ∨P(f(x,y))} ∧ ¬ ( ∀y){ ¬Q(x,y) ∨P(y)}}}

2) Reducing the range of negation signs
Negation applied to a single predicative letter.In practice:

Artificial Intelligence – Problem Solving 2 101

¬(A ∧ B) is replaced by ¬A ∨ ¬ B

¬(A ∨ B) ” ¬A ∧ ¬ B

¬¬ A ” A

¬ ( ∀x)A ” ( ∃x){ ¬A }

¬ ( ∃x) A ” ( ∀x){ ¬A }

Therefore:

(∀x){¬P(x)∨{(∀y){¬P(y)∨P(f(x,y))}∧ ¬(∀y){¬Q(x,y)∨P(y)}}}

becomes:

(∀x){¬P(x)∨{(∀y){¬P(y)∨P(f(x,y))}∧ (∃y){¬{¬Q(x,y)∨P(y)}}}}

And then:

(∀x){¬P(x)∨{(∀y){¬P(y)∨P(f(x,y))}∧ (∃y){Q(x,y)∧¬P(y)}}}

3) Standardization of variables
The quantified variables are renamed so that each quantifier has a unique apparent variable.

In practice:

( ∀x){P(x) → ( ∃x)Q(x)} becomes: ( ∀x){P(x) → ( ∃y)Q(y)}

Therefore:

(∀x){¬P(x)∨{(∀y){¬P(y)∨P(f(x,y))}∧ (∃y){Q(x,y)∧¬P(y)}}}

becomes:

(∀x){¬P(x)∨{(∀y){¬P(y)∨P(f(x,y))}∧ (∃w){Q(x,w)∧¬P(w)}}}

4) Elimination of existential quantifiers
Every existentially quantified variable is replaced with a Skolem function.

For example, suppose that the fbf

( ∀ y ∃x)P(x,y)

can be interpreted as, “for all y there exists an x such that x is greater than y.”

NB: x may depend on y!

Then we look for a function g(y) (called Skolem’s function) that sends every value of y into the x that “exists.” So the fbf becomes

( ∀y)P(g(y),y)

Observe again.

– ∃z is deleted in:

{( ∀w)Q(w)} → ( ∀x){( ∀y){( ∃z){P(x,y,z) → ( ∀u)R(x,y,u,z)}}}

Obtaining:

{( ∀w)Q(w)} → ( ∀x){( ∀y){P(x,y,g(x,y)) → ( ∀u)R(x,y,u,g(x,y))}}

If the existential quantifier is not in the range of a universal quantifier, Skolem’s function has zero arguments.

Example:

( ∃x)P(x) is replaced by P( a ) where a is a constant that we know “exists.”

The example we are following becomes from so:

( ∀x){ ¬P(x) ∨{( ∀y){ ¬P(y) ∨P(f(x,y))} ∧ ( ∃w){Q(x,w)∧¬P(w)}}}

to so:

( ∀x){ ¬P(x) ∨{( ∀y){ ¬P(y) ∨P(f(x,y))} ∧ {Q(x,g(x))∧¬P(g(x))}}}

Where g(x) is a Skolem function.

5) Conversion to pre-connected form
All universal quantifiers (which are all different) are moved to the beginning of the fbf (pre-connected form).

Example from:

(∀x){¬P(x)∨{(∀y){¬P(y)∨P(f(x,y))}∧ {Q(x,g(x))∧¬P(g(x))}}}

becomes:

(∀x∀y){¬P(x)∨{{¬P(y)∨P(f(x,y))}∧ {Q(x,g(x))∧¬P(g(x))}}}

Where:

(∀x∀y) is called the prefix and {¬P(x)∨{{¬P(y)∨P(f(x,y))}∧{Q(x,g(x))∧¬P(g(x))}} is called the Intelligence matrix

6) Matrix transformation to conjunctive normal form
‍The matrix is written as a conjunction of a finite number of disjunctions of predicates and/or negations of predicates (conjunctive normal form). (Simply put, AND of OR, i.e., products of sums, i.e., maxterm).

Examples of conjunctive normal form:

{P(x) ∨Q(x,y)} ∧{P(w)∨¬R(y)} ∧Q(x,y) P(x) ∨ Q(x,y) P(x) ∧ Q(x,y) ¬R(y)

In practice, the relationship is repeatedly applied:

A ∨ (B ∧ C) ≡ { A ∨ B } ∧ { A ∨ C}

The example

(∀x∀y){¬P(x)∨{{¬P(y)∨P(f(x,y))}∧ {Q(x,g(x))∧¬P(g(x))}}}

becomes:

(∀x∀y){{¬P(x)∨ ¬P(y)∨P(f(x,y))}∧ {¬P(x)∨Q(x,g(x))}∧ {¬P(x)∨¬P(g(x))}}

7) Elimination of universal quantifiers
There remains the only matrix in which, because the variables are linked, they are all universally quantified.

8) Elimination of conjunction signs
The conjunction signs (i.e., ∧; example: A ∧ B) are eliminated resulting in two fbf (in the example, A and B). Repeatedly applying this replacement yields a finite list of fbfs, each of which is a disjunction ( ∨) of atomic formulas and/or negations of atomic formulas.

Horn's clause

A Horn clauseis a disjunction of literals in which at most one of the literals is positive. Example: ¬L∨¬B and ¬L∨¬B∨C are Horn clauses, ¬L∨B∨C is not. They are important because they can be written as implications whose premise is a conjunction of positive literals and whose conclusion is a single positive literal. For example, ¬L∨¬B∨C is equivalent to (L∧B)→C.

There are many ways to solve logic problems; we will use Modus Ponens as an example.

Resolution is an iterative process that, at each step, compares (resolves) two parent clauses and allows a new clause to be inferred. For resolution, two clauses containing the same atomic formula, once affirmed and once negated, are considered. The resolvent is obtained by combining all the atomic formulas of the parent clauses except those it deletes. If you produce the empty clause, you have a contradiction. The attempt will be to find a contradiction, if it exists. If it does not exist, the procedure may never end.

Resolution in propositional logic using Modus Ponens

The process is as follows:

1) Transform all propositions in F into clause form.
2) Negate S and transform the result into clause form. Add it to theset of clauses obtained in step 1).
3) Repeat until a contradiction is found or you can no longer go on with the following steps:
4) Select two clauses. Call them parent clauses.
5) Solve them together. The resulting clause, called the resolvent, will be the disjunction of all the atomic formul as of both parent clauses with the following exception: if there are pairs of atomic formulas L and ¬L, such that one of the parent clauses contains L and the other contains ¬L, then delete both L and ¬L from the resolvent.
7)If the resolvent is the empty clause, then a contradiction has been found. If not, add it to the set of clauses available to the procedure.

An example

Let the following axioms be given:

Axioms:
p
(p ∧ q) → r
(s ∨ t) → q
t

Converted to clause form:
p ‍
¬p ∨ ¬q ∨ r
¬s ∨ q
¬t ∨ q
t

One would like to demonstrate r.

After transforming the axioms into clause form, you introduce ¬r (already in clause form) into the list.
Then 2-by-2 clauses are selected and resolved (it pays to choose clauses that contain the same atomic form once affirmed and once denied).

Note: Proposition 2 is true if ¬p, ¬q, r are true
In the first step we assume that ¬r is true, along with preposition 2.
This can only happen if either
¬p or ¬q is true.
That is what the solver states!

Conclusion:
to prove r, one proves that ¬r creates contradiction. We then enter ¬r in the list of clause forms and search by resolution whether this contradiction exists (refutation method).

Other types of reasoning

We focused for simplicity on an example of a mechanism performed by an SE using classical logic but the types of reasoning are many, for example, we could use:

Nonmonotonic logic, which allows for deleting and adding utterances to the database. In this logic, belief in one utterance can be based on non-belief in some other utterance (reasoning by default).

Probabilistic reasoning, which makes it possible to represent probable but uncertain inferences.

Fuzzy logic (fuzzy logic), which allows continuous (nonbinary) or fuzzy properties of objects to be represented.

The concept of belief spaces, which makes it possible to represent patterns of sets of beliefs embedded in each other.

We conclude the article by giving some insight into the problem of knowledge representation, which is fundamental in this area.

Knowledge representation

Knowledge representation is a branch of artificial intelligence that studies the way human reasoning occurs, and is concerned with defining symbolisms or languages that allow knowledge to be formalized in order to make it understandable to machines, so that they can make automatic reasoning (inferring the information present) and thus extract new knowledge.

Through the chosen language, a series of assertions about the world will be made, which together will go to form a knowledge base (KB). It is also important that the language chosen to make the assertions is also able to operate on the KB to extract new knowledge and to add new knowledge.

Knowledge representation methods schematically are divided into implicit and explicit. In general, it is necessary to represent:

collection of objects

collection of attributes (properties of objects)

Sets of relationships (between objects).

The choice of representation influences the search effort, as it can allow for recognition of simplifying concepts (symmetries, similarities, etc.) and training macro-operators (it may be useful to use variables in describing states).

A related problem, called thecontour problemor frame problem, is to represent a world in which there are things that change and things that do not change. To get around the problem, one can, for example, keep track of only the changes (the starting state is fully described) or change the initial state with “invertible” operators so that one can go back by “undoing” the steps taken.

Characteristics of a good representation system

Representative adequacy: being able to represent all types of knowledge related to a domain.

Inferential adequacy: being able to manipulate representational structures so as to infer new knowledge

Inferential efficiency: being able to incorporate extra information to be used as a guide toward goals in inference mechanisms (focus of attention/heuristics)

Acquisition efficiency: ability to acquire new knowledge easily.

It is necessary for the system to possess and be able to manipulate a large amount of world knowledge(semantics and pragmatics) and to know the syntax and possess thevocabulary of the language. It is desirable but complicated to keep the two levels separate (but some syntactic interpretations are not possible if one does not know the context).

Classification of techniques

It appears necessary to make a distinction between declarative and procedural methods.

Declarative methods

Knowledge is represented as a static collection of facts, flanked by a small set of general procedures for manipulation (example: predicate logic).

Advantages of declarative methods

Each fact should be stored only once, regardless of the number of ways it can be used.
It is easy to add new facts to the system without changing either other facts or procedures.

Procedural methods

Knowledge is represented as procedures for its use.

Advantages of procedural methods

It is easy to represent knowledge about how to do things;

It is easy to represent knowledge that does not easily fit into many simple declarative schemes, such as default and probabilistic reasoning;

It is easy to represent heuristic knowledge about how to do things efficiently.

Most working systems use a combination of the two methods.

Patterns for describing knowledge structures

The description of knowledge structures is done by means of schemata, that is, an active organization of past relations, or past experiences, which must assume are operating in every adaptive organic response. Without going into detail we list below some types of schemas.

Frame

Frames are often used to describe a collection of attributes that a given object, for example, a chair, typically possesses.

Script

Scripts (scripts) are used to describe common sequences of events, such as what happens when you go to a restaurant.

Stereotypes

Stereotypes are used to describe sets of characteristics that are often present in people simultaneously.

Rule-based models

Rule models are used to describe common features shared within a set of rules in a production system.

Conclusions

In this article, we have seen some basic concepts of expert systems, which are extremely powerful though little known in the field of AI. We have also intuitively introduced concepts peculiar to mathematical logic and the problem of knowledge representation.