## 2.1 RELATIONAL DATABASESThe fundamental building blocks of databases are entities and relations. In our examples here, we refer to entities and relations using strings of letters, digits, and a few non-alphanumeric characters (e.g. "_"). For reasons described below, we prohibit strings beginning with upper case letters; all other combinations are acceptable. Examples include The set of all entities that can be used in a database is called the The set of all relations in a database is called the The A database In the In the database literature, it is common to present the extension of a relation as a 2-dimensional table. The number of columns in the table corresponds to the arity of the relation and the number of rows corresponds to the cardinality of the extension. For example, the extension of the
The
The
This tabular representation for relations makes clear that, for a finite domain, there is an upper bound on the number of possible extensions for an n-tuples. Every extension of an n-ary relation is a subset of these b tuples. Therefore, the extension of an ^{n}n-ary relation must be one of at most 2 possible sets.^{bn}## 2.2 SENTENTIAL DATABASESWhile the relational data model is an appealing way to think about data, the More precisely, given a database, we define a Consider the data shown in Figure 2.1. We can encode this in sentential representation as shown below. There is one sentence for each row in the original table. Each sentence here has the same relation. The relation is included so that we can combine these sentences with sentences representing other relations.
The sentences corresponding to the data in Figure 2.2 are shown below.
Finally, we have the sentences for the data in Figure 2.3. Here, each sentence has 3 arguments, since the relation
Since there are no column headings in such sentences in this presentation, as there are in tables, the order of arguments is important. Given the intended meaning of the The A
Datalog programs are built up from three disjoint classes of components, viz. An A A
q(a,Y) :- p(b,Y) & ~r(Y,d)
A An expression in Datalog is said to be A rule in a Datalog program is The A negation in a Datalog program is said to be In this book, we concentrate exclusively on Datalog programs that are both safe and stratified. While it is possible to extend the results here to other programs, such extensions are beyond the scope of this work. The An An An interpretation In general, a Datalog program can have more than one model, which means that there can be more than one way to satisfy the rules in the program. In order to eliminate ambiguity, we adopt the minimal model approach to Datalog program semantics, i.e. we define the meaning of a safe and stratified Datalog program to be its minimal model. A model ## 2.4 OPEN DATALOG PROGRAMSDatalog programs as just defined are Formally, an The The Given an open Datalog program Finally, we define the meaning of an open Datalog program to be a function that maps each input model for the program into the corresponding output model. As an example, consider the simple open Datalog program shown below. The universe of discourse is {
r(X,Y) :- p(X,Y) & ~q(Y)
In this case, there are 2
## 2.5 DATABASE QUERIESA The value of a query Query The problem of determining whether a Datalog program The problem becomes decidable if The expansion of a relation ## 2.6 DATABASE CONSTRAINTSIn our development thus far, we have assumed that the extension of an Database management systems can use such constraints in a variety of ways. They can be used to optimize the processing of queries. They can also be used to check that updates do not lead to unacceptable extensions. In many database texts, constraints are written in direct form - by writing rules that say, in effect, that if certain things are true in an extension, then other things must also be true. The In what follows, we use a slightly less direct approach - we encode limitations by writing rules that say when a database is One merit of this approach is that we can use Datalog to write such rules. We simply invent a new 0-ary relation, here called This approach works particularly well for consistency constraints like the one stating that a course cannot be its own prerequisite. Using this technique, we can write the constraint on teaches as shown below. There is an error if an entity is in the first column of the And we can even write the
In fact, using this approach, we can define constraints of all sorts. In order for the direct approach to work, it is necessary to introduce explicit ## 2.7 PARTIAL DATALOG PROGRAMSThe rules in Datalog programs provide complete definitions of view relations in terms of base relations. In data integration, complete definitions are not always possible. On the other hand, we may know something about the extension of the view relation and it is desirable to capture this partial information. In order to express what we do know about views, we can extend Datalog in various ways. In what follows, we consider two particular extensions - the introduction of Consider, for example, the relation of parenthood and the relation of grandparenthood. While it is possible to define grandparenthood in terms of parenthood, it is not possible to define parenthood in terms of grandparenthood. On the other hand, if we were given an instance of the grandparenthood relation, we could say Consider the kinship relations ## 2.8 FUNCTIONAL DATALOG PROGRAMSThe main difference between Functional Datalog and Basic Datalog is the availability of functional terms to refer to entities in the domain of a database. A In Functional Datalog, the An Note that, unlike Basic Datalog programs, Functional Datalog programs are not guaranteed to terminate since they can generate terms with arbitrarily nested functional terms. As an example, consider the following logic program.
The result in this case contains infinitely many terms. If the extension of the Fortunately, in many cases, termination can be assured, e.g. when the programs are not recursive or when all recursions are bounded in one way or another. ## 2.9 DISJUNCTIVE DATALOG PROGRAMSA p is a literal._{i}
Every variable in the head of a rule must also occur in the body of the rule. A A ## 2.10 ENHANCED DATALOG PROGRAMSOn occasion, it is useful to write programs utilizing well-known relationships, such as arithmetic functions (such as If the bodies of rules are allowed to contain these predefined relations, there are syntactic restrictions. For example, if rules contain literals with comparison operators, then every variable that occurs in such literals must appear in at least once in a positive literal in the body. Even with this restriction, the presence of predefined relations can significantly affect the computational properties of the resulting programs. In the following chapters, we talk about these effects as we use various enhancements. |