DeAL TutorialBuilt-In Aggregates
Built-In Aggregates (.deal | .fac)
Given the following facts:
father(joe, peter). father(joe, mary). father(jim, lucy). father(jim, arnold). father(jim, jim).
Then to list the number of children for each father we can use a Built-In count as follows:
query no_of_children(X, No). no_of_children(X, count<Y>) <- father(X, Y).
This will return:
no_of_children(joe, 2) no_of_children(jim, 3)
Thus, a rule with a Built-In aggregate applies the aggregate (count in the example) to every true body, grouped by the non-aggregate arguments in the head (X in the example).
DeAL supports SQL's five basic aggregates: sum, count, max, min, avg. The count distinct equivalent is also supported.
As discussed later, DeAL supports user-defined aggregates.
Consider another example:
salary(john, 50000, 'Accounting'). salary(james, 60000, 'Accounting'). salary(becky, 100000, 'IT'). salary(laura, 70000, 'Accounting'). salary(sam, 60000, 'IT').
To find the average salary of employees by department we use the following rule:
query avg_salary(Dept, Sal). avg_salary(Dept, avg<S>) <- salary(_, S, Dept).
This will return:
avg_salary('Accounting', 60000) avg_salary('IT', 80000)
Name | Semantics | SQL Equivalent |
---|---|---|
sum | Compute the sum of items in column | SUM |
avg | Compute the average of items in column | AVG |
count | Count the items in column | COUNT |
max | Find the maximum value in column | MAX |
min | Find the minimum value in column | MIN |
countd | Count the number of distinct items in column | COUNT DISTINCT |
Multiple Aggregates in the Head
Multiple aggregates are allowed in the same head.
p(K1,K2,...,Km, aggr1<A1>, aggr2<A2>, ..., aggrN<An>) <- Rule Body.
The following conventions apply:
- The arguments in the head, that is, K1, ...,Km and A1,...,An must all appear in the body of the rule.
- aggr1,...,aggrN can either be builtin aggregates or user-defined aggregates.
- Each aggr1<A1>, aggr2<A2>, ..., aggrN<An> is grouped by K1,K2,...,Km where m denote a non-negative integer (thus an empty group-by list is also allowed).
- The cartesian product of the results of aggr1, ...,aggrN will be returned for each new value X. Thus, if any of the N aggregates fail for a given X no value is returned at that point.
Duplicates
The DeAL System uses the following policy for duplicate elimination:
- Duplication in base relations is controlled by the keys of the relations
- Duplicates are eliminated from the tuples returned as query answers
- Duplicates are eliminated during the computation of recursive predicates
- For a head aggregate such as p(X, count<Y>) <- ... the first argument X is a unique key for the values produced by this rule. Similar rules apply to all other Built-In aggregates.