Databases - Formalisation and SQL Queries
5.3 Grouping in SQL
Before starting with that chapter we have to expand our tables:
Grouping in one Table
In a first example we follow the question about the number of animals per genus.
When we try to solve this problem with our current knowledge we have to determine first, which genuses are available in zoo. Afterwards we have to check for all genuses how many animals exist. As you see this is a lot of handwork and not satisfactory.
We want to get a result with one query and therefore we use GROUP BY.
The idea is to specify that calculations are performed per attribute to be specified.
SELECT Animal.Genus, COUNT(*) Animalnumber
FROM Animal
GROUP BY Animal.Genus
with the result of
Genus | Animal number |
Bear | 2 |
Rabbit | 4 |
Sheep | 3 |
To evaluate the request, you can imagine the following steps: First, the table is sorted according to the grouping attribute. This clearly corresponds to the result of the enquiry.
SELECT *
FROM Animal
ORDER BY Animal.Genus
with the result of
GNr | TName | Genus | Group |
1 | Laber | Bear | Bear-Group |
1 | Sabber | Bear | |
2 | Klopfer | Rabbit | Rabbit-Group |
3 | Bunny | Rabbit | |
2 | Runny | Rabbit | |
2 | Hunny | Rabbit | |
2 | Harald | Sheep | Sheep-Group |
3 | Walter | Sheep | |
3 | Dörthe | Sheep |
In the following step, the grouping takes place, as indicated in the result. The known aggregate functions can then be used for each of these groups.
Another example is the request for the number of hares per enclosure number. First, you create a query in which all the hares are entered. In this case, they are also sorted according to the appropriate grouping attribute.
SELECT *
FROM Animal
WHERE Animal.Genus='Rabbit'
ORDER BY Animal.GNr
Check the result on your own.
Now this request is embedded into a GROUP BY request without the SELECT and ORDER BY lines. The grouping attribute and the required aggregate function are entered in the resulting SELECT line.
SELECT Animal.GNr, COUNT(*) Rabbitnumber
FROM Animal
WHERE Animal.Genus='Rabbit'
GROUP BY Animal.GNr
The result of the query is:
GNr | Rabbitnumber |
2 | 3 |
3 | 1 |
Note:
Groups for which no elements exist do not appear.
Only attributes that are also in the GROUP BY line may appear in the SELECT line.
A final example is a query for the number of each genus per compound number. Therefore, we need two attributes in the GROUP BY line.
The resulting query is
SELECT Animal.GNr, Animal.Genus, COUNT(*) Animalnumber
FROM Animal
GROUP BY Animal.GNr, Animal.Genus
Finally, it is also possible to group the NULL values. Try it!
Usage of HAVING
HAVING adds another form of the WHERE condition to the GROUP BY line.
However, only attributes that are also mentioned in the GROUP BY line can exist in HAVING. In addition, the HAVING line may also contain aggregate functions that refer to the respective evaluation of the group.
We want to output the compound number in which at least three animals live.
SELECT Animal.GNR
FROM Animal
GROUP BY Animal.GNr
HAVING COUNT(*)>=3
The result is 2 and 3.
During the evaluation, the grouping presented in the previous chapter is carried out first and then all groups are selected for the result preparation, which then also fulfill the HAVING condition.
Note:
The aggregate functions used in the HAVINNG condition and the SELECT line do not have to match. Furthermore, attributes of GROUP BY Zeiel do not have to be used for output in the SELECT line.
In a next example we have a request for the number of animals per genus together with the genus, whereby the number of animals must not exceed 3 and no sheeps should be considered.
SELECT Animal.Genus, COUNT(*) Animalnumber
FROM Animal
WHERE Animal.Genus<>'Sheep'
GROUP BY Animal.Genus
HAVING COUNT(*)<=3
The result is Genus Bear and Animalnumber 3.
In the last example we are looking for a query for Compounds, in which at least three different genus live.
SELECT Animal.GNr
FROM Animal
GROUP BY Animal.GNr
HAVING COUNT(DISTINCT Animal.Genus)>=3
with no result.
Grouping over several Tables
Let's start with an example:
Request for the area consumed by the animal in each compound
Step 1: Request containing all information.
SELECT Compound.Gname, Animal.Genus, Species .MinArea
FROM Compound, Animal, Species
WHERE Compound.GNr=Animal.GNr
AND Animal.Genus=Species .Genus
ORDER BY Compound.Gname
Step 2: Transfer sort criterion to GROUP BY and perform the necessary calculation in SELECT
SELECT Compound.Gname, SUM(Species.MinArea) Needed
FROM Compound, Animal, Species
WHERE Compound.GNr=Animal.GNr
AND Animal.Genus=Species .Genus
GROUP BY Compound.Gname
ORDER BY Compound.Gname
Check the result on your own.
Note:
In the SELECT line, HAVING condition and ORDER BY line, only attributes that also occur in the GROUP BY line may be named.
In a second example we have a request for the land use of a genus, whereby at least three animals of a genus must live in the zoo.
SELECT Animal.Genus, SUM(Species.MinArea) Needed
FROM Animal, Species
WHERE Animal.Genus=Species.Genus
GROUP BY Animal.Genus
HAVING COUNT(*)>=3
ORDER BY SUM(Species.MinArea)
with the result
Genus | Needed |
Rabbit | 8 |
Sheep | 15 |
SQL Query Creation Guide
Order |
Key Word |
Description |
6 |
SELECT |
Attribute, Aggregate functions |
1 |
FROM |
List of tables whose cross product is considered |
2 |
WHERE |
Conditions for selecting lines of the cross product |
3 |
GROUP BY |
List of attributes according to which grouping takes place |
4 |
HAVING |
Boolean condition with attributes from the GROUP BY line or aggregate functions for selecting groups |
5 |
ORDER BY |
Attributes or aggregation functions for GROUP BY for sorting sequence during output |