Lösungsvorschlag Data Modeling And Databases FS12

This is only a suggested solution! Feel free to correct any mistakes or to add alternative solutions!

(Course of FS 2012 was held by Gustavo Alonso & Timothy Roscoe)

Question 1: SQL

1.

The attribute Students is redundant. Because the number of students a researcher has graduated is equal to the number of entries this researcher has in the Advisor entry (assuming the closed-world assumption).

This can become a problem: If we, for example, add another tuple with 'Donald Kossmann' as advisor, we also need to update a second tuple (Students attribute of Name = 'Donald Kossmann'). If we don't update this second entry, we will have data inconsistency.

2.

Remove the attribute Students. We can get this value with the query:

```SELECT COUNT(*) AS Students
FROM Researchers
```

3.

```CREATE TRIGGER studentsNumber
BEFORE INSERT ON Researchers
FOR EACH ROW
WHEN(
(SELECT COUNT(*) FROM Researchers WHERE Advisor = NEW.Advisor) NOT IN (SELECT Students FROM Researchers WHERE Name = NEW.Advisor)
BEGIN
SIGNAL SQLSTATE '-1' SET Message.TEXT = 'Error!';
END
```

or

```WITH Aggr AS
FROM Researchers
SELECT r.Name, r.Students, a.cnt
FROM Researchers r, Aggr a
AND r.Students <> a.cnt;
```

or

```SELECT (SELECT COUNT(*) FROM Researchers AS r WHERE students != (SELECT COUNT(*) FROM Researchers WHERE Advisor = r.Name))=0
```

of

```select (select count(*) from Researchers rr where (select ((select r.Students from Researchers r where `Name` = rr.`Name`)) = (select count(*) from Researchers where Advisor = rr.`Name`))=0)=0
```

4.

```WITH Aggr AS
(SELECT University AS Uni, COUNT(*) AS cnt
FROM Researchers
GROUP BY University)
SELECT DISTINCT r.University, a.cnt
FROM Researchers r, Aggr a
WHERE r.University = a.Uni
AND a.cnt = (SELECT MAX(cnt) FROM Aggr);
```

or

```SELECT University
FROM Researchers
GROUP BY University
HAVING COUNT(*) IN (SELECT MAX(c)
FROM (SELECT COUNT(*) AS c
FROM Researchers
GROUP BY University) AS c)
```

5.

// really?

```SELECT DISTINCT r.Advisor
FROM Researchers r
WHERE NOT EXISTS (SELECT Name
FROM Researchers s
```

Another solution without using subqueries:

```SELECT R.Advisor
FROM Researchers R
HAVING SUM(R.Students) = 0

```

or

```SELECT r1.Name AS Name
FROM Researchers r1, Researchers r2
HAVING SUM(r2.Students) = 0
```

Question 2: ER modeling

1.

See solution of exercise 6, exercise sheet 1 (FS15): DMDB Übungen FS15 Lösungsvorschlag Serie 1

2.

```CREATE TABLE Person(
name VARCHAR,
e-mail VARCHAR,
PRIMARY KEY (name)
);

CREATE TABLE Friendship(
person1 VARCHAR,
person2 VARCHAR,
PRIMARY KEY (person1,person2),
FOREIGN KEY person1 REFERENCES Person(name),
FOREIGN KEY person2 REFERENCES Person(name),
CONSTRAINT no_double_entry CHECK person1 < person2
);

CREATE TABLE Enmity(
person VARCHAR,
enemy VARCHAR,
PRIMARY KEY (person,enemy),
FOREIGN KEY person REFERENCES Person(name),
FOREIGN KEY enemy REFERENCES Person(name),
);

CREATE TABLE Picture(
name VARCHAR,
owner VARCHAR,
coordinates INT,
camera VARCHAR,
PRIMARY KEY (name),
FOREIGN KEY owner REFERENCES Person(name)
);

CREATE TABLE Access(
person VARCHAR,
picture VARCHAR,
PRIMARY KEY (person,picture),
FOREIGN KEY person REFERENCES Person(name),
FOREIGN KEY picture REFERENCES Picture(name),
);

CREATE TABLE Appears(
person VARCHAR,
picture VARCHAR,
PRIMARY KEY (person,picture),
FOREIGN KEY person REFERENCES Person(name),
FOREIGN KEY picture REFERENCES Picture(name),
CONSTRAINT access_to_appearing CHECK (EXISTS (SELECT * FROM Appears app, Access acc WHERE app.person = acc.person AND app.picture = acc.picture))
);
```

3.

```SELECT *
FROM Person p
WHERE NOT EXISTS (SELECT enemy FROM Enmity e WHERE p.name = e.enemy);
```

4.

```SELECT pic.*
FROM Person p, Picture pic, Enmity e, Access a
WHERE e.person = p.name AND pic.owner = p.name AND a.pic = pic.name AND a.person = e.enemy;
```

It even works without using the Person relation:

```SELECT pic.*
FROM Picture pic, Enmity e, Access a
WHERE e.person = pic.owner  AND
a.pic    = pic.name   AND
a.person = e.enemy
```

Question 3: Synchronization

A network split may occur, in which a subset of sites loses communication with all other sites, but not with each other. Imagine everybody voted COMMIT and there had been no PRE-COMMITs received on one side of the split, but there had been at least one PRE-COMMIT received on the other side. Then both partitions will continue with recovery nodes that respectively commit or abort the transaction, and when the network merges the system will have an inconsistent state.

Question 4: Normal Forms

1.

Consider the following relation

```R(A,B,C,D)
```

with the following functional dependencies:

```(1) A,B->C,D
(2) C->D
```

The only possible key of R is A,B. We see that R is in 2 NF, because every non-key attribute depends on the whole key. However, because of (2), this relation isn't in 3NF (D is no attribute of a key).

2.

Consider the following relation

```R(A,B,C,D,E)
```

with the following functional dependencies:

```(1) A,B->C,D
(2) B->E
```

The only possible key of R is A,B. R is not in 2NF, because the non-key attribute E only depends on B. We could transform this table into 2NF if we split the relation R into R1(A,B,C,D) and R2(B,E). Then R2 is trivially in 2NF and also R1 is in 2NF because C and D depend both on the whole key A,B.

//An easier Relation that is neither in 2NF:

R(A,B,C) with f.d.: B->C Only key candidate: AB Non prime atrribute C depends on proper subset of key candidate thus R is not in 2NF. Into 2NF with R1(A,B),R2(B,C). Trivially lossless/rd preserving. In 2-NF since only f.d. has key on left side.

3.

Consider the following relation

```R(Student, Class, Sport)
```

(Suppose your classes have nothing to do with the sports you do)

With the following functional dependencies:

```(1)Student ->-> Class
(2)Student ->-> Sport
```

This relation is in BCNF ({Student,Class,Sport} is the key). However, R isn't in 4NF, because the FD's (1) and (2) aren't trivial (i.e. b=R-a doesn't hold for a->->b) and Student isn't a superkey. We can transform it into 4NF by splitting R into R1(Student, Class) and R2(Student, Sport). Now both FD's are trivial in R1,R2 and our relation is in 4NF.

Question 5: Query Implementation

I think there is a missing line in the query: GROUP BY A.Name

Join: Nested loop join

Group by: Sort on the Name attribute, then aggregate on the sorted ranges (price)

Aggregation: Simply sum up all the tuples (using for loop)

Projection: Read out name and sum from all the tuples.

```main {
joined_tables = join(Books, Authors, Books.Author = Author.Id);
list_of_groups = group_by(joined_tables, Name);
for each element in list_of_groups
sum(element, Price)
Let aggregated be the resulting table
result = project(aggregated, Name, Price)
return result
}
```
```join(table1, table2, predicate) {
for each t in table1
for each u in table2
if predicate(t,u)
append t concatenated with u to result table
return result
}
```
```group_by(table, attribute) {
for each t in table
append t to the list that collects the same t.attribute value
return the list of tables
}
```
```sum(table, attribute) {
for each t in table
result = result + t.attribute
return first tuple from table with result as attribute value
}
```
```project(table, attributes...) {

for each t in table
store the attributes given as argument in the result table
return result
}
```