# Lösungsvorschlag Operating Systems And Computer Networks FS16

## 1 Processes

### a)

Event New State
preemption Runnable
I/O operation Blocked
process terminates Terminated

### b)

• User-level threads are cheaper to create and destroy.
• Context switches are faster for user-level threads.
• They allow each process to have its own customized scheduling algorithm.

### c)

A freshly forked process will execute the same code as the process that forked it, with the difference that the call to fork() returned 0.

### d)

pid_t pid = fork();
if (pid == -1) {
// Error handling
exit(-1);
} else if (pid == 0) {
printf("hello\n");
execlp("/bin/ls", "ls", NULL);
} else {
wait(NULL);
printf("goodbye\n");
exit(0);
}


## 2 Scheduling

### a)

Let f be a function that takes two int arguments, locks the first one, then the second one, then unlocks both. Executing f(a, b) and f(b, a) in two concurrent processes p1 and p2 can cause a deadlock, when for example p1 locks a, then gets preempted by p2, which locks on b. However, mutual exclusion is always satisfied.

### b)

The OS needs 5 ms to switch to a new process, thus the minimum time slice needed to achieve a scheduling overhead of less than 10% is 45 ms.

### c)

Time
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
F1 X X X X X X X
F2 X
T3 X X X X X X X
T4 X X X X

### d)

Shortest-Job First (SJF) minimizes waiting time.

Time
1 2 3 4 5 6 7 8 9 10 11 12
T1 X
T2 X X X X
T3 X X X X X
T4 X X

### e)

Let ${\displaystyle n}$ be the number of jobs and ${\displaystyle t}$ the time slice. We assume the job can produce output as soon as it is run. In the worst case, the job's time slice is used up when input arrives, and it has to wait for ${\displaystyle n-1}$ other jobs to complete their time slice. An upper bound is thus ${\displaystyle t\cdot (n-1)}$.

### f)

Ageing: Tasks which have waited a long time are gradually increased in priority. Eventually, any starving task ends up with the highest priority. The priority is then reset when the task's quantum has been used up.

### g)

• SJF: ${\displaystyle {\mathcal {O}}(log(n))}$, the OS can use binary search on the queue to find the right place to insert the new task.
• RR: ${\displaystyle {\mathcal {O}}(1)}$, the new task can be run after the last process of the current round.

## 3 Memory Management

### a)

Accesses
1 2 3 4 5 4 1 4 3 2 1 2 1 1 3 4 5 4
P1 1 1 1 1 1 1 5
P2 2 2 2 5 2 2
P3 3 3 3 3 3
P4 4 4 4 4

This series of accesses causes 7 page faults.

### b)

https://spcl.inf.ethz.ch/Teaching/2017-osnet/lectures/osnet_6_1s.pdf (last slide: "page-fault frequency scheme")

Or estimate the working set by refence bits and a timer (https://spcl.inf.ethz.ch/Teaching/2017-osnet/lectures/osnet_6_1s.pdf, page 91ff.)

### c)

Since the page size is ${\displaystyle 4~KiB=2^{12}~B}$, the VPN is 20 bits wide.

• minimum: The upper 10 bits of the VPN1 are all the same, and the 10 bits of the VPN2 range from 0 to 122. We need ${\displaystyle 1\cdot 32~bits=4~B}$ for the page directory and ${\displaystyle 123\cdot 32~bits=492~B}$ for the page table, which makes ${\displaystyle 496~B}$ in total.
• maximum: There are 123 different VPNs, and we need ${\displaystyle 123\cdot 32~bits=492~B}$ for both the page directory and the page table, which amounts to ${\displaystyle 984~B}$ in total.

Alternative solution: The above solution makes the assumption that you can store addresses without having to have a whole page table in memory. We'll work with the assumption that as soon as there is one entry in a table, the whole table structure needs to be in memory.

Size of one table: we have ${\displaystyle 2^{10}}$ entries and each entry is 32 bits. That means the table occupies 4KB

minimum: in the best case as stated above, all 123 map to the same entry in the page directory. In that case we need the page directory and only one additional second level page table, which is a total of 8 KB

maximum: in the worst case, all 123 addresses map to a different entry in the page directory, therefore we have to have one second level page table for each. 123 page tables plus 1 directory table is 124 tables in total, at 4 KB each, that means a total of 496 KB of memory used

### d)

Page Table Base Register : 0xAAAA0000

Page directory:

0 0xAAAA0000 0xCCCC0000
1 0xAAAA0004 0xDDDD0000
2 0xAAAA0008 0xBBBB0000
3 0xAAAA000C 0xEEEE0000
... ... ...
1023 0xAAAA0FFC 0xABAB0000

Page tables:

0 0xBBBB0000 0xCAFFE000
1 0xBBBB0004 0xCAFFE100
2 0xBBBB0008 0xCAFFE200
3 0xBBBB000C 0x87654321
...
0 0xEEEE0000 0xBEEF2000
...
0 0xABAB0000 0x12345000
...

Method:

0x00803000 (VPN1 = 0x2, VPN2 = 0x3, VPO = 0) maps to 0x87654321 (PFN = 0x87654, PFO = 0x321)

Entry 2 of the page directory is 0xBBBB0000, so the entry at 0xBBBB0000 + 0x3*0x4 = 0xBBBB000C must be 0x87654321.

0xFFC00678 (VPN1 = 0x3FF, VPN2 = 0, VPO = 0x678) maps to 0x12345678 (PFN = 0x12345, PFO = 0x678)

Entry 1023 of the page directory is 0xABAB0000, so the entry at 0xABAB000 must be 0x12345000.

0x00C00345 (VPN1 = 0x3, VPN2 = 0, VPO = 0x345) maps to 0xBEEF2345 (PFN = 0xBEEF2, PFO = 0x345)

Entry 3 of the page directory is 0xEEEE0000, so the index of 0xBEEF2000 must be 0xEEEE0000.

### e)

Since a page contains ${\displaystyle 4~KiB=2^{12}~B}$, the lower 12 bits of the virtual address are used the VPO, and the 20 upper bits are used for the VPN. In total there are ${\displaystyle 2^{20}}$ possible VPNs, distributed over ${\displaystyle 256}$ hash buckets, which gives us ${\displaystyle {\frac {2^{20}}{256}}=2^{12}=4096}$ entries per hash bucket, in the worst case (/because it is balanced). Thus, we need at most 4098 memory accesses to read from a virtual address (resp. 2 page accesses for the bucket (${\displaystyle 2^{12}*32bit=2^{14}Byte>2^{12}Byte}$), and one to the physical frame).

## 4 Filesystems/IO

### a)

When blocking, a process changes to the "blocked" state, whereas with busy waiting, it loops, constantly querying for changes.

While blocking the OS can schedule another process and make some progress. Because of high context switch overhead this is preferred, when no quick response to the I/O is needed and/or the I/O takes long. Whereas busy waiting is more suitable for short I/O and short response times.

### b)

struct inode {
//...
void* direct_block[4];
void** indirect_block[4];
void*** double_indirect[4];
};

void* get_data_block(struct inode* in, int byte_offset) {
int i = byte_offset / 4096; // index of data block
if (i < 4) {
return in->direct_block[i];
} else {
i -= 4;
if (i < 4 * 1024) { // there are 4096/4 = 1024 pointers per indirect block
void* ind = in->indirect_block[i / 1024];
return ind[i % 1024];
} else {
i -= 4 * 1024;
void* d_ind = in->double_indirect[i / (1024*1024)]; // a double indirect block references 1024^2 data blocks
void* ind = d_ind[(i % (1024*1024)) / 1024];
return ind[i % 1024];
}
}
}


See Discussion

### c)

Row-wise
Write A, B
Execute B
Write C, D
Execute B
Write C
Execute C
Write A, B, C
Execute A

Advantage: scales to large number of files, easy to change rights quickly

Disadvantage: doesn't scale well to large number of principals

Column-wise
Write F0, F3
Execute F3
Write F0, F3
Execute F0, F1
Write F1, F2, F3
Execute F2
Write F1
Execute

Advantage: access control ressource charged to principal, scales to large number of principals

Disadvantage: hard to change rights (revocation)

### d)

receive side scaling: each flow gets mapped to a core (ideally balanced across cores). This increases parallelism and by always mapping a flow to the same core it also helps with locality

### e)

 Provides reliable storage (d) RAID Allows to match different speeds of devices (c) Buffering Improves Access Latency (b) Caching Holds output for a device (a) Spooling Allows to do IO concurrently with computation (e) DMA Allows to remap DMA device address spaces (f) IOMMU

## 5 Knowledge Bank

1. b
2. b,d
3. b
4. a,c
5. a,c,d (since when decrementing the TTL the checksum changes)
6. b,c
7. a,c [c is questionable: see Discussion ]
8. b,c
9. b,c
10. b,d

## 6 The Basics

### a)

1. It takes ${\displaystyle {\frac {7.5\cdot 10^{6}~bits}{1.5\cdot 10^{6}~bits/s}}=5~s}$ to transfer the message to the first switch, and thus ${\displaystyle 15~s}$ to transfer the message to the destination.
2. ${\displaystyle {\frac {1500~bits}{1.5\cdot 10^{6}~bits/s}}=1~ms}$
3. ${\displaystyle 3~ms+4999*1~ms=5~s~2~ms}$, roughly one third of the original time, because we get a 'pipelining effect' (more parallelism).
4. When using message segmentation, we need to handle out-of-sequence packets (e.g. within IP header). Also there is more computation overhead for segmenting & reassembling at the switches

### b)

i. ${\displaystyle RTT=2\cdot \left({\frac {390\cdot 10^{6}~m}{3\cdot 10^{8}~m/s}}\right)=2\cdot 1.3~s=2.6~s}$

ii. ${\displaystyle BD=100\cdot 10^{6}~bits/s~\cdot 1.3~s=130~Mbits}$

iii. Pitfall: 25 Megabyte (MB) are ${\displaystyle 25\cdot 8=200}$ Megabit (Mb).

${\displaystyle 1.3~s+{\frac {25\cdot 8\cdot 10^{6}~bits}{100\cdot 10^{6}~bits/s}}+1.3~s=4.6~s}$ (one propagation delay for the request)

## 7 Layer 2

### a)

1. ${\displaystyle l_{24},l_{35},l_{56}}$

2.

S1 S2 S3 S4 S5 S6
A ${\displaystyle \rightarrow }$ E A at ${\displaystyle L_{a}}$ A at ${\displaystyle l_{12}}$ A at ${\displaystyle l_{23}}$ A at ${\displaystyle l_{14}}$ A at ${\displaystyle l_{45}}$ A at ${\displaystyle l_{36}}$
E ${\displaystyle \rightarrow }$ A E at ${\displaystyle l_{14}}$ E at ${\displaystyle l_{45}}$ E at ${\displaystyle L_{e}}$
C ${\displaystyle \rightarrow }$ A C at ${\displaystyle l_{12}}$ C at ${\displaystyle l_{23}}$ C at ${\displaystyle L_{c}}$

### b)

1.

Hidden terminals: C, F

Exposed terminals: E

2.

i. The hidden terminal sees the CTS of the recipient, but not the RTS of the sender. Thus, it knows it cannot send.

ii. The terminal sees the RTS of the sender, but not the CTS of the receiver. Thus, it knows it can send.

### c)

1. Error Detection can tell us if an incoming packet has been corrupted, while Error Correction allows to correct some errors in the packet.
2. Forwarding tables could not be aggregated, causing scalability problems. However, using only one fixed address could be useful for mobility, as changing IP addresses can break established connections.

## 8 Layer 3

### a)

1.

prefix next hop
output IP
1.2.2.0/27 1 1.2.5.2
1.2.3.0/26 1 1.2.5.4
1.2.5.0/29 1 -
1.2.1.28/30 3 -
1.2.1.8/30 2 -

2.

prefix next hop
output IP
1.2.2.0/27 2 1.2.5.2
1.2.1.0/27 2 1.2.5.1
1.2.5.0/29 2 -
1.2.3.44/30 1 -
1.2.3.8/30 3 -

Alternative Solution (See Discussion):

prefix next hop
output IP
1.2.2.0/27 2 1.2.5.2
1.2.1.0/26 2 1.2.5.1
1.2.5.0/24 2 -
1.2.3.44/30 1 -
1.2.3.8/30 3 -

3. The entry 1.2.3.44/30 would be replaced by 1.2.3.32/27.

### b)

1. C uses the route {CD}, since it is the only one it knows.
2. A announces the path {AD} to C, which then uses the route {CAD}.
3. In the previous round, B announced the path {BD} to A, which will now use the route {ABD} and announce it to C. C will then use the route {CD}, because of its policy.
4. In the previous round, B announced the path {BCD} to A, A used the route {AD} and announced it to C. C now uses the route {CAD}.
5. C will alternately use the paths {CD} and {CAD}, and a similar oscillation phenomenon will happen to A and B. This can cause problems, since the route is not stable.

## 9 Layer 4

### a)

Signals
Packet Loss Packet Delay Router Indication
Advantage easier to detect early reaction to congestion early reaction to congestion
Disadvantage signals congestion too late need appropriate tuning to infer congestion requires router support

### b)

1. [1, 6], [7, 11], and [23, 26]. During slow start the congestion window doubles every RTT. Slow start happens at the beginning of the connection in order to determine the threshold. It also happens after a lost packet, when the loss is determined by a timeout.

2. [11, 15], [16, 22], and [26, 32] ([Discussion]). The goal of congestion control, is to drive the sending rate close to the capacity of the path, i.e., close to congestion, but ideally without triggering it. Thus, when the rate is close to the estimation of the capacity, the rate increases smoothly in order to stay as long as possible close to the limit.

3. (a) At 16 a packet loss is detected by 3 duplicate ACKs, and the packet is sent again. We know this because the congestion window is halved and not set to 1. Because fast recovery and fast retransmission work hand in hand, we know that a fast retransmission occured.

(b) Fast retransmission sends one packet if it receives 3 duplicate ACKs for the packet. This is an optimization to send the lost packet before a timeout occurs. In fast recovery, the source halves the congestion window instead of restarting at 1 and continues with additive increase.