# Advanced Systems Lab Exercises From the Book

## Recommended Exercises from the Book

R. Jain: The Art of Computer Systems Performance Analysis, Wiley.

### 1.1

The measured performance of two database systems on two different work-loads is shown in Table 1.6. Compare the performance of the two systems and show that
a. System A is better.
b. System B is better.
TABLE 1.6 Throughput in Queries per Second

A 10 30
B 30 10

Solution:
a.

A 33% 300% 166.5%
B 100% 100% 100%

b.

A 100% 100% 100%
B 33% 300% 166.5%

--Kermit 14:29, 22. Jan. 2011 (CET)

### 12.1

A distributed system has three file servers, which are chosen independently and with equal probabilities whenever a new file is created. The servers are named A, B, and C. Determine the probabilities of the following events:
a. Server A is selected
b. Server A or B is selected
c. Servers A and B are selected
d. Server A is not selected
e. Server A is selected twice in a row
f. Server selection sequence ABCABCABC is observed (in nine successive file creations)

Solution:
a. P(A) = 0.33
b. P(A∨B) = 0.66
c. P(A∧B) = 0 (because they are never selected at the same time)
d. P(¬A) = P(B∨C) = 0.66
e. P(AA) = 0.33 * 0.33 = 0.1089
f. P(ABCABCABC) = 0.33⁹ = 4.6411e-05
--Kermit 17:45, 22. Jan. 2011 (CET)

### 12.7

The execution times of queries on a database is normally distributed with a mean of 5 seconds and a standard deviation of 1 second. Determine the following:
a. What is the probability of the execution time being more than 8 seconds?
b. What is the probability of the execution time being less than 6 seconds?
c. What percentage of responses will take between 4 and 7 seconds?
d. What is the 95-percentile execution time?

Solution:
Wikipedia: standard deviation
a. ${\displaystyle CDF(-3\sigma )=1-CDF(3\sigma )=1-{\frac {1}{2}}\cdot (1+erf({\frac {3}{\sqrt {2}}}))=0.00135=0.135\%}$
b. ${\displaystyle CDF(1\sigma )={\frac {1}{2}}\cdot (1+erf({\frac {1}{\sqrt {2}}}))=0.84=84\%}$
c. ${\displaystyle CDF(2\sigma )-CDF(-1\sigma )={\frac {1}{2}}\cdot (1+erf({\frac {2}{\sqrt {2}}}))-{\frac {1}{2}}\cdot (1+erf({\frac {-1}{\sqrt {2}}}))=0.819=81.9\%}$

d. ${\displaystyle erf({\frac {x}{\sqrt {2}}})=0.95\rightarrow x=1.96}$
95% of the execution times are between 3.04s and 6.96s. --Kermit 11:18, 23. Jan. 2011 (CET)

### 13.2

Answer the following for the data of Exercise 12.11:
a. What is the 10-percentile and 90-percentile from the sample?
b. What is the mean number of disk I/O’s per program?
c. What is the 90% confidence interval for the mean?
d. What fraction of programs make less than or equal to 25 I/O’s and what is the 90% confidence interval for the fraction?
e. What is the one-sided 90% confidence interval for the mean?

Data: {23, 33, 14, 15, 42, 28, 33, 45, 23, 34, 39, 21, 36, 23, 34, 36, 25, 9, 11, 19, 35, 24, 31, 29, 16, 23, 34, 24, 38, 15, 13, 35, 28}

Sorted Data: {9, 11, 13, 14, 15, 15, 16, 19, 21, 23, 23, 23, 23, 24, 24, 25, 28, 28, 29, 31, 33, 33, 34, 34, 34, 35, 35, 36, 36, 38, 39, 42, 45}

Solution:
a. round( (n-1)*(p/100) + 1 )th element from sorted sequence.
10-percentile > 4th element > 14
90-percentile > 30th element > 38
b. have fun doing this manually... wolfram alpha says 26.91
c. the formula to calculate CI: ${\displaystyle x\pm t\cdot {\frac {s}{\sqrt {n}}}}$
looking up t in students t-distribution, value is for N=30 (for 2-sided CI it is one to the right in the book (wikipedia is more clear on that))
${\displaystyle CI=26.91\pm 1.697\cdot {\frac {9.495}{\sqrt {33}}}=(24.11;29.71)}$
the values are slightly off from the master solution

d. p:=16/33=0.485
${\displaystyle \alpha =0.1,s=9.49,n=33}$
CI=${\displaystyle p\pm z_{1-{\frac {\alpha }{2}}}\cdot {\frac {s}{\sqrt {n}}}=(0.34,0.63)}$

e. see c. for e formula:
CI=26.91-1.31*(9.495/sqrt(33))=(24.7;26.91)
or
CI=26.91+1.31*(9.495/sqrt(33))=(26.91;29.08)
again slightly off.
--Kermit 12:15, 23. Jan. 2011 (CET)

### 14.2

For the disk I/O and CPU data in Example 14.1, find a linear formula to predict the number of disk I/O’s given the CPU time. Answer the following questions about this regression:
a. Which parameters are significant?
b. What percentage of variation is explained by the regression?
c. What is the expected number of disk I/O’s for a program with a CPU time of 40 milliseconds?
d. What bounds would you put on your answer in c if you wanted to take less than 10% chance of error on a single program to be measured tomorrow?
e. Repeat d for the case that you want to take a less than 10% chance of error on the mean of a large number of programs to be measured tomorrow?

Data: {(14, 2), (16, 5), (27, 7), (42, 9), (39, 10), (50, 13), (83, 20)}

Solution:
${\displaystyle b1={\frac {\Sigma {xy}-n{\bar {x}}{\bar {y}}}{\Sigma {x^{2}}-n({\bar {x}})^{2}}}={\frac {3375-7\cdot 38.71\cdot 9.43}{13855-7\cdot (38.71)^{2}}}=0.24}$
${\displaystyle b0={\bar {y}}-b_{1}{\bar {x}}=9.43-0.24\cdot 38.71=-0.008}$
${\displaystyle {\hat {y}}=b_{0}+b_{1}x}$
a. only b1 is significant. (as we have an obvious (0,0) origin, only the slope is relevant for a linear model of the data - this is confirmed by the very small b0.)
b. SSR indicates the variation that was explained by the regression.
${\displaystyle SST=\sum _{i=1}^{n}(y_{i}-{\bar {y}})^{2}}$
${\displaystyle SSE=\sum _{i=1}^{n}e_{i}^{2}=\sum _{i=1}^{n}(y_{i}-b_{0}-b_{1}x_{i})^{2}}$
${\displaystyle {\frac {SSR}{SST}}={\frac {SST-SSE}{SST}}={\frac {205.71-5.87}{205.71}}=0.97}$, i.e., 97% of the variation is explained by the regression.
c. ${\displaystyle {\frac {40-b_{0}}{b_{1}}}=161}$
d. ${\displaystyle CI={\bar {x}}\pm t\cdot {\frac {s}{\sqrt {n}}}=161\pm 1.943\cdot {\frac {\sqrt {SST}}{\sqrt {7}}}=(150.5,171.5)}$

 Dem Autor des Lösungsvorschlags stellt sich eine Unklarheit betreffend der Prüfungsfrage oder deren Lösung: How to compute s? Hilf mit, die Qualität dieser Seite zu verbessern, indem du eine eindeutige Lösung einfügst oder auf der Diskussionsseite dieses Themas deine Meinung äusserst.

e. *beep*
--Kermit 15:24, 23. Jan. 2011 (CET)

### 14.3

The memory size of the seven programs mentioned in the disk I/O and CPU time data were also measured. The memory size (in kilobytes) and CPU time (in milliseconds) pairs observed are {((70, 2), (75, 5), (144, 7), (190, 9), (210, 10), (235, 13), (400, 20)}. Analyze the data using a simple regression model to predict CPU time as a function of the memory size.

Solution:
${\displaystyle b1={\frac {\Sigma {xy}-n{\bar {x}}{\bar {y}}}{\Sigma {x^{2}}-n({\bar {x}})^{2}}}={\frac {16388-7\cdot 189.1\cdot 9.429}{326686-7*(189.1)^{2}}}=0.0512}$
${\displaystyle b0={\bar {y}}-b_{1}{\bar {x}}=9.429-0.0512\cdot 189.1=-0.2529}$
${\displaystyle {\hat {y}}=b_{0}+b_{1}x=-0.2529+0.0512\cdot x}$
--Kermit 15:24, 23. Jan. 2011 (CET)

### 16.1

The performance of a system being designed depends upon the following three factors:
1. CPU type: 68000, 8086, 80286
2. Operating system type: CPM, MS-DOS, UNIX
3. Disk drive type: A, B, C
How many experiments are required to analyze the performance if
a. There is significant interaction among factors.
b. There is no interaction among factors.
c. The interactions are small compared to main effects.

Solution:
a. 3³ = 27
b. simple design: ${\displaystyle 1+\sum _{i=1}^{k}(n_{i}-1)=7}$
c. 9 = 3²(Fractional factorial design due to little interaction)
--Kermit 15:35, 23. Jan. 2011 (CET)
--Flwidmer 14:44, 25. Jan. 2011 (CET)

### 17.1

Analyze the 2³ design shown in Table 17.10
a.Quantify main effects and all interactions.
b. Quantify percentages of variation explained.
c. Sort the variables in the order of decreasing importance.

Solution:
a.b.

I A B C AB AC BC ABC y
1 1 1 1 1 1 1 1 100
1 1 1 -1 1 -1 -1 -1 15
1 1 -1 1 -1 1 -1 -1 40
1 1 -1 -1 -1 -1 1 1 30
1 -1 1 1 -1 -1 1 -1 120
1 -1 1 -1 -1 1 -1 1 10
1 -1 -1 1 1 -1 -1 1 20
1 -1 -1 -1 1 1 1 -1 50
385 -15 105 175 -15 15 215 -65 Total
48.125 -1.875 13.125 21.875 -1.875 1.875 26.875 8.125 Total/8
11596.9 28.125 1378.125 3828.125 28.125 28.125 5778.125 528.125 Variation
100 0.24 11.88 33.0 0.24 0.24 49.82 4.55 Variation %

c. BC, C, B, ABC, A, AB, AC
--Kermit 11:54, 24. Jan. 2011 (CET)

### 18.1

Table 18.11 lists measured CPU times for two processors on two workloads. Each experiment was repeated three times. Analyze the design.

I (41.16, 39.02, 42.56) (63.17, 59.25, 64.23)
J (51.50, 52.50, 50.50) (48.08, 48.98, 47.10)

Solution:

W P WP y Mean y Error
1 1 1 1 (41.16, 39.02, 42.56) 40.9133 6.35707
1 1 -1 -1 (63.17, 59.25, 64.23) 62.2167 14.98746
1 -1 1 -1 (51.50, 52.50, 50.50) 51.5 2
1 -1 -1 1 (48.08, 48.98, 47.10) 48.0533 1.76827
202.8633 3.7566 -18.0366 -24.93 Total
50.71 0.94 -4.51 -6.23 Total/4
745.82 10.58 243.99 466.13 Variation 25.1128
100 1.42 32.71 62.499  % Variation 3.367

SSE = (40.9133-41.16)² + (40.9133-39.02)² + (40.9133-42.56)² + (62.2167-63.17)² + (62.2167-59.25)² + (62.2167-64.23)² + (51.5-51.5)² + (51.5-52.5)² + (51.5-50.5)² + (48.0533-48.08)² + (48.0533-48.98)² + (48.0533-47.1)²
= 6.35707 + 14.987 + 2 + 1.76827 = 25.1128

### 30.3

Which queueing system would provide better performance: an M/M/3/300/100 system or an M/M/3/100/100 system?

Solution:
Both will provide the same performance. Increasing buffers beyond the population size has no effect.
--Kermit 14:53, 24. Jan. 2011 (CET)

### 30.4

During a 1-hour observation interval, the name server of a distributed system received 10,800 requests. The mean response time of these requests was observed to be one-third of a second.
a. What is the mean number of queries in the server?
c. Would the mean number of queries be different if the service time was not exponentially distributed?

Solution:
a. E[n] = λ * E[w] = 10800/3600 * 1/3 = 1
b. Job flow balance. (That all jobs that are sent into the system get a response.)
c. Doesn't have an influence.
--Kermit 15:47, 24. Jan. 2011 (CET)

### 31.1

Consider a single-server system with discouraged arrivals in which the arrival rate is only λ/(n + 1) when there are n jobs in the system. The interarrival times, as well as the service time, are independent and identically distributed with an exponential distribution. Using a birth-death process model for this system, develop expressions for the following:
a. State probability pn of n jobs in the system
b. State probability ρ0 of the system being idle
c. Mean number of jobs in the system, E[n]
d. Effective arrival rate λ’
e. Mean response time E[r]

Solution:
a. ${\displaystyle \lambda _{n}={\frac {\lambda _{0}}{n+1}}}$
${\displaystyle \Rightarrow p_{n}={\frac {\lambda _{0}\cdot \lambda _{1}\cdot ...\cdot \lambda _{n-1}}{\mu _{1}\cdot \mu _{2}\cdot \mu _{n}}}\cdot p_{0}={\frac {\lambda _{0}^{n}}{\mu ^{n}\cdot n!}}\cdot p_{0}={\frac {\rho ^{n}}{n!}}\cdot p_{0}}$
With ${\displaystyle \rho ={\frac {\lambda _{0}}{\mu }}}$
b. As we know: ${\displaystyle \sum _{i=0}^{\infty }p_{i}=1}$ and with the previous formula for ${\displaystyle p_{i}}$ we see that this corresponds to a poisson distribution, and therefore ${\displaystyle p_{0}=e^{-\rho }}$
c. As already figured out, the probabilities correspond to a poisson distribution with parameter ${\displaystyle \rho }$, hence ${\displaystyle E[n]=\rho ={\frac {\lambda _{0}}{\mu }}}$
d. We know that ${\displaystyle \lambda '=\sum _{i=0}^{\infty }\lambda _{i}\cdot p_{i}}$ thus we can solve:
${\displaystyle \Rightarrow \sum _{i=0}^{\infty }\lambda _{i}\cdot p_{i}=\sum _{i=0}^{\infty }{\frac {\lambda _{0}}{i+1}}\cdot {\frac {\rho ^{i}}{i!}}\cdot e^{-\rho }=\lambda _{0}\cdot e^{-\rho }\sum _{i=0}^{\infty }{\frac {\rho ^{i}}{(i+1)!}}}$
Using the formula: ${\displaystyle \sum _{i=0}^{\infty }{\frac {\rho ^{i}}{(i+1)!}}={\frac {e^{\rho }-1}{\rho }}}$ we get:
${\displaystyle \Rightarrow \lambda '=\lambda _{0}\cdot {\frac {1-e^{-\rho }}{\rho }}}$
e. Little's Law: ${\displaystyle E[n]=\lambda '\cdot E[R]\Rightarrow E[R]={\frac {E[n]}{\lambda '}}}$ we directly get:
${\displaystyle {\frac {\rho ^{2}}{\lambda _{0}\cdot (1-e^{-\rho })}}}$
-- Davids 15:52, 25. Jan. 2012 (CET)

### 31.2

Consider an M/M/∞ system with an infinite number of servers. In such a system, all arriving jobs begin receiving service immediately. The service rate is nµ when there are n jobs in the system. Using a birth-death process model for this system, draw a state transition diagram for this system and develop expressions for the following:
a. State probability pn of n jobs in the system
b. State probability p0 of the system being idle
c. Mean number of jobs in the system, E[n]
d. Variance of number of jobs in the system, Var[n]
e. Mean response time E[r]

Solution:
a. same as 31.1.a
b. same as 31.1.b, ${\displaystyle e^{-\rho }}$
c. same as 31.1.c, ${\displaystyle E[n]=\rho }$
d. Knowing it is a poisson distribution (compare with 31.1), we know that ${\displaystyle Var[n]=\rho }$
e. E[r] = E[n]/λ = λ/μ/λ = 1/μ
Note the difference to 31.1: Here the effective arrival rate is the same as the normal arrival rate!
-- Davids 16:06, 25. Jan. 2012 (CET)

### 31.3

The average response time on a database system is 3 seconds. During a 1-minute observation interval, the idle time on the system was measured to be 10 seconds. Using an M/M/1 model for the system, determine the following:
a. System utilization
b. Average service time per query
c. Number of queries completed during the observation interval
d. Average number of jobs in the system
e. Probability of number of jobs in the system being greater than 10
f. 90-percentile response time
g. 90-percentile waiting time

Solution:
a. ρ=5/6
b. using E[r] = (1/μ) / (1-ρ)
E[s] = 1 /μ = E[r]*(1-ρ) = 0.5s
c. x = 50/0.5 = 100
d. E[n] = E[nq]+ρ = E[w]*λ+ρ=(E[r]-E[s])*(ρ*μ)+ρ=(3-0.5)*(5/6*2)+5/6 = 5
e. P[n>=10] = p10 = rho^11 = (5/6)^11 = 0.1346
f. q-percentile of E[r] = E[r] * ln(100/(100-q)) => 90-percentile = E[r]*ln(10) = 6.9s
g. E[wq] = E[r]-E[s] = 2.5s. q-percentile of E[wq] = (E[wq]/rho) * ln(100rho/(100-q)) = 6.36
--Mbruggmann 18:07, 24. Jan. 2011 (CET)

### 31.4

A storage system consists of three disk drives sharing a common queue. The average time to service an I/O request is 50 milliseconds. The I/O requests arrive to the storage system at the rate of 30 requests per second. Using an M/M/3 model for this system, determine the following:
a. Average disk drive utilization
b. Probability of the system being idle, p0
c. Probability of queueing,
d. Average number of jobs in the system, E[n]
e. Average number of jobs waiting in the queue, E[nq]
f. Mean response time E[r]
g. Variance of the response time
h. 90-percentile of the waiting time
Solution:
a. μ (per disk) = 1/0.05, λ (per disk) = 30/3 = 10 => ρ = λ/μ = 0.5
b. pain-in-the-ass formel (aus textbook box 31.2) p0=0.21
c. ( ((m*ρ)^m) / (m!*(1-ρ)) ) * p0 = 0.236
d./e./f./g./h. > richtige formel aus box 31.2 auswerten
--Mbruggmann 11:24, 25. Jan. 2011 (CET)

### 33.1

During a 10-second observation period, 400 packets were serviced by a gateway whose CPU can service 200 pps. What was the utilization of the gateway CPU?

Solution:
Throughput = X = 400/10 = 40
Servicetime = S = 1/200
U = X*S = 40/200 = 20%
--Kermit 15:22, 25. Jan. 2011 (CET)

### 33.2

The throughput of a timesharing system was observed to be five jobs per second over a 10-minute observation period. If the average number of jobs in the system was four during this period, what was the average response time?
Solution:

R = Number of Jobs / Throughput = Q/X = 4/5
--Kermit 15:42, 25. Jan. 2011 (CET)

### 33.3

During a 10-second observation period, 40 requests were serviced by a file server. Each request requires two disk accesses. The average service time at the disk was 30 milliseconds. What was the average disk utilization during this period?

Solution:
X = 40/10 = 4
Vdisk = 2
Xdisk = X*Vdisk = 8
Sdisk = 0.03
Udisk = Xdisk * Sdisk = 8 * 0.03 = 0.24 = 24%
--Kermit 15:42, 25. Jan. 2011 (CET)

### 33.5

For a timesharing system with two disks (user and system), the probabilities for jobs completing the service at the CPU were found to be 0.80 to disk A, 0.16 to disk B, and 0.04 to the terminals. The user think time was measured to be 5 seconds, the disk service times were 30 and 25 milliseconds, and the average service time per visit to the CPU was 40 milliseconds. Using the queueing network model shown in Figure 32.8, answer the following for this system:
a. For each job, what are the visit ratios for CPU, disk A, and disk B?
b. For each device, what is the total service demand?
c. If disk A utilization is 60%, what is the utilization of the CPU and disk B?
d. If the utilization of disk B is 10%, what is the average response time when there are 20 users on the system?

Solution:
a. V_CPU=1/0.04=25, V_A=0.8/0.04=20, V_B=0.16/0.04=4 (Divide visit probability by exit probability)
b. D_CPU=25*40ms=1, D_A=20*30ms=0.6, D_B=4*25ms=0.1
c. U_CPU=1, U_B=0.1
d. N=20, X=U_B/D_B = 1 Job/s, R = N/X - Z = 20/1 - 5 = 15s

## Errata

Several solutions in the book contain errors. Here's a complete list of errors as PDF