Solution Manual For Structured Computer Organization, 6th Edition
Find textbook answers quickly with Solution Manual For Structured Computer Organization, 6th Edition, a detailed solutions manual designed to make studying easier.
Zoe Jordan
Contributor
4.4
58
5 months ago
Preview (13 of 42 Pages)
100%
Purchase to unlock
Loading document content...
Preview Mode
Sign in to access the full document!
STRUCTURED COMPUTER ORGANIZATION SIXTH EDITION PROBLEM SOLUTIONS ANDREW S. TANENBAUM Vrije Universiteit Amsterdam, The Netherlands PROBLEM SOLUTIONS FOR CHAPTER 1 3 SOLUTIONS TO CHAPTER 1 PROBLEMS 1. a. A translator converts programs in one language to another. b. An interpreter carries out a program instruction by instruction. c. A virtual machine is a conceptual machine, one that does not exist. 2. It is possible, but there are problems. One difficulty is the large amount of code produced. Since one ISA instruction does the work of many microin- structions, the resulting program will be much bigger. Another problem is that the compiler will have to deal with a more primitive output language, hence it, itself, will become more complex. Also, on many machines, the micropro- gram is in ROM. Making it user-changeable would require putting it in RAM, which is much slower than ROM. On the positive side, the resulting program might well be much faster, since the overhead of one level of interpretation would be eliminated. 3. During the detailed design of a new computer, the device and digital logic lev- els of the new machine may well be simulated on an old machine, which puts them around level 5 or 6. 4. Each level of interpretation slows down the machine by a factor of n / m . Thus, the execution times for levels 2, 3, and 4 are kn / m , kn 2 / m 2 , and kn 3 / m 3 , re- spectively. 5. Each additional level of interpretation costs something in time. If it is not needed, it should be avoided. 6. You lose a factor of n at each level, so instruction execution times at levels 2, 3, and 4 are kn , kn 2 , and kn 3 , respectively. 7. Hardware and software are functionally equivalent. Any function done by one can, in principle, be done by the other. They are not equivalent in the sense that to make the machine really run, the bottom level must be hardware, not software. They also differ in performance. 8. Not at all. If you wanted to change the program the difference engine ran, you had to throw the whole computer out and build a new one. A modern com- puter does not have to be replaced because you want to change the program. It can read many programs from many CD-ROMs. 9. A typical example is a program that computes the inner product of two arrays, A and B . The first two instructions might fetch A [0] and B [0], respectively. At the end of the iteration, these instructions could be incremented to point to A [1] and B [1], respectively. Before indexing and indirect addressing were in- vented, this was done. 4 PROBLEM SOLUTIONS FOR CHAPTER 1 10. Raw cycle time is not the only factor. The number of bytes fetched per cycle is also a major factor, this increasing with the larger models. Memory speed and wait states play a role, as does the presence of caching. A better I/O archi- tecture causes fewer cycles to be stolen, and so on. 11. The design of Fig. 1-5 does I/O one character at a time by explicit program command. That of of Fig. 1-6 can use DMA to have the controller do all the work, relieving the CPU of the burden, and thus making it potentially better. 12. Each person consumes 730 tags per nonleap year. Multiply by 300 million and you get 219 billion tags a year. At a penny a tag, they cost $2.19 billion dol- lars a year. With GDP exceeding $10 trillion, the tags add up to 0.02% of GDP, not a huge obstacle. 13. The following appliances are normally controlled these days by embedded systems: alarm-clock radios, microwave ovens, television sets, cordless tele- phones, washing machines, sewing machines, and burglar alarms. 14. According to Moore’s law, next year the same chip will have 1.6 times as many transistors. This means that the area of each transistor will be 1/1.6 or 0.625 times the size of this year’s transistors. Since the area goes like the square of the diameter, the diameter of next year’s transistors must be 0.079 microns. 15. In three years the power has doubled twice so the execution time is down by a factor of four. Thus, a 4-hour run will take 1 hour then. In 6 years we gain an- other factor of four in power, so the run time then decreases to 15 min. If we start the simulation now, it will complete in 5 years, but if we start the simula- tion in 3 years it will run four times faster and thus complete in 1.25 years. If we add the 3-year delay in starting plus the 1.25 year execution time, the run will complete in 4.25 years, which is faster than starting the run now, which would take 5 years from the starting date to complete. 16. The current numbers obviously depend on when the calculation is being made, but for example, assume the instruction execution rate is 1 billion instruc- tions/sec, the memory is 1 GB, and the cost is $500. The speed gain is 2000. The 7090 had 1,179,648 bits of memory, whereas 1 GB is 8,589,934,592 bits giving a memory gain of 7281. Thus, the speed-size product is 14,562,000 times better. And with a price improvement of 6000, the total gain is 87.372 million times better. Now let’s scale the aircraft using the same numbers. If it could fly 2000 faster, it would go 1.18 million km/hr, which means it could cir- cle the earth in 2 min. A gain of 7281 in capacity implies about 1.3 million passengers. And our plane would cost about $667. 17. After mainframes, we had personal computers. A current development is mov- ing computation to the cloud, which is basically centralized computing all over again. PROBLEM SOLUTIONS FOR CHAPTER 2 5 SOLUTIONS TO CHAPTER 2 PROBLEMS 1. The data path cycle is 20 nsec. The maximum number of data path cycles/sec is thus 50 million. The best the machine could do is thus 50 MIPS. 2. The program counter must be incremented to point to the next instruction. If this step were omitted, the computer would execute the initial instruction for- ever. 3. You cannot say anything for sure. If computer 1 has a five-stage pipeline, it can issue up to 500 million instructions/second. If computer 2 is not pipelined, it cannot do any better than 200 million instructions/sec. Thus with- out more information, you cannot say which is faster. 4. On-chip memory does not affect the first three principles. Having only LOAD s and STORE s touch memory is no longer required. There is no particular rea- son not to have a memory-to-memory architecture if memory references are as fast as register references. Likewise, the need for many registers becomes less in this environment. 5. The monastery resembles Fig. 2-7, with one master and many slaves. 6. The access time for registers is a few nanoseconds. For optical disk it is a few hundred milliseconds. The ratio here is about 10 8 . 7. Sixty-four 6-bit numbers exist, so 4 trits are needed. In general, the number of trits, k , needed to hold n bits is the smallest value of k such that 3 k ≥ 2 n . 8. A pixel requires 6 + 6 + 6 = 18 bits, so a single visual frame is 1. 8 × 10 7 bits. With 10 frames a second, the gross data rate is 180 Mbps. Unfortunately, the brain’s processing rate is many orders of magnitude less than this. As an experiment, try watching the random noise on a color television for a few min- utes when no station is broadcasting and see if you can memorize the color bit pattern in the noise. 9. With 44,000 samples per second of 16 bits each, we have a data rate of 704 kbps. 10. There are 2 bits per nucleotide, so the information capacity of the human genome is about 6 gigabits. Dividing this number by 30,000, we get about 200,000 bits per gene. Just think of a gene as a 25-KB ROM. This estimate is an upper bound, because many of the nucleotides are used for purposes other than coding genes. 6 PROBLEM SOLUTIONS FOR CHAPTER 2 11. For efficiency with binary computers, it is best to have the number of cells be a power of 2. Since 1,073,741,824 is 2 30 , it is reasonable, whereas 1,000,000,000 is not. 12. From 0 to 9 the codes are: 0000000, 1101001, 0101010, 1000011, 1001100, 0100101, 1100110, 0001111, 1110000, and 0011001. 13. Just add a parity bit: 00000, 00011, 00101, 00110, 01001, 01010, 01100, 01111, 10001, and 10010. 14. If the total length is 2 n − 1 bits, there are n check bits. Consequently, the per- centage of wasted bits is n /(2 n − 1) × 100%. Numerically for n from 3 to 10 we get: 42.9%, 26.7%, 16.1%, 9.5%, 5.5%, 3.1%, 1.8%, and 1.0%. 15. Each 8-bit character is put into a 12-bit codeword where positions 1, 2, 4, and 8 are check positions. After calculating the check digits and grouping each consecutive 4 bits into a hex digit, the 5 character ASCII string is encoded in hex as: C85 DD1 DF2 5F4 4D8. 16. Each 8-bit ASCII character is encoded into three hex digits. The first set of hex digits: 0D3, has an error in bit 12 (as indicated by the fact that bit 4 and bit 8 have the wrong parity). The next set, DD3 has bit 11 wrong; the set 0F2 has bit 7 wrong; the set 5C1 has bit 9 wrong; the set 1C5 has bit 1 wrong; the last set CE3 does not contain any errors. After the bit positions are corrected and the data extracted from the code words and looked up in the ASCII table, the encoded characters are: babies. 17. With 4096 bits/sector and 1024 sectors/track, each track holds 4,194,304 bits. At 7200 RPM, each rotation takes 1/120 sec. In 1 sec it can read 120 tracks for a rate of 503,316,480 bits/sec or 62,914,560 bytes/sec. 18. At 160 Mbytes/sec and 4 bytes/word, the disk transfer rate is 40 million words/sec. Of the 200 million bus cycles/sec, the disk takes 1/5 of them. Thus the CPU will be slowed down by 20 percent. 19. Logically it does not matter, but the performance is better if you allocate from the outside in. One rotation of the outermost track takes as long as one rota- tion of the innermost track (because hard disks rotate with constant angular velocity), but there are more sectors on the outermost track, so the transfer rate is higher. It is smarter to use the high-performance sectors first. Maybe the disk will never fill up and you will never have to use the lowest-performance sectors. 20. A cylinder can be read in four rotations. During the fifth rotation, a seek is done to the next cylinder. Because the track-to-track seek time is less than the rotation time, the program must wait until sector zero comes around again. Therefore, it takes five full rotations to read a cylinder and be positioned to start reading the next one. Reading the first 9999 cylinders thus takes 49,995 PROBLEM SOLUTIONS FOR CHAPTER 2 7 rotations. Reading the last cylinder requires four rotations, because no final seek is needed. The 49,999 rotations at 10 msec/rotation take 499.99 sec. If the sectors are skewed, however, it may be possible to avoid the fifth rotation per track. 21. RAID level 2 can recover not only from crashed drives but also from undetect- ed transient errors. If one drive delivers a single bad bit, RAID level 2 will correct this, but RAID level 3 will not. 22. In mode 2, the data streams at 175,200 bytes/sec. In a 80-min time span, the number of seconds is 5920, so the size of a 80-min mode-2 CD-ROM is 840,960,000 bytes or 802 MB. Of course, in mode 2 there is no error cor- rection, which is fine for music but not for data. In mode 1, only 2048/2336 of the bits are available for data, reducing the payload to 737,280,000 bytes or 703 MB. 23. The mode does not matter, since the laser has to pulse for preamble bits, data bits, ECC bits, and all the overhead bits as well. The gross data rate at 1x is 75 sectors/sec, each sector consisting of 98 × 588 = 57, 624 bits. Thus 4,321,800 bits/sec fly by the head at 1x. At 10x, this is 43,218,000 bits/sec. Thus each pulse must last no more than 23.14 nsec (actually slightly less, since there is a blank interval between pulses). 24. Each frame contains 345,600 pixels or 1,036,800 bytes of information. At 30 fps, the rate per second is 31,104,000 bytes/sec. In 133 minutes this amounts to 2. 482 × 10 11 bytes. The disk capacity is 3. 5 × 2 30 which is about 3. 758 × 10 9 bytes. We are off by a factor of 66, so the compression has to be 66x. 25. The read time is the size divided by the speed: 25 × 2 30 /4. 5 × 10 6 or about 5965 sec. This is almost an hour and 39 minutes. 26. The usual way to handle this would be to have a bank of 256 24-bit mapping registers in the hardware. Whenever a byte was fetched from the video RAM, the 8-bit number would be used as an index into the mapping registers. The register selected would deliver the 24 bits to drive the display (typically 8 bits each for the red, green, and blue electron guns). Thus indeed 2 24 colors are available, but at any instant, only 256 are available. Changing colors means reloading the mapping registers. 27. For 32 intensities, we need 5 bits (since 2 5 = 32). For each pixel we need 5 × 5 = 25 bits. This gives 25 × 10 8 bits/frame. A temporal resolution of 10 msec means 100 frames/sec so the bit rate is 2500 × 10 8 bps. This is the same as 312. 5 × 10 8 bytes/sec. Using 1 GB = 10 9 bytes, this is 312.5 GB/sec. 8 PROBLEM SOLUTIONS FOR CHAPTER 2 28. The display must paint 1920 × 1080 × 75 pixels/sec. This is a total of 155.52 megapixels. Thus the pixel time is 6.43 nsec. 29. A page has 4000 characters. Each character uses 25% of 4 mm 2 , or 1 mm 2 . Thus a page has 4000 mm 2 of toner. With a thickness of 25 microns (0.025 mm), the volume of toner on a page is 100 mm 3 . The capacity of the toner cartridge is 400 cm 3 or 400,000 mm 3 . A cartridge is good for 4000 pages. 30. Each interval can transmit 6 bits, so the data rate is 6 n bps. 31. A 12-MHz cable with QAM-64 has a data rate of 72 Mbps. With nf com- puters sharing the bandwidth, each user gets 72/ nf Mbps. Thus the cable user gets better service if 72/ nf > 2. An alternative way to write this is nf < 36. In other words, if the 72-Mbps bandwidth is being shared by 36 active users, it is the same as 2-Mbps ADSL; with fewer users, cable wins; with more users, ADSL wins. 32. Each uncompressed image file is 18 million bytes. After 5x compression, it is 3.6 million bytes. To write this in 2 sec requires a data rate of 1.8 MB/sec. 33. The uncompressed image is 144 million bytes. The compressed image is 28.8 million bytes. The number of images stored is thus 8 × 2 30 /28. 8 million or 298 images. 34. A typical computer-science textbook has about a million characters, so it needs about 1 MB. Ten thousand books require 10 10 bytes. A CD-ROM holds 700 MB, so you need 15 CD-ROMs. A dual-layer, single-sided DVD holds 4.7 GB, so the whole library fits on three DVDs. PROBLEM SOLUTIONS FOR CHAPTER 3 9 SOLUTIONS TO CHAPTER 3 PROBLEMS 1. To a limited extent digital circuits are immune to noise. If a digital value is represented by, say +5 volts for 1 and 0 volts for 0, then a noise spike of 1 volt has no effect. But a noise spike of 3 volts could turn a 0 into a 1. 2. The order is hamburger or hot dog and french fries which can be parsed as hamburger or (hot dog and french fries) or as (hamburger or hot dog) and french fries The first parse leads to alternatives a and d. The second parse leads to alterna- tives e and d. Thus, a, d, and e are all possible. Choice h is ruled out on edu- cational grounds: the cook is too stupid to get the joke. 3. He should point to one of the roads and ask: "If I were to ask the other gang if this is the road to Disneyland, what would they say?" If the answer is no, he should take the road; if the answer is yes, he should not take it. The validity of this solution can easily be seen by trying all four combinations of right/wrong road and liars/truth-tellers. 4. The truth table required is as follows: P Q P AND Q P AND NOT Q (P AND Q) OR (P AND NOT Q) 0 0 0 0 0 0 1 0 0 0 1 0 0 1 1 1 1 1 0 1 5. With three variables, the truth table has eight rows, so a function can be de- scribed by an 8-bit number. Thus, 256 functions exist. With n variables, the truth table has k = 2 n rows and there are 2 k functions. 6. With three variables, the truth table has eight rows, so a function can be de- scribed by an 8-bit number. Thus, 256 functions exist. With four variables the table has 16 rows, so a function can be described by a 16-bit number. Thus 65,536 functions exist. 7. Call the two variables A and B . Connect the inputs to the first NAND gate to A and B . Take the output and feed it into both inputs of the second NAND gate and presto, AND . 10 PROBLEM SOLUTIONS FOR CHAPTER 3 8. Inputs D 1 , D 2 , D 4 , and D 7 are all connected to ground The other four inputs are tied to V cc . 9. Input line D 0 supplies the output for truth table rows 0000 and 0001. Input line D 1 supplies the output for truth table rows 0010 and 0011, and so on. For each case, the function values for the two rows can be 00, 01, 10, and 11. If they are 00, just wire the input to ground; if they are 11, just wire it to V cc . If they are 01, notice that they are just the same as the fourth input variable, D , so wire it to D . If they are 10, wire it to D . The truth table values for the ex- ample, from 0000 to 1111, are 0111001110100001, so the circuit is as follows. D 0 V cc D 1 D 2 D 3 D 4 D 5 D 6 D 7 A B C P D 10. The encoder looks like this. Note that line 0 is not used. Low-order bit Input 0 1 High-order bit 2 3 PROBLEM SOLUTIONS FOR CHAPTER 3 11 11. The demultiplexer looks like this. B AB AB AB Input A B Selected by A 12. It is a half adder with C as the sum and D as the carry. 13. The chip needs four pins for the first operand, four pins for the second oper- and, four pins for the result, one pin for carry in, and one pin for carry out (to make it cascadable), plus power and ground, for a total of 16 pins. 14. The carry into stage i can be written as C i = P i − 1 + S i − 1 C i − 1 , where P i − 1 is the product term A i − 1 B i − 1 and S i − 1 is the sum term A i − 1 + B i − 1 . This result follows directly from the fact that a carry is generated from a stage if both op- erands are 1, or if one operand and the carry in are both 1. For example, C 0 = 0 C 1 = P 0 + S 0 C 0 = P 0 C 2 = P 1 + S 1 C 1 = P 1 + P 0 S 1 C 3 = P 2 + S 2 C 2 = P 2 + P 1 S 2 + P 0 S 1 S 2 C 4 = P 3 + S 3 C 3 = P 3 + P 2 S 3 + P 1 S 2 S 3 + P 0 S 1 S 2 S 3 As soon as the inputs, A and B , are available, all the P and S terms can be gen- erated simultaneously in one gate delay time. Then the various AND terms such as P 0 S 1 S 2 can be generated in a second gate delay. Finally, all the carries can be produced in a third gate delay. Thus, all the carries are available after three gate delays, no matter how many stages the adder has. The price paid for this speedup is a considerable number of additional gates, of course. 15. The timing of the circuit can be found by writing a 0 on each of the input lines and then tracing them through the circuit, adding 2 at each gate. The A input takes 4 nsec to become available, so the output of the logic unit takes 8 nsec and the final output for a Boolean operation takes 10 nsec. The decode lines that drive the logic unit each have two gate delays, so the enable lines are set 12 PROBLEM SOLUTIONS FOR CHAPTER 3 in plenty of time. The adder also takes 6 nsec to produce its contribution to the output gate after it has A . Worst case through the whole circuit is 12 nsec. 16. The ALU is already capable of subtraction. Note that in 2’s complement, B − A = B + ( − A ). To get − A , we can use A + 1. Thus, to subtract A from B we need to add B , A and then increment the sum. The circuit can already do this by asserting INVA and INC and then adding the two inputs. 17. A basic cycle is 11 nsec, including propagation. Sixteen cycles take 176 nsec. However, the last propagation is not needed, so the answer is available 175 nsec after the start. 18. One way is to negate ENB , forcing B to 0, then choosing function code 01 to select B as the ALU function. The 1’s complement of 0 is − 1 in 2’s comple- ment (all 1 bits). As long as INC is negated, the control lines for A do not mat- ter. A second way is to negate and invert A , putting all 1s on the A input. Then negate ENB to force B to 0. With A equal to − 1 and B equal to 0, we can either add or OR them together. 19. The NAND latch is wired the same way as the NOR latch. Normally, both in- puts should be 1 to achieve consistency between input and output. 20. Use the same circuit, but replace the AND gate in the pulse generator by a NOR gate. The only time both inputs will be low is just after the falling edge. 21. The design uses two AND gates for chip enable logic plus two AND gates per word select line plus one AND gate per data bit. For a 256 × 8 memory this comes to 2 + 512 + 2048 = 2562 AND gates. The circuit also uses one OR gate for each bit in the word; hence eight of them would be needed. 22. It will not fly. Each of the four flip-flops needs three pins, for D , Q , and Q . In addition, the chip needs clock, power, and ground. Unless you plan on putting out the world’s first 15-pin chip, you are going to have to use a 16-pin pack- age, thus wasting a pin. Thus, the design is not very efficient. Since an extra pin is available, you might as well do something with it to make the chip more flexible. For example, you might use the sixteenth pin to reset all the FFs. 23. The pins can be multiplexed in time. For example, with n /2 pins, half the bits are presented in one cycle, and the other half in a succeeding cycle. Many dy- namic RAMs work this way. Even more extreme is to feed the address serially into the chip a bit at a time using a single pin. 24. A 32-bit-wide data bus means that 32 chips must be used in parallel, each chip providing 1 bit. Thus, the smallest memory consists of 32 chips, which is 32 megabits or 4 Mbytes. PROBLEM SOLUTIONS FOR CHAPTER 3 13 25. With a 20-nsec clock period, MREQ might be asserted as late as 13 nsec into T 1 . The data are required 2 nsec before the high-to-low transition in T 3 , which oc- curs 10 nsec after the start of the cycle. From the midpoint of T 1 to the mid- point of T 3 is 40 nsec. Since the memory cannot start until 3 nsec after the transition in the first cycle and has to be done 2 nsec before the transition in the third cycle, in the worst case the memory has only 35 nsec to respond. 26. The memory would now have 20 − 3 − 4 = 13 nsec to respond after MREQ and RD are asserted. There is not a lot of margin left, but if all the memory chips can always respond in 10 nsec, the system will still work. 27. In principle it could be negative, but it has to be asserted after the address is stable because on a write, the memory must not start changing the bytes ad- dressed until the address is valid. On reads it does not matter. 28. In regular mode, it takes three cycles per transfer. In block mode, it takes one cycle per transfer once the transfer has started. Therefore block transfers have almost three times as much bandwidth. The bus width does not matter, so it is still three times more, even for 32-bit transfers. 29. The CPU cannot assert MSYN until it has provided an address so T A 1 < T MSYN 1 , T MREQ 1 < T MSYN 1 , and T RD 1 < T MSYN 1 . The data cannot be asserted before MSYN , so T MSYN 1 < T D 1 and T D 1 < T SSYN 1 . Once SSYN has been asserted, the master can clean up, so T SSYN 1 < T MSYN 2 , T SSYN 1 < T RD 2 , T SSYN 1 < T MREQ 2 , and T SSYN 1 < T A 2 . Finally, T MSYN 2 < T D 2 and T MSYN 2 < T SSYN 2 . 30. In a multicore system, the cores can share caches and primary memory easily. Having a shared memory multiprocessor makes programming many applica- tions easier. There is often also a high-bandwidth interconnection between the cores for signaling, etc.
Study Now!
XY-Copilot AI
Unlimited Access
Secure Payment
Instant Access
24/7 Support
Document Chat
Document Details
Subject
Information Technology