exsphl.txt 3.8 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132
  1. EX (SP),HL
  2. ==========
  3. This instruction does 5 memory transactions
  4. - 1 byte instruction fetch
  5. - 2 bytes data reads
  6. - 2 bytes data writes
  7. The goal of this test is to find out the order of these 5 transactions.
  8. Obviously the instruction fetch (F) is the first transaction.
  9. There is one read from location SP+0, I'll call this R0. One read from
  10. location SP+1 (R0). A write to SP+0 (W0) and a write to SP+1 (W1).
  11. R0 must come before WO and R1 must come before W1. This leaves the
  12. following possible orders (please check that I didn't miss some):
  13. a) F R0 W0 R1 W1
  14. b) F R0 R1 W0 W1
  15. c) F R0 R1 W1 W0
  16. d) F R1 W1 R0 W0
  17. e) F R1 R0 W1 W0
  18. f) F R1 R0 W0 W1
  19. * Z80
  20. In the document z80cpu_um.pdf you can find the number of cycles per M-cycle.
  21. For the EX (SP),HL instruction it lists:
  22. 5 M-cycles, 19 T-states, 4+3+4+3+5
  23. note: this doesn't include the extra wait-state introduced on MSX.
  24. This can be explained like this (note, I'm guessing here, but it seems
  25. to fit well on the number of T-states).
  26. M1: fetch/decode opcode 4 cycles
  27. M2: tmpL = read(SP) 3 cycles
  28. M3: ++SP ; tmpH = read(SP) 1+3 cycles
  29. M4: write(SP, H) 3 cycles
  30. M5: --SP ; write(SP, L) ; HL = tmp 1+3+1 cycles
  31. fetching opcode (or prefix) takes 4 cycles in all Z80 instructions
  32. reading/writing memory takes 3 cycles on Z80
  33. This corresponds with order c) [F R0 R1 W1 W0]
  34. * R800
  35. On R800, the instruction takes 7 cycles if SP+0 and SP+1 are in the same
  36. 256-byte page, 9 cycles if not (see r800test.txt for more details on this).
  37. So it seems there are two extra page-breaks in the latter case, order a)
  38. and d) need only 1 extra page-break (of course it's not impossible there
  39. still are 2 page-breaks, or one page-break and some other extra cost).
  40. In any case the timing doesn't give us enough information.
  41. We need another test to be sure about the order. We'll do this by setting
  42. the stack pointer to a memory-mapped IO region:
  43. In the MSXTurboR, in slot 3-3, there's a ROM mapper. Writes to the region
  44. 0x6C00-0x6FFF will switch the ROM in region 0x6000-0x7FFF. The content
  45. of the ROM is like this:
  46. page 0, address 0x6C80/0x6C81: F6 50
  47. 1 C3 3E
  48. 2 08 07
  49. Now execute the following program, with slot 3-3 selected in page 1
  50. (0x4000-0x7FFF).
  51. -------------------------------
  52. org #C000
  53. di
  54. ld (save),sp
  55. ld sp,#6C80
  56. xor a
  57. ld (#6C80),a
  58. ld hl,#0201
  59. ex (sp),hl
  60. ld de,(#6C80)
  61. ld sp,(save)
  62. ret
  63. save dw 0
  64. ------------------------------
  65. After running this program the registers HL/DE contain these values
  66. HL = 50F6
  67. DE = 3EC3
  68. Register DE contains the ROM content after both writes (W0,W1) are done.
  69. The value of DE comes from ROM page 1, so writing of L=1 (=W0) must come
  70. after writing of H=2 (=W1).
  71. Both register H and L contain the content of the initial ROM page
  72. (page 0), this means both reads are executed before either of the writes.
  73. This leaves two possibilities:
  74. c) F R0 R1 W1 W0 [FRrww] [FRRwW]
  75. e) F R1 R0 W1 W0 [FRrww] [FRRWW]
  76. I've also indicated the page-breaks when SP+0 and SP+1 are in the same or
  77. in a different page (again see r800test.txt for details). Possibility e)
  78. would require 10 cycles, possibility c) matches the measurement of 9
  79. cycles.
  80. * conclusion
  81. Both Z80 and R800 use the same order [F R0 R1 W1 W0]. The result on R800
  82. is based on measurements, for Z80 it's based on extrapolating T-state
  83. documentation. It would be nice to also actually measure it on Z80.
  84. The order [F R0 WO R1 W1] would have been better on R800: it would only
  85. require 8 cycles in case of a SP+0 SP+1 page break. Though that would be
  86. incompatible with Z80. Also most of the time it doesn't matter because
  87. the top-stack-word doesn't cross a page boundary too often (even never
  88. when the stack is two-bytes aligned).