Discussion:
[Simh] DMA access to the IO page
Bob Supnik
2018-09-06 02:13:47 UTC
Permalink
Apparently, the GT40 does this. So... problems.

1. The original simulator I wrote didn't support DMA to IO space. Code
for this was added in V4, but the code conformed to the internal
simulator convention that all addresses are 22b wide. This is certainly
not the case for Unibus DMA devices; those addresses are 18b wide. So
the V4 code to test for the IO page needs to be bus-type sensitive.

I'm assuming the GT40 either generates 18b addresses via Address
Extension bits or does the 16b -> 18b conversion itself, including the
IO page test. The "Unibus" does NOT do the IO page recognition/sign
extension to 18b. In all Unibus systems, that's done in the CPU.

On Qbus systems, IO page references are distinguished by the assertion
of BBS7, and only address bits <12:0> matter. Either the CPU or a DMA
device can assert BBS7, but I don't think the standard Qbus chips ever
assert BBS7, so I don't think standard Qbus DMA devices can access the
IO page. I could be wrong on this.

2. More critically, while all IO space addresses are accessible from the
CPU, not all are accessible from DMA. In particular, internal CPU
registers are not, at least on the systems I'm familiar with. (And I
think Unibus map registers aren't either.) CPUs, in general, didn't need
to monitor DMA activity, except for systems with internal caches, like
the 11/70. I know for a fact that the F11 and J11 simply ignored DMA
activity. The cache, if any, was external to the CPU, and it was the
responsibility of the cache controller to deal with DMA activity.

At the moment, the PDP11 simulator makes no distinction between IO page
addresses that are CPU-internal vs bus-external. Without this, DMA
devices can do truly evil things, like overwrite the PSW or memory
management registers, that they couldn't do on a real system. So a data
structure needs to be added to distinguish internal from external IO
space addresses, code needs to be added to distinguish internal DIB
entries from external, and call flags added to the IO page read/write
routines to distinguish CPU access from DMA access.

/Bob
Lars Brinkhoff
2018-09-06 06:32:11 UTC
Permalink
Post by Bob Supnik
Apparently, the GT40 does this. So... problems.
Yes, the boot ROM includes instructions for the VT11 display processor.

Admittedly, I'm quite clueless when it comes to details about the PDP-11
and Unibus. The vt_fetch function in pdp11_vt.c does go through the I/O
map, but apparently that by itself isn't enough to access the ROM
region. The GT40 is built on a 11/05 which doesn't have an I/O map,
right?

I worked around this by adding a direct call to iopageR. It's a
temporary solution for running the GT40 ROM code, and as such it works.
I apologize if this makes anyone cringe. I'll be happy to throw it out
when a real solution appears.

. . .
Post by Bob Supnik
CPUs, in general, didn't need to monitor DMA activity, except for
systems with internal caches, like the 11/70.
I'll take this opportunity to also mention that I tested something
labelled "PDP-11/60, 70 Console/Diagnostic ROM". It checks bit 0 in the
177752 register, which is related to the cache. Doing "set hitmiss 1"
in SIMH makes the ROM happy, but I'm sure that's only an approximation
of the real hardware function. Secondly, it retrieves an address from
704 and jumps to that address added by two. I don't know what's
supposed to be there.
Angelo Papenhoff
2018-09-06 06:56:11 UTC
Permalink
Post by Lars Brinkhoff
Post by Bob Supnik
Apparently, the GT40 does this. So... problems.
Yes, the boot ROM includes instructions for the VT11 display processor.
Admittedly, I'm quite clueless when it comes to details about the PDP-11
and Unibus. The vt_fetch function in pdp11_vt.c does go through the I/O
map, but apparently that by itself isn't enough to access the ROM
region. The GT40 is built on a 11/05 which doesn't have an I/O map,
right?
Having written a PDP-11/05 emulator recently I can explain how things
work there.
As Bob said, internal registers are not visible on the Unibus,
the CPU handles the internal addresses itself and "simulates" a Unibus
transaction.
Everything else it puts on the Unibus, the top page of the 16 bit
address space is mapped to the top page of the 18 bit address space,
the rest is identity mapped. This happens, as Bob said, on the CPU.
The internal registers are:
the PSW, the 16 scratchpad registers (8 GPRs + some temporaries
used by the μcode. words addressed at even and odd locations!!),
the switches, the line clock, the serial console (KL11 look-alike).
On top of that the internal "KL11" and(?) the line clock could be disabled
so you could use a real Unibus peripheral instead.

aap
Don North
2018-09-06 08:31:48 UTC
Permalink
Post by Lars Brinkhoff
Post by Bob Supnik
Apparently, the GT40 does this. So... problems.
Yes, the boot ROM includes instructions for the VT11 display processor.
Admittedly, I'm quite clueless when it comes to details about the PDP-11
and Unibus. The vt_fetch function in pdp11_vt.c does go through the I/O
map, but apparently that by itself isn't enough to access the ROM
region. The GT40 is built on a 11/05 which doesn't have an I/O map,
right?
I worked around this by adding a direct call to iopageR. It's a
temporary solution for running the GT40 ROM code, and as such it works.
I apologize if this makes anyone cringe. I'll be happy to throw it out
when a real solution appears.
. . .
Post by Bob Supnik
CPUs, in general, didn't need to monitor DMA activity, except for
systems with internal caches, like the 11/70.
I'll take this opportunity to also mention that I tested something
labelled "PDP-11/60, 70 Console/Diagnostic ROM". It checks bit 0 in the
177752 register, which is related to the cache. Doing "set hitmiss 1"
in SIMH makes the ROM happy, but I'm sure that's only an approximation
of the real hardware function. Secondly, it retrieves an address from
704 and jumps to that address added by two. I don't know what's
supposed to be there.
Locations 700-704 are used to save registers r0,r1,r4 when running the
diagnostics, when called from a device boot prom. Normally when booting via an
M9312 device boot prom about the first thing the device boot prom does is jump
into the CPU console prom to run the simple go/nogo CPU diagnostics. Prior to
running the actual diagnostics the code saves r0,r1,r4 of the caller in
locations 700-704. Register r4 contains the return address-2 where the console
prom diagnostics should return to if the diagnostics pass. So when the
diagnostics complete it does a 'move @#704,r4; jmp 2(r4)' to return to the
device boot prom to then execute the actual boot code.

This is all detailed in the M9312 console prom listings and user guide on
bitsavers.

Don North
Johnny Billquist
2018-09-06 21:51:18 UTC
Permalink
Post by Lars Brinkhoff
Post by Bob Supnik
Apparently, the GT40 does this. So... problems.
Yes, the boot ROM includes instructions for the VT11 display processor.
Admittedly, I'm quite clueless when it comes to details about the PDP-11
and Unibus. The vt_fetch function in pdp11_vt.c does go through the I/O
map, but apparently that by itself isn't enough to access the ROM
region. The GT40 is built on a 11/05 which doesn't have an I/O map,
right?
I worked around this by adding a direct call to iopageR. It's a
temporary solution for running the GT40 ROM code, and as such it works.
I apologize if this makes anyone cringe. I'll be happy to throw it out
when a real solution appears.
It seems like a rather unfortunate design quirk in simh.
A real Unibus don't distinguish between the I/O page any anything else.
In fact, from the Unibus point of view, there is no "I/O page". It's a
flat 18-bit address space where everything works the same.

Memory is a bit special in that it is always just a slave on the Unibus,
and never master. And memory usually appear anywhere.
Many other devices can be both masters and slaves, and are usually
addressed in the high 8Kbyte, but there is nothing that says they have to.
Finally, the CPU is always master, and never slave. So no other devices
can access the CPU.

The I/O page concept, as such, is a construction in the CPU. It's just
so that the high 8K of the 64K address space on a PDP-11 without MMU
gets converted to the high 8K of the Unibus address space when
referenced. And of course, if the PDP-11 have an MMU, then you'll have
to make sure you have such a mapping yourself.

So, of course DMA can happen to the high 8K of the Unibus. No different
than any other address.
Post by Lars Brinkhoff
Post by Bob Supnik
CPUs, in general, didn't need to monitor DMA activity, except for
systems with internal caches, like the 11/70.
I'll take this opportunity to also mention that I tested something
labelled "PDP-11/60, 70 Console/Diagnostic ROM". It checks bit 0 in the
177752 register, which is related to the cache. Doing "set hitmiss 1"
in SIMH makes the ROM happy, but I'm sure that's only an approximation
of the real hardware function. Secondly, it retrieves an address from
704 and jumps to that address added by two. I don't know what's
supposed to be there.
The 11/60 and 11/70 diagnostics rom do a test that the cache is working
by doing accesses twice to some memory and checking that you are getting
cache hits registered.
And yes, 177752 is the cache hit/miss register, which holds the hit/miss
state of the last 6 memory accesses.

As for the start address, the diagnostics ROM is supposed to be jumped
to from the device specific bootstrap. The first thing the diagnostics
does is preserving registers R0,R1,R4 in 700,702,704.
R4 is supposed to hold the address from where the diagnostics were
called, and so at the end of the diagnostics, the code returns from
where it was called, but without making use of the stack.

Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: ***@softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
Paul Koning
2018-09-06 13:16:21 UTC
Permalink
Post by Bob Supnik
Apparently, the GT40 does this. So... problems.
FWIW, I've done this too, in college on our 11/20. That was for a primitive CRT display (an X/Y scope attached to an AD11 (?) D/A converter) which had its data fed to it by DMA from the RC11 system disk. It was set to write to the D/A data register, with bus address increment inhibit set. Crude, but it worked.

paul
Timothe Litt
2018-09-06 18:32:43 UTC
Permalink
Post by Bob Supnik
Apparently, the GT40 does this. So... problems.
The KDP (KMC/DUP) and KDZ (KMC/DZ) also do this.   Once the KMC11 was
available, this became a fairly popular method to off-load character
processing & turn character interrupts into DMA.  When I did the KDP
emulation, Mark & I negotiated a private API rather than emulate NPR to
the DUP.  See pdp11_kmc.c.  (In the hardware, the KMC ucode runs the dup
(or dz) with interrupts disabled & polls the DUP/DZ CSRs often enough to
catch individual characters.  It then DMAs validated messages into
memory.  The polling would have been pretty expensive to emulate.)

I can't speak to whether any Qbus device does DMA to the IO page -
off-hand, I can't think of a DEC device that would have done that. 

It's certainly true that if CPU internal registers were accessible to
DMA writes, bad things could happen.  However, it may not be necessary
to fence them off in the emulator.  In the hardware, I'd expect such
a device to get a NXM (bus timeout).  Or maybe the write is ignored.  So
unless a device exists that expects that behavior, which seems doubtful,
the issue can probably be ignored.  Of course, having said that,
there'll probably be some diagnostic that tests NPR timeouts that way :-( 

(Even more evil would be that some early (unibus) 11's had the GPRs in
I/O space -- I think an implementation artifact, but it was handy to be
able to read them directly off the switches for debugging.)
Post by Bob Supnik
1. The original simulator I wrote didn't support DMA to IO space. Code
for this was added in V4, but the code conformed to the internal
simulator convention that all addresses are 22b wide. This is
certainly not the case for Unibus DMA devices; those addresses are 18b
wide. So the V4 code to test for the IO page needs to be bus-type
sensitive.
I'm assuming the GT40 either generates 18b addresses via Address
Extension bits or does the 16b -> 18b conversion itself, including the
IO page test. The "Unibus" does NOT do the IO page recognition/sign
extension to 18b. In all Unibus systems, that's done in the CPU.
On Qbus systems, IO page references are distinguished by the assertion
of BBS7, and only address bits <12:0> matter. Either the CPU or a DMA
device can assert BBS7, but I don't think the standard Qbus chips ever
assert BBS7, so I don't think standard Qbus DMA devices can access the
IO page. I could be wrong on this.
2. More critically, while all IO space addresses are accessible from
the CPU, not all are accessible from DMA. In particular, internal CPU
registers are not, at least on the systems I'm familiar with. (And I
think Unibus map registers aren't either.) CPUs, in general, didn't
need to monitor DMA activity, except for systems with internal caches,
like the 11/70. I know for a fact that the F11 and J11 simply ignored
DMA activity. The cache, if any, was external to the CPU, and it was
the responsibility of the cache controller to deal with DMA activity.
At the moment, the PDP11 simulator makes no distinction between IO
page addresses that are CPU-internal vs bus-external. Without this,
DMA devices can do truly evil things, like overwrite the PSW or memory
management registers, that they couldn't do on a real system. So a
data structure needs to be added to distinguish internal from external
IO space addresses, code needs to be added to distinguish internal DIB
entries from external, and call flags added to the IO page read/write
routines to distinguish CPU access from DMA access.
/Bob
_______________________________________________
Simh mailing list
http://mailman.trailing-edge.com/mailman/listinfo/simh
Mark Pizzolato
2018-09-06 19:04:19 UTC
Permalink
Post by Bob Supnik
Apparently, the GT40 does this. So... problems.
The KDP (KMC/DUP) and KDZ (KMC/DZ) also do this.   Once the KMC11
was available, this became a fairly popular method to off-load character
processing & turn character interrupts into DMA.  When I did the KDP
emulation, Mark & I negotiated a private API rather than emulate NPR
to the DUP.  See pdp11_kmc.c.  (In the hardware, the KMC ucode runs
the dup (or dz) with interrupts disabled & polls the DUP/DZ CSRs often
enough to catch individual characters.  It then DMAs validated messages
into memory.  The polling would have been pretty expensive to emulate.)
Along the way, while doing this, you had added the ability to access the
I/O page via DMA.
It's certainly true that if CPU internal registers were accessible to DMA
writes, bad things could happen.  However, it may not be necessary to
fence them off in the emulator.  In the hardware, I'd expect such
a device to get a NXM (bus timeout).  Or maybe the write is ignored. 
So unless a device exists that expects that behavior, which seems
doubtful, the issue can probably be ignored.  Of course, having said that,
there'll probably be some diagnostic that tests NPR timeouts that way :-( 
As it turns out, fencing them off wasn't hard at all and is now in the
master branch along with bus size bounds checking to more robustly
identify I/O page references.
Post by Bob Supnik
1. The original simulator I wrote didn't support DMA to IO space.
Code for this was added in V4, but the code conformed to the
internal simulator convention that all addresses are 22b wide.
This is certainly not the case for Unibus DMA devices; those
addresses are 18b wide. So the V4 code to test for the IO page
needs to be bus-type sensitive.
I'm assuming the GT40 either generates 18b addresses via Address
Extension bits or does the 16b -> 18b conversion itself, including
the IO page test. The "Unibus" does NOT do the IO page
recognition/sign extension to 18b. In all Unibus systems, that's
done in the CPU.
That is the core problem that raised this discussion and illuminated
the potential problems with the initial simulator DMA access to the
I/O page.

The case we were seeing only had a 16 bit addresses presented for a
DMA memory reference. The lack of high bits 16 and 17 was either
an implementation problem in the VT11 simulation or a bug in the
program we're running which never programmed the high bus
address bits. I'm leaving it to Lars to explore which of these is the
cause. The code that is running was not captured directly from ROMs
but was typed in from a listing in a manual. That precise code may
not have ever actually run from a ROM in the I/O page...

- Mark
Timothe Litt
2018-09-06 21:43:49 UTC
Permalink
Post by Mark Pizzolato
Post by Timothe Litt
Post by Bob Supnik
Apparently, the GT40 does this. So... problems.
The KDP (KMC/DUP) and KDZ (KMC/DZ) also do this.   Once the KMC11
was available, this became a fairly popular method to off-load character
processing & turn character interrupts into DMA.  When I did the KDP
emulation, Mark & I negotiated a private API rather than emulate NPR
to the DUP.  See pdp11_kmc.c.  (In the hardware, the KMC ucode runs
the dup (or dz) with interrupts disabled & polls the DUP/DZ CSRs often
enough to catch individual characters.  It then DMAs validated messages
into memory.  The polling would have been pretty expensive to emulate.)
Along the way, while doing this, you had added the ability to access the
I/O page via DMA.
Yes.  It's a hybrid.

The private interface handles DDCMP message transport, avoiding the
character overhead.  You do the framing, CRC & deliver (or accept)
messages.  The KMC does the DMA. 

However, the DUP CSRs are accessed by DMA for resetting/configuring the
DUP, modem control, loopback, and also to detect (via NXM) cases where
the OS tries to activate a non-existent DUP.  These are infrequent, and
didn't merit a private interface.
Post by Mark Pizzolato
I'm assuming the GT40 either generates 18b addresses via Address
Post by Timothe Litt
Extension bits or does the 16b -> 18b conversion itself, including
the IO page test. The "Unibus" does NOT do the IO page
recognition/sign extension to 18b. In all Unibus systems, that's
done in the CPU.
That is the core problem that raised this discussion and illuminated
the potential problems with the initial simulator DMA access to the
I/O page.
The case we were seeing only had a 16 bit addresses presented for a
DMA memory reference. The lack of high bits 16 and 17 was either
an implementation problem in the VT11 simulation or a bug in the
program we're running which never programmed the high bus
address bits. I'm leaving it to Lars to explore which of these is the
cause. The code that is running was not captured directly from ROMs
but was typed in from a listing in a manual. That precise code may
not have ever actually run from a ROM in the I/O page...
Probably the former.  The 11/05 doesn't have a MMU and uses 16 bit
addresses.  The bus master (CPU or device) must set bits 16 & 17 for any
access to the I/O page.  This is required for any Unibus peripheral to
work. 

This isn't a function of where the code resides.  There are two things
going on.  The CPU has to fetch code from I/O space - this isn't DMA,
but it does require the CPU to set the upper PA bits.  The display
processor does DMA to get graphics data - this is DMA.  If the ROM code
tells the display processor to fetch from ROM, that's DMA to I/O space. 
In that case, the display processor is responsible for setting the upper
PA bits.
Post by Mark Pizzolato
- Mark
Paul Koning
2018-09-06 19:06:34 UTC
Permalink
Post by Bob Supnik
Apparently, the GT40 does this. So... problems.
The KDP (KMC/DUP) and KDZ (KMC/DZ) also do this. Once the KMC11 was available, this became a fairly popular method to off-load character processing & turn character interrupts into DMA. When I did the KDP emulation, Mark & I negotiated a private API rather than emulate NPR to the DUP. See pdp11_kmc.c. (In the hardware, the KMC ucode runs the dup (or dz) with interrupts disabled & polls the DUP/DZ CSRs often enough to catch individual characters. It then DMAs validated messages into memory. The polling would have been pretty expensive to emulate.)
I can't speak to whether any Qbus device does DMA to the IO page - off-hand, I can't think of a DEC device that would have done that.
It's certainly true that if CPU internal registers were accessible to DMA writes, bad things could happen. However, it may not be necessary to fence them off in the emulator. In the hardware, I'd expect such
a device to get a NXM (bus timeout). Or maybe the write is ignored. So unless a device exists that expects that behavior, which seems doubtful, the issue can probably be ignored. Of course, having said that, there'll probably be some diagnostic that tests NPR timeouts that way :-(
(Even more evil would be that some early (unibus) 11's had the GPRs in I/O space -- I think an implementation artifact, but it was handy to be able to read them directly off the switches for debugging.)
GPRs in the I/O space is, for the most part, something visible only to console switches, not even the program in the CPU let alone I/O devices. But there is one famous example, the 11/05, which can execute code from the GPRs. And in that case, the PC increments by 1 rather than 2 because the GPR addresses are 1 apart rather than 2. This allows very short "is this CPU working" tests on an 11/05 without any functioning memory.

For SIMH, it would be sufficient to make non-CPU I/O space addresses visible to DMA. But as you said, if, say, the MMU addresses were to answer to DMA requests, that would probably be ok because existing programs are not going to do such a thing.

paul
Tim Shoppa
2018-09-07 00:13:50 UTC
Permalink
Bob, a completely supported configuration for Q-bus RT-11 was 30K words (up
to 170000) RAM on a MSV11D on a 11/23 or 11/03.

This started with, I think, V3B of RT-11 and continued through RT-11 5.7.

Similar systems could be configured with third party memory and would work
on Unibus 11/24's as well.

I never knew of any issues with DMA to the high memory addresses using DEC
DMA peripherals (16 bit or 18 bit address bus) or clones (almost all 18 bit
address bus).

The Falcon documentation shows how to enable memory even further up and a
patch in the RT-11 release notes shows how to configure a monitor to use
memory up higher than 30K.

I was told RSX-11S had support for 30K as well but I'm not as well versed
in RSX-11S configuration.

Tim
Post by Bob Supnik
Apparently, the GT40 does this. So... problems.
1. The original simulator I wrote didn't support DMA to IO space. Code
for this was added in V4, but the code conformed to the internal
simulator convention that all addresses are 22b wide. This is certainly
not the case for Unibus DMA devices; those addresses are 18b wide. So
the V4 code to test for the IO page needs to be bus-type sensitive.
I'm assuming the GT40 either generates 18b addresses via Address
Extension bits or does the 16b -> 18b conversion itself, including the
IO page test. The "Unibus" does NOT do the IO page recognition/sign
extension to 18b. In all Unibus systems, that's done in the CPU.
On Qbus systems, IO page references are distinguished by the assertion
of BBS7, and only address bits <12:0> matter. Either the CPU or a DMA
device can assert BBS7, but I don't think the standard Qbus chips ever
assert BBS7, so I don't think standard Qbus DMA devices can access the
IO page. I could be wrong on this.
2. More critically, while all IO space addresses are accessible from the
CPU, not all are accessible from DMA. In particular, internal CPU
registers are not, at least on the systems I'm familiar with. (And I
think Unibus map registers aren't either.) CPUs, in general, didn't need
to monitor DMA activity, except for systems with internal caches, like
the 11/70. I know for a fact that the F11 and J11 simply ignored DMA
activity. The cache, if any, was external to the CPU, and it was the
responsibility of the cache controller to deal with DMA activity.
At the moment, the PDP11 simulator makes no distinction between IO page
addresses that are CPU-internal vs bus-external. Without this, DMA
devices can do truly evil things, like overwrite the PSW or memory
management registers, that they couldn't do on a real system. So a data
structure needs to be added to distinguish internal from external IO
space addresses, code needs to be added to distinguish internal DIB
entries from external, and call flags added to the IO page read/write
routines to distinguish CPU access from DMA access.
/Bob
_______________________________________________
Simh mailing list
http://mailman.trailing-edge.com/mailman/listinfo/simh
Loading...