[Simh] CMP R3,(R3)+

Discussion:

[Simh] CMP R3,(R3)+

Lars Brinkhoff

2018-07-29 10:42:19 UTC

Hello,

I have a very small debugger for the GT40 called URUG, or micro RUG. It
has two troublesome instructions: CMP R3,(R3)+ and equivalent with R4.
I suppose SIMH will run it fine even though there's a hazard?

The PALX assembler complains about this, so I'm considering chaning the
code. As far as I can see, the instructions are used to add 2 to a
register. It's shorter than an ADD R3,#2, which is important because
there's not a lot of memory on this machine.

Would there be any possible downside to using TST (R3)+ instead?

The whole file is here:
https://github.com/PDP-10/its-vault/blob/master/files/sysen2/urug.27

Timothe Litt

2018-07-29 20:42:05 UTC

Permalink

Post by Lars Brinkhoff
Hello,
I have a very small debugger for the GT40 called URUG, or micro RUG. It
has two troublesome instructions: CMP R3,(R3)+ and equivalent with R4.
I suppose SIMH will run it fine even though there's a hazard?
The PALX assembler complains about this, so I'm considering chaning the
code. As far as I can see, the instructions are used to add 2 to a
register. It's shorter than an ADD R3,#2, which is important because
there's not a lot of memory on this machine.
Would there be any possible downside to using TST (R3)+ instead?
https://github.com/PDP-10/its-vault/blob/master/files/sysen2/urug.27

I think the 11/20 had a bug and compared C(r3)+2 to @R3; the original
intent was that it would compare C(r3) to @R3; then increment R3.Â I
don't recall if it was fixed in later machines.

In any case, the CMP's purpose is to set the condition codes - e.g. it
does a subtract to set the condition codes, comparing the register
contents with memory.Â Aside from the autoincrement, it has no other
side effects.

Your code doesn't use the condition codes, so there's no difference
between your cmp and a TST (R3)+.

In either case, the instruction is making a memory reference.Â So R3
must not point at NXM (or many I/O devices, which have read side-effects).

As long as R3 points to a valid location in core memory, TST should be fine.

Lars Brinkhoff

2018-07-30 11:17:36 UTC

Permalink

Your code doesn't use the condition codes, so there's no difference between
your cmp and a TST (R3)+.

Thank you!

OPR %R,(R)+ is listed as case 1. The difference between cases is
whether R is incremented before using it as the source operand.

Yes, that's why PALX is complaining.

Both instructions affect all 4 flags, so no code can already depend on
preserving any.

Thanks!

I did this change, and alerted the original author. :-)

Paul Koning

2018-07-30 13:30:22 UTC

Permalink

Not exactly a hazard, but a documented incompatibility among implementations. In fact, there exists code that does something very similar to tell an 11/20 from others:

CMP PC,#.+4 ;SEE WHEN PC GETS INCREMENTED
BEQ O.ST01 ;IS IS A 11/20, ...

Post by Lars Brinkhoff
The PALX assembler complains about this, so I'm considering chaning the
code. As far as I can see, the instructions are used to add 2 to a
register. It's shorter than an ADD R3,#2, which is important because
there's not a lot of memory on this machine.
Would there be any possible downside to using TST (R3)+ instead?

Yes, that is the standard way to do this. I have never seen the code you quoted before and I can't imagine any reason for doing that.

Either option of course only works if R3 contains a valid memory address, and it must be even. A short way to increment by 2 that doesn't depend on R3 being even would be CMPB (R3)+,(R3)+.

It's fairly common to see the TST, not just because it's shorter, but also because it has a well known effect on the C condition code (it clears it). For example, a common pattern when C is used to indicate success/fail in a subroutine:

TST (PC)+ ; Indicate success
fail: SEC
MOV (SP)+,R1 ; ...
RTS PC

You might also see code that pops a no longer needed value from the stack, either clearing or setting C or leaving it alone. To clear, you'd see TST (SP)+. To set, COM (SP)+. To leave it untouched, INC (SP)+. (More obscure is NEG, which sets C if the operand is non-zero and clears it if it is zero.)

paul

Timothe Litt

2018-07-30 13:51:17 UTC

Permalink

Post by Paul Koning
Yes, that is the standard way to do this. I have never seen the code you quoted before and I can't imagine any reason for doing that.

A memory address test's verification pass.Â Check thatÂ memory contains
address of self. Of course, you need a

Â Â Â bne fail
following the compare :-)

Post by Paul Koning
Either option of course only works if R3 contains a valid memory address, and it must be even.

I should have noted that "valid memory address" includes "even" for
words.Â But if the code provided works on any 11 (obviously, not the
11/20), that constraint is met.

Post by Paul Koning
A short way to increment by 2 that doesn't depend on R3 being even would be CMPB (R3)+,(R3)+.
TST (PC)+ ; Indicate success
fail: SEC
MOV (SP)+,R1 ; ...
RTS PC
You might also see code that pops a no longer needed value from the stack, either clearing or setting C or leaving it alone. To clear, you'd see TST (SP)+. To set, COM (SP)+. To leave it untouched, INC (SP)+. (More obscure is NEG, which sets C if the operand is non-zero and clears it if it is zero.)

The C bit was a very common way of returning success/failure from
subroutines and system services.Â In his case, however, the condition
codes were ignored in all paths from the instruction.Â It was just a
very odd way of adding 2.

Those constructs bring back memories... particularly of debugging such
clever code that didn't have the corresponding comment.Â I often worked
on several machines with slightly different ideas of condition codes;
switching took some effort.Â Clever coding is fine - as long as you
document it.

BLISS got pretty good at being clever - but never at commenting its
assembler code.Â Some of its contortions caused CPU architects to pause
before agreeing that the code should work.Â On a few occasions, SHOULD
and DID diverged...

Paul Koning

2018-07-30 14:06:32 UTC

Permalink

Post by Paul Koning
Yes, that is the standard way to do this. I have never seen the code you quoted before and I can't imagine any reason for doing that.

A memory address test's verification pass. Check that memory contains address of self. Of course, you need a
bne fail
following the compare :-)

Oh, ok. Yes, for a memory test that makes sense, but now the incompatibility matters and you have to split the instruction into two.

...
Those constructs bring back memories... particularly of debugging such clever code that didn't have the corresponding comment. I often worked on several machines with slightly different ideas of condition codes; switching took some effort. Clever coding is fine - as long as you document it.
BLISS got pretty good at being clever - but never at commenting its assembler code. Some of its contortions caused CPU architects to pause before agreeing that the code should work. On a few occasions, SHOULD and DID diverged...

One example I remember that puzzled me the first time I saw it is CMP (SP)+, (PC)+ which is "pop a word from the stack and skip the next (one-word) instruction".

Then there is the classic "one word to write all of memory" -- 014747 dropped into the last word of memory and executed. For extra credit, there is a one-word program that *clears* all of memory.

paul

Johnny Billquist

2018-07-30 15:45:48 UTC

Permalink

Post by Paul Koning
Yes, that is the standard way to do this. I have never seen the code you quoted before and I can't imagine any reason for doing that.

A memory address test's verification pass. Check that memory contains
address of self. Of course, you need a
bne fail
following the compare :-)

Not to mention that it will succeed or fail depending on which PDP-11
model you run the code on? :-)

Just as Paul, I have never seen anyone actually do CMP R3,(R3)+. Most
assemblers will give a warning just because the actual values used will
vary depending on model.

Post by Paul Koning
Either option of course only works if R3 contains a valid memory address, and it must be even.

I should have noted that "valid memory address" includes "even" for
words. But if the code provided works on any 11 (obviously, not the
11/20), that constraint is met.

I thought about that one for a moment as well, but since the results of
the compare itself was obviously irrelevant, using a TST instead of a
CMP does not make the code any more error prone than before.
The CMP would have barfed on an illegal or odd address already. Using a
TST will cause the same effects, except that condition codes will be
different at the end. In all other aspects they produce the same result.

Johnny

--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: ***@softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol

Timothe Litt

2018-07-30 18:22:28 UTC

Permalink

Post by Johnny Billquist

Post by Timothe Litt

Yes, that is the standard way to do this.Â I have never seen the
code you quoted before and I can't imagine any reason for doing that.

A memory address test's verification pass.Â Check thatÂ memory
contains address of self. Of course, you need a
Â Â Â Â bne fail
following the compare :-)

Not to mention that it will succeed or fail depending on which PDP-11
model you run the code on? :-)

Oddly enough, we did have quality control, and it usually worked.

Diagnostics are often CPU-specific.Â This was fixed after the 11/20.
I might have done this - it represents a 40% savings in instructions for
the loop:
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ;11/20 safeÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Many other
11sÂ Â Â Â Â Â Â Â Â Some other 11s
Â Â Â 10$: movr3, r0Â Â 10$:Â cmp r3,(r3)+Â Â Â Â 10$: cmp r3,(r3)+
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â cmp r0, (r3)+ Â Â Â Â Â Â Â Â bne Â Â failÂ
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â bne fail
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â bneÂ failÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â sobÂ r1,
10$Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â dec r1
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â decÂ
r1Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â
bne 10$
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â bneÂ 10$

20-40% reduction in instructions of an inner loop at boot time is worth
a runtime check
for the 11/20 - if I cared (e.g. a BIST is probably in
processor-specific ROM, so no need to check).

Rhialto

2018-07-29 20:59:56 UTC

Permalink

In the "PDP-11 Architecture Handbook 1983-84" (EB-23657-18) there is a
convenient appendix listing the differences between various
implementations of the PDP-11.

OPR %R,(R)+ is listed as case 1. The difference between cases is whether
R is incremented before using it as the source operand. Apart from that
no issues are listed, so if you don't care about that difference there is
no problem.

Post by Lars Brinkhoff
Would there be any possible downside to using TST (R3)+ instead?

Both instructions affect all 4 flags, so no code can already depend on
preserving any.

-Olaf.

--
___ Olaf 'Rhialto' Seibert -- Wayland: Those who don't understand X
\X/ rhialto/at/falu.nl -- are condemned to reinvent it. Poorly.

John Dundas

2018-07-30 18:41:50 UTC

Permalink

Post by Lars Brinkhoff
I have a very small debugger for the GT40 called URUG, or micro RUG. It
has two troublesome instructions: CMP R3,(R3)+ and equivalent with R4.
I suppose SIMH will run it fine even though there's a hazard?
The PALX assembler complains about this, so I'm considering chaning the
code. As far as I can see, the instructions are used to add 2 to a
register. It's shorter than an ADD R3,#2, which is important because
there's not a lot of memory on this machine.

The syntax should be:

ADD #2,R3

in PAL/PALX/MACRO-11. But this generates two words rather than one word
as used by the CMP instruction.

Post by Lars Brinkhoff
Would there be any possible downside to using TST (R3)+ instead?

A number of possibilities have already been covered. Also consider:

BIT (R3)+,(R3)+

(different result in the condition codes) and

INC R3
INC R3

Note this is two words but may be faster as it is two instruction
fetches and no gratuitous memory references.

John