Discussion:
[Simh] MicroVAX 3900 simulator fails BIST sometimes?
Robert Armstrong
2018-06-23 18:15:53 UTC
Permalink
I've found that on occasion the MicroVAX 3900 simh will fail the built in
self test. Just to be clear, this is the self test that's in the
MicroVAX-III EPROM image that's failing; it's not VMS.



legato-hecnet.sim-73> boot cpu



KA655X-B V5.3, VMB 2.7

Performing normal system tests.

40..39..38..37..36..35..34..33..32..31..



?53 2 0A FF 00 0000



P1=00000002 P2=00000028 P3=00002712 P4=00D40077 P5=00000001

P6=FFFFFFFF P7=00000000 P8=00000000 P9=00000000 P10=20051CE0

r0=00000007 r1=20140110 r2=6954A9F0 r3=0002FF6A r4=6954A9FA

r5=2004E8F9 r6=00018678 r7=000186C8 r8=0001ADB2 ERF=80000000

30..29..28..27..26..25..

24..23..22..21..20..19..18..17..16..15..14..13..12..11..10..09..

08..07..06..05..04..03..

Normal operation not possible.





Does anybody else have this problem? It seems to be random and not easily
reproducible. It happens once every dozen restarts or so, but when it does
happen I can just restart simh and it'll work fine the next time.



It's a bit annoying because it makes the auto-restart of my emulated VMS
system fail..



Bob
Mark Pizzolato
2018-06-23 18:55:55 UTC
Permalink
This problem is not unique to you. I have not been able to reproduce the problem with enough debug information to track down the details of the cause. The problem report relates to the boot ROM's internal clock/timer diagnostic. In all of my test cases, with debug information being gathered, Heisenberg effects mean that the problem is never seen. :(

You could side step the problem if you:

1) Didn't enable auto boot
AND

2) In your configuration file, you have 'EXPECT ">>>" SEND "BOOT DUA0\r"' before your existing BOOT command.

This will work since, although the "Normal operation not possible" message has been issued, the error is not actually severe enough to inhibit system operation. It just inhibits autoboot.

I haven't given up hope of eventually capturing the detail of the cause and thus fixing it. Once I give up hope, I'll revert to the original simh boot ROM strategy that had this clock/timer test disabled.


- Mark

From: Simh [mailto:simh-***@trailing-edge.com] On Behalf Of Robert Armstrong
Sent: Saturday, June 23, 2018 11:16 AM
To: ***@trailing-edge.com
Subject: [Simh] MicroVAX 3900 simulator fails BIST sometimes?

I've found that on occasion the MicroVAX 3900 simh will fail the built in self test. Just to be clear, this is the self test that's in the MicroVAX-III EPROM image that's failing; it's not VMS.

legato-hecnet.sim-73> boot cpu

KA655X-B V5.3, VMB 2.7
Performing normal system tests.
40..39..38..37..36..35..34..33..32..31..

?53 2 0A FF 00 0000

P1=00000002 P2=00000028 P3=00002712 P4=00D40077 P5=00000001
P6=FFFFFFFF P7=00000000 P8=00000000 P9=00000000 P10=20051CE0
r0=00000007 r1=20140110 r2=6954A9F0 r3=0002FF6A r4=6954A9FA
r5=2004E8F9 r6=00018678 r7=000186C8 r8=0001ADB2 ERF=80000000
30..29..28..27..26..25..
24..23..22..21..20..19..18..17..16..15..14..13..12..11..10..09..
08..07..06..05..04..03..
Normal operation not possible.


Does anybody else have this problem? It seems to be random and not easily reproducible. It happens once every dozen restarts or so, but when it does happen I can just restart simh and it'll work fine the next time.

It's a bit annoying because it makes the auto-restart of my emulated VMS system fail..

Bob
Robert Armstrong
2018-06-23 19:40:42 UTC
Permalink
Post by Mark Pizzolato
Heisenberg effects mean that the problem is never seen
Yep, the uncertainty principle applies to software as well...

I'm just curious - do we (well, I don't, but somebody) have source
listings for the KA655 EPROM? Supposedly one of the other hex numbers in
the "?53 2 0A FF 00 0000" message identifies the exact subtest that's
failing. It might be interesting to know that.

Thanks,
Bob
Mark Pizzolato
2018-06-23 21:11:46 UTC
Permalink
Post by Robert Armstrong
Post by Mark Pizzolato
Heisenberg effects mean that the problem is never seen
Yep, the uncertainty principle applies to software as well...
I'm just curious - do we (well, I don't, but somebody) have source listings for
the KA655 EPROM? Supposedly one of the other hex numbers in the "?53 2 0A
FF 00 0000" message identifies the exact subtest that's failing. It might be
interesting to know that.
The listings we have are for a close, but not precise, version that the ROM
is built from. Without regard to precise listings, the listings don't actually
have a mapping of the test numbers (and sub test numbers) to specific
tests. The assignment of tests is magically built by small macros that briefly
define PSECTs that each tests contributes a little content towards. There
must be some nice tool that can decode the resulting collection of test
numbers and map them back to particular tests. Maybe a special
interpretation of data in the link map might help get there, but how it is
done isn't particularly obvious.

ROM debugging is a very tedious process. You poke around in the
ROM binary and disassemble a section. Then you locate particular instruction
Sequences in the listing. Or you do the opposite (start from the listing and
look for instruction patterns in the ROM). You then put simh breakpoints in
some ROM locations that you want to stop at. The presence of breakpoints
adds to Heisenberg variances since simh breakpoints are implemented by
checking instruction fetch addresses rather than modifying the target code to
insert actual breakpoint instructions (which are architecturally meaningful
within the VAX simulation). For most things Heisenberg doesn't come into
play, but timing calibration stuff is the exception.

Anyway, this problem is benign enough, and the lack of solid debug data
has it sitting there unsolved.

- Mark
Alan Frisbie
2018-06-23 19:57:05 UTC
Permalink
So I'm not alone! I have seen the same thing, but since it
wasn't causing me any problems I didn't bother reporting it.

Alan Frisbie
Post by Mark Pizzolato
This problem is not unique to you. I have not been able to
reproduce the problem with enough debug information to track
down the details of the cause. The problem report relates
to the boot ROM’s internal clock/timer diagnostic. In all
of my test cases, with debug information being gathered,
Heisenberg effects mean that the problem is never seen.
I’ve found that on occasion the MicroVAX 3900 simh will fail
the built in self test. Just to be clear, this is the self
test that’s in the MicroVAX-III EPROM image that’s failing;
it’s not VMS.
Zane Healy
2018-06-24 05:32:30 UTC
Permalink
Sounds like I’ve been lucky. My main SIMH VAX uses a fairly current MicroVAX 3900 build, mainly for virtual tape support, the others are using the prebuilt 3.8 build available in Ubuntu and Raspian.

Zane
Post by Alan Frisbie
So I'm not alone! I have seen the same thing, but since it
wasn't causing me any problems I didn't bother reporting it.
Alan Frisbie
Post by Mark Pizzolato
This problem is not unique to you. I have not been able to
reproduce the problem with enough debug information to track
down the details of the cause. The problem report relates
to the boot ROM’s internal clock/timer diagnostic. In all
of my test cases, with debug information being gathered,
Heisenberg effects mean that the problem is never seen.
I’ve found that on occasion the MicroVAX 3900 simh will fail
the built in self test. Just to be clear, this is the self
test that’s in the MicroVAX-III EPROM image that’s failing;
it’s not VMS.
_______________________________________________
Simh mailing list
http://mailman.trailing-edge.com/mailman/listinfo/simh
Loading...