Discussion:
[Simh] DEC Alpha Emulation
Zane Healy
2018-02-02 04:55:37 UTC
Permalink
I’m in the process of slowly updating my DEC Emulation website, which has been largely stagnant since 2007. The first page I’m spending some significant time on is the DEC Alpha page, as I’m curious to see what the current options are. In reviewing things, I notice that Alpha emulation was worked on at some point for SIMH (probably around 2006). What is the current state of that work? I had a quick google, and the most popular hit seems to be mirrored copies of my site.

http://www.avanthar.com/healyzh/decemulation/Alpha.html

Thanks,
Zane
Vorländer, Martin
2018-02-02 09:16:06 UTC
Permalink
Zane,

just had a look at your Alpha page, and noticed that you missed an emulator:
FreeAXP / Avanti from Migration Specialties, http://www.migrationspecialties.com/
This is where the ES40 emulator went, AFAIK.

cu,
Martin

-----Ursprüngliche Nachricht-----
Von: Simh [mailto:simh-***@trailing-edge.com] Im Auftrag von Zane Healy
Gesendet: Freitag, 2. Februar 2018 05:56
An: SIMH <***@trailing-edge.com>
Betreff: [Simh] DEC Alpha Emulation

I'm in the process of slowly updating my DEC Emulation website, which has been largely stagnant since 2007. The first page I'm spending some significant time on is the DEC Alpha page, as I'm curious to see what the current options are. In reviewing things, I notice that Alpha emulation was worked on at some point for SIMH (probably around 2006). What is the current state of that work? I had a quick google, and the most popular hit seems to be mirrored copies of my site.

http://www.avanthar.com/healyzh/decemulation/Alpha.html

Thanks,
Zane



_______________________________________________
Simh mailing list
***@trailing-edge.com
http://mailman.trailing-edge.com/mailman/listinfo/simh
Zane Healy
2018-02-02 17:10:00 UTC
Permalink
Thanks, I thought I had that one listed. It’s there now, and the FPGA page is closer to being updated.

I’ll have to decide which I’m brave enough to tackle next, the VAX, PDP-11, or PDP-10 page. The PDP-10 one will be the most challenging one to update.

Zane
Post by Vorländer, Martin
Zane,
FreeAXP / Avanti from Migration Specialties, http://www.migrationspecialties.com/
This is where the ES40 emulator went, AFAIK.
cu,
Martin
-----Ursprüngliche Nachricht-----
Gesendet: Freitag, 2. Februar 2018 05:56
Betreff: [Simh] DEC Alpha Emulation
I'm in the process of slowly updating my DEC Emulation website, which has been largely stagnant since 2007. The first page I'm spending some significant time on is the DEC Alpha page, as I'm curious to see what the current options are. In reviewing things, I notice that Alpha emulation was worked on at some point for SIMH (probably around 2006). What is the current state of that work? I had a quick google, and the most popular hit seems to be mirrored copies of my site.
http://www.avanthar.com/healyzh/decemulation/Alpha.html
Thanks,
Zane
_______________________________________________
Simh mailing list
http://mailman.trailing-edge.com/mailman/listinfo/simh
Angelo Papenhoff
2018-02-03 11:43:26 UTC
Permalink
Post by Zane Healy
Thanks, I thought I had that one listed. It’s there now, and the FPGA page is closer to being updated.
I’ll have to decide which I’m brave enough to tackle next, the VAX, PDP-11, or PDP-10 page. The PDP-10 one will be the most challenging one to update.
Why is the page for the 10 so hard?
Lars is quite active in collecting and documenting things, check out
http://gunkies.org/wiki/PDP-10

aap
Zane Healy
2018-02-03 20:38:38 UTC
Permalink
Post by Angelo Papenhoff
Why is the page for the 10 so hard?
Lars is quite active in collecting and documenting things, check out
http://gunkies.org/wiki/PDP-10
aap
The page for the PDP-10 was the best of the pages I have, and is the original page for the site. As a result it needs a lot of updating. It looks like even the bit savers links need fixed.

I just took a look at Lars page, it will definitely help, and I need to add it to my list of links. I also see that he has updated the link to my page. :-)

Zane
Hittner, David T [US] (MS)
2018-02-02 18:01:03 UTC
Permalink
Well, Migration Specialties is where the developer of ES40 went, anyway. :-)

I'm working on the SIMH Alpha simulator. It is quite a complex beast, where the firmware required to power up the system to the chevrons (>>>) is an operating system of small Linux complexity even before you can try to boot OpenVMS, TRU64 or WinNT. I think the simulation currently runs some 56 billion instructions before encountering a missing hardware simulation, and it's nowhere near the chevrons - and hasn't even built the OS interface block or the FW-level device tables yet.

If anyone would like to help with the Alpha development, I wouldn't say no. It takes a long time to figure what the RISC firmware is trying to accomplish. :-)

Dave

-----Original Message-----
From: Simh [mailto:simh-***@trailing-edge.com] On Behalf Of Vorländer, Martin
Sent: Friday, February 2, 2018 3:16 AM
To: 'SIMH' <***@trailing-edge.com>
Subject: EXT :Re: [Simh] DEC Alpha Emulation

Zane,

just had a look at your Alpha page, and noticed that you missed an emulator:
FreeAXP / Avanti from Migration Specialties, http://www.migrationspecialties.com/
This is where the ES40 emulator went, AFAIK.

cu,
Martin

-----Ursprüngliche Nachricht-----
Von: Simh [mailto:simh-***@trailing-edge.com] Im Auftrag von Zane Healy
Gesendet: Freitag, 2. Februar 2018 05:56
An: SIMH <***@trailing-edge.com>
Betreff: [Simh] DEC Alpha Emulation

I'm in the process of slowly updating my DEC Emulation website, which has been largely stagnant since 2007. The first page I'm spending some significant time on is the DEC Alpha page, as I'm curious to see what the current options are. In reviewing things, I notice that Alpha emulation was worked on at some point for SIMH (probably around 2006). What is the current state of that work? I had a quick google, and the most popular hit seems to be mirrored copies of my site.

http://www.avanthar.com/healyzh/decemulation/Alpha.html

Thanks,
Zane



_______________________________________________
Simh mailing list
***@trailing-edge.com
http://mailman.trailing-edge.com/mailman/listinfo/simh
_______________________________________________
Simh mailing list
***@trailing-edge.com
http://mailman.trailing-edge.com/mailman/listinfo/simh
Zane Healy
2018-02-02 19:49:43 UTC
Permalink
Thanks for the information Dave! I’m glad to hear it’s still being worked on. I can well imagine what a pain the firmware is, especially if you’re trying to support both the SRM and ARC firmware. I wish I could help. From my standpoint, getting SIMH to boot OpenVMS would be a big deal.

Are you working to support a specific platform? I’m curious since not every system supports SRM or ARC, and many of the PCI boards were only supported by Windows NT.

Would you like me to put a note on the webpage for anyone interested in helping to contact you via the SIMH mailing list?

Zane
Post by Hittner, David T [US] (MS)
Well, Migration Specialties is where the developer of ES40 went, anyway. :-)
I'm working on the SIMH Alpha simulator. It is quite a complex beast, where the firmware required to power up the system to the chevrons (>>>) is an operating system of small Linux complexity even before you can try to boot OpenVMS, TRU64 or WinNT. I think the simulation currently runs some 56 billion instructions before encountering a missing hardware simulation, and it's nowhere near the chevrons - and hasn't even built the OS interface block or the FW-level device tables yet.
If anyone would like to help with the Alpha development, I wouldn't say no. It takes a long time to figure what the RISC firmware is trying to accomplish. :-)
Dave
-----Original Message-----
Sent: Friday, February 2, 2018 3:16 AM
Subject: EXT :Re: [Simh] DEC Alpha Emulation
Zane,
FreeAXP / Avanti from Migration Specialties, http://www.migrationspecialties.com/
This is where the ES40 emulator went, AFAIK.
cu,
Martin
-----Ursprüngliche Nachricht-----
Gesendet: Freitag, 2. Februar 2018 05:56
Betreff: [Simh] DEC Alpha Emulation
I'm in the process of slowly updating my DEC Emulation website, which has been largely stagnant since 2007. The first page I'm spending some significant time on is the DEC Alpha page, as I'm curious to see what the current options are. In reviewing things, I notice that Alpha emulation was worked on at some point for SIMH (probably around 2006). What is the current state of that work? I had a quick google, and the most popular hit seems to be mirrored copies of my site.
http://www.avanthar.com/healyzh/decemulation/Alpha.html
Thanks,
Zane
_______________________________________________
Simh mailing list
http://mailman.trailing-edge.com/mailman/listinfo/simh
_______________________________________________
Simh mailing list
http://mailman.trailing-edge.com/mailman/listinfo/simh
_______________________________________________
Simh mailing list
http://mailman.trailing-edge.com/mailman/listinfo/simh
Hittner, David T [US] (MS)
2018-02-02 21:06:06 UTC
Permalink
The Alpha target is the Digital Personal Workstation (PWS) 500au [codename: Miata], which qualifies as a single user workstation from a minimum licensing perspective.

Bob Supnik had one in his possession, and gave it to me to do comparisons. The Miata has both ARC and SRM firmware "full-flash", but the initial goal is to get SRM firmware to run OpenVMS and TRU64 using the serial console. Planned PCI boards are: DE500 ethernet, KZPAA(narrow) and KZP??(wide) SCSI disk/tape controllers, and the Digital PCI-to-PCI bridge which are all supported by every OS. If someone writes a VGA controller in the future, it should be one supported by every OS, like the fairly basic S3 TRIO32/64+.

Windows NT support on ARC firmware could come later, but would require the VGA card emulation, unless you wanted to run NT "headless" over the serial port, which is allowed by the NT specification. ARC firmware is much harder due to having an x86 register emulator, and an INT 10 BIOS emulator, as well as running everything in 32-bit mode.

Yes, I would appreciate it if you would put something on your web page to have interested coders contact me. I could use the help.

Dave

-----Original Message-----
From: Zane Healy [mailto:***@avanthar.com]
Sent: Friday, February 2, 2018 1:50 PM
To: Hittner, David T [US] (MS) <***@ngc.com>
Cc: "Vorländer, Martin" <***@pdv-systeme.de>; SIMH <***@trailing-edge.com>
Subject: EXT :Re: [Simh] DEC Alpha Emulation

Thanks for the information Dave! I’m glad to hear it’s still being worked on. I can well imagine what a pain the firmware is, especially if you’re trying to support both the SRM and ARC firmware. I wish I could help. From my standpoint, getting SIMH to boot OpenVMS would be a big deal.

Are you working to support a specific platform? I’m curious since not every system supports SRM or ARC, and many of the PCI boards were only supported by Windows NT.

Would you like me to put a note on the webpage for anyone interested in helping to contact you via the SIMH mailing list?

Zane
Post by Hittner, David T [US] (MS)
Well, Migration Specialties is where the developer of ES40 went, anyway. :-)
I'm working on the SIMH Alpha simulator. It is quite a complex beast, where the firmware required to power up the system to the chevrons (>>>) is an operating system of small Linux complexity even before you can try to boot OpenVMS, TRU64 or WinNT. I think the simulation currently runs some 56 billion instructions before encountering a missing hardware simulation, and it's nowhere near the chevrons - and hasn't even built the OS interface block or the FW-level device tables yet.
If anyone would like to help with the Alpha development, I wouldn't say no. It takes a long time to figure what the RISC firmware is trying to accomplish. :-)
Dave
-----Original Message-----
Sent: Friday, February 2, 2018 3:16 AM
Subject: EXT :Re: [Simh] DEC Alpha Emulation
Zane,
FreeAXP / Avanti from Migration Specialties, http://www.migrationspecialties.com/
This is where the ES40 emulator went, AFAIK.
cu,
Martin
-----Ursprüngliche Nachricht-----
Gesendet: Freitag, 2. Februar 2018 05:56
Betreff: [Simh] DEC Alpha Emulation
I'm in the process of slowly updating my DEC Emulation website, which has been largely stagnant since 2007. The first page I'm spending some significant time on is the DEC Alpha page, as I'm curious to see what the current options are. In reviewing things, I notice that Alpha emulation was worked on at some point for SIMH (probably around 2006). What is the current state of that work? I had a quick google, and the most popular hit seems to be mirrored copies of my site.
http://www.avanthar.com/healyzh/decemulation/Alpha.html
Thanks,
Zane
_______________________________________________
Simh mailing list
http://mailman.trailing-edge.com/mailman/listinfo/simh
_______________________________________________
Simh mailing list
http://mailman.trailing-edge.com/mailman/listinfo/simh
_______________________________________________
Simh mailing list
http://mailman.trailing-edge.com/mailman/listinfo/simh
Clem cole
2018-02-03 01:44:33 UTC
Permalink
Be careful. The comment about many of the pci boards were only NT while that is true really is somewhat different. For instance both vms an tru64 (and FreeBSD for that matter) will all boot and run fine with an Adaptec 1542 controller (the 500au in my office only had same in it)

The SPD never spoke of it because neither TruCluster nor VMScluster could properly do failover on them do to issues in the Adaptec microcode which they would not fix because the mass market never cared. But the default controller for NT/Alpha was just that controller

I mention this because while the qLogic controller is the official one for getting simh running it is going to be a PITA and the Adaptec might be a better first start

Send me email offline if you want more info btw.

Sent from my PDP-7 Running UNIX V0 expect things to be almost but not quite.
Post by Hittner, David T [US] (MS)
The Alpha target is the Digital Personal Workstation (PWS) 500au [codename: Miata], which qualifies as a single user workstation from a minimum licensing perspective.
Bob Supnik had one in his possession, and gave it to me to do comparisons. The Miata has both ARC and SRM firmware "full-flash", but the initial goal is to get SRM firmware to run OpenVMS and TRU64 using the serial console. Planned PCI boards are: DE500 ethernet, KZPAA(narrow) and KZP??(wide) SCSI disk/tape controllers, and the Digital PCI-to-PCI bridge which are all supported by every OS. If someone writes a VGA controller in the future, it should be one supported by every OS, like the fairly basic S3 TRIO32/64+.
Windows NT support on ARC firmware could come later, but would require the VGA card emulation, unless you wanted to run NT "headless" over the serial port, which is allowed by the NT specification. ARC firmware is much harder due to having an x86 register emulator, and an INT 10 BIOS emulator, as well as running everything in 32-bit mode.
Yes, I would appreciate it if you would put something on your web page to have interested coders contact me. I could use the help.
Dave
-----Original Message-----
Sent: Friday, February 2, 2018 1:50 PM
Subject: EXT :Re: [Simh] DEC Alpha Emulation
Thanks for the information Dave! I’m glad to hear it’s still being worked on. I can well imagine what a pain the firmware is, especially if you’re trying to support both the SRM and ARC firmware. I wish I could help. From my standpoint, getting SIMH to boot OpenVMS would be a big deal.
Are you working to support a specific platform? I’m curious since not every system supports SRM or ARC, and many of the PCI boards were only supported by Windows NT.
Would you like me to put a note on the webpage for anyone interested in helping to contact you via the SIMH mailing list?
Zane
Post by Hittner, David T [US] (MS)
Well, Migration Specialties is where the developer of ES40 went, anyway. :-)
I'm working on the SIMH Alpha simulator. It is quite a complex beast, where the firmware required to power up the system to the chevrons (>>>) is an operating system of small Linux complexity even before you can try to boot OpenVMS, TRU64 or WinNT. I think the simulation currently runs some 56 billion instructions before encountering a missing hardware simulation, and it's nowhere near the chevrons - and hasn't even built the OS interface block or the FW-level device tables yet.
If anyone would like to help with the Alpha development, I wouldn't say no. It takes a long time to figure what the RISC firmware is trying to accomplish. :-)
Dave
-----Original Message-----
Sent: Friday, February 2, 2018 3:16 AM
Subject: EXT :Re: [Simh] DEC Alpha Emulation
Zane,
FreeAXP / Avanti from Migration Specialties, http://www.migrationspecialties.com/
This is where the ES40 emulator went, AFAIK.
cu,
Martin
-----Ursprüngliche Nachricht-----
Gesendet: Freitag, 2. Februar 2018 05:56
Betreff: [Simh] DEC Alpha Emulation
I'm in the process of slowly updating my DEC Emulation website, which has been largely stagnant since 2007. The first page I'm spending some significant time on is the DEC Alpha page, as I'm curious to see what the current options are. In reviewing things, I notice that Alpha emulation was worked on at some point for SIMH (probably around 2006). What is the current state of that work? I had a quick google, and the most popular hit seems to be mirrored copies of my site.
http://www.avanthar.com/healyzh/decemulation/Alpha.html
Thanks,
Zane
_______________________________________________
Simh mailing list
http://mailman.trailing-edge.com/mailman/listinfo/simh
_______________________________________________
Simh mailing list
http://mailman.trailing-edge.com/mailman/listinfo/simh
_______________________________________________
Simh mailing list
http://mailman.trailing-edge.com/mailman/listinfo/simh
_______________________________________________
Simh mailing list
http://mailman.trailing-edge.com/mailman/listinfo/simh
Zane Healy
2018-02-04 00:28:32 UTC
Permalink
Now that you mention it, I was mainly thinking about the video cards, which aren’t fully applicable here. Though Network was another area besides disk where I remember it being a bit of a challenge. I wish I’d known that an Adaptec 1542 was an option, finding a card I could afford was a nightmare about 20 years ago. The Adaptec cards were “dime a dozen” around here back then. I had found a nearly new PWS 433a in a used computer shop, that didn’t know what it was worth, and with some effort got VMS running on it. It was an impressive system, even after one of the card slots went bad, it was rock solid.

It looks like MAME already may already have "Adaptec AHA-1542{,C,CF} SCSI Controller” and "PCI Number Nine 9FX Vision 330 2.03.10 (S3 Trio64)" support.

The S3 would be a good option for VGA emulation, and I suspect that the Elsa Gloria Synergy would be also (though I suspect it would take a lot more effort). The Elsa Gloria supports a higher screen resolution, and IIRC, more colours.

I’m reminded of how I loved the Matrox Millennium II 4MB cards for PC’s back then, as they were supported by basically every x86 OS (including OPENSTEP and BeOS).

Zane
Post by Clem cole
Be careful. The comment about many of the pci boards were only NT while that is true really is somewhat different. For instance both vms an tru64 (and FreeBSD for that matter) will all boot and run fine with an Adaptec 1542 controller (the 500au in my office only had same in it)
The SPD never spoke of it because neither TruCluster nor VMScluster could properly do failover on them do to issues in the Adaptec microcode which they would not fix because the mass market never cared. But the default controller for NT/Alpha was just that controller
I mention this because while the qLogic controller is the official one for getting simh running it is going to be a PITA and the Adaptec might be a better first start
Send me email offline if you want more info btw.
Sent from my PDP-7 Running UNIX V0 expect things to be almost but not quite.
Post by Hittner, David T [US] (MS)
The Alpha target is the Digital Personal Workstation (PWS) 500au [codename: Miata], which qualifies as a single user workstation from a minimum licensing perspective.
Bob Supnik had one in his possession, and gave it to me to do comparisons. The Miata has both ARC and SRM firmware "full-flash", but the initial goal is to get SRM firmware to run OpenVMS and TRU64 using the serial console. Planned PCI boards are: DE500 ethernet, KZPAA(narrow) and KZP??(wide) SCSI disk/tape controllers, and the Digital PCI-to-PCI bridge which are all supported by every OS. If someone writes a VGA controller in the future, it should be one supported by every OS, like the fairly basic S3 TRIO32/64+.
Windows NT support on ARC firmware could come later, but would require the VGA card emulation, unless you wanted to run NT "headless" over the serial port, which is allowed by the NT specification. ARC firmware is much harder due to having an x86 register emulator, and an INT 10 BIOS emulator, as well as running everything in 32-bit mode.
Yes, I would appreciate it if you would put something on your web page to have interested coders contact me. I could use the help.
Dave
-----Original Message-----
Sent: Friday, February 2, 2018 1:50 PM
Subject: EXT :Re: [Simh] DEC Alpha Emulation
Thanks for the information Dave! I’m glad to hear it’s still being worked on. I can well imagine what a pain the firmware is, especially if you’re trying to support both the SRM and ARC firmware. I wish I could help. From my standpoint, getting SIMH to boot OpenVMS would be a big deal.
Are you working to support a specific platform? I’m curious since not every system supports SRM or ARC, and many of the PCI boards were only supported by Windows NT.
Would you like me to put a note on the webpage for anyone interested in helping to contact you via the SIMH mailing list?
Zane
Post by Hittner, David T [US] (MS)
Well, Migration Specialties is where the developer of ES40 went, anyway. :-)
I'm working on the SIMH Alpha simulator. It is quite a complex beast, where the firmware required to power up the system to the chevrons (>>>) is an operating system of small Linux complexity even before you can try to boot OpenVMS, TRU64 or WinNT. I think the simulation currently runs some 56 billion instructions before encountering a missing hardware simulation, and it's nowhere near the chevrons - and hasn't even built the OS interface block or the FW-level device tables yet.
If anyone would like to help with the Alpha development, I wouldn't say no. It takes a long time to figure what the RISC firmware is trying to accomplish. :-)
Dave
-----Original Message-----
Sent: Friday, February 2, 2018 3:16 AM
Subject: EXT :Re: [Simh] DEC Alpha Emulation
Zane,
FreeAXP / Avanti from Migration Specialties, http://www.migrationspecialties.com/
This is where the ES40 emulator went, AFAIK.
cu,
Martin
-----Ursprüngliche Nachricht-----
Gesendet: Freitag, 2. Februar 2018 05:56
Betreff: [Simh] DEC Alpha Emulation
I'm in the process of slowly updating my DEC Emulation website, which has been largely stagnant since 2007. The first page I'm spending some significant time on is the DEC Alpha page, as I'm curious to see what the current options are. In reviewing things, I notice that Alpha emulation was worked on at some point for SIMH (probably around 2006). What is the current state of that work? I had a quick google, and the most popular hit seems to be mirrored copies of my site.
http://www.avanthar.com/healyzh/decemulation/Alpha.html
Thanks,
Zane
_______________________________________________
Simh mailing list
http://mailman.trailing-edge.com/mailman/listinfo/simh
_______________________________________________
Simh mailing list
http://mailman.trailing-edge.com/mailman/listinfo/simh
_______________________________________________
Simh mailing list
http://mailman.trailing-edge.com/mailman/listinfo/simh
_______________________________________________
Simh mailing list
http://mailman.trailing-edge.com/mailman/listinfo/simh
Clem Cole
2018-02-05 17:01:24 UTC
Permalink
I had found a nearly new PWS 433a in a used computer shop, that didn’t
know what it was worth
​
​Likely to have been one of the NT/Alpha's in the wild. It was a dirty
little secret in ZKO and MRO that they worked fine with Tru64 and OpenVMS.
​Purely a co​st issue. It was cheaper and faster to get the NT/Alpha HW
- so they became a popular internal development systems for support for the
Adaptec chipset (who's number I now forget). But marketing never accepted
because of the failover issue for clusters.

I never understood that. My argument was that nobody was going to put a
$1M cluster at risk with a $100 PCI card. We could have just stated in
the SPD that the Adaptec chip set was supported on single (small) systems
such as Workstations, DS10, DS20 etc... But I lost that war.
ᐧ
Timothe Litt
2018-02-05 18:05:37 UTC
Permalink
  But marketing never accepted because of the failover issue for
clusters.
I never understood that.  My argument was that nobody was going to
*knowingly***put a $1M cluster at risk with a $100 PCI card.   We
could have just stated in the SPD that the Adaptec chip set was
supported on single (small) systems such as Workstations, DS10, DS20
etc...  But I lost that war.
ᐧ
The *word *you left out was probably the issue.  It is trivially easy to
add a workstation to a cluster, and neither VMS nor Tru64 verify that
hardware meets requirements when a node joins a cluster.  So it's not
easy to dismiss the scenario that someone buys a workstation that is not
intended for cluster use; then circumstances change and it turns up in
your cluster.  And it "just works" for a long time, until you hit the
corner case.  In your $M enterprise, stuff gets passed around and
information gets lost as ownership changes at the periphery.  (The way
things moved about on the ZK engineering clusters  is typical.  Despite
attempts to control, people needed to do their jobs & configuration
limits were ignored/fudged.)  *We just didn't make adding a node to a
cluster difficult and mysterious enough.*  Plus, profit is usually a
percentage of user cost.  More cost => more profit.  (Assuming you make
the sale.) 

So product management's conservatism is understandable, given the risk
that the SPD won't be re-read when the function of a node changes, and
the resulting data corruption being laid at DEC's feet.  Engineers
aren't known for reading the instructions - and IT people who are
under-staffed and under pressure less so.  SPDs are even less appealing
- they tend to be read at initial purchase - and subsequently only when
the finger pointing starts.  And that's after customer services has
spent a lot of time and money diagnosing the problem.

These days, we have gates with names like "network admission control";
they won't allow a VPN or Wireless client to connect to a network unless
software is up-to-date.  Something along those lines that also included
hardware and firmware would be a useful addition to clusters - assuming
you could do everything quickly enough to prevent cluster transition
times from becoming unacceptable.  It's non-trivial; the nasty cluster
cases have to do with multi-ported hardware, so you need to check
firmware revisions & bus configurations on all ports for compatibility. 
With all the permutations of the controllers being on stand-alone
systems, cluster nodes not yet joined, joined cluster nodes, and
redundant controllers on the same node.  And interconnects: CI, NI, MC,
DSSI, SCSI.   And hot swap, which can upgrade or downgrade a controller
on the fly.

So, the counter-argument becomes "how much engineering should be
invested in allowing a customer to save $100 on the cost of a PCI
card?"  And the easy answer is one of "none" and "it's not a priority". 
Ship only cluster capable hardware, and "problem solved".  Not all
engineering problems are best solved with engineering solutions.  But
I'll grant that the engineering would be a lot more fun :-)

An imperfect analogy would be selling cars without windshield wipers to
people who promise that they never drive in the rain.  It's in the
nature of things that someday the rain will come.  Or the car will be
passed on.  Of course, missing wipers are a lot more obvious than what
kind and revision of a PCI card is buried in a cardcage :-)

A better analogy is a exercise left to the reader.
Clem Cole
2018-02-05 18:36:31 UTC
Permalink
Post by Timothe Litt
The *word *you left out was probably the issue.
​Fair enough ..​
Post by Timothe Litt
*We just didn't make adding a node to a cluster difficult and mysterious
enough.*
​Right ;-)​
Post by Timothe Litt
So product management's conservatism is understandable, given the risk
that the SPD won't be re-read when the function of a node changes, and the
resulting data corruption being laid at DEC's feet.
​Point taken, but DEC used the SPD as its primary defense for exactly this
type of problem.​ It was the 'legal' definition of what was and was not
allowed. But as you point out, that behavior does not always make for
happy customers or sr managers.
Post by Timothe Litt
So, the counter-argument becomes "how much engineering should be invested
in allowing a customer to save $100 on the cost of a PCI card?" And the
easy answer is one of "none" and "it's not a priority". Ship only cluster
capable hardware, and "problem solved". Not all engineering problems are
best solved with engineering solutions. But I'll grant that the
engineering would be a lot more fun :-)
​I hear you and your are 100% correct -- that is exactly the way it would
have been (was) handled​. The truth is in at least Tru64 (I think is was
Feed Knight - Mr. SCSI) had code that detect when your SCSI bus was being
shared. It would have been easy to add add a side look up to check the
control being used and if it was not in the official table, produce a boot
message saying -- "*shared bus with unsupported SCSI controller, please
remove sharing or replace controller and reboot."*

​But I could never get marketing to accept that.​

But I agree, the fact is that somebody would have tried to do it. My point
was that if we detected it (which was not not that hard), then we could
have at least said something. And in practice if you still ignored it and
it was in all those system logs, it would have been pretty easy to say to
the end customer, *we told you not to do that*.

Then again to your point - I know of a case where a VPN manufacturer as
told its customer not configure their VPN's the way the customer does, but
said customer's IT dept insists on doing it differently anyway - much to
the internal development engineers (@ the customer) distress. In
this case, its a switch, the customer are being more conservative (and
silly) than the VPN manufacturer has told it to be, but breaking a lot of
things in process that should work. The local engineers b*tch because
things break that should not -- and they find out it is a local cause by
their own IT dept. But it is a case where
the manufacturers recommendations are not being considered and things would
work as expected (and tested by the manufacturer) if the SW was configured
as designed.

Clem



ᐧ
Timothe Litt
2018-02-05 20:10:12 UTC
Permalink
Post by Clem Cole
​Point taken, but DEC used the SPD as its primary defense for exactly
this type of problem.​ It was the 'legal' definition of what was and
was not allowed.   But as you point out, that behavior does not always
make for happy customers or sr managers.
I started in the field, and consulted with the corporate flying squads. 
The SPDs' value as legal definition was of more interest to lawyers &
junior product managers than to those at the sharp end of the spear. 
Happiness, even at expense above and beyond legal technicalities brought
more business than sticking to the letter of the law.  Unhappiness was
very, very expensive.  I have stories that run both ways...
Post by Clem Cole
 
 The truth is in at least Tru64  (I think is was Feed Knight - Mr.
SCSI) had code that detect when your SCSI bus was being shared.  
It would have been easy to add add a side look up to check the
control being used and if it was not in the official table,
produce a boot message saying -- "/shared bus with unsupported
SCSI controller, please remove sharing or replace controller and
reboot."/
​But I could never get marketing to accept that.​
I wish it were that simple.  In this case, Marketing's intuition covered
some technical challenges.  I had many a talk with Fred when I was in
the Tru64 group.  That 'table' would have to deal not only with
controller types, but with compatibility of firmware versions for every
device on the bus.  And the permutations of what worked (and didn't)
weren't static.  The sys_check maintainer made some efforts, as did the
SPEAR folks in CSSE.  But everything was a moving target. 

The trivial case of "don't ever use this controller in a cluster" isn't
all that hard to blacklist.  Of course, when the foobar-plus comes out
with a different device ID, but the same bug, you have to blacklist it
too.  Before any customer finds one a "American Used Computers" (Kenmore
Square, before e-bay:-)  And don't forget that to find another
controller on the bus, you have to enumerate the bus.  This can have
side-effects with "bad" controllers.  The bugs weren't all limited to
fail-over.  IIRC tagging and command queuing had issues; at least one
controller created parity errors (and some undetected one).

But maintaining a useful whitelist - with all the churn in the SCSI
space - would be a nightmare.  Disks have firmware & HW revs. 
Controllers too.  Blocking all 3rd party disks (despite the frequent
firmware issues) isn't viable.  Don't forget CD/DVD, tape, and even
ethernet.  Even getting customers to install patches was hard (patch
quality and interactions was one of my issues); patching to keep up with
hardware/firmware revs wasn't going to fly.  And you need this
information before you have a file system; preferably in the boot
driver.  So no, not a config file.  Maybe SRM console environment
variables...  Even in the relatively controlled environment that DEC was
able to impose, SCSI should have been called CHAOSnet - except that name
was taken.

Worse, once you produce one error message in a problem space (e.g.
invalid HW config), suddenly NOT producing errors for all other cases
that don't work become bugs.
Post by Clem Cole
My point was that if we detected it (which was not not that hard),
then we could have at least said something.   And in practice if you
still ignored it and it was in all those system logs, it would have
been pretty easy to say to the end customer, /we told you not to do that/.
By the time it's in a system log, it's too late.  The logging disk is
probably on the SCSI bus.

"I told you so" - not a happy strategy.

For the simple case of only two machines sharing a bus: what do you mean
by "at boot time"?  The first machine powers up, and is "alone" with a
"good" controller.  Two weeks later, the owner of the second machine
(with a "bad" one) returns from vacation and turns his on.  His dog
brought him a magazine article on clusters, so why not jump in?  It
might, maybe, manage to boot to the point of noticing the first one
without polluting its transfers.  Note that at this point, the first
machine is undoubtedly doing disk writes; packet corruption is not as
"harmless" as when you have a ROFS.  And the second machine has to touch
the first's controller to query it's versions.  And to find it, it
enumerates the entire bus.  Meantime, does the first machine repeat the
boot-time check? How does it notice?

As I said, when something's wrong, logging to disk with an invalid
hardware configuration isn't going fly.  Above the hardware level,
you're not in the cluster (yet), so how are you going to get the disk
bitmaps (and locks)?  And write to a ROFS?  Normally, these are queued
in memory (and retrieved for syslog by dmesg).  But with this
misconfiguration, the last thing you want to do is join the cluster &
remount the logging disk R/W.  So you can't log to disk.  You might want
to try to send to a network syslog - but that means you've gotten a LOT
further into kernel initialization; you have a file system, network
configuration, know where to send it, etc.  Besides the fact that your
network chip may be on the same SCSI bus, you've done a whole lot more
I/O to get this far.  With this kind of error, you want to make the test
and panic very, very early in initialization to minimize collateral
damage. 

There are many more cases to cover.  This is one of the simpler.

It's really not that simple to verify hardware configurations, once you
dig in to the problem space.  Fred's test was undoubtedly useful for
logging & cluster initialization - with supported controllers.  It might
have been a good reminder for engineering experiments.  I'd need to be
convinced that it could solve the issue that you wanted to address. 
"For every problem, there is a solution that is simple, obvious, and ...
wrong".

You're correct that some simple check at driver initialization that
stuck with console logging could probably be 80-90% effective.  But
getting the rest right, while an interesting engineering project, would
be a P.roject.  Sunshine with a slight chance of data corruption just
wasn't the DEC way :-)

As I said, a lot of fun for the engineers, but hard to justify in order
to save a few customers $100.
Clem Cole
2018-02-05 20:53:44 UTC
Permalink
I hear you. My point was it only failed the corner cases of failover, so I
think it would have made it to logs. And that should have been good
enough. Not perfect, but in practice it would have worked and it would
have simplified things immensely.

And not having the Adaptec support was in fact a real problem to added cost
and really did not add value. My last act, I was trying too build the $1K
Alpha at the time (which I did prototype until Jessie kill it, buts that's
another story). Folks said the cheapest Alpha was $5K -- well there was a
reason. When we took a $799 [end user Radio Shack priced] Compaq K7 based
system and spliced a $200 EV6 into and got Tru64 working (with an Adapter
chipset on the motherboard BTW), it worked and people wanted it!!! [I
still have the the motherboard at home and EV6 from it is on my desk at
Intel].

Look I tend to be a practical engineer. I always felt that DEC's building
things 100% fool proof, everything had to be perfect was really what killed
the golden goose - not palmer et al. Not being able to understand what did
needed to be "you bet the company/farm/your life" and what could be "good
enough for now and move to the next problem". I always thought that was
one of the things Roger Gourd taught me -- how to differentiate between the
two. I think DEC had that when the PDP-8 and PDP-11 was done and in the
early development of the Vax (hey I programmed Vax serial #1 -- VMS V1.0
was buggy as could be). But with Vax becoming supreme, DEC lost its
way/believed its own hype.

Alpha and Tru64 are great examples of the problem. BTW: I loved Alpha,
bleed for it etc.. Tru64 was the best UNIX implementation I have ever
used, and am proud to be a developer of TruClusters. But it took 3 extra
years to get Tru64 out the door because it had to be perfect (and nobody
got fired for it either).

I never understood that. Every subsystem that needed to be rewritten (TTY
handler, memory, bulk I/O) did need to be do over from the original code
from OSF for the 386 and PMAX. But I always felt, DEC could have shipped
OSF/1 on the Alpha pretty much as is, and started to get revenue and move
the installed base. Then subsystem by subsystem, replaced them with
something better. I also argued with Supnik BTW (whom I adore and think
the world of), the lack of 32 bits certainly made the engineering support
easier, but again it cost us. We also basically paid the ISVs to fix
their code so it would run on 64 bit SPARC (Judy Ward's errors are still
the best I know for cleaning up 32 bit-isms. If I have old code, I'm about
to port, I often run it through my Alpha with Judy's compiler to tell me is
going to be troublesome).

Yup 32 bit support would have been messy and we would have had to have 4
versions of the libraries just like MIPS, SPARC et al. It would have been
a little ugly and not 'perfect' ... but it would have worked and been
faster to market. And ISV's would started to get some revenue. By the
time we were 'done' - it was too little to late and folks hard already
started to look for an alternative - and guess what Winders on a 386 was
'good enough.'

As I said to Jessie et al, SW is not written on $1M computers - its written
on the cheapest thing that gets the job done. Then moved upstream if it is
valuable.

By the time the Sr Managers took the 'cut a deal with Microsoft and get
their SW' strategy, the death spiral was well underway. And they
misunderstood, they were never going to get 'big bucks' for 'commodity sw.'
[Intel suffered that also with Itanium].

BTW: I'm watching Intel make more of the same mistakes.... sigh. As I
say to folks here, I have those tee shirts, I know how this movie ends.


Clem

BTW: the best people in Intel IT are the ***@Intel folks. They listen
and understand I can help them. So they try to help me. Its a good
arrangement. When they send me a note, I do try to help. I'm one of the
few folks in lingering that 'update access' to their tools library (biggest
issue is its all written in VB -- seriously for Mac's -- don't ask).
But when I find something, I do try to help. In return, I ran into an
issue last Tues with my keyboard and called them that PM. I missed the
Fedex time for Folsom, but on Thursday an new Mac was on my doorstep.
Only reason I have not switched is because I had to travel Thursday for
work - so I took them both. Got my job done and will switch systems
completely tonight (I hope).

Anyway -- back to work....
ᐧ
Post by Clem Cole
​Point taken, but DEC used the SPD as its primary defense for exactly this
type of problem.​ It was the 'legal' definition of what was and was not
allowed. But as you point out, that behavior does not always make for
happy customers or sr managers.
I started in the field, and consulted with the corporate flying squads.
The SPDs' value as legal definition was of more interest to lawyers &
junior product managers than to those at the sharp end of the spear.
Happiness, even at expense above and beyond legal technicalities brought
more business than sticking to the letter of the law. Unhappiness was
very, very expensive. I have stories that run both ways...
The truth is in at least Tru64 (I think is was Feed Knight - Mr. SCSI)
had code that detect when your SCSI bus was being shared. It would have
been easy to add add a side look up to check the control being used and if
it was not in the official table, produce a boot message saying -- "*shared
bus with unsupported SCSI controller, please remove sharing or replace
controller and reboot."*
​But I could never get marketing to accept that.​
I wish it were that simple. In this case, Marketing's intuition covered
some technical challenges. I had many a talk with Fred when I was in the
Tru64 group. That 'table' would have to deal not only with controller
types, but with compatibility of firmware versions for every device on the
bus. And the permutations of what worked (and didn't) weren't static. The
sys_check maintainer made some efforts, as did the SPEAR folks in CSSE.
But everything was a moving target.
The trivial case of "don't ever use this controller in a cluster" isn't
all that hard to blacklist. Of course, when the foobar-plus comes out with
a different device ID, but the same bug, you have to blacklist it too.
Before any customer finds one a "American Used Computers" (Kenmore Square,
before e-bay:-) And don't forget that to find another controller on the
bus, you have to enumerate the bus. This can have side-effects with "bad"
controllers. The bugs weren't all limited to fail-over. IIRC tagging and
command queuing had issues; at least one controller created parity errors
(and some undetected one).
But maintaining a useful whitelist - with all the churn in the SCSI space
- would be a nightmare. Disks have firmware & HW revs. Controllers too.
Blocking all 3rd party disks (despite the frequent firmware issues) isn't
viable. Don't forget CD/DVD, tape, and even ethernet. Even getting
customers to install patches was hard (patch quality and interactions was
one of my issues); patching to keep up with hardware/firmware revs wasn't
going to fly. And you need this information before you have a file system;
preferably in the boot driver. So no, not a config file. Maybe SRM
console environment variables... Even in the relatively controlled
environment that DEC was able to impose, SCSI should have been called
CHAOSnet - except that name was taken.
Worse, once you produce one error message in a problem space (e.g. invalid
HW config), suddenly NOT producing errors for all other cases that don't
work become bugs.
My point was that if we detected it (which was not not that hard), then we
could have at least said something. And in practice if you still ignored
it and it was in all those system logs, it would have been pretty easy to
say to the end customer, *we told you not to do that*.
By the time it's in a system log, it's too late. The logging disk is
probably on the SCSI bus.
"I told you so" - not a happy strategy.
For the simple case of only two machines sharing a bus: what do you mean
by "at boot time"? The first machine powers up, and is "alone" with a
"good" controller. Two weeks later, the owner of the second machine (with
a "bad" one) returns from vacation and turns his on. His dog brought him a
magazine article on clusters, so why not jump in? It might, maybe, manage
to boot to the point of noticing the first one without polluting its
transfers. Note that at this point, the first machine is undoubtedly doing
disk writes; packet corruption is not as "harmless" as when you have a
ROFS. And the second machine has to touch the first's controller to query
it's versions. And to find it, it enumerates the entire bus. Meantime,
does the first machine repeat the boot-time check? How does it notice?
As I said, when something's wrong, logging to disk with an invalid
hardware configuration isn't going fly. Above the hardware level, you're
not in the cluster (yet), so how are you going to get the disk bitmaps (and
locks)? And write to a ROFS? Normally, these are queued in memory (and
retrieved for syslog by dmesg). But with this misconfiguration, the last
thing you want to do is join the cluster & remount the logging disk R/W.
So you can't log to disk. You might want to try to send to a network
syslog - but that means you've gotten a LOT further into kernel
initialization; you have a file system, network configuration, know where
to send it, etc. Besides the fact that your network chip may be on the
same SCSI bus, you've done a whole lot more I/O to get this far. With this
kind of error, you want to make the test and panic very, very early in
initialization to minimize collateral damage.
There are many more cases to cover. This is one of the simpler.
It's really not that simple to verify hardware configurations, once you
dig in to the problem space. Fred's test was undoubtedly useful for
logging & cluster initialization - with supported controllers. It might
have been a good reminder for engineering experiments. I'd need to be
convinced that it could solve the issue that you wanted to address. "For
every problem, there is a solution that is simple, obvious, and ... wrong".
You're correct that some simple check at driver initialization that stuck
with console logging could probably be 80-90% effective. But getting the
rest right, while an interesting engineering project, would be a P.roject.
Sunshine with a slight chance of data corruption just wasn't the DEC way :-)
As I said, a lot of fun for the engineers, but hard to justify in order to
save a few customers $100.
ᐧ
Clem Cole
2018-02-05 20:59:38 UTC
Permalink
My apologies to the list -- I did a reply all - I had intended that last
message to be sent only to Tim.
Clem
ᐧ
Clem Cole
2018-02-05 16:52:38 UTC
Permalink
dyslexia sucks, sorry... s/could properly/couldn't properly/
Post by Clem cole
Be careful. The comment about many of the pci boards were only NT while
that is true really is somewhat different. For instance both vms an tru64
(and FreeBSD for that matter) will all boot and run fine with an Adaptec
1542 controller (the 500au in my office only had same in it)
The SPD never spoke of it because neither TruCluster nor VMScluster
​couldn't
properly do failover on them do to issues in the Adaptec microcode which
they would not fix because the mass market never cared. But the default
controller for NT/Alpha was just that controller
I mention this because while the qLogic controller is the official one for
getting simh running it is going to be a PITA and the Adaptec might be a
better first start
Send me email offline if you want more info btw.
Sent from my PDP-7 Running UNIX V0 expect things to be almost but not quite.
ᐧ
Rhialto
2018-02-06 22:53:59 UTC
Permalink
Post by Hittner, David T [US] (MS)
The Alpha target is the Digital Personal Workstation (PWS) 500au
[codename: Miata], which qualifies as a single user workstation from a
minimum licensing perspective.
I have a PWS 433au which I have installed NetBSD on. I have IDE disks
instead of SCSI disks though. Thanks to the great Miata chipset it has
wonderful DMA bugs, as I recall. I think that in the end the disk is
driven with PIO, so slow is it.
I think that I put in a generic IDE PCI card for a second disk, but I
can't really remember...

The PWS survives the AMD computer that replaced it in active use.

-Olaf.
--
___ Olaf 'Rhialto' Seibert -- Wayland: Those who don't understand X
\X/ rhialto/at/falu.nl -- are condemned to reinvent it. Poorly.
Loading...