VirtualizationAdmin.com

Making registry changes to MaxMpxCt and MaxWorkItem keys when you have Windows 95 machines on your network

Rick Mack explains why you need to read the fine print and tread lightly when making registry changes to MaxMpxCt and MaxWorkItem keys when you have Windows 95 machines on your network.

Hi,
I had a bit of fun today finding out how to subtly break a network. I thought the whole thing might bear repeating just in case someone else is stupid enough to make the same mistakes. First a bit of history. Our client has 4 TSE/metaframe systems (Compaq 5500, quad CPU, 1 GB RAM) that run a bunch of banking applications talking to a back-end AS/400. Most of the thin clients are either NCD Xploras or Thinstars (approx 200), with a few PC clients (about 30). The systems have been performing very well with up to 70 users per server, but in the last 2 weeks have had periods where the servers nearly slowed to a crawl for no obvious reason. As the systems slowed down, CPU utilization actually went down, and there was relatively little paging activity (1.8 GB page file) with about 350-400 MB of RAM still free.

After a bit of investigation, I hit a couple of MS technotes (Q191370 and Q232476) which seemed to be describing exactly what I was seeing. In essence, the TSE/metaframe systems have an SMB i/o request queue of finite size which when full, absolutely slows server network i/o to a crawl. The default queue size (defined at the file/print server end as MaxMpxCt) is 50 and is frankly inadequate for busy TSE systems. The fix was fairly easy, increase MaxMpxCT and MaxWorkItem counts on the file/print server. Following the example in Q232476, I added a MaxMpxCt value of 1024 (decimal), and MaxWorkItems of 4096 (decimal). It worked brilliantly, with no evidence of server slowdowns today unless CPU utilizaton was really high. Only trouble is that a bunch of older windows 95 machines were hanging when users tried to log on to the domain, while windows NT workstations, some newer win 95 machines and the TSE/Metaframe systems were fine. When we checked the Win 95 TCP/IP setup, quite a few of them didn't appear to have gotten their DHCP IP addresses. Those that did would just hang later on. We'd installed a new switch into the network the night before so it seemed the obvious culprit. Extensive troubleshooting proved there was nothing wrong with the switch, and the DHCP server was fine (also jetpack checked the DHCP database). I speculated that we could have a network timing problem since the new switch was certainly speeding things up, so we upgraded the TCP/IP stack on some of the windows 95 machines, made sure they were running the latest NIC drivers, and even tried changing a few of the NICs with limited results. Win 95 worked fine as long as you didn't log into the network. You could use an ICA client on the win 95 systems to connect to the application servers, ping etc worked fine. So maybe there was a selective problem with the netbios side of things or the license services were screwy on the PDC? It was obvious that as soon as you tried to log onto the network explorer stopped responding and that was it. But rebooting the PDC and BDC and promoting the BDC had no effect, and things were starting to get a bit uncomfortable.

You see, most people had NCs except for the managers who had PCs. The penny dropped when someone accidentally mistyped a login user name and got an "invalid username" message. The Domain controllers were working! The login was actually hanging when the system was trying to mount the users home drive on the F/P server. But the only thing I'd changed was the MaxMpxCt and MaxWorkItem values as per the M$ example. When I went back and re-read the technotes I spotted the mention of Win 95/98 possibly hanging if you changed MaxMpxCt, and there was a reference to Q232890. There were a couple of warnings about the fact that changing MaxMpxCt could affect Windows 95 machines, but to be honest, I completely ignored this warning because Microsoft surely wouldn't give you example values that would blow things up, especially since the ceiling on MaxMpxCt was 64K.

Technote Q232476 describes the problem. Win 95/98 allocates an 8 bit integer to the SMB I/O request queue size and any MaxMpxCt value larger than hex FF simply screws up the redirector when it tries to figure out the queue size. The fix was to delete the MaxMpxCt value to default back to 50. Since I'd had good reasons to add and extend this value, this wasn't really well received. A bit more research came up with another technote that recommended a MaxMpxCt of 510 (dec, 0x1fe) for windows 95 but this was obviously at variance with Q232476, so I decide to play it safe and use a MaxMpx of 254 (0xfe) and MaxWorkItems of 1016. This is still 5 times greater than the default so it will still be an improvement for the TSE systems and those Windows 95 machines can continue their squalid insignificant existence without bothering me.

So what did I learn? Let's see: I'm going to read the fine print, when M$ warn you that something might go wrong it really can, and what a brilliant way to get rid of Windows 95 on a network.

Jen Madaffer (jmadaffer@republictech.com) Came up with this great additional information that you should know!
Subject: New discovery on MaxMpXCT and MaxWorkItems

I had a lot of fun with these counters today. I wanted to share a very interesting tidbit that my colleague, Dean Partridge, figured out about these counters.

Thank you Rick for taking the time to post the whole story! Your insight is what led Dean to this discovery.

As noted below, it is necessary to increase MaxMpxCt and MaxWorkItems on the local file/print server for TS's to run properly. Although, you can easily get in trouble if you set these numbers wrong, because it can break your Windows 95 users.. i.e. they will no longer be able to login to the domain. Back when Rick posted this, I already had the reghack in place and just considered myself lucky for not having my 95 machines break when I implemented the fix. The curious part was I had set MaxMpxCt to 5000! How is it that 5000 worked for me but 1024 was too large for Rick's client? I figured it had to do with the individual environments being different. I was wrong.

Yesterday, in an effort to improve my TS performance (they slow down then crash periodically) I increased my MaxMpxCt to 8192 and the MaxWorkItems to 32768. This morning, any 95 user anywhere in the corporation that mapped a drive to that file server was unable to login. So I put it back to 5000 and 15000 and rebooted. All 95 machines were fine again.

Above Rick says "Win 95/98 allocates an 8 bit integer to the SMB I/O request queue size and any MaxMpxCt value larger than hex FF simply screws up the redirector". That's not exactly true. We found that 95 just accepts the LAST TWO digits of the hex value that is passed to it. Check it out:

  • 50 in hex is 32
  • 1024 in hex is 400
  • 5000 in hex is 1388
  • 8192 in hex is 2000

That is why 5000 worked for me but 8192 didn't! When I set it to 8192, the 95 machine was told it was only allowed to have "00" connections to the server! We tested this in our test lab... I set MaxMpxCt as high as 65000 and it still worked! 65000 is FDE8 in hex.

DISCLAIMER: this proved true in my lab but I make no claims as to whether it would work anywhere else for anyone else!

It is interesting that MS's articles SUGGEST values that end up with a hex 00 on the 95 machine.

To sum up, when setting MaxMpxCt I will always be sure to look at the hex number it creates and make sure the last two digits are large enough to service a windows 95 machine, then make MaxWorkItems four times that amount.

Jen


Receive all the latest articles by email!

Receive Real-Time & Monthly MSTerminalServices.org article updates in your mailbox. Enter your email below!
Click for Real-Time sample & Monthly sample

Become an MSTerminalServices.org member!

Discuss your Terminal Services & Citrix issues with thousands of other SBC experts. Click here to join!

Solution Center