[Solved! See the update at the bottom of the post.]
Over the years, I’ve spent more hours than I care to think about delving into the innards of network connections. Sometimes, the solution to slow network throughput is as simple as swapping a cable or updating a driver. But sometimes the problem is more baffling.
It’s so baffling, in fact, that I’m posting this here in hopes that a networking expert (maybe even someone from HP, Intel, or Microsoft) will be able to explain exactly what’s happening.
Over the weekend, I picked up a new HP Pavilion Elite m9600t with a core i7-920 processor. I wiped away the messy Windows Vista installation and replaced it with a clean copy of Windows 7 Ultimate. After a few updates everything appeared to be working fine, until I tried to download a few large files from a server on my local network and discovered that the onboard Intel 82567V-2 Gigabit Ethernet adapter was delivering truly abysmal speeds.
Copying files from the new PC to any other network location were impressively fast. Here’s what the file transfer dialog box looked like for a file copy to the Public folder on that Windows Home Server box:
That’s truly impressive throughput, with that 4.36GB file (a recorded TV program) copying in under 80 seconds.
But when I reversed the operation and tried to copy that same file to the local PC, the throughput dropped by more than 97%, to roughly 2 MB/sec. I tried different files and folders on different PCs, with similarly depressing results. In some cases transfer speeds were slower than I get on an Internet connection. Yikes!
This gave me an opportunity to try most of the obvious (and some not-so-obvious) troubleshooting solutions. I’ll write about the details of that process later, but suffice it to say that upgrading to the most recent drivers, forcing the link speed into Full Duplex Gigabit mode, tweaking Windows TCP auto-tuning settings, enabling jumbo frames, and removing or disabling various Windows networking services did no good whatsoever.
Eventually, I zeroed in on some esoteric settings for the Ethernet adapter, available from the properties dialog box in Device Manager.
Through trial and error, I found that adjusting three settings “unblocked” the connection and allowed receive speeds to zoom to the levels I was seeing in the other direction:
- Adaptive Inter-Frame Spacing This setting is disabled by default; enabling it, according to the help text, “compensates for excessive Ethernet packet collisions by dynamically controlling back-to-back timing.”
- Flow Control The default setting is RX and TX Enabled, which means that the adapter responds to and generates flow control frames that tell the other end of the connection to wait. I set it to Tx Only.
- Interrupt Moderation Rate This setting “moderates or delays the generation of interrupts … to optimize network throughput and network utilization.” Given that this system has a kick-ass i7 with eight core processor processing threads and four cores, I figured I could spare some CPU cycles, so I changed this setting from the default (Adaptive) to Off.
With these settings in place, receive speeds shot up dramatically, to rates that were exactly what I expected from a Gigabit Ethernet connection on a system with fast disks and controllers on either end.
But when the system resumed from sleep or restarted after being shut down, performance was back at those depressingly low levels again, which led to another round of troubleshooting. The settings I had made to the adapter still appeared to be in place when I checked its properties in Device Manager, but it was behaving as though the default settings were in force. After going down more dead ends and through more experimentation, I discovered a remarkable fix: If I restore the default performance settings using the Advanced Adapter Settings dialog box (clicking OK to reset the adapter) and then manually change the adapter settings back to my tweaked setup, performance returns to the speedy levels I expect.
This is completely reproducible. I’m assuming that somehow, when the network adapter wakes up after sleeping or a shutdown, it is loading its default performance settings rather than the ones I saved previously. As a workaround, I can do this Advanced Settings fandango every time the machine restarts or resumes from sleep, but that is going to get very old, very fast. I’m also considering disabling the onboard network adapter and installing a separate, non-Intel adapter in my one remaining PCI-Express slot. That’s $25 I’d rather not spend, but it’s the logical solution if I can’t find and fix the real cause.
So, what about it, networking experts? Have you ever seen anything like this? I’ll send an autographed copy of Windows 7 Inside Out to the first person who comes up with a successful solution (or at least a detailed explanation of why this is happening).
Update: Thanks to commenter BFT for insisting that I look more carefully at the network switch. When I tested connectivity using a straight-through Ethernet cable to connect two PCs directly, I was unable to replicate the throughput problems. That suggests that the problem is somewhere in the networking hardware itself. Switching to a different cable and using a different port on the switch solved the problem completely. The system now resumes from sleep with full network speeds. In addition, I restored the default settings to the network adapter and found that throughput increased by about 10%.
BFT, use the contact form in the sidebar to send me your contact information so I can get your signed copy of the book to you!
Another update: In response to some questions via Twitter and in the comments, here’s my theory of what happened. I never swapped cables as part of the troubleshooting. Intel’s network adapter control panel has a cable test that told me this cable was good. I assumed (incorrectly) that the fact I could get decent transfer speeds in both directions with the right settings was evidence there was no problem with the cable.
My theory is that the defective cable was causing the switch to get an improper signal at power-on, so the switch was defaulting to slow Ethernet mode and not auto-sensing the Gigabit Ethernet connection. Adjusting the software settings and forcing the adapter to reset also forced the switch to reset.
Bottom line, I think the culprit was mostly the cable, which in turn was causing the switch to behave incorrectly.
And one more PS: This is yet another example of a problem that appeared to be Windows-related but eventually was traced to the simplest of hardware connections. For previous examples, see here and here. This is why I am always reluctant to point a finger at any hardware or software maker until I have all the facts.