Scenario / Questions
I have two machines connected with 10Gbit Ethernet. Let one of them be NFS server and another will be NFs client.
Testing network speed over TCP with
iperf shows ~9.8 Gbit/s throughput in both directions, so network is OK.
Testing NFS server’s disk performance:
dd if=/dev/zero of=/mnt/test/rnd2 count=1000000
Result is ~150 MBytes/s, so disk works fine for writing.
Client mounts this share to it’s local
/mnt/test with following options:
node02:~ # mount | grep nfs 192.168.1.101:/mnt/test on /mnt/test type nfs4 (rw,relatime,sync,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.1.102,local_lock=none,addr=192.168.1.101)
If I try to download a large file (~5Gb) on the client machine from the NFS share, I get ~130-140 MBytes/s performance which is close to server’s local disk performance, so it’s satisfactory.
But when I try do upload a large file to the NFS share, upload starts at ~1.5 Mbytes/s, slowly increases up to 18-20 Mbytes/s and stops increasing.
Sometimes the share “hangs” for a couple of minutes before upload actually starts, i.e. traffic between hosts becomes close to zero and if I execute
ls /mnt/test, it does not return during a minute or two. Then
ls command returns and upload starts at it’s initial 1.5Mbit/s speed.
When upload speed reaches it’s maximum (18-20 Mbytes/s), I run
iptraf-ng and it shows ~190 Mbit/s traffic on the network interface, so network is not a bottleneck here, as well as server’s HDD.
What I tried:
Set up an NFS server on a third host which was connected only with a 100Mbit Ethernet NIC. Results are analogical: DL shows good performance and nearly full 100Mbit network utilization, upload does not perform faster than hundreds of kilobytes per second, leaving network utilization very low (2.5 Mbit/s according to
I tried to tune some NFS parameters:
wsizeare maximal in my examples, so I tried to
decrease them in several steps down to 8192
I tried to switch client and server machines (set up NFS server on former client and vice versa). Moreover, there are six more servers with the same configuration, so I tried to mount them to each other in different variations. Same result.
MTU=9000, MTU=9000 and 802.3ad link aggregation, link aggregation with MTU=1500.
node01:~ # cat /etc/sysctl.conf net.core.wmem_max=16777216 net.core.rmem_max=16777216 net.ipv4.tcp_rmem= 10240 873800 16777216 net.ipv4.tcp_wmem= 10240 873800 16777216 net.ipv4.tcp_window_scaling = 1 net.ipv4.tcp_timestamps = 1 net.ipv4.tcp_sack = 1 net.core.netdev_max_backlog = 5000
Mount from localhost:
node01:~ # cat /etc/exports /mnt/test *(rw,no_root_squash,insecure,sync,no_subtree_check) node01:~ # mount -t nfs -o sync localhost:/mnt/test /mnt/testmount/
And here I get the same result: download from
/mnt/testmount/ is fast, upload to
/mnt/testmount/ is very slow, not faster than 22 MBytes/s and there is a small delay before transfer actually starts. Does it mean that network stack works flawlessly and the problem is in NFS?
All of this did not help, results didn’t differ significantly from the default configuration.
echo 3 > /proc/sys/vm/drop_caches was executed before all tests.
MTU of all NICS at all 3 hosts is 1500, no non-standard network tuning performed. Ethernet switch is Dell MXL 10/40Gbe.
OS is CentOS 7.
node01:/mnt/test # uname -a Linux node01 3.10.0-123.20.1.el7.x86_64 #1 SMP Thu Jan 29 18:05:33 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
What settings am I missing? How to make NFS write quickly and without hangs?
Find below all possible solutions or suggestions for the above questions..
You use the sync-option in your export statement. This means that the server only confirms write operations after they are actually written to the disk. Given you have a spinning disk (i.e. no SSD), this requires on average at least 1/2 revolution of the disk per write operation, which is the cause of the slowdown.
Using the async setting, the server immediately acknowledges the write-operation to the client when it is processed but not yet written to the disk. This is a little bit more unreliable, e.g., in case of a power failure when the client received an ack for an operation that did not happened. However, it delivers a huge increase in write-performance.
(edit) I just saw that you already tested the options async vs sync. However, I am almost sure that this is the cause of your performance degradation issue — I once had exactly the same indication with an idencitcal setup. Maybe you test it again. Did you give the async option at the export statement of the server AND in the mount operation at the client at the same time?
It can be a problem related to packet size and latency. Try the following:
- enable jumbo frames (MTU >= 9000 bytes) on both machines
- use UDP or, alternatively, manually increase TCP window size on both machines
The report back your results.
Configuring the Linux scheduler on systems with hardware RAID and changing the default from [cfq] to [noop] gives I/O improvements.
Use the nfsstat command, to calculate percentage of reads/writes. Set the RAID controller cache ratio to match.
For heavy workloads you will need to increase the number of NFS server threads.
Configure the nfs threads to write without delay to the disk using the no_delay option.
Tell the Linux kernel to flush as quickly as possible so that writes are kept as small as possible. In the Linux kernel, dirty pages writeback frequency can be controlled by two parameters.
For faster disk writes, use the filesystem data=journal option and prevent updates to file access times which in itself results in additional data written to the disk. This mode is the fastest when data needs to be read from and written to disk at the same time where it outperforms all other modes
Disclaimer: This has been sourced from a third party syndicated feed through internet. We are not responsibility or liability for its dependability, trustworthiness, reliability and data of the text. We reserves the sole right to alter, delete or remove (without notice) the content in its absolute discretion for any reason whatsoever.
Source: NFS poor write performance