this post was submitted on 22 Nov 2023
1 points (100.0% liked)

Homelab

380 readers
9 users here now

Rules

founded 1 year ago
MODERATORS
 

I've been going back and forth with this issue for some time but honestly I have no idea if the vCenter telemetry is something to rely on. I'm experiencing rather high latency on the storage on my VMs, most of them idle, only vCenter and virtual firewall generate some IOPS, 5 are shut down, other 3 VMs are linux machines that idle for 99%, even though they can spike 100ms per IO. Today I have decided to migrate a VM storage to another server to find that higher disk utilization reduces the latency on the host, how that makes any sense? I'm using P420 in RAID 10 with 4x4TB 7k SAS HDDs.

Host latency:

https://preview.redd.it/cqvmy550ty1c1.png?width=986&format=png&auto=webp&s=f5823391eb6cd82cb9612b44aa2768087bf619e1

top 8 comments
sorted by: hot top controversial new old
[–] [email protected] 1 points 11 months ago

If it's constantly reading the same data it's stored in cache, which is significantly faster than reading from the actual drive. Because the latency is average and cache is very fast it lowers the latency shown in that graph.

[–] [email protected] 1 points 11 months ago

I had the same issue, only using a 930-8i w/ 2M cache. Honestly performance sucked on all of my VMs. I reinstalled the server with Rocky Linux 8.8 and KVM using the same array and performance was acceptable (the array was configured as a LVM volume). I then added a NVMe drive as a LVM cache and performance was much better (good enough for my homelab). Too bad, since I really prefer VMware.

[–] [email protected] 1 points 11 months ago (2 children)

15-30ms latency seems reasonable for that hardware (P420 in RAID 10 with 4x4TB 7k SAS HDDs.)

Basically your single SAS disk can do ~150IOPS random, so worst case of 4k random reads you will get ~600KBps.

For the migration if it's sequential (depending on filesystem layout), the performance can be different, up to maximum streaming performance of like 100MBs for large sequential reads.

Then 2x for your RAID config.

[–] [email protected] 1 points 11 months ago

That is the thing, my total IOPS are less than 150, with 4 disks in RAID 10 I believe I should get 300IOPS due to mirroring. If you look at the graph, red line is transfer in kbps and blue latency, why is it dropping when disk is highly utilised? I will post htop when I’m back from work.

[–] [email protected] 1 points 11 months ago (1 children)

GID VMNAME NVDISK CMDS/s READS/s WRITES/s MBREAD/s MBWRTN/s LAT/rd LAT/wr

12953 dns - 1 0.00 0.00 0.00 0.00 0.00 0.000 0.000

16904 fw - 2 5.84 0.00 5.84 0.00 0.02 0.000 18.408

20481 vcsa - 13 16.58 0.00 16.58 0.00 0.08 0.000 37.582

130847 - 2 0.00 0.00 0.00 0.00 0.00 0.000 0.000

626694 deb - 2 12.06 0.00 12.06 0.00 0.46 0.000 6.586

as you can see the is no much IOPS per VM, like vcsa VM latency I captured is floating between 20 and 100ms, while deb has similar IOPS but lower latency

[–] [email protected] 1 points 11 months ago (1 children)

I think what you want to do is go into your db vm and run a DD or fio or bonnie++ that is at least 2 x the VM RAM, and see what the steady-state disk performance is.

[–] [email protected] 1 points 11 months ago

I don’t have db VM, I think you are referring to deb which is short for Debian.

[–] [email protected] 1 points 11 months ago

I have an r710 with 8 WD black 500gb disk in a raid 5, No vcenter. From a cold boot of any flavor VM there is severe disk latency.

 Only way to get past it in my case is a large disk operation, is xfsdump of / solves the latency.