Hi!
You have SX6036T FDR10 switch.
Your switch's port support up to FDR10 (40Gb, 64/66bit encording).
If you want FDR14(56Gb) speed, you must change your switch to SX6036F or SX6036G.
Best Regard,
Jae-Hoon Choi
Hi!
You have SX6036T FDR10 switch.
Your switch's port support up to FDR10 (40Gb, 64/66bit encording).
If you want FDR14(56Gb) speed, you must change your switch to SX6036F or SX6036G.
Best Regard,
Jae-Hoon Choi
in principle, it is possible to achieve optimum bandwidth over "aggregated" Mellanox CX-2 adapters but I'm not sure you can get as much as 40Gb/s using your old platform but then you'll need to ensure the following pre-conditions in your system:
1.In case of Windows platform -
- to line up with the proper OS, driver & fw that supports CX-2 vpi adapter, which is: Windows 8.1, WinOF driver ver. 4.80 & firmware v2.9.120 (Windows 10 is not supported )
- by "teaming" the entire CX-2 adapters on Windows OS (in case teaming is supported in Windows client)
2. the same goes for Linux- you need to line up here as well with the proper OS, driver & fw that supports CX-2 vpi adapter
Can be found in the release-notes of each OS (see link bellow)
http://www.mellanox.com/page/products_dyn?product_family=26&mtag=linux_sw_drivers
In close, Mellanox CX-2 adapters are already EOF (end-of-life) in terms of sale & Mellanox support. So I suggest you overview the advanced ConnectX-3/4/5 adapters that will provide you the required throughput
per the unsuccessful error print you've presented I can suggest that you use an nvme connect <device> command options that I see is missing there, and that is: "--nr-io-queues"
This option specifies the number of io queues to allocate.
Have you tried this option?
For examples: # nvme connect --transport=rdma --nr-io-queues=36 --trsvcid=4420 --traddr=10.0.1.14 --nqn=test-nvm
Otherwise, you will hit the "default" option which is “num_online_cpus” (Number of controller IO queues that will be established), and this may explains the error you got:
“nvme_fabrics: unknown parameter or missing value 'hostid=a61ecf3f-2925-49a7-9304-cea147f61ae' in ctrl creation request”
read more on that in the article: Add nr_io_queues parameter to connect command: [PATCH v2] nvme-cli/fabrics: Add nr_io_queues parameter to connect command
default:
+ pr_warn("unknown parameter or missing value '%s' in ctrl creation request\n",
+ p);
+ ret = -EINVAL;
+ goto out;
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Hope this helps
Does anyone know how to configure PFC on an esxi 6.5 host for connectx-4 cards?
i followed the following guide to configure linux, however mlnx_qos does not seem to exist for esxi, nor does the python script in the guide:
In UBOOT try running the command:
run flash_self_safe
This will bring the 4036 to the primary kernel where you can recover using "update software"
Unfortunately, iSER 1.0.0.1 isn't beta level.
It's a alpha level quality.
Only 1st port can connect to iSER target.
And Software iSCSI initiator must be add to ESXi 6.5U1 host.
If you don't do it, iSER initiator can't connect to iSER target.
At last, whenever host reboot, iSER initiator disappeared.
You must excute esxcli rdma iser add command to host then rescan HBA adapter everytime.
We are not a beta tester.
We are not a QC employee on Mellanox or VMware,too.
If iSER will be a dominate RDMA storage protocol, It must be add a iSER eanble option to VMware software iSCSI initiator.
Not a funky CLI command.
PFC options and iSER initiator add to ESXi then Host Profile creation work failed everytime since IB iSER beta driver 1.8.3.0.
Various bugs were exist on iSER initiator and PFC option on drivers.
Don't use Mellanox HCA - especially iSER initiator alpha driver 1.0.0.1 - on VMware production environment!
Best Regard.
Jae-Hoon Choi
How did you bind the iscsi adapter for this to work? Is it associated to the same vmnic as the iser hba? did you need to associate a vmkernel adapter with the iscsi initiator at all? If so did you assign an ip address to the iscsi vmkernel? I just dont want to give it a true iscsi path to the target if i do not have to.
Just add a VMware iSCSI adapter.
vmknic must bind iSER initiator, not VMware iSCSI initiator.
IP address also set to vmknic,too.
This test was completed with StarWind vSAN iSER target and RoCE v1 with Global Pause (access port in Arista switch)
This iSER driver 1.0.0.1 for Global Pause(RoCE v1).
Mellanox said this stuff for ESXi 6.5, but QS shows a old ESXi 6.0 C# client picture and ethernet switch must set every port up to Global Pause mode.
This manual is very useless...:(
PFC on ESXi need some configuration with pfcrx, pfctx. But setup the default priority 3 to pfcrx, pfctx to 0x08, you will meet a Host Profile creation error message that general system error. Mellanox never provide a fix since driver 1.8.3.0
BR,
Jae-Hoon Choi
can you provide any guidance for esxi pfc setup? i followed this on my linux target but all of the commands seem to not apply to esxi: HowTo Configure PFC on ConnectX-4
I'm Sorry!
iSER initiator 1.0.0.1 only works with Global Pause.
It doesn't works with PFC on VMware environment.
If you mean that you just using PFC based RDMA network, you must have a Enterprise class ethernet switch that support PFC.
I have 2 of Mellanox SX6036G Gateway switches.
What's your switch model?
BR,
Jae-Hoon Choi
P.S
RDMA must need a switched fabric like FCoE.
If you using direct connection between CX-4, you can't do it!
i do not have a switch, they are working back to back though. there are also a few examples on mellanox's site of direct connect servers as part of their demo. You can configure global pause and pfc in firmware, outside of OS. not sure why this would not work back to back if both adapters are sending pfc/global pause info?
once i enabled iscsi software adapter i was able to connect to my scst target. It is working now. Not really performing any better though. I was able to get better performance in iscsi by configuring 4 vmkernel adapters to my 1 vnic and then configuring round robin policy to 1 iops. It was the same adapter but it seemed to trick esxi into dedicating more hardware resources/scheduler to the adapter. At the moment the iser connection is on par with iscsi before i did this round robin policy. It doesnt look like i can configure round robin policy for iser adapters, but i am still looking into that.
Yes! CX-4 has a advanced unique logic on firmware over CX-3, but that logic must co-works with switch's firmware and configuration.
RoCE RDMA is a kernel bypass protocol.
If your any HCA is down, you meet a kernel dump like ESXi PSOD or Linux kernel dump.
If there isn't switch between host and storage, who control congestion or etc?
This link HowTo Configure PFC on ConnectX-4 that you mentioned previous message include basic system requirements.
BR,
Jae-Hoon Choi
Here is a result from a vm, it has 2 cpus, 8gb and running 2016. It has a vmxnet3 adapter and vmware paravirtual iscsi hard drive.
These results are slightly better than what i was able to achieve with iscsi, so there is some improvement. I understand a switch is recommended but it clearly does work without one. I am monitoring for any packet loss/data issues but as explained before its point to point. The target is 3 nvme pciessds in what amounts to a raid 0 in a zfs pool/zvol btw.
It's a ZFS memory cache ARC range.
You must increase test sample size to 100GB or above.
Best Practice is using iometer with multi-user configurations...:)
BR,
Jae-Hoon Choi
Understood, i am testing the interconnect though, where its coming from is a bit irrelevant. Just saying that iser is performing somewhat better than iscsi so it is worth going through this hassle for those sitting on the sidelines wondering =). I was never able to get above 8800 or so on iscsi. The non sequential are very close, not a huge difference.
I've encountered the same problem with the latest centos7 inbox rpms. Back to mofed it seems.
RDMA vs TCP/IP = latency war ...
You must check latency in test with another tool with multi-user configurations.
I'm also test it.
By increase client count, there are huge deffrence in latency factor on same throughput usage.
BR,
Jae-Hoon Choi
Hi Grant,
Please see the following user manual Mellanox ConnectX-4/ConnectX-5 NATIVE ESXi Driver for VMware vSphere 6.5.
Please see section 3.1.4 Priority Flow Control (PFC) :
Thanks,
Karen.
Karen, Thank you for the reply.
Can you tell me what is Mellanox's recommended configuration if there is just 1 traffic flow? My connectx-4 cards are direct connected between two servers and used only to connect the iser initator to the iser target. There is no other traffic on these nics. I have seen references to creating a vlan and assigning it to one of the priority queues yet none of this applies to my scenario. Should i just run global pause?
Hi, It looks like the latest Debian 8 (Jessie) sources will compile and install on Debian 9 (Stretch) however do you have a timeline for official support?
Thanks