Re: krping (4.7 kernel version) crashing with mlx5_core (CX415A), it is passing with mlx4_core(CX354A)

September 29, 2016, 4:15 am

≫ Next: Windows RDMA storage software

≪ Previous: Re: krping (4.7 kernel version) crashing with mlx5_core (CX415A), it is passing with mlx4_core(CX354A)

Hi Yonatan,

The logs what ever are there, they are dmesg outputs

↧

Windows RDMA storage software

September 29, 2016, 12:19 pm

≫ Next: Re: ib_write_bw error connecting to server with rdma_cm option

≪ Previous: Re: krping (4.7 kernel version) crashing with mlx5_core (CX415A), it is passing with mlx4_core(CX354A)

Is there any RDMA based storage software available for server 2008 R2?

↧

Re: ib_write_bw error connecting to server with rdma_cm option

September 29, 2016, 2:02 pm

≫ Next: Re: Where can I find the firmware for ConnectX-3 MT27500 v2.34.5000 for DEL1100001019

≪ Previous: Windows RDMA storage software

Thanks Talat, your suggestion to use IP to connect was correct, I was trying with the wrong NIC IP, using the ib IP worked.

↧

Re: Where can I find the firmware for ConnectX-3 MT27500 v2.34.5000 for DEL1100001019

October 5, 2016, 1:21 am

≫ Next: Re: Where can I find the firmware for ConnectX-3 MT27500 v2.34.5000 for DEL1100001019

≪ Previous: Re: ib_write_bw error connecting to server with rdma_cm option

Hi,

you can find them under this link http://www.mellanox.com/page/firmware_table_dell?mtag=oem_firmware_download , in this page choose adapters and there you can find the last GA firmware.

as i see in that page, this is the FW that you need.

http://www.mellanox.com/downloads/firmware/fw-ConnectX3-rel-2_36_5000-079DJ3-FlexBoot-3.4.718.bin.zip

Thanks,

Talat

↧

Re: Where can I find the firmware for ConnectX-3 MT27500 v2.34.5000 for DEL1100001019

October 5, 2016, 2:44 am

≫ Next: Re: Where can I find the firmware for ConnectX-3 MT27500 v2.34.5000 for DEL1100001019

≪ Previous: Re: Where can I find the firmware for ConnectX-3 MT27500 v2.34.5000 for DEL1100001019

Perfect thank you !

Do you know if it's possible to get an older version ? (just in case... )

↧

Re: Where can I find the firmware for ConnectX-3 MT27500 v2.34.5000 for DEL1100001019

October 5, 2016, 3:49 am

≫ Next: Re: Dfms high rate is not supported and lots of others

≪ Previous: Re: Where can I find the firmware for ConnectX-3 MT27500 v2.34.5000 for DEL1100001019

sure,

in addition you can save the current fw image that installed on the device.

see mstflint or flint help page (ri command)

ri <out-file> : Read the fw image on the flash.

for example

mstflint -d <lspci-device> ri image.bin

flint -d <mst-device> ri image.bin

↧

Re: Dfms high rate is not supported and lots of others

October 5, 2016, 4:08 am

≫ Next: Re: How Can I change my group members?

≪ Previous: Re: Where can I find the firmware for ConnectX-3 MT27500 v2.34.5000 for DEL1100001019

Hi Nikolay ,

could you please send the operation system, FW and driver version of the device ?

FW :

flint -d <mst-device> q

Driver:

ofed_info -s

modinfo mlx4_core

↧

Re: How Can I change my group members?

October 5, 2016, 7:33 am

≫ Next: Re: OFED/IBDUMP for ConnectX-4

≪ Previous: Re: Dfms high rate is not supported and lots of others

Please submit the registration form again. New submission will override all previous ones.

To submit new registration http://omniqrcode.com/q/hackatop

↧

Re: OFED/IBDUMP for ConnectX-4

October 5, 2016, 11:00 am

≫ Next: Re: OFED/IBDUMP for ConnectX-4

≪ Previous: Re: How Can I change my group members?

Thought I would add my question here as well as I'm having the same issues:

I just updated to MLNX_OFED_LINUX-3.4-1.0.0.0 today (was using 3.3) and I can't get ibdump to work on my ConnectX-4 Lx card.

Are the ConnectX-4 cards going to be supported by ibdump? Any pointers/suggestions would be appreciated!

Tcpdump doesn't seem to correctly decode RoCE traffic - well, it doesn't decode it like this post shows: https://community.mellanox.com/docs/DOC-2416

This is what I see when I do tcpdump:

root@ceb-ubu14:~# tcpdump -i eth2 -vv > ~/rdma_traffic.txt

10:51:15.571909 24:8a:07:11:4f:91 (oui Unknown) > 24:8a:07:0d:d6:d0 (oui Unknown), ethertype Unknown (0x8915), length 334:

0x0000: 6000 0000 0118 1b01 0000 0000 0000 0000 `...............

0x0010: 0000 ffff 0c0c 0c02 0000 0000 0000 0000 ................

0x0020: 0000 ffff 0c0c 0c01 6440 ffff 0000 0001 ........d@......

0x0030: 0000 056e 8001 0000 0000 0001 0107 0203 ...n............

0x0040: 0000 0000 0000 0004 34ed 641e 0010 0000 ........4.d.....

0x0050: 0000 0000 1e64 ed34 0000 0000 0000 0000 .....d.4........

0x0060: 0106 4853 248a 0703 0011 4f91 0000 0000 ..HS$.....O.....

0x0070: 0000 0000 0002 7900 0000 0000 0000 00a0 ......y.........

0x0080: 79e8 c5a7 ffff 37f0 0000 0000 0000 0000 y.....7.........

0x0090: 0000 0000 0000 ffff 0c0c 0c02 0000 0000 ................

0x00a0: 0000 0000 0000 ffff 0c0c 0c01 0000 0003 ................

0x00b0: 0001 0898 0000 0000 0000 0000 0000 0000 ................

0x00c0: 0000 0000 0000 0000 0000 0000 0000 0000 ................

0x00d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................

0x00e0: 0040 c1ef 0000 0000 0000 0000 0000 0000 .@..............

0x00f0: 0c0c 0c02 0000 0000 0000 0000 0000 0000 ................

0x0100: 0c0c 0c01 0000 0000 0000 0000 0000 0000 ................

0x0110: 0000 0000 0000 0000 0000 0000 0000 0000 ................

0x0120: 0000 0000 0000 0000 0000 0000 0000 0000 ................

0x0130: 0000 0000 0000 0000 0000 0000 640c f1db ............d...

10:51:15.580920 24:8a:07:0d:d6:d0 (oui Unknown) > 24:8a:07:11:4f:91 (oui Unknown), ethertype Unknown (0x8915), length 334:

...

My Setup:

root@ceb-ubu14:~# uname -a

Linux ceb-ubu14 4.4.0-36-generic #55~14.04.1-Ubuntu SMP Fri Aug 12 11:49:30 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

root@ceb-ubu14:~# ibdev2netdev

mlx5_0 port 1 ==> eth2 (Up)

root@ceb-ubu14:~# ifconfig eth2

eth2 Link encap:Ethernet HWaddr 24:8a:07:0d:d6:d0

inet addr:12.12.12.1 Bcast:12.12.12.255 Mask:255.255.255.0

inet6 addr: fe80::268a:7ff:fe0d:d6d0/64 Scope:Link

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

RX packets:10 errors:0 dropped:0 overruns:0 frame:0

TX packets:71 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000

RX bytes:1237 (1.2 KB) TX bytes:10660 (10.6 KB)

root@ceb-ubu14:~# ethtool -i eth2

driver: mlx5_core

version: 3.4-1.0.0 (25 Sep 2016)

firmware-version: 14.16.1006

bus-info: 0000:05:00.0

supports-statistics: yes

supports-test: yes

supports-eeprom-access: no

supports-register-dump: no

supports-priv-flags: yes

root@ceb-ubu14:~# ethtool --show-priv-flags eth2

Private flags for eth2:

hw_lro : on

sniffer : on

dcbx_handle_by_fw : off

qos_with_dcbx_by_fw: off

rx_cqe_moder : off

root@ceb-ubu14:~# ibdump -d mlx5_0

Initiating resources ...

searching for IB devices in host

Port active_mtu=1024

MR was registered with addr=0x16a7000, lkey=0xbf82, rkey=0xbf82, flags=0x1

------------------------------------------------

Device : "mlx5_0"

Physical port : 1

Link layer : Ethernet

Dump file : sniffer.pcap

Sniffer WQEs (max burst size) : 4096

------------------------------------------------

Failed to set port sniffer1: command interface bad param

Thanks,

Curt

↧

Re: OFED/IBDUMP for ConnectX-4

October 5, 2016, 2:59 pm

≫ Next: Does ubuntu 14.04 inbox-driver support connectx-4 ?

≪ Previous: Re: OFED/IBDUMP for ConnectX-4

Hi, Curt, last time I observed it working was on combination of CentOS-7.2 + MOFED-3.3-1.0.4.0.

Indeed worked as advertised: set sniffer flags using patched ethtool (e.g. installed on my system /opt/mellanox/ethtool/sbin/ethtool), tcpdump -i <interface> -w <file> does the job. Latest stable of wireshark does not decode traffic for me well, but unstable build does ok (I browse captured files on Windows using wireshark 2.1.1).

Regard,

Philip

↧

Does ubuntu 14.04 inbox-driver support connectx-4 ?

October 6, 2016, 1:49 am

≫ Next: NVMe native vs NVMe over fabrics on ConnectX-3

≪ Previous: Re: OFED/IBDUMP for ConnectX-4

Hi,

I am trying to use connectx-4 en card with ubuntu 14.04 inbox driver,

but it seems not to be supported.

I already know that connetx-3 card supported by ubuntu inbox driver.

Does anyone try this?

or know about possible version of ubuntu that support connectx-4 card with inbox driver?

thanks.

Regards

Taiyoung.

↧

NVMe native vs NVMe over fabrics on ConnectX-3

October 6, 2016, 5:09 am

≫ Next: Re: OFED/IBDUMP for ConnectX-4

≪ Previous: Does ubuntu 14.04 inbox-driver support connectx-4 ?

How much performance do you lose by running NVMe devices over Ethernet or InfiniBand compared to running them natively?

We did the test using ConnectX-3 adapters on Ethernet and InfiniBand.

Here are the results:

http://www.zeta.systems/blog/2016/10/06/NVMe-Native-vs-NVMe-Over-Fabrics/

↧

Re: OFED/IBDUMP for ConnectX-4

October 6, 2016, 8:28 am

≫ Next: OFED 2.4-1.0.4 for EL7.2?

≪ Previous: NVMe native vs NVMe over fabrics on ConnectX-3

Thanks for the response, Philip!

I can get the dumps, they just don't look too pretty and tcpdump doesn't seem to understand the RoCE traffic. I was hoping to just get timestamps on each RDMA transfer to get more latency insight. I might grab the unstable build of Wireshark, like you suggested, to see if that can give me what I'm looking for.

Thanks again,

Curt

↧

OFED 2.4-1.0.4 for EL7.2?

October 7, 2016, 1:50 am

≫ Next: SX6036 IPoIB speed issue

≪ Previous: Re: OFED/IBDUMP for ConnectX-4

Is there [going to be] an OFED 2.4-1.0.4 release for EL7.2? I'm not currently in a position to upgrade to one of the 3-series and I've got a Knights Landing to play with.

↧

SX6036 IPoIB speed issue

October 7, 2016, 4:19 am

≫ Next: ConnectX-2 IPoIB Performance Help

≪ Previous: OFED 2.4-1.0.4 for EL7.2?

Hi,

I was hoping that someone would be able to help me.

I have 15 HP DL380 gen9 servers running Windows Server 2012 R2 each with dual port HP Connect-X3 VPI cards connected to a Mellanox SX6036 switch.

The line speed is correctly displayed as 32Gbps (QDR) but we are not getting anywhere near that performance. Real-world RDMA speeds seems to max out at 25Gbps (which is OK but could be better) but the maximum speed we seem to get with IPoIB is 1Gbps. Below is a ntttcp test:

c:\temp\NTttcp-v5.31\x64>ntttcp.exe -r -m 8,*,10.167.255.111 -rb 2M -a 16 -t 30

Network activity progressing...

Thread Time(s) Throughput(KB/s) Avg B / Compl

====== ======= ================ =============

0 30.062 1896.880 65536.000

1 30.046 1414.365 63434.262

2 30.124 1459.567 64410.918

3 30.093 2473.399 65536.000

4 30.062 1847.914 65536.000

5 30.062 2452.531 65365.777

6 30.047 2726.406 63310.495

7 30.047 2890.476 61291.891

##### Totals: #####

Bytes(MEG) realtime(s) Avg Frame Size Throughput(MB/s)

================ =========== ============== ================

503.877392 30.069 3966.649 16.757

Throughput(Buffers/s) Cycles/Byte Buffers

===================== =========== =============

268.118 395.755 8062.038

DPCs(count/s) Pkts(num/DPC) Intr(count/s) Pkts(num/intr)

============= ============= =============== ==============

81.845 54.124 5762.646 0.769

Packets Sent Packets Received Retransmits Errors Avg. CPU %

============ ================ =========== ====== ==========

7198 133199 2 6 15.137

We have a similar environment which is slightly different. 9 HP DL380 , Connect-X3 cards connected to a IS5022 switch. This environment performs as expected:

c:\temp\NTttcp-v5.31\x64>ntttcp.exe -s -m 8,*,192.168.84.10 -l 128k -a 2 -t 30

Network activity progressing...

Thread Time(s) Throughput(KB/s) Avg B / Compl

====== ======= ================ =============

0 30.000 469060.267 131072.000

1 30.000 359970.133 131072.000

2 30.000 446084.267 131072.000

3 30.000 437909.333 131072.000

4 30.000 348608.000 131072.000

5 30.000 387993.600 131072.000

6 30.000 348654.933 131072.000

7 30.000 357444.267 131072.000

##### Totals: #####

Bytes(MEG) realtime(s) Avg Frame Size Throughput(MB/s)

================ =========== ============== ================

92452.875000 30.000 4037.074 3081.762

Throughput(Buffers/s) Cycles/Byte Buffers

===================== =========== =============

24654.100 0.614 739623.000

DPCs(count/s) Pkts(num/DPC) Intr(count/s) Pkts(num/intr)

============= ============= =============== ==============

51664.567 1.719 72494.233 1.225

Packets Sent Packets Received Retransmits Errors Avg. CPU %

============ ================ =========== ====== ==========

24013397 2663789 15 6 6.892

The only major difference between the two environments is the switch. I’m pretty sure that the SX6036 is configured correctly but there must be something wrong if we are getting a throughput of 16MBps compared with 3081MBps!

Any help on this issue would be much appreciated. I can provide switch config and more details if required.

Thanks,

Zak

↧

ConnectX-2 IPoIB Performance Help

October 7, 2016, 8:07 am

≫ Next: Re: How to connect two (small) fat tree networks

≪ Previous: SX6036 IPoIB speed issue

I really need some expertise here:

I have two Windows 10 Machines with two MHQH19B-XTR 40 Gbit Adapters and a QSFP cable in between. The Vlan manager is opensm.

The connection should be about 32Gbits Lan. In reality i only get 5 Gbit performance. So clearly something is very wrong.

C:\Program Files\Mellanox\MLNX_VPI\IB\Tools>iblinkinfo

CA: E8400:

0x0002c903004cdfb1 2 1[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 1 1[ ] "IP35" ( )

CA: IP35:

0x0002c903004ef325 1 1[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 2 1[ ] "E8400" ( )

I tested my IPoIB with a program called lanbench and nd_read_bw:

nd_read_bw -a -n 100 -C 169.254.195.189

#qp #bytes #iterations MR [Mmps] Gb/s CPU Util.

0 512 100 0.843 3.45 0.00

0 1024 100 0.629 5.15 0.00

0 2048 100 0.313 5.13 0.00

0 4096 100 0.165 5.39 0.00

0 8192 100 0.083 5.44 0.00

0 16384 100 0.042 5.47 0.00

0 32768 100 0.021 5.47 100.00

..stays at 5.47 after that. with CPU util 100%

The processor is an intel core I7 4790k so it should not be at 100%. According to Taskmanager only 1 Core is actively used.

Firmware, Drivers, Windows 10 are up to date.

My goal is to get the fastest possible File sharing between two windows 10 machines.

What could be the problem here and how do I fix it?

↧

Re: How to connect two (small) fat tree networks

October 7, 2016, 11:25 am

≫ Next: Re: SX6018 ?? Please Help

≪ Previous: ConnectX-2 IPoIB Performance Help

Combining IB fabrics is quite common, and sharing storage is often the reason. There are enough detailed considerations that you should consult with your local Mellanox sales engineer.

However, there are some general points that may be helpful.

Although not suggested in the original question, it is tempting to simply connect a few cables from the lower (L1) switches on Cluster A to the L1 switches on Cluster B, but this is not recommended. It may seem to work, but it can have unexpected side effects. For example, when using Up/Down routing, if the spine switches of Cluster A are declared as the Root switches, the L1 switch of Cluster B that connects to Cluster A is treated as being 'below' the L1 switches in Cluster A. The spine switches of Cluster B become even 'lower', and the L1 switches of Cluster B are lower still because they are farthest from the roots. Effectively one L1 switch from Cluster B-- the one connected to Cluster A-- is closer to Cluster A. If the storage is in Cluster A, the storage latency from one portion of Cluster B will be two switch hops less than from the rest of Cluster B.

Now assume that for some reason, e.g. resiliency, two L1 switches on Cluster B are connected to two switches on Cluster A. Call the Cluster B L1 switches 'X' and 'Y'. If Cluster A is the root for Up/Down routing, switches X and Y will be closer to the root switches than the other L1 switches in Cluster B. Now, if a node on switch X talks to a node on switch Y, the traffic will be routed through the spine of Cluster A, via cluster A L1 switches-- not through the spine of cluster B. The latency between Switches X and Y is now higher, by two switch hops, than the latency between other L1 switches in Cluster B. For compute nodes this is probably not a good outcome.

Another point: Connecting nodes to spine switches, e.g. fileservers, is generally discouraged. Such nodes are often the reason that combining two IB fabrics requires a more detailed analysis. The analysis depends on the traffic patterns to and from those nodes. To address the original question more specifically: cross-connecting between the spines of two clusters is possible but not common, and it may not be possible with nodes connected directly to any spine. There are two approaches that are more common. One approach is to connect spine(s) of one cluster to one of more L1 switches of the other cluster, i.e. connect Cluster B as a sub-tree of Cluster A, and form a 4-tier fabric. Another approach is to connect the spines of Cluster B as through they are L1 switches of cluster A. In both approaches, the key is to do this effectively with only a handful of cables.

Let us know if you prefer to take this offline with a local Mellanox resource.

↧

Re: SX6018 ?? Please Help

October 7, 2016, 11:17 pm

≫ Next: Fiber optic cable?

≪ Previous: Re: How to connect two (small) fat tree networks

hi bruce

I've the same problems with a SX6018 Switch out of ebay, because it is an EMC. I do have an actual Image of MLNX-OS for that switch, but I dont know any way to bring it up. So if you have an idea, i would be able to share my image.

full of hope

herbie

↧

Fiber optic cable?

October 8, 2016, 10:15 am

≫ Next: Re: Fiber optic cable?

≪ Previous: Re: SX6018 ?? Please Help

I own an is5022 8-port 40 GB/s switch. Can it use a QSFP fiber optic cable? Or does it have to be copper? Just want to make sure before I buy a fiber optic cable.

Thank you!

↧

Re: Fiber optic cable?

October 8, 2016, 10:34 am

≫ Next: Please help

≪ Previous: Fiber optic cable?

Yes

↧