Quantcast
Channel: Mellanox Interconnect Community: Message List
Viewing all 6227 articles
Browse latest View live

Re: krping (4.7 kernel version) crashing with mlx5_core (CX415A), it is passing with mlx4_core(CX354A)

$
0
0

Hi Yonatan,

 

The logs what ever are there, they are dmesg outputs


Windows RDMA storage software

$
0
0

Is there any RDMA based storage software available for server 2008 R2?

Re: ib_write_bw error connecting to server with rdma_cm option

$
0
0

Thanks Talat, your suggestion to use IP to connect was correct, I was trying with the wrong NIC IP, using the ib IP worked.

Re: Where can I find the firmware for ConnectX-3 MT27500 v2.34.5000 for DEL1100001019

Re: Where can I find the firmware for ConnectX-3 MT27500 v2.34.5000 for DEL1100001019

$
0
0

Perfect thank you !

 

Do you know if it's possible to get an older version ? (just in case... )

Re: Where can I find the firmware for ConnectX-3 MT27500 v2.34.5000 for DEL1100001019

$
0
0

sure,

in addition you can save the current fw image that installed on the device.

see mstflint or flint help page (ri command)

ri   <out-file>                            : Read the fw image on the flash.

 

for example

mstflint -d <lspci-device> ri image.bin

or

flint -d <mst-device> ri image.bin

Re: Dfms high rate is not supported and lots of others

$
0
0

Hi Nikolay ,

could you please send the operation system, FW and driver version of the device ?

FW :

flint -d <mst-device> q

 

Driver:

ofed_info -s

or

modinfo mlx4_core

Re: How Can I change my group members?


Re: OFED/IBDUMP for ConnectX-4

$
0
0

Thought I would add my question here as well as I'm having the same issues:

I just updated to MLNX_OFED_LINUX-3.4-1.0.0.0 today (was using 3.3) and I can't get ibdump to work on my ConnectX-4 Lx card. 

Are the ConnectX-4 cards going to be supported by ibdump?   Any pointers/suggestions would be appreciated!

 

Tcpdump doesn't seem to correctly decode RoCE traffic - well, it doesn't decode it like this post shows: https://community.mellanox.com/docs/DOC-2416

 

This is what I see when I do tcpdump:

root@ceb-ubu14:~# tcpdump -i eth2 -vv > ~/rdma_traffic.txt

 

10:51:15.571909 24:8a:07:11:4f:91 (oui Unknown) > 24:8a:07:0d:d6:d0 (oui Unknown), ethertype Unknown (0x8915), length 334:

        0x0000:  6000 0000 0118 1b01 0000 0000 0000 0000  `...............

        0x0010:  0000 ffff 0c0c 0c02 0000 0000 0000 0000  ................

        0x0020:  0000 ffff 0c0c 0c01 6440 ffff 0000 0001  ........d@......

        0x0030:  0000 056e 8001 0000 0000 0001 0107 0203  ...n............

        0x0040:  0000 0000 0000 0004 34ed 641e 0010 0000  ........4.d.....

        0x0050:  0000 0000 1e64 ed34 0000 0000 0000 0000  .....d.4........

        0x0060:  0106 4853 248a 0703 0011 4f91 0000 0000  ..HS$.....O.....

        0x0070:  0000 0000 0002 7900 0000 0000 0000 00a0  ......y.........

        0x0080:  79e8 c5a7 ffff 37f0 0000 0000 0000 0000  y.....7.........

        0x0090:  0000 0000 0000 ffff 0c0c 0c02 0000 0000  ................

        0x00a0:  0000 0000 0000 ffff 0c0c 0c01 0000 0003  ................

        0x00b0:  0001 0898 0000 0000 0000 0000 0000 0000  ................

        0x00c0:  0000 0000 0000 0000 0000 0000 0000 0000  ................

        0x00d0:  0000 0000 0000 0000 0000 0000 0000 0000  ................

        0x00e0:  0040 c1ef 0000 0000 0000 0000 0000 0000  .@..............

        0x00f0:  0c0c 0c02 0000 0000 0000 0000 0000 0000  ................

        0x0100:  0c0c 0c01 0000 0000 0000 0000 0000 0000  ................

        0x0110:  0000 0000 0000 0000 0000 0000 0000 0000  ................

        0x0120:  0000 0000 0000 0000 0000 0000 0000 0000  ................

        0x0130:  0000 0000 0000 0000 0000 0000 640c f1db  ............d...

10:51:15.580920 24:8a:07:0d:d6:d0 (oui Unknown) > 24:8a:07:11:4f:91 (oui Unknown), ethertype Unknown (0x8915), length 334:

...

 

My Setup:

root@ceb-ubu14:~# uname -a

Linux ceb-ubu14 4.4.0-36-generic #55~14.04.1-Ubuntu SMP Fri Aug 12 11:49:30 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

 

root@ceb-ubu14:~# ibdev2netdev

mlx5_0 port 1 ==> eth2 (Up)

 

root@ceb-ubu14:~# ifconfig eth2

eth2      Link encap:Ethernet  HWaddr 24:8a:07:0d:d6:d0

          inet addr:12.12.12.1  Bcast:12.12.12.255  Mask:255.255.255.0

          inet6 addr: fe80::268a:7ff:fe0d:d6d0/64 Scope:Link

          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

          RX packets:10 errors:0 dropped:0 overruns:0 frame:0

          TX packets:71 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:1000

          RX bytes:1237 (1.2 KB)  TX bytes:10660 (10.6 KB)

         

root@ceb-ubu14:~# ethtool -i eth2

driver: mlx5_core

version: 3.4-1.0.0 (25 Sep 2016)

firmware-version: 14.16.1006

bus-info: 0000:05:00.0

supports-statistics: yes

supports-test: yes

supports-eeprom-access: no

supports-register-dump: no

supports-priv-flags: yes

 

root@ceb-ubu14:~# ethtool --show-priv-flags eth2

Private flags for eth2:

hw_lro             : on

sniffer            : on

dcbx_handle_by_fw  : off

qos_with_dcbx_by_fw: off

rx_cqe_moder       : off

 

root@ceb-ubu14:~# ibdump -d mlx5_0

Initiating resources ...

searching for IB devices in host

Port active_mtu=1024

MR was registered with addr=0x16a7000, lkey=0xbf82, rkey=0xbf82, flags=0x1

------------------------------------------------

Device                         : "mlx5_0"

Physical port                  : 1

Link layer                     : Ethernet

Dump file                      : sniffer.pcap

Sniffer WQEs (max burst size)  : 4096

------------------------------------------------

Failed to set port sniffer1: command interface bad param

 

 

Thanks,

Curt

Re: OFED/IBDUMP for ConnectX-4

$
0
0

Hi, Curt, last time I observed it working was on combination of CentOS-7.2 + MOFED-3.3-1.0.4.0.

 

Indeed worked as advertised: set sniffer flags using patched ethtool (e.g. installed on my system /opt/mellanox/ethtool/sbin/ethtool), tcpdump -i <interface> -w <file> does the job. Latest stable of wireshark does not decode traffic for me well, but unstable build does ok (I browse captured files on Windows using wireshark 2.1.1).

 

Regard,

 

   Philip

Does ubuntu 14.04 inbox-driver support connectx-4 ?

$
0
0

Hi,

I am trying to use connectx-4 en card with ubuntu 14.04 inbox driver,

but it seems not to be supported.

 

I already know that connetx-3 card supported by ubuntu inbox driver.

 

Does anyone try this?

or know about possible version of ubuntu that support connectx-4 card with inbox driver?

 

thanks.

 

Regards

 

Taiyoung.

NVMe native vs NVMe over fabrics on ConnectX-3

Re: OFED/IBDUMP for ConnectX-4

$
0
0

Thanks for the response, Philip!

 

I can get the dumps, they just don't look too pretty and tcpdump doesn't seem to understand the RoCE traffic.  I was hoping to just get timestamps on each RDMA transfer to get more latency insight.  I might grab the unstable build of Wireshark, like you suggested, to see if that can give me what I'm looking for.


Thanks again,

Curt

OFED 2.4-1.0.4 for EL7.2?

$
0
0

Is there [going to be] an OFED 2.4-1.0.4 release for EL7.2? I'm not currently in a position to upgrade to one of the 3-series and I've got a Knights Landing to play with.

SX6036 IPoIB speed issue

$
0
0

Hi,

 

I was hoping that someone would be able to help me.

I have 15 HP DL380 gen9 servers running Windows Server 2012 R2 each with dual port HP Connect-X3 VPI cards connected to a Mellanox SX6036 switch.

The line speed is correctly displayed as 32Gbps (QDR) but we are not getting anywhere near that performance. Real-world RDMA speeds seems to max out at 25Gbps (which is OK but could be better) but the maximum speed we seem to get with IPoIB  is 1Gbps. Below is a ntttcp test:

 

c:\temp\NTttcp-v5.31\x64>ntttcp.exe -r -m 8,*,10.167.255.111  -rb 2M -a 16 -t 30

 

Copyright Version 5.31

Network activity progressing...

 

 

Thread  Time(s) Throughput(KB/s) Avg B / Compl

======  ======= ================ =============

     0 30.062 1896.880     65536.000

1   30.046 1414.365     63434.262

2   30.124 1459.567     64410.918

3   30.093 2473.399     65536.000

4   30.062 1847.914     65536.000

5   30.062 2452.531     65365.777

6   30.047 2726.406     63310.495

7   30.047 2890.476     61291.891

 

 

#####  Totals:  #####

 

 

Bytes(MEG)    realtime(s) Avg Frame Size Throughput(MB/s)

================ =========== ============== ================

      503.877392 30.069 3966.649           16.757

 

 

Throughput(Buffers/s) Cycles/Byte       Buffers

===================== =========== =============

268.118     395.755      8062.038

 

 

DPCs(count/s) Pkts(num/DPC)   Intr(count/s) Pkts(num/intr)

============= ============= =============== ==============

81.845 54.124 5762.646          0.769

 

 

Packets Sent Packets Received Retransmits Errors Avg. CPU %

============ ================ =========== ====== ==========

7198 133199 2      6     15.137

 

 

We have a similar environment which is slightly different. 9 HP DL380 , Connect-X3 cards connected to a IS5022 switch. This environment performs as expected:

 

c:\temp\NTttcp-v5.31\x64>ntttcp.exe -s -m 8,*,192.168.84.10 -l 128k -a 2 -t 30

Copyright Version 5.31

Network activity progressing...

 

 

Thread  Time(s) Throughput(KB/s) Avg B / Compl

======  ======= ================ =============

     0 30.000       469060.267 131072.000

     1 30.000       359970.133 131072.000

     2 30.000       446084.267 131072.000

     3 30.000       437909.333 131072.000

     4 30.000       348608.000 131072.000

     5 30.000       387993.600 131072.000

     6 30.000       348654.933    131072.000

     7 30.000       357444.267 131072.000

 

 

#####  Totals:  #####

 

 

   Bytes(MEG)    realtime(s) Avg Frame Size Throughput(MB/s)

================ =========== ============== ================

92452.875000 30.000 4037.074         3081.762

 

 

Throughput(Buffers/s) Cycles/Byte       Buffers

===================== =========== =============

24654.100       0.614 739623.000

 

 

DPCs(count/s) Pkts(num/DPC)   Intr(count/s) Pkts(num/intr)

============= ============= =============== ==============

51664.567 1.719 72494.233          1.225

 

 

Packets Sent Packets Received Retransmits Errors Avg. CPU %

============ ================ =========== ====== ==========

24013397 2663789 15      6      6.892

 

 

The only major difference between the two environments is the switch. I’m pretty sure that the SX6036 is configured correctly but there must be something wrong if we are getting a throughput of 16MBps compared with 3081MBps!

 

Any help on this issue would be much appreciated. I can provide switch config and more details if required.

 

Thanks,

Zak


ConnectX-2 IPoIB Performance Help

$
0
0

I really need some expertise here:

 

I have two Windows 10 Machines with two MHQH19B-XTR 40 Gbit Adapters and a QSFP cable in between. The Vlan manager is opensm.

 

The connection should be about 32Gbits Lan. In reality i only get 5 Gbit performance. So clearly something is very wrong.

C:\Program Files\Mellanox\MLNX_VPI\IB\Tools>iblinkinfo

CA: E8400:

      0x0002c903004cdfb1      2    1[  ] ==( 4X          10.0 Gbps Active/  LinkUp)==>       1    1[  ] "IP35" ( )

CA: IP35:

      0x0002c903004ef325      1    1[  ] ==( 4X          10.0 Gbps Active/  LinkUp)==>       2    1[  ] "E8400" ( )

 

I tested my IPoIB with a program called lanbench and nd_read_bw:

nd_read_bw -a -n 100 -C 169.254.195.189

#qp #bytes #iterations    MR [Mmps]     Gb/s     CPU Util.

0   512       100          0.843        3.45     0.00

0   1024      100          0.629        5.15     0.00

0   2048      100          0.313        5.13     0.00

0   4096      100          0.165        5.39     0.00

0   8192      100          0.083        5.44     0.00

0   16384     100          0.042        5.47     0.00

0   32768     100          0.021        5.47     100.00

..stays at 5.47 after that. with CPU util 100%

The processor is an intel core I7 4790k so it should not be at 100%. According to Taskmanager only 1 Core is actively used.

Firmware, Drivers, Windows 10 are up to date.

 

My goal is to get the fastest possible File sharing between two windows 10 machines.

What could be the problem here and how do I fix it?

Re: How to connect two (small) fat tree networks

$
0
0

Combining IB fabrics is quite common, and sharing storage is often the reason.  There are enough detailed considerations that you should consult with your local Mellanox sales engineer. 

However, there are some general points that may be helpful.

 

Although not suggested in the original question, it is tempting to simply connect a few cables from the lower (L1) switches on Cluster A to the L1 switches on Cluster B, but this is not recommended.  It may seem to work, but it can have unexpected side effects.  For example, when using Up/Down routing, if the spine switches of Cluster A are declared as the Root switches, the L1 switch of Cluster B that connects to Cluster A is treated as being 'below' the L1 switches in Cluster A.  The spine switches of Cluster B become even 'lower', and the L1 switches of Cluster B are lower still because they are farthest from the roots.  Effectively one L1 switch from Cluster B-- the one connected to  Cluster A-- is closer to Cluster A.  If the storage is in Cluster A, the storage latency from one portion of Cluster B will be two switch hops  less than from the rest of Cluster B.

Now assume that for some reason, e.g. resiliency, two L1 switches on Cluster B are connected to two switches on Cluster A.  Call the Cluster B L1 switches 'X' and 'Y'. If Cluster A is the root for Up/Down routing, switches X and Y will be closer to the root switches than the other L1 switches in Cluster B. Now, if a node on switch X talks to a node on switch Y, the traffic will be routed through the spine of Cluster A, via cluster A L1 switches-- not through the spine of cluster B.  The latency between Switches X and Y is now higher, by two switch hops, than the latency between other L1 switches in Cluster B.  For compute nodes this is probably not a good outcome.

 

Another point:  Connecting nodes to spine switches, e.g. fileservers, is generally discouraged.  Such nodes are often the reason that combining two IB fabrics requires a more detailed analysis.  The analysis depends on the traffic patterns to and from those nodes. To address the original question more specifically:  cross-connecting between the spines of two clusters is possible but not common, and it may not be possible with nodes connected directly to any spine.  There are two approaches that are more common.  One approach is to connect spine(s) of one cluster to one of more L1 switches of the other cluster, i.e. connect Cluster B as a sub-tree of Cluster A, and form a 4-tier fabric.  Another approach is to connect the spines of Cluster B as through they are L1 switches of cluster A.  In both approaches, the key is to do this effectively with only a handful of cables.

 

Let us know if you prefer to take this offline with a local Mellanox resource.

 

 

Re: SX6018 ?? Please Help

$
0
0

hi bruce

I've the same problems with a SX6018 Switch out of ebay, because it is an EMC. I do have an actual Image of MLNX-OS for that switch, but I dont know any way to bring it up. So if you have an idea, i would be able to share my image.

full of hope

herbie

Fiber optic cable?

$
0
0

I own an is5022 8-port 40 GB/s switch.  Can it use a QSFP fiber optic cable?  Or does it have to be copper?  Just want to make sure before I buy a fiber optic cable.

 

Thank you!

Re: Fiber optic cable?

Viewing all 6227 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>