Quantcast
Channel: Mellanox Interconnect Community: Message List
Viewing all 6227 articles
Browse latest View live

Re: Frequent TCP retransmissions with ConnectX-3 EN

$
0
0

> FC = Flow control, assuming it is enabled on the NIC (ethtool -a ethX) it is needed also to be enabled on  every switch port on your Arista.

 

Got it.  It is enabled on the NIC

 

$ ethtool -a p6p1

Pause parameters for p6p1:

Autonegotiate: off

RX: on

TX: on

 

But not enabled on the switch:

 

# show interfaces ethernet 14 flowcontrol

Port       Send FlowControl  Receive FlowControl  RxPause       TxPause     

           admin    oper     admin    oper                                  

---------  -------- -------- -------- --------    ------------- -------------

Et14       off      off      off      off         0             0           

 

Do the two devices need to agree on flow control before the NIC will try to pause?  ethtool -S shows all the "rx_pause" counters are set to 0 (zero).

 

> Assuming you're using OFED 2.4 - Can you copy paste dmesg output when the low lever driver loads (mlx4_core) ?

 

So I have a variety of driver versions that seem to have this issue.  I plan to get everything to 2.4 shortly.  Here is the dmesg output from one server with 2.4:

 

$ dmesg | grep mlx

[   17.650462] mlx4_core: Mellanox ConnectX core driver v2.4-1.0.4 (Apr 24 2015)

[   17.737351] mlx4_core: Initializing 0000:02:00.0

[   23.820087] mlx4_core 0000:02:00.0: PCIe link speed is 8.0GT/s, device supports 8.0GT/s

[   23.908150] mlx4_core 0000:02:00.0: PCIe link width is x4, device supports x4

[   26.069942] mlx4_core 0000:02:00.0: irq 86 for MSI/MSI-X

[   26.069947] mlx4_core 0000:02:00.0: irq 87 for MSI/MSI-X

[   26.069951] mlx4_core 0000:02:00.0: irq 88 for MSI/MSI-X

[   26.069955] mlx4_core 0000:02:00.0: irq 89 for MSI/MSI-X

[   26.069958] mlx4_core 0000:02:00.0: irq 90 for MSI/MSI-X

[   26.069962] mlx4_core 0000:02:00.0: irq 91 for MSI/MSI-X

[   26.069966] mlx4_core 0000:02:00.0: irq 92 for MSI/MSI-X

[   26.069970] mlx4_core 0000:02:00.0: irq 93 for MSI/MSI-X

[   26.069974] mlx4_core 0000:02:00.0: irq 94 for MSI/MSI-X

[   26.069978] mlx4_core 0000:02:00.0: irq 95 for MSI/MSI-X

[   26.069982] mlx4_core 0000:02:00.0: irq 96 for MSI/MSI-X

[   26.069985] mlx4_core 0000:02:00.0: irq 97 for MSI/MSI-X

[   26.069989] mlx4_core 0000:02:00.0: irq 98 for MSI/MSI-X

[   26.114201] mlx4_en: Mellanox ConnectX HCA Ethernet driver v2.4-1.0.4 (Apr 24 2015)

[   26.205113] mlx4_en 0000:02:00.0: registered PHC clock

[   26.296333] mlx4_en 0000:02:00.0: Activating port:1

[   26.402739] mlx4_en: eth0: Using 96 TX rings

[   26.494296] mlx4_en: eth0: Using 8 RX rings

[   26.586020] mlx4_en: eth0: Initializing port

[   26.679114] <mlx4_ib> mlx4_ib_add: mlx4_ib: Mellanox ConnectX InfiniBand driver v2.4-1.0.4 (Apr 24 2015)

[   26.775863] mlx4_core 0000:02:00.0: mlx4_ib_add: allocated counter index 1 for port 1

[   28.825562] mlx4_en: p6p1:   frag:0 - size:1522 prefix:0 stride:1536

[   29.545220] mlx4_core 0000:02:00.0: mlx4_ib: Port 1 logical link is up

[   29.545232] mlx4_en: p6p1: Link Up

 

Also:

 

$ ethtool --show-priv-flags p6p1

Private flags for p6p1:

pm_qos_request_low_latency    : off

mlx4_rss_xor_hash_function    : off

mlx4_flow_steering_ethernet_l2: on

mlx4_flow_steering_ipv4       : on

mlx4_flow_steering_tcp        : on

mlx4_flow_steering_udp        : on

qcn_disable_32_14_4_e         : off

blueflame                     : on

 

> and you can also try optimizing the flow steering mode - just create a mlnx.conf file under /etc/modprobe.d/ and put the following line :

>

>   options mlx4_core log_num_mgm_entry_size=-7

 

Seems like for most use cases, this would be a good thing to do.  Any reason it isn't the default?  I'll give it a try.  Thanks again for your help.  -- Bud


Problems with ConnectX2(rev:b0)

$
0
0

I have a Mellanox ConnextX2 ethernet adapter and I having some issues receiving packets on it. I am sending simple layer 2 vlan tagged packets. I can see the packets were sent on the wire to the ethernet card, but nothing gets received on the port. Where should I start debugging ? I looked at all the software tools, but no clue so far. Here are the details of my card

 

 

06:00.0 Ethernet controller: Mellanox Technologies MT26448 [ConnectX EN 10GigE, PCIe 2.0 5GT/s] (rev b0)

  Subsystem: Mellanox Technologies Device 0018

08:00.0 Ethernet controller: Mellanox Technologies MT26448 [ConnectX EN 10GigE, PCIe 2.0 5GT/s] (rev b0)

  Subsystem: Mellanox Technologies MT26448 [ConnectX EN 10GigE, PCIe 2.0 5GT/s]

21:00.0 Ethernet controller: Mellanox Technologies MT26448 [ConnectX EN 10GigE, PCIe 2.0 5GT/s] (rev b0)

  Subsystem: Mellanox Technologies MT26448 [ConnectX EN 10GigE, PCIe 2.0 5GT/s]

 

 

Any help would be appreciated

 

Thanks

Srini

 

Re: Frequent TCP retransmissions with ConnectX-3 EN

$
0
0

Erez, In looking at the VMA github repo

 

Commits · Mellanox/libvma · GitHub

 

it seems like there has been some work with ack timers and retransmits between VMA 6.8.3 (which ships with OFED 2.4) and VMA 6.8.4.  Should I consider installing a newer version of vma on top of my OFED?

 

-- Bud

FreeBSD 10.1 ib0 does not bring link up at boot

$
0
0

I have a strange issue. I've set ib mode on FreeBSD driver 2.1.6 (same behavior on 2.1.5), and set ip for ib0. The issue that i'm facing is that after a reboot, i need to do ifconfig ib0 down && ifconfig ib0 up to actually have link on ib0. Also i must do this after "mlx4_core0: mlx4_ib: Port 1 logical link is up".

 

See here:

#dmesg
...
mlx4_core0: <mlx4_core> mem 0xfb200000-0xfb2fffff,0xf9000000-0xf97fffff irq 40 at device 0.0 on pci4
uhub0: 2 ports with 2 removable, self powered
uhub1: 2 ports with 2 removable, self powered
ugen1.2: <vendor 0x8087> at usbus1
uhub2: <vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2> on usbus1
ugen0.2: <vendor 0x8087> at usbus0
uhub3: <vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2> on usbus0
uhub3: 6 ports with 6 removable, self powered
uhub2: 8 ports with 8 removable, self powered
ugen0.3: <Winbond Electronics Corp> at usbus0
ukbd0: <Winbond Electronics Corp Hermon USB hidmouse Device, class 0/0, rev 1.10/0.01, addr 3> on usbus0
kbd0 at ukbd0
mlxen0: Ethernet address: e4:1d:2d:0d:bf:60
mlx4_en: mlx4_core0: Port 1: Using 12 TX rings
mlx4_en: mlx4_core0: Port 1: Using 8 RX rings
mlx4_en: mlxen0: Using 12 TX rings
mlx4_en: mlxen0: Using 8 RX rings
mlx4_en: mlxen0: Initializing port
mlx4_core0: mlx4_ib_add: allocated counter index 1 for port 1
SMP: AP CPU #1 Launched!
SMP: AP CPU #4 Launched!
SMP: AP CPU #6 Launched!
SMP: AP CPU #9 Launched!
SMP: AP CPU #7 Launched!
SMP: AP CPU #8 Launched!
SMP: AP CPU #5 Launched!
SMP: AP CPU #11 Launched!
SMP: AP CPU #10 Launched!
SMP: AP CPU #3 Launched!
SMP: AP CPU #2 Launched!
Timecounter "TSC" frequency 2000046460 Hz quality 1000
Trying to mount root from zfs:zroot/ROOT/default []...
ib0: Attached to mlx4_0 port 1
ums0: <Winbond Electronics Corp Hermon USB hidmouse Device, class 0/0, rev 1.10/0.01, addr 3> on usbus0
ums0: 3 buttons and [Z] coordinates ID=0
igb0: link state changed to UP
lagg0: link state changed to UP
vlan200: link state changed to UP
igb1: link state changed to UP
igb2: link state changed to UP
igb3: link state changed to UP
mlx4_core0: mlx4_ib: Port 1 logical link is up



root@fat:~ # ifconfig ib0
ib0: flags=8043<UP,BROADCAST,RUNNING,MULTICAST> metric 0 mtu 4092        options=8009b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTATE>        lladdr 0.0.0.48.fe.80.0.0.0.0.0.0.e4.1d.2d.3.0.d.bf.61        inet 10.105.0.230 netmask 0xffffff00 broadcast 10.105.0.255        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
root@fat:~ # ping 10.105.0.225
PING 10.105.0.225 (10.105.0.225): 56 data bytes
ping: sendto: No route to host
ping: sendto: No route to host
^C
--- 10.105.0.225 ping statistics ---
2 packets transmitted, 0 packets received, 100.0% packet loss
root@fat:~ # ifconfig ib0 down && ifconfig ib0 up
root@fat:~ # ping 10.105.0.225
PING 10.105.0.225 (10.105.0.225): 56 data bytes
64 bytes from 10.105.0.225: icmp_seq=0 ttl=64 time=0.208 ms
^C
--- 10.105.0.225 ping statistics ---
1 packets transmitted, 1 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.208/0.208/0.208/0.000 ms
root@fat:~ # ifconfig ib0
ib0: flags=8043<UP,BROADCAST,RUNNING,MULTICAST> metric 0 mtu 4092        options=8009b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTATE>        lladdr 0.0.0.48.fe.80.0.0.0.0.0.0.e4.1d.2d.3.0.d.bf.61        inet 10.105.0.230 netmask 0xffffff00 broadcast 10.105.0.255        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>

What i have in config:

/boot/loader.conf
mlx4_core="YES"
ibcore_load="YES"
mlxen_load="YES"
mlx4ib_load="YES"
ipoib_load="YES"


/etc/sysctl.conf
sys.device.mlx4_core0.mlx4_port1=ib

/etc/rc.conf
ifconfig_ib0="inet 10.105.0.230 netmask 255.255.255.0 up mtu 4092"

No DHCP lease on boot

$
0
0

Using a ConnectX-3 with an 10Gb Ethernet port I installed Debian 8 (jessie). Booting is fine until I install the Mellanox OFED stack (3.0-1.0.1), at which point I no longer get a DHCP lease at boot.  Doing and 'ifdown' followed by an 'ifup' on the interface fixes the issue.

Re: MLNX_OFED install failed at Ubuntu 14.04.2

$
0
0

Just an FYI, MLNX_OFED_LINUX-3.0-1.0.1-ubuntu14.04-x86_64 seems to work fine with Ubuntu 14.04.2.  -- Bud

I have windows 2012R2 OS on Dell R710.When i connect mellanox nics and cables through a Dell switch with 40G ports, It shows network cable unplugged.

$
0
0

I have windows 2012R2 OS on Dell R710.When i connect mellanox nics and cables through a Dell switch with 40G ports, It shows network cable unplugged.

flexboot versions?

$
0
0

Hello all,

 

I am trying to install flexboot on two HT26428's. The most recent version of flexboot I can install for my FW version is 3.4.124, but there seem to be 3 different versions (VPI, ethernet, and infiniband). I am using them as infiniband ports, but I was just wondering if I need just the IB package or the IB package and the VPI package?

 

Thanks,

Trevor


Re: I have windows 2012R2 OS on Dell R710.When i connect mellanox nics and cables through a Dell switch with 40G ports, It shows network cable unplugged.

$
0
0

If you're running things in Infiniband mode, is there a subnet manager running on the network?

 

This kind of sounds like what happens when you run without a subnet manager.  If you're not sure, try running the OpenSM service and see if the network cables suddenly start showing up as connected.

Re: I have windows 2012R2 OS on Dell R710.When i connect mellanox nics and cables through a Dell switch with 40G ports, It shows network cable unplugged.

$
0
0

He is saying Dell switch, I don't think that it is InfiniBand.

 

Few questions:

Did you install WinOF driver?

Could you make sure that the port is configured with Ethernet (if this is the case)?

Try to replace cable or switch port to make sure that there is no problem with the HW.

 

Thanks,

Ophir.

FDR with unmanaged switch

$
0
0

I have been trying to set up a new cluster using an 18 port MSX6015 switch and 9 systems with connectx-3 cards/built in IB:

08:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]

and 8X

02:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]

All cables are new FDR cables.

They are running a newish (3.17.3) kernel, although I have 4.0.3 ready to go when I can reboot them.

They are running debian with the current OFED, and even opensm 3.3.18

All firmware has been brought up to the most recent posted one.

 

I have set

# Force PortInfo:LinkSpeedExtEnabled on ports

# If 0, don't modify PortInfo:LinkSpeedExtEnabled on port

# Otherwise, use value for PortInfo:LinkSpeedExtEnabled on port

# Values are (MgtWG RefID #4722)

#    1: 14.0625 Gbps

#    2: 25.78125 Gbps

#    3: 14.0625 Gbps or 25.78125 Gbps

#    30: Disable extended link speeds

#    Default 31: set to PortInfo:LinkSpeedExtSupported

#force_link_speed_ext 31

force_link_speed_ext 1

(have also tried 31)

 

# FDR10 on ports on devices that support FDR10

# Values are:

#    0: don't use fdr10 (no MLNX ExtendedPortInfo MADs)

#    Default 1: enable fdr10 when supported

#    2: disable fdr10 when supported

fdr10 1

 

in the opensm.conf

 

 

I still get a connection of 40 and iblinkinfo saye:

4   10[  ] ==( 4X          10.0 Gbps Active/  LinkUp)==>       3    1[  ] "MT25408 ConnectX Mellanox Technologies" ( Could be FDR10)

or

CA: MT25408 ConnectX Mellanox Technologies:

      0x002590fffff7b3c5     33    1[  ] ==( 4X          10.0 Gbps Active/  LinkUp)==>       4   13[  ] "SwitchX -  Mellanox Technologies" ( Could be FDR10)

for all live ports.

Does anyone have any ideas of what i could be doing wrong?

Thanks.

 

 

 

Ok I have an update -

I found another posting on how to modify the firmware to be able to run back to back 56Gig IB. I did this with all the cards and the switch. I set tboth fdr10 and 56gig on in the init files and rebuilt firmware. After reboots of everything one of our models is running 25 minutes faster than before (was 3. hrs). However, ibportstate and ibstat still report tha everything is running at 40. Also iblinkinfo reports that it all "could be 56Gig"

Any ideas? Is it working and the reporting is just bad?

 

Message was edited by: David Warren 6/25/2015

Re: I have windows 2012R2 OS on Dell R710.When i connect mellanox nics and cables through a Dell switch with 40G ports, It shows network cable unplugged.

$
0
0

First, I would make sure you are using supported adapters, cables, and SFPs for both the Mellanox cards and the Dell switch. There are known issues when mixing certain types of SFPs and cables, and with certain kinds of Twinax.

 

If all the hardware is supported, please make sure the WinOF driver is installed and that the Mellanox port is configured to Ethernet (for VPI adapters).

 

I would then try directly connecting the NICs to eachother to bypass the Dell switch. Try running some IO using direct connect to verify your cards and cables are working. If this works --> either a compatibility issue with SFPs/cables and Dell switch, or the switch is configured incorrectly.

Mellanox OFED 3.0 failed to build in sles12

$
0
0

Build warnings/errors in sles12 happened in %install.

Are these problems known? Already a patch available?

Mlnx-ofa_kernel 3.0 was built fine in sles11sp3.

 

1)

extracting debug info from /usr/src/packages/BUILDROOT/mlnx-ofa_kernel-3.0-OFED.3.0.0.3.7.1.g3c1d583sgi712r1.sles12.x86_64/usr/src/ofa_kernel/default/compat/mlx_compat.ko

*** WARNING: identical binaries are copied, not linked:

        /usr/src/ofa_kernel/default/compat/mlx_compat.ko

   and  /lib/modules/3.12.39-47-default/updates/compat/mlx_compat.ko

 

2)

calling /usr/lib/rpm/brp-suse.d/brp-55-boot-scripts

E: File `openibd' without LSB header found in /usr/src/packages/BUILDROOT/mlnx-ofa_kernel-3.0-OFED.3.0.0.3.7.1.g3c1d583sgi712r1.sles12.x86_64/etc/init.d/

ERROR: found one or more broken init or boot scripts, please fix them.

       For more information about LSB headers please read the manual

       page of of insserv by executing the command `man 8 insserv'.

       If you don't understand this, mailto=werner@suse.de

 

3)

RPM build errors:

    bogus date in %changelog: Wed Sep 8 2008 Vladimir Sokolovsky <vlad@mellanox.co.il>

IBM PowerKVM: Configuring SR-IOV VFs on Different ConnectX-3 Ports

$
0
0

Hi

 

 

 

IBMPowerKVM server, which has a  Connectx-3 Pro card [FW: 2.33.5106] , I tried loading mlx4_core as below:

 

modprobe mlx4_core port_type_array=2,2 num_vfs=8,0,0 probe_vf=8,0,0  log_num_mgm_entry_size=-1

 

Which should ideally have created 8 VFs probed on 1st port. But I am seeing below under ip link o/p:

 

ip link show enP5p1s0

62: enP5p1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT qlen 1000

    link/ether f4:52:14:3b:22:10 brd ff:ff:ff:ff:ff:ff

    vf 0 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto

    vf 1 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto

    vf 2 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto

    vf 3 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto

    vf 4 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto

    vf 5 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto

    vf 6 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto

    vf 7 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto

ip link show enP5p1s0d1

63: enP5p1s0d1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT qlen 1000

    link/ether f4:52:14:3b:22:11 brd ff:ff:ff:ff:ff:ff

    vf 0 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto

    vf 1 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto

    vf 2 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto

    vf 3 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto

    vf 4 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto

    vf 5 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto

    vf 6 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto

    vf 7 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto

 

Where enP5p1s0d1 and enP5p1s0 are the interface names of dual port mellanox connectx-3 Pro.

I have read instructions @ HowTo Configure SR-IOV VFs on Different ConnectX-3 Ports

I am seeing VF for both ports and none of them seems to be dummy.. as I understand from HowTo Configure SR-IOV VFs on Different ConnectX-3 Ports  , dummy vf ports will be assigned with MAC and vlan to 0xff(s).

 

I could open a bug from official channel if you think its a genuine issue.

Re: Connectx-3EN and Connectx-pro: SRIOV : eswitch related query

$
0
0

I did get resolution. The ports of connect3-x pro cards were not connected to the switch/cross connected. By connecting the ports to external switch I was able to get the intra-host communication among guests to work. Although I dont see a prerequisite which cites this requirement, the exercise which I performed confirmed it.


VXLAN through linux bridge.How to attach veth0 to VM?

Совместимость кабелей mellanox и cisco catalyst.

$
0
0

Hi all! Никто не сталкивался с сабжевым вопросом? Проблема в том: втыкаю MC2209124-005 в Cisco Catalyst 2960х (стекованная)

1 52    WS-C2960X-48FPD-L  15.0(2)EX5            C2960X-UNIVERSALK9-M

2 52    WS-C2960X-48FPD-L  15.0(2)EX5            C2960X-UNIVERSALK9-M

и не работает, хотя вендер в cisco виден, видны регистры.

Втыкаю в тоже место Cisco SFP-H10GB-CU3M (не полный аналог, короче на 2 метра, но такой у меня под рукой) - и все работает.

Втыкаю тот же MC2209124-005 в Cisco Catalyst 3850 (пока не стековая)  1 56    WS-C3850-48T       03.06.00E         cat3k_caa-universalk9

и все работает.

Понятно что дело в IOS-е, однако, у меня 2 вопроса:

1. Что делать:

1.1 заменять на кабель cisco?

1.2. вводить какие-то команды полностью недокументированные в cisco?

1.3. перешивать кабель mellanox под cisco?

2. Это не из-за того, что коммутаторы находятся в стеке? Поясняю: вторую Cisco Catalyst 3850 я пока думаю покупать для стека, вдруг эффект повторится?

Re: Совместимость кабелей mellanox и cisco catalyst.

Does Mellanox OFED-1.5.3.-3.0.0 support "MT26428 [ConnectX VPI PCIe 2.0 5 GT/IB QDR]" adapter?, Does Mellanox OFED-1.5.3.-3.0.0 support "MT26428 [ConnectX VPI PCIe 2.0 5 GT/IB QDR]" adapter?

$
0
0

Please reply asap.

 

 

Thanks,

Rupesh

Re: Совместимость кабелей mellanox и cisco catalyst.

$
0
0

Hi! Sorry, my English is bad. I'm connect Cisco switch's (2960x and 3850 - see my f'st message) and Mellanox netcard (Mellanox ConnectX®-3 Pro EN network interface card, 10GbE, dual port SFP+, PCIe3.0 x8 8GT/s, tall bracket, RoHS R6 - PN MCX312A-XCBT).

So, also sorry! It's my mistake: PN MS3309124-005

On Russian:

Здравствуйте! Покорнейше прошу меня извинить за мое плохое владение английским языком. Я соединял коммутаторы Cisco (модели 2960Х и 3850 - смотрите мое первое сообщение) и сетевую карту Mellanox (Mellanox ConnectX®-3 Pro EN network interface card, 10GbE, dual port SFP+, PCIe3.0 x8 8GT/s, tall bracket, RoHS R6 - PN MCX312A-XCBT).

Извините также за мою ошибку: PN MS3309124-005

Viewing all 6227 articles
Browse latest View live


Latest Images