Quantcast
Channel: Mellanox Interconnect Community: Message List
Viewing all 6227 articles
Browse latest View live

Re: Connect two adapters without switch?

$
0
0

Hi,

If I understand correctly, You are trying to setup the two servers with an InfiniBand connection between. If this is not the case, you don't need the OpenSM.

You can start by installing the driver package provided by Mellanox http://www.mellanox.com/page/software_overview_ib

The package contains the driver and the OpenSM package within.
Once you got the driver going, you will need to start the OpenSM server only on one of the servers. Just start the service using the default parameters - should get you going.

 

Along with the above driver, you will find the release notes - check that the OS you have is supported by this driver. Then, check out the user guide for specific commands explaining how to start the driver and the OpenSM process.

 

Hope it helps..


Re: Configuring inter-switch links between SX6036 QSFP ports

$
0
0

Hi Greg,

 

Infiniband architecture is a bit different than Ethernet in a way where the protocol uses all the ISL connections all together. In short, the Subnet Manager (SM), calculate and provision static routes between every end point (HCA port) to every other end point. This is typically done using a set of rules that are relative to the topology and are configuring those routes while avoiding the risk of having a loop in the network. The configuration also aims to spread the routes as evenly as possible across all the available links to maximize the BW utilization within the fabric itself.

If a situation occurs where a link goes down during the life-cycle of the network, the SM has the way to identify the event (usually receives a trap) and recalculate the routes again using alternative routes. This is all done automatically - no need to configure anything.

Lastly, the notion of "LACP" does not exist in Infiniband - it is just not needed.

 

I hope this helps. You can read more about the Infiniband architecture online. There are many resources at different levels.

 

Cheers!

ConnectX3 with MC2609130-001 cable (40GbE to 4x10GbE QSFP to 4xSFP+) in VMWARE only "port 1" is working (10GB)

$
0
0

I have a ConnectX3 card in ESXI 6.5 connected with the Mellanox MC2609130-001 Passive Copper Hybrid Cable Ethernet 40GbE to 4x10GbE QSFP to 4xSFP+  to an Alcatel Lucent 6900 switch. Only port 1 with 10 GB get into service. If I use a 40GB to 4*10GB cable to connect one Alcatel 40GB port to 4*10 GB Alcatel ports it works. I had to set the 40GB port on Alcatel to "split mode". If I connect the ConnectX3 card with a 40GB cable to a 40GB port at Alcatel it works too.

 

Do I need to configure the ConnectX3 card to a kind of "split mode" in VMWare to have 4*10 GB links running as I did on the Alcatel switch?

 

Device Type:      ConnectX3

  Part Number:      08KP6W_0M9NW6

  Description:      ConnectX-3 Dual Port  QSFP 40GbE Adapter card for Dell PowerEdge

  PSID:             DEL0A70000023

  PCI Device Name:  mt4099_pci_cr0

  Port1 MAC:        248a07c081e0

  Port2 MAC:        248a07c081e1

  Versions:         Current        Available

     FW             2.40.5048      N/A

     PXE            3.4.0746       N/A

     UEFI           15.11.0040     N/A

  Status:           No matching image found

Re: NVMeOF SLES 12 SP3 : Initiator with 36 cores unable to discover/connect to target

$
0
0

We faced the same on SLES 12 SP3. We found that in SP3 release version there are two issues related to nvmeof initiator.

 

First, kernel 4.4.73-5-default does not know anything about hostid argument (this causes error message you observe). It was fixed in later updates, 4.4.92-6.18-default does not have this issue.

 

Second issue is in nvme-cli. As you may notice, the last letter from hostid is truncated: 'hostid=a61ecf3f-2925-49a7-9304-cea147f61ae', this causes kernel module to reject host id argument. The root cause is in nvme-cli patch that adds hostid support. It can be fixed by the simple patch added to nvme cli src rpm:

 

diff -crB nvme-cli-v1.2/linux/nvme.h nvme-cli-v1.2.patched/linux/nvme.h

*** nvme-cli-v1.2/linux/nvme.h Thu Dec  7 09:42:00 2017

--- nvme-cli-v1.2.patched/linux/nvme.h Thu Dec  7 09:50:32 2017

***************

*** 23,29 ****

  /* However the max length of a qualified name is another size */

  #define NVMF_NQN_SIZE 223

 

! #define NVMF_HOSTID_SIZE        36

  #define NVMF_TRSVCID_SIZE 32

  #define NVMF_TRADDR_SIZE 256

  #define NVMF_TSAS_SIZE 256

--- 23,29 ----

  /* However the max length of a qualified name is another size */

  #define NVMF_NQN_SIZE 223

 

! #define NVMF_HOSTID_SIZE        37

  #define NVMF_TRSVCID_SIZE 32

  #define NVMF_TRADDR_SIZE 256

  #define NVMF_TSAS_SIZE 256

Unable create L3 Vlan interface with puppet

$
0
0

Hello!

 

I encountered an issue while trying to create SVI interface on a MELLANOX switch model MSN2410-BB2F with Puppet.

 

I was using Mellanox MLNX-OS® User Manual for Ethernet. According to this guide the manifest should have look like this:

 

class l3int{

    netdev_device { $hostname: }

    $vlans = {

        'vlan 347' => { ensure => present, ipaddress =>'192.168.4.2', netmask =>'255.255.255.0' }

    }

    create_resources( netdev_l3_interface, $vlans )

}

 

But after executing it puppet-agent on switch returns an error:

show puppet-agent log continuous

Wed Dec 13 13:20:35 +0000 2017 Puppet (err): Could not set 'present' on ensure: Error: return code = 1  return msg = line:1|Action /ifd/actions/ip_interface parameter ifindex has bad type: string expected uint32

Wed Dec 13 13:20:35 +0000 2017 Puppet (err): Could not set 'present' on ensure: Error: return code = 1  return msg = line:1|Action /ifd/actions/ip_interface parameter ifindex has bad type: string expected uint32

Wrapped exception:

Error: return code = 1  return msg = line:1|Action /ifd/actions/ip_interface parameter ifindex has bad type: string expected uint32

Wed Dec 13 13:20:35 +0000 2017 /Stage[main]/L3int/Netdev_l3_interface[vlan 347]/ensure (err): change from absent to present failed: Could not set 'present' on ensure: Error: return code = 1  return msg = line:1|Action /ifd/actions/ip_interface parameter ifindex has bad type: string expected uint32

 

While trying to solve this problem i founded, that after manually creating a SVI interface with this vlan puppet seccessefuly assign an ip address with same manifest:

 

hostname [standalone: master] (config) # int vlan 347

hostname [standalone: master] (config interface vlan 347) # ex

hostname [standalone: master] (config) # ex

 

 

Wed Dec 13 13:11:23 +0000 2017 Puppet (notice): Starting Puppet client version 3.2.3

Wed Dec 13 13:11:33 +0000 2017 /Stage[main]/L3int/Netdev_l3_interface[vlan 347]/ipaddress (notice): ipaddress changed '0.0.0.0' to '192.168.4.2'

Wed Dec 13 13:11:33 +0000 2017 /Stage[main]/L3int/Netdev_l3_interface[vlan 347]/netmask (notice): netmask changed '0.0.0.0' to '255.255.255.0'

 

 

Could you provide any guidance to resolve this issue and make the manifest fully operational?

Re: Unable create L3 Vlan interface with puppet

Re: Unable create L3 Vlan interface with puppet

Re: Configuring inter-switch links between SX6036 QSFP ports

$
0
0

Thanks Yairi.  It sounds like my configuration is good.  Some IBM guys who recommended it for their Purescale solution checked my configuration and advised it is fine.  We had a DB2 failure when one Mellanox switch was failed (tested) and continue to try to isolate the problem.  If DB2 fails again with the next test, they have asked that a ticket be opened with IBM support.

thanks again for helping me further my knowledge on this.

Greg


Re: Configuring inter-switch links between SX6036 QSFP ports

Re: SX1012X and QSFP DAC?

$
0
0

Hi Brad, to implement IPL SX1012X requires a couple of standard QSPF+/SFP+ cable (usually 0.5 mt) PLUS a couple of QSA adapters to plug on the SFP+ ends.

e.g.

     

MC2309130-00A Mellanox® passive copper hybrid cable, ETH 10GbE, 10Gb/s, QSFP to SFP+,0,5m 2
MAM1Q00A-QSA Mellanox® cable module, QSFP to SFP adaptor 2

 

Hope this helps

Roberto

Re: Configuring inter-switch links between SX6036 QSFP ports

$
0
0

I found something in the switch log that points to something that may be missing from setting up the switches:

"Master pm[5916]: (pid 6878): Found remote SM (0,31,1) with non-matching sm_key"

We were running more tests and all members failed in DB2 after all the CF links to the HA standby switch were disconnected.  It was expected that the HA master switch would be the active SM and all active links would be sufficient to keep DB2 running.

Any ideas?  Is there some sort of licence coordination that is required?

Thanks,

Greg

Re: connectx-5 max speed 68Gb?

$
0
0

H,

We don't expect this behavior.

Please look for errors in ethtool statistics.

 

CRC or PCI errors

All phy counters

Out of buffer counters

 

Thanks,

Amir.

Re: SX1012X and QSFP DAC?

hpc-x for arm

$
0
0

Hi All

I want to use hpc-x for Ubuntu in arm machines connected by infiniband by EDR.

However I found that for for desired Mellanox drivers(ofed) , the binaries are

not available , Is this possible to compile it by myself i.e any location

from where I can download desired code and compile it with desired dependencies .

 

I have already compiled Open MPI but want to check its performance with HPC-x.

 

Thanks

pg

Re: including different OS

$
0
0

Colin-san,

 

Thank you very much.

I understand it.

Thank you。

 

tetsu


cable issue to build infiniband

$
0
0

Hello,

        We have a problem that we want to build the infiniband. But we can't reach the  rational speed of 40Gps. We just reach 6.5Gps. We subscribe some websites on the mellanox community, and they suggest that it may be the cable problem.  However, we haven't dedicated  infiniband-cable yet. So Must we use the dedicated infiniband-cable??  Can we just use some non-infiniband-cables that support the speed of 40Gps?

       Our adapter is  Mellanox ConnectX-3 Adapters (link speed: 40Gps). But our active-speed is just 10Gps. We have no idea about the meaning of active-speed ? We search this word(active-speed) on the Internet, but we fail to find the appropriate explaination, Can someone explain it to us?

VPI Driver on Windows Server 2016 1709

$
0
0

Windows Server 2016 release 1709 is server core only without GUI.  How do you install MLNX driver for VPI cards?  Looks like you can use the inbox driver that comes with Windows Server 2016.  However, as there is no GUI, how do you turn specific port into IPoIB/Ethernet mode?  Is there a Powershell command for this?

ConnectX-4 and flat SRIOV-networks

$
0
0

Hi,

 

trying to get OpenStack "flat" networks to work with VLAN tagging from inside the VM - without success

 

My setup:

ConnectX-4 LX (HPE rebranded should that matter)

Redhat OSP10, RHEL 7.4 on the compute nodes

NIC driver 4.1-1.0.2

 

Scenario:

- created flat network on SRIOV-network

- create direct port on this network

- create VM with this port attached

- inside the VM, create a VLAN subinterface (eth0.667)

 

Observations:

- VF gets created as expected, no visible VLAN filtering so I'm expecting this VF to be in VGT mode

           vf 5 MAC fa:16:3e:ad:89:a4, spoof checking on, link-state enable, trust off, query_rss off

- the network (Cisco ACI based) learns the MAC/IP of this VM

- packets go out to the network with the correct VLAN tag

- doing a tcpdump on the PF (ens4f0) I see ARP reply packets coming from the network

          10:16:22.908153 fa:16:3e:ad:89:a4 > Broadcast, ethertype 802.1Q (0x8100), length 60: vlan 667, p 0, ethertype ARP, Request who-has 192.168.100.1 tell 192.168.100.21, length 42

          10:16:22.909352 00:22:bd:f8:19:ff > fa:16:3e:ad:89:a4, ethertype 802.1Q (0x8100), length 64: vlan 667, p 5, ethertype ARP, Reply 192.168.100.1 is-at 00:22:bd:f8:19:ff, length 46

- but these packets are not getting to the VM

 

I've tried re-creating the VM after setting rx-vlan-filter on the PF to "off", but this doesn't change anything...

 

Am I missing something here? From what I can see from the documentation, setting VLAN to 0 should enable VGT, and so to my understanding instruct the NIC to not touch tagged traffic...this seems to be true for outgoing traffic, but incoming tagged traffic seems to disappear

 

Any pointers/hints would be greatly appreciated

 

Thanks,

 

Jan

mellanox connectX3 adapter failed to reach the rational speed

$
0
0

Hello,

        We are using the mellanox connectx3 adapter(Link Speed:40Gps) on windows10 PCs. We also use the proper infiniband cable for this kind of adapter which can provide the rational infiniband speed .

        However, we still failed to reach the rational speed. I use the nd_read_bw to test the speed , it was just about 6Gps. It's further out of our expectation.

        So l use the command "vstat" to show some info about my PC:

This is vstat result on the same host:11.4.12.11

C:\Windows\system32>vstat

 

 

        hca_idx=0

        uplink={BUS=PCI_E Gen2, SPEED=5.0 Gbps, WIDTH=x2, CAPS=8.0*x8}

        MSI-X={ENABLED=1, SUPPORTED=128, GRANTED=10, ALL_MASKED=N}

        vendor_id=0x02c9

        vendor_part_id=4099

        hw_ver=0x0

        fw_ver=2.42.5000

        PSID=MT_1100110028

        node_guid=f452:1403:0047:63f0

        num_phys_ports=1

                port=1

                port_guid=f452:1403:0047:63f1

                port_state=PORT_ACTIVE (4)

                link_speed=10.00 Gbps

                link_width=4x (2)

                rate=40.00 Gbps

                real_rate=32.00 Gbps (QDR)

                port_phys_state=LINK_UP (5)

                active_speed=10.00 Gbps

                sm_lid=0x0004

                port_lid=0x0004

                port_lmc=0x0

                transport=IB

                max_mtu=4096 (5)

                active_mtu=4096 (5)

                GID[0]=fe80:0000:0000:0000:f452:1403:0047:63f1

 

 

This is test result on the same host:11.4.12.11

 

 

C:\Windows\system32>nd_read_bw -S 11.4.12.11

Listening for incoming connection request...

Connection accepted.

 

 

C:\Windows\system32>nd_read_bw -S 11.4.12.11

Listening for incoming connection request...

Connection accepted.

 

 

C:\Windows\system32>nd_read_bw -C 11.4.12.11

 

 

#qp #bytes #iterations    MR [Mmps]     Gb/s     CPU Util.

0   65536     100000       0.011        5.95     100.00

 

 

Test finished. Releasing resources...

 

 

C:\Windows\system32>nd_read_bw -C 11.4.12.11

 

 

#qp #bytes #iterations    MR [Mmps]     Gb/s     CPU Util.

0   65536     100000       0.011        5.94     100.00

 

 

Test finished. Releasing resources...

         Can someone help me solve the problem why we can't reach the rational speed?

MLNX_OFED and hard coded mac-addresses for VFs

$
0
0

System: HP ProLiant m710p Server Cartridge

OS: Ubuntu 16.04

IB Controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]

OFED: MLNX_OFED_LINUX-4.2-1.0.0.0-ubuntu16.04-x86_64

 

Hi, I have networking booting system, all hardware nodes is booting from prebuild image.

I installed MLNX_OFED driver with kernel modules inside base image, using this command:

 

./mlnxofedinstall --kernel 4.4.0-98-generic --without-dkms --add-kernel-support --without-fw-update

 

Also I writed to /etc/modprobe.d/mlx4_core.conf:

 

options mlx4_core num_vfs=1 port_type_array=2,2 probe_vf=1

 

But after boot, and loading driver with this command

 

/etc/init.d/openibd start

 

Each my node have same mac-address on VF interfaces.

Looks like MAC-addresses generates on the mlnxofedinstall step - this is wrong, it's should be generates automaticly when new VFs add, or save MAC-addresses somewhere in configuration, unlike be hardcoded somewhere in builded module.

 

Thanks!

Viewing all 6227 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>