Quantcast
Channel: Mellanox Interconnect Community: Message List
Viewing all 6227 articles
Browse latest View live

Re: ib_ep.c:297 MXM ERROR failed to create address handle: Cannot allocate memory

$
0
0

ipv6 is already enabled for the hosts.


Re: ib_ep.c:297 MXM ERROR failed to create address handle: Cannot allocate memory

$
0
0

Opir, does  MXM work with ROCE? Our setup at University of Houston is ROCE. I beginning to suspect that  MXM is strictly meant for infiniband protocol, any taughts before I formerly engage the support Team.

 

 

-Jerry Ebalunode

Re: ib_ep.c:297 MXM ERROR failed to create address handle: Cannot allocate memory

$
0
0

Please, e-mail me and let's continue via e-mail.

Re: can not download MLNX_OFED_LINUX-1.5.3-3.1.0/3.0.0-rhel6.1

Re: ib_ep.c:297 MXM ERROR failed to create address handle: Cannot allocate memory

$
0
0

MXM use VERBS in order to transfer packets over Ethernet.

ConnectX-3 Pro BSD driver build fails

$
0
0

I am trying to install a ConnectX-3 Pro in a BSD system. I rebuilt and installed the new kernel but when I try to build the modules I always fail the "make && make install" - for the mlx4 module I get:

 

/tmp/MLNX_EN_FreeBSD_v2.1/modules/mlx4/../../ofed/drivers/net/mlx4/mcg.c:43:17: error: unused variable

      'zero_gid' [-Werror,-Wunused-const-variable]

static const u8 zero_gid[16];   /* automatically initialized to 0 */

                ^

1 error generated.

*** Error code 1

 

For the mlxen module I get:

 

/tmp/MLNX_EN_FreeBSD_v2.1/modules/mlxen/../../ofed/drivers/net/mlx4/en_netdev.c:1694:19: error: unused

      variable 'fmt_u64' [-Werror,-Wunused-const-variable]

static const char fmt_u64[] = "%llu\n";

                  ^

1 error generated.

*** Error code 1

 

Stop.

 

This is with BSD 10.1 and the "MLNX_EN_FreeBSD_v2.1" tarball.

Does FCA or other accelerators work with RHEL6u4 or RHEL6u6 OFEDs?

$
0
0

Hello,

 

I was wondering if FCA or other accelerators work with RHEL6u4 or RHEL6u6 OFEDs?

 

Are there any issues to watch out for or is some functionality not available with the RHEL OFEDs but it is with Mellanox OFEDs?

 

Thanks!

Michael

Re: 4036E software update

$
0
0

After some more searching I found the solution to this issue is to update the software to version 3.62 and then to 3.91-987.  In my case I was able to skip the other versions in between.  I had to contact Mellanox support to get the correct software and firmware versions for the 4036E.

 

I hope this helps someone out there.

 

Kevin


MLNX_OFED_LINUX-1.5.3-3.0.0 kernel error.

$
0
0

Hi friends,

 

Can anybody explain me the exact reason for the kernel error "kernel: ib0: failed to send RTU: -22" ?

 

Thanks.

Re: MLNX_OFED_LINUX-1.5.3-3.0.0 kernel error.

$
0
0

Hi,

 

It can be normal under certain heavy traffic load scenarios. Every so often, connections get closed due to RNR NAK timeouts. When a connection is closed it will be re-opened when traffic needs to be sent. The chain of events  according to the spec would look similar from the below :

 

Host X initiates a connection by sending a REQ (starts the 3 way handshake).

Host X Receives the RTU message

At this stage according to the IB spec, Host X, is allowed to start transmitting – so it does.

Unfortunately, Host X received a RNR NAK and it had to destroy the connection.

Once the connection was destroyed a RTU message needs to be sent (since the destroy above is asynchronous)

Finally, RTU messages failed because the connection was closed.

 

I would probably ignore this if I dont see my application / other dropping packets.

Re: MLNX_OFED_LINUX-1.5.3-3.0.0 kernel error.

$
0
0

Thanks a lot Erez Ferber.

In our case, it's causing HPC job slowness issue /job failure.

Is it possible to avoid such kind of error/issue?

supporting 1000base with a connectx-3

$
0
0

Hello all,

 

I am still learning the ropes with these cards so hopefully I include the right details.

 

the card in this server is:

 

# mstflint -d 42:00.0 q

Image type:      FS2

FW Version:      2.32.5100

FW Release Date: 3.9.2014

Product Version: 02.32.51.00

Rom Info:        type=PXE version=3.4.306 devid=4099 proto=ETH

Device ID:       4099

Description:     Node             Port1            Port2            Sys image

GUIDs:           ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff

MACs:                                 f452141fd030     f452141fd031

VSD:

PSID:            MT_1080120023

 

I can see the following with ethtool:

 

# ethtool p4p1

Settings for p4p1:

        Supported ports: [ FIBRE ]

        Supported link modes:   1000baseKX/Full

                                10000baseKR/Full

        Supported pause frame use: Symmetric Receive-only

        Supports auto-negotiation: No

        Advertised link modes:  1000baseKX/Full

                                10000baseKR/Full

        Advertised pause frame use: Symmetric

        Advertised auto-negotiation: No

        Speed: Unknown!

        Duplex: Unknown! (255)

        Port: FIBRE

        PHYAD: 0

        Transceiver: internal

        Auto-negotiation: off

        Supports Wake-on: d

        Wake-on: d

        Current message level: 0x00000014 (20)

                               link ifdown

        Link detected: no

 

The network engineer on the Cisco side says he can see the SFP on that side is happy but no link. I can't see a link either and now the question is what have I done wrong in relation to getting this card working. The drivers for the card appear to be installed correctly etc.

 

And yet speed and duplex are unknown. The SFPs on the switch side can only support 1000.

 

Thanks for your time all.

Re: what is the reason for below ib error,

$
0
0

Hello,

 

I'm not sure the background of that issue and the point where that is happening but this article may help you.

dev_queue_xmit

 

Thanks,

Masahiro

Does MLNX_OFED also support FreeBSD?

$
0
0

We want to implement IO virtualization over InfiniBand, For this we need to have following system requirement:

• MLNX_OFED Driver

• A server/blade with an SR-IOV-capable motherboard BIOS

• Hypervisor that supports SR-IOV such as: Red Hat Enterprise Linux Server Version 6.*

• Mellanox ConnectX® VPI Adapter Card family with SR-IOV capability

 

As per our finding currently Mellanox supports following OS:

RHEL, CentOS, Ubuntu, SLES, OLE, Citrix, Fedora and Debian. Does MLNX_OFED also support FreeBSD? As we need to implement the same on FreeBSD.

 

Thanks,

Sagar Borse

how to get a (full) shell access?

$
0
0

hello there!

i got a sx series switch (Ethernet Switches) and a particular (Virtualization for Infiniband and Ethernet) use-case demands that a full shell be accessible to download (inject) certain scripts / binaries to run locally on the box. wondering if it's possible to get a full (bash) access to allow this?

 

thanks.

girish.@


hca_self_test.ofed: Firmware Check on CA #0 (VPI) .......... FAIL:REASON: mismatch CA #0 firmware detected (found v2.11.500, required v2.32.5100)

$
0
0

I am a new user of Infiniband. After installation of Mellaxon_OFED by default, I run the hca_self_test.ofed, and I got the information as follow:

 

root@gpu-cluster-4:/usr/bin# hca_self_test.ofed

 

 

---- Performing Adapter Device Self Test ----

Number of CAs Detected ................. 1

PCI Device Check ....................... PASS

Kernel Arch ............................ x86_64

Host Driver Version .................... MLNX_OFED_LINUX-2.3-2.0.0 (OFED-2.3-2.0.0): 3.13.0-32-generic

Host Driver RPM Check .................. PASS

Firmware on CA #0 VPI .................. v2.11.500

Firmware Check on CA #0 (VPI) .......... FAIL

    REASON: mismatch CA #0 firmware detected (found v2.11.500, required v2.32.5100)

Host Driver Initialization ............. PASS

Number of CA Ports Active .............. 2

Port State of Port #1 on CA #0 (VPI)..... UP 4X QDR (InfiniBand)

Port State of Port #2 on CA #0 (VPI)..... UP 4X QDR (InfiniBand)

Error Counter Check on CA #0 (VPI)...... PASS

Kernel Syslog Check .................... PASS

Node GUID on CA #0 (VPI) ............... 00:e0:81:00:00:2a:e9:5b

------------------ DONE ---------------------

 

But when I check the interface of ib, it seems like OK:

root@gpu-cluster-1:/usr/sbin# ibstat

CA 'mlx4_0'

  CA type: MT4099

  Number of ports: 2

  Firmware version: 2.11.500

  Hardware version: 0

  Node GUID: 0x00e08100002ae8a7

  System image GUID: 0x00e08100002ae8aa

  Port 1:

  State: Active

  Physical state: LinkUp

  Rate: 40

  Base lid: 1

  LMC: 0

  SM lid: 1

  Capability mask: 0x0251486a

  Port GUID: 0x00e08100002ae8a8

  Link layer: InfiniBand

  Port 2:

  State: Active

  Physical state: LinkUp

  Rate: 40

  Base lid: 9

  LMC: 0

  SM lid: 1

  Capability mask: 0x02514868

  Port GUID: 0x00e08100002ae8a9

  Link layer: InfiniBand

 

Do I need to do something to fix the FAIL problem? What is the influence of it?

Re: how to get a (full) shell access?

$
0
0

Hi Girish,

 

Can you share the use case mentioned in your question?

 

It's not possible to get bash access to the switches.

Re: how to get a (full) shell access?

$
0
0

thanks very much for following this up!

 

my use-case is of SDN. we've a lightweight controller agent, that can be dispatched on a switch itself, to double it as a controller. we've been testing this scenario on various Linux based switching platforms and are evaluating MLNX switch platforms for this purpose. in order to dispatch our agent and it's dependencies we're at the mercy of the switch platform!

 

if i understand it right, MLNX switches support BGP by integrating Quagga. if we're to do something similar -- are we allowed? if so what's the way? after bit of a google search, came across this posting (from somebody at EMC), where they tried to run a uboot + Linux on a similar MLNX switch. but seems that exercise either went nowhere. any help?

 

thanks again!

best regards,

girish.

Re: what is the reason for below ib error,

How to connect a SX1012 Ethernet switch to 10Gb RJ45

$
0
0

We have just bought a new Mellanox SX1012 Ethernet switch. The idea was to interconnect a multicore server with a dual 10Gb Ethernet card (with RJ45 plugs) to a NAS also having a pair of 10 Gb (RJ45) Ethernet connectors. We have also another NAS which we will connect in the future. So far we have also got a Mellanox passive cooper hybrid cable ETH, 40GbE to 4x10GbE QSFP to 4xSFP+. The problem is that the SFP+ cables can be connected neither to the RJ45 connectors of the server nor to the RJ45 connectors of the NAS. Thus, we are wondering:

 

1) Are there any cables that directly split from QSFP to 4xRJ45?.

2) If not, with the current equipment, are there any kind of "adapters" from SFP+ to RJ45?

3) Do we need to buy a second switch with 1 or 2 SFP+ connectors to convert the SFP+ to RJ45?

     3.1) If this is the case, do you have any hint for what could be the best option to not reduce the bandwidth of interconenction between the NAS and the server?

 

 

Many thanks!!

Miguel Ángel

Viewing all 6227 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>