Quantcast
Channel: Mellanox Interconnect Community: Message List
Viewing all 6227 articles
Browse latest View live

Problem installing MLNX_OFED_LINUX-4.4-1.0.0.0-ubuntu18.04-x86_64

$
0
0

Hi,

I am running linux mint 19 which is basically ubuntu 18.04. I recently bought a ConnectX-3 CX311A and am trying to get it running.
I downloaded the MLNX_OFED_LINUX-4.4-1.0.0.0-ubuntu18.04-x86_64 and tried to run it:

sudo ./mlnxofedinstall --add-kernel-support --distro ubuntu18.04

 

Result:

Note: This program will create MLNX_OFED_LINUX TGZ for ubuntu18.04 under /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0-4.15.0-29-generic directory.

See log file /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0-4.15.0-29-generic/mlnx_iso.5496_logs/mlnx_ofed_iso.5496.log

 

Checking if all needed packages are installed...

Building MLNX_OFED_LINUX RPMS . Please wait...

find: 'MLNX_OFED_SRC-4.4-1.0.0.0/RPMS': No such file or directory

Creating metadata-rpms for 4.15.0-29-generic ...

 

ERROR: Failed executing "/usr/bin/perl /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0-4.15.0-29-generic/mlnx_iso.5496/MLNX_OFED_LINUX-4.4-1.0.0.0-ubuntu18.04-ext/create_mlnx_ofed_installers.pl --with-hpc --tmpdir /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0-4.15.0-29-generic/mlnx_iso.5496_logs --mofed /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0-4.15.0-29-generic/mlnx_iso.5496/MLNX_OFED_LINUX-4.4-1.0.0.0-ubuntu18.04-ext --rpms-tdir /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0-4.15.0-29-generic/mlnx_iso.5496/MLNX_OFED_LINUX-4.4-1.0.0.0-ubuntu18.04-ext/RPMS --output /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0-4.15.0-29-generic/mlnx_iso.5496/MLNX_OFED_LINUX-4.4-1.0.0.0-ubuntu18.04-ext --kernel 4.15.0-29-generic --ignore-groups eth-only"

ERROR: See /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0-4.15.0-29-generic/mlnx_iso.5496_logs/mlnx_ofed_iso.5496.log

Failed to build MLNX_OFED_LINUX for 4.15.0-29-generic

 

Once I check this log it says:

[33mUnsupported package: kmp [0m

Logs dir: /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0-4.15.0-29-generic/mlnx_iso.5496_logs/OFED.5926.logs

General log file: /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0-4.15.0-29-generic/mlnx_iso.5496_logs/OFED.5926.logs/general.log

[32m

Below is the list of OFED packages that you have chosen

   

(some may have been added by the installer due to package dependencies):

[0m

ofed-scripts

mlnx-ofed-kernel-utils

mlnx-ofed-kernel-dkms

iser-dkms

isert-dkms

srp-dkms

mlnx-nfsrdma-dkms

mlnx-nvme-dkms

mlnx-rdma-rxe-dkms

kernel-mft-dkms

knem-dkms

knem

 

Checking SW Requirements...

This program will install the OFED package on your machine.

Note that all other Mellanox, OEM, OFED, RDMA or Distribution IB packages will be removed.

Those packages are removed due to conflicts with OFED, do not reinstall them.

 

Installing new packages

Building DEB for ofed-scripts-4.4 (ofed-scripts)...

Running  /usr/bin/dpkg-buildpackage -us -uc

Building DEB for mlnx-ofed-kernel-utils-4.4 (mlnx-ofed-kernel)...

Running  /usr/bin/dpkg-buildpackage -us -uc

Building DEB for iser-dkms-4.0 (iser)...

Running  /usr/bin/dpkg-buildpackage -us -uc

Building DEB for isert-dkms-4.0 (isert)...

Running  /usr/bin/dpkg-buildpackage -us -uc

Building DEB for srp-dkms-4.0 (srp)...

Running  /usr/bin/dpkg-buildpackage -us -uc

Building DEB for mlnx-nfsrdma-dkms-3.4 (mlnx-nfsrdma)...

Running  /usr/bin/dpkg-buildpackage -us -uc

Building DEB for mlnx-nvme-dkms-4.0 (mlnx-nvme)...

Running  /usr/bin/dpkg-buildpackage -us -uc

Building DEB for mlnx-rdma-rxe-dkms-4.0 (mlnx-rdma-rxe)...

Running  /usr/bin/dpkg-buildpackage -us -uc

Building DEB for kernel-mft-dkms-4.10.0 (kernel-mft)...

Running  /usr/bin/dpkg-buildpackage -us -uc

Building DEB for knem-dkms-1.1.3.90mlnx1 (knem)...

Running  /usr/bin/dpkg-buildpackage -us -uc

[32mBuild passed successfully [0m

-E- '' dir does not exist!

 

Strange!

Then I tried

sudo ./mlnxofedinstall --distro ubuntu18.04

which gives:

Logs dir: /tmp/MLNX_OFED_LINUX.12121.logs

General log file: /tmp/MLNX_OFED_LINUX.12121.logs/general.log

 

Below is the list of MLNX_OFED_LINUX packages that you have chosen

(some may have been added by the installer due to package dependencies):

 

ofed-scripts

mlnx-ofed-kernel-utils

mlnx-ofed-kernel-dkms

iser-dkms

isert-dkms

srp-dkms

mlnx-nfsrdma-dkms

mlnx-rdma-rxe-dkms

libibverbs1

ibverbs-utils

libibverbs-dev

libibverbs1-dbg

libmlx4-1

libmlx4-dev

libmlx4-1-dbg

libmlx5-1

libmlx5-dev

libmlx5-1-dbg

librxe-1

librxe-dev

librxe-1-dbg

libibumad

libibumad-static

libibumad-devel

ibacm

ibacm-dev

librdmacm1

librdmacm-utils

librdmacm-dev

mstflint

ibdump

libibmad

libibmad-static

libibmad-devel

libopensm

opensm

opensm-doc

libopensm-devel

infiniband-diags

infiniband-diags-compat

mft

kernel-mft-dkms

libibcm1

libibcm-dev

perftest

ibutils2

libibdm1

cc-mgr

ar-mgr

dump-pr

ibsim

ibsim-doc

knem-dkms

mxm

ucx

sharp

hcoll

openmpi

mpitests

knem

libdapl2

dapl2-utils

libdapl-dev

srptools

mlnx-ethtool

mlnx-iproute2

 

This program will install the MLNX_OFED_LINUX package on your machine.

Note that all other Mellanox, OEM, OFED, RDMA or Distribution IB packages will be removed.

Those packages are removed due to conflicts with MLNX_OFED_LINUX, do not reinstall them.

 

Do you want to continue?[y/N]:y

 

Checking SW Requirements...

Removing old packages...

Installing new packages

Installing ofed-scripts-4.4...

Installing mlnx-ofed-kernel-utils-4.4...

Installing mlnx-ofed-kernel-dkms-4.4...

 

Error: mlnx-ofed-kernel-dkms installation failed!

Collecting debug info...

See:

    /tmp/MLNX_OFED_LINUX.12121.logs/mlnx-ofed-kernel-dkms.debinstall.log

Removing newly installed packages...

 

How can I install the drivers? Thank you for your help


Need help updating firmware/speed for MNPA19-XTR adapters

$
0
0

I need faster peer to peer access between a server and desktop computer.  I installed an MNPA19-XTR 10Gb adapter in each machine with a Peer to Peer configuration w/SFP+ copper cable.  The issue/problem is they are not performing as they should.  When a large transfer is started, the speed starts at just under 700Mb/s (which is expected or even better than expected with SATA HD's in use).  But after 5-6 seconds, the speed starts dropping to 100-150Mb/s.  Intermittently, the speed will jump up to 400-500Mb/s for a second or two, then drop down again. Both systems have an SSD, Single SATA and a SATA array of disks.  So, I have tried the tests from SSD to SSD, SATA to SATA, etc, and the results are pretty much the same.  All offloading is enabled, Jumbo Packet is at 9000, send/recv buffers are at max and I have tried tuning as many settings as I can find including specifying both 10Gb adapters in the Hosts file.  It almost seems like it is a heat issue even though there is plenty of air movement.

I don't know if speed will improve with a firmware update, but MLXUP.exe does not recognize my adapter (not sure if I am using the correct switch(es) in the command line (windows machines).  Any help with either or both speed and firmware updating would be highly appreciated.  The cards currently have firmware rev 2.9.1000 and I have 2.9.1200 on hand to update.  I will be extremely happy if I can get a reliable 400-500Mb/s out of this setup, which is what I believe it should be at.

System 1:

ASUS X370-A MB, Ryzen 5 1600, 16Mb RAM, 500Gb SSD, 6-4Tb NAS SATA disk array, 1-3Tb NAS SATA disk.

System 2:

ASUS X370-Pro MB, Ryzen 7 1800X, 32Mb RAM, 240Gb SSD, 6-4Tb NAS SATA disk array, 1-6Tb NAS SATA disk.

Also it looks like I have to choose a group for this discussion, so I am just choosing the closest fit.

Thank you in advance for any help.

Re: mlx5_core enable hca failed, mlx5_load_one failed with error code -22

$
0
0

Hi Pharthiphan,

 

I am not sure if your issue is still relevant as it was posted on 6/11, however what Mellanox OFED Drivers did you installed and have you validated the FW version/compatibility?

 

You can download the MFT package from the following link:

http://www.mellanox.com/page/management_tools

To query the FW:

#mst start

#mst status -v

#flint -d <mst device> q

 

Note: Check based on the RN of the Drivers that the FW is supported/compatible. If not, I would suggest to align the FW to a supported version.

 

Sophie.

Is Mellanox ConnectX-4 compatible with VPP 18.07?

Re: Ceph with OVS Offload

Re: Ceph with OVS Offload

$
0
0

Hi Sophie,

 

I have read all about ASAP2 on Mellanox website. My question is about the performance of running Ceph with ASAP2 OVS offload and VXLAN offload.

 

Best regards,

Re: Ceph with OVS Offload

$
0
0

Hi Lazuardi,

 

Ceph has not been tested against the ASAP2 OVS offload solution.

 

Sophie.

Re: Ceph with OVS Offload

$
0
0

Hi Sophie,

 

How can I request that test to Mellanox as reference? I'm looking for reference design of link redundancy for Ceph but without MLAG on switch and maximazing offload features of ConnectX-5 EN.

 

Best regards,


Re: rx_fifo_errors and rx_dropped errors using VMA where CPU user less than 40%

$
0
0

number of channels - is how many queues show be created

ring size - what is the size of the queue

 

Generally, you shouldn't be changing the default as they are based on the vendor experience (any vendor), however sometimes it is better to play with these settings. For example, setting number of receive queue to the number of CPUs on the host might be not a bad idea as larger number of queue will cause to more context switches that might cause to degradation.

The same with queue size - setting it to maximum means increase amount of memory used by the queue and that might cause to page swapping, that also might cause to degradation.

Bottom line, there is no single recipe, but optimum defaults. Every change, need to be validated by running benchmarks that close to mimics behaviour of the real-time application or by application itself.

Do you still have dropped packets after changing these parameters?

I would recommend to check also RedHat network performance tuning guide if you work with TCP/UDP. For VMA is is not really applicable as VMA bypass the kernel.

ConnectX-2 10GbE Ethernet "Flash not found", FW update problem

$
0
0

Dear Community,

 

I have two Mellanox ConnectX-2 Ethernet card (MNPA19-XTR).

In Linux, I see only:

01:00.0 Memory controller: Mellanox Technologies MT25408 [ConnectX IB SDR Flash Recovery] (rev b0)

 

I tired update the FW with different versions of MFT Tools (2.7, 3.x, 4.x) but no success. I follow and try many Google results.

I got always: "-E- Failed to query 0000:01:00.0 device, error : No such file or directory. MFE_NO_FLASH_DETECTED"

If I force flash chip type, I got: "Flash write failed: Flash erase of address 0x80000 failed: MFE_WRITE_TIMEOUT"

 

("Flash is not present" jumper = opened)

 

I attached a text file, with relevant flashing/query outputs.

 

Thank you for answer.

Ps: Mellanox support not helps, because I have no "contract".

Is an p2p (dedicated link, without switch) Fibre connexion totally lossless ?

$
0
0

Dear all,

 

I am doing RDMA transfer using connectx-5 100GbE Fibre with RoCEv2 (UD unreliable datagram Send) between two servers (<10 meters)

 

data size is around 8 GBytes or 80GB during tests

some time everything is fine and I dont have packet drop

but I also have  frequently a low number (around 0.01%) packet silently loss (nothing visible with verbs api neither dmesg or sysfs)

 

I am sure that some packet are dropped because I use RDMA_SEND_WITH_IMM verb with a pkt number that is checked while polling RWQ on destination host. Application is a loop that continuously post 15360 work request (3072bytes length)  at once

 

This is not related with completion queue overrun (they are polled).

 

I pay attention to cpu affinity, I try to put some amount of nanosleep on source host between ibv_post_send,I also set Ring parameters to max (8192). I suspected some transceiver temperature issue and try with another 40GbE copper link, and I have same issue

 

My question : are some (very few number but not zero) packet loss unavoidable ?

 

 

cheers

Re: What are some good budget options for NICs and Switch?

$
0
0

Hi Kirill,

 

Many thanks for posting this question in the Mellanox Community.

 

Please use the following link to inquire about your needs and what Mellanox products are suitable for your request. The link is https://mellanox.force.com/inquiry.

 

Many thanks.

 

Cheers,~Mellanox Technical Support

Re: RoCE v2 configuration with Linux drivers and packages

$
0
0

Hi Vetri,

Thank you for posting your question on the Mellanox Community.

Instead of using MLNX_OFED, you can use the OS distributed drivers for our adapters. These are called INBOX drivers. 

The following provides you the User Manual and Release Notes for some of the distributions which include the INBOX driver for our adapter. The User Manual provides the information how to set the RoCE mode if supported with the INBOX driver.

For any other inquiries regarding the OS distributed driver, you need to contact the OS vendor for the instructions.

The link is -> http://www.mellanox.com/page/inbox_drivers


Thanks and regards,

~Mellanox Technical Support

Failed to pxe boot win10 if set start type of mlx4_bus and ibbus to 0(boot start)

$
0
0

We have a diskless(pxe) boot system, according to our experience, in order to boot with mellanox connect2 nic, we should always set mlx4_bus and ibbus service to boot start, it worked well for windows 7. Recently, we moved to win10, but if we do the same, win10 does't boot, we debugged some processes of mlx4_bus and ibbus, only found that if we set boot start type to 0 for mlx4_bus and ibbus, there always lack of a \device\000000XX device to be created compared to a normal system, we don't know why windows 7 is ok but windows 10 failed for the same nic to boot. Could anyone help me to solve this?

Re: Mellanox Ethernet Adapters PRM is now available online!

$
0
0

Can we get an update?

The current linux driver source uses more opcodes which are not defined in the 0.40 version of the manual.

The ones I am curious about are:

 

MLX5_CMD_OP_ALLOC_ENCAP_HEADER            = 0x93d,
MLX5_CMD_OP_DEALLOC_ENCAP_HEADER          = 0x93e,
MLX5_CMD_OP_ALLOC_MODIFY_HEADER_CONTEXT   = 0x940,
MLX5_CMD_OP_DEALLOC_MODIFY_HEADER_CONTEXT = 0x941,

Re: send_bw test between QSFP ports on Dual Port Adapter

$
0
0

Hi Dmitri,

 

For testing tcp performance in Windows we recommend using nttcp tool, From the command line kindly run the NTttcp test and provide the output, the NTttcp tool is provided by Microsoft to test the network performance.

 

For example:

Server side: ntttcp.exe -s -m 8,*,<client ip> -l 128k -a 2 -t 30

Client side: ntttcp.exe -r -m 8,*, <client ip> -l 128k -a 2 -t 30

 

For your reference kindly see the download and explanation link:

https://gallery.technet.microsoft.com/NTttcp-Version-528-Now-f8b12769

 

Please let me know about the results.

Karen.

Re: How to configure host chaining for ConnectX-5 VPI

$
0
0

Hi Daniel,

 

I wanted to thank you for this directions they were very helpful. I was successful in linking three nodes together, all running Ubuntu 18.04. I was able to get ~96Gbs in speed between all the host using iperf2. I then took one of the boxes and loaded ESXi 6.7, and configured the same IP address on the two interface I had before. The VMware box can not communicate with the others now. I can communicate through the Nic between the other Ubuntu boxes. When I run a tcpdump on the ESXi I see the ARP request getting created, but get no response. I am wondering if you have any idea why the Chaining feature does not seem to work with ESXi?

 

Thanks

Shawn

Can't ibping Lid or GUID but can ping by ip

$
0
0

We are using an SB7790 unmanaged switch connected to:

  1. VMWARE (6.5) server with opensm on a guest Centos VM (7.5) - Mellanox ConnectX-4
  2. Server with Ubuntu (16.04.5 LTS) - Mellanox ConnectX-4
  3. Have all updated

 

Successful items:

  • Opensm is running (active) from Centos VM
  • ibstat finds all interfaces with active and linkup.
  • ibnetworkdiscover finds all interfaces connected
  • We can ping by ip to and from each server

 

Unsuccessful item:

  • Not able to ibping across switch

 

We're not sure what we might be missing.

 

Can't find many resources to do more troubleshooting. Anyone that could help would be greatly appreciated!

 

Thanks

Brian

Re: SN2100B v3.6.8004

$
0
0

Hi Reginald,

 

The reason for this is because of an enhanced security feature added for all versions starting from Mellanox Onyx/OS 3.6.8004 and above - HTTP is disabled by default. Therefore, we are not able to reach the GUI after upgrading to 3.6.8004 and above.

There are 2 possible solutions:

1.  Use HTTPS instead of HTTP to log into the GUI

2.  You can enable http by using the following commands:

      switch(config)# no web https ssl secure-cookie enable

      switch(config)# web http enable

      switch(config)# write memory

Now you can use HTTP and HTTPS connections to log into the GUI

 

Hope this helps

 

Thanks,

Pratik Pande

Re: Assign a MAC to a VLAN

$
0
0

Hi, that is not supported. The VLANs are separated only by port # and VLAN ID.

Viewing all 6227 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>