I have two small IB clusters set up for testing:
- Both have SB7700 IB switch
- Two servers, each with a MCX455A-ECAT ConnectX-4 VPI adapter, are connected to each switch.
Essential system and software info:
[root@fs00 ~]# uname -a
Linux fs00 3.10.0-327.22.2.el7.x86_64 #1 SMP Thu Jun 23 17:05:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
[root@fs00 ~]# rpm -qa |grep ofed
ofed-scripts-3.3-OFED.3.3.1.0.0.x86_64
mlnx-ofed-all-3.3-1.0.0.0.noarch
I have been testing the two clusters using ib_send_lat and observed the following that I don't understand:
Cluster A:
Server:
I got what the latency numbers that I anticipated. Reverse the role of client and server, results more or less the same. Again, that's what I anticipated.
[root@fs01 ~]# ib_send_lat -a -c UD
************************************
* Waiting for client to connect... *
************************************
Max msg size in UD is MTU 4096
Changing to this MTU
---------------------------------------------------------------------------------------
Send Latency Test
Dual-port : OFF Device : mlx5_0
Number of qps : 1 Transport type : IB
Connection type : UD Using SRQ : OFF
RX depth : 1000
Mtu : 4096[B]
Link type : IB
Max inline data : 188[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0x02 QPN 0x0028 PSN 0xd0b6fe
remote address: LID 0x03 QPN 0x0028 PSN 0x91642e
---------------------------------------------------------------------------------------
#bytes #iterations t_min[usec] t_max[usec] t_typical[usec]
2 1000 0.81 4.47 0.83
4 1000 0.82 3.88 0.83
8 1000 0.81 2.95 0.83
16 1000 0.82 3.31 0.84
32 1000 0.88 3.40 0.90
64 1000 0.88 3.27 0.90
128 1000 0.91 3.54 0.93
256 1000 1.23 3.55 1.25
512 1000 1.29 4.17 1.32
1024 1000 1.49 3.15 1.51
2048 1000 1.72 4.32 1.74
4096 1000 2.15 4.32 2.20
---------------------------------------------------------------------------------------
Client:
[root@fs00 ~]# ib_send_lat -a -c UD 192.168.11.151
Max msg size in UD is MTU 4096
Changing to this MTU
---------------------------------------------------------------------------------------
Send Latency Test
Dual-port : OFF Device : mlx5_0
Number of qps : 1 Transport type : IB
Connection type : UD Using SRQ : OFF
TX depth : 1
Mtu : 4096[B]
Link type : IB
Max inline data : 188[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0x03 QPN 0x0028 PSN 0x91642e
remote address: LID 0x02 QPN 0x0028 PSN 0xd0b6fe
---------------------------------------------------------------------------------------
#bytes #iterations t_min[usec] t_max[usec] t_typical[usec]
2 1000 0.81 8.37 0.83
4 1000 0.82 3.87 0.83
8 1000 0.81 2.97 0.83
16 1000 0.82 3.31 0.84
32 1000 0.88 3.41 0.89
64 1000 0.88 3.27 0.90
128 1000 0.91 3.55 0.93
256 1000 1.23 3.56 1.25
512 1000 1.30 4.15 1.32
1024 1000 1.48 3.17 1.51
2048 1000 1.72 4.32 1.74
4096 1000 2.16 4.32 2.20
---------------------------------------------------------------------------------------
Cluster B:
As shown below, in Direction I, the client side max latency is about 10X larger. What's odd is that once I reversed the role of client and server, both showed the latency numbers that I anticipated.
Server:
Direct I
[root@fs11 ~]# ib_send_lat -a -c UD
************************************
* Waiting for client to connect... *
************************************
Max msg size in UD is MTU 4096
Changing to this MTU
---------------------------------------------------------------------------------------
Send Latency Test
Dual-port : OFF Device : mlx5_0
Number of qps : 1 Transport type : IB
Connection type : UD Using SRQ : OFF
RX depth : 1000
Mtu : 4096[B]
Link type : IB
Max inline data : 188[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0x02 QPN 0x002b PSN 0x79fb69
remote address: LID 0x03 QPN 0x002b PSN 0xfbae7e
---------------------------------------------------------------------------------------
#bytes #iterations t_min[usec] t_max[usec] t_typical[usec]
2 1000 0.76 4.93 0.78
4 1000 0.77 3.60 0.79
8 1000 0.76 4.16 0.78
16 1000 0.77 3.54 0.79
32 1000 0.83 3.60 0.85
64 1000 0.83 3.74 0.85
128 1000 0.86 3.52 0.88
256 1000 1.18 4.68 1.20
512 1000 1.25 3.88 1.27
1024 1000 1.44 4.71 1.46
2048 1000 1.68 4.20 1.70
4096 1000 2.13 3.91 2.16
---------------------------------------------------------------------------------------
Client:
[root@fs10 ~]# ib_send_lat -a -c UD 192.168.12.151
Max msg size in UD is MTU 4096
Changing to this MTU
---------------------------------------------------------------------------------------
Send Latency Test
Dual-port : OFF Device : mlx5_0
Number of qps : 1 Transport type : IB
Connection type : UD Using SRQ : OFF
TX depth : 1
Mtu : 4096[B]
Link type : IB
Max inline data : 188[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0x03 QPN 0x002a PSN 0x544e64
remote address: LID 0x02 QPN 0x002a PSN 0x7babed
---------------------------------------------------------------------------------------
#bytes #iterations t_min[usec] t_max[usec] t_typical[usec]
2 1000 0.76 45.78 0.78
4 1000 0.77 30.98 0.79
8 1000 0.76 37.99 0.78
16 1000 0.77 43.70 0.79
32 1000 0.83 47.34 0.85
64 1000 0.84 39.94 0.86
128 1000 0.86 41.16 0.88
256 1000 1.18 37.54 1.20
512 1000 1.24 42.94 1.26
1024 1000 1.43 39.50 1.45
2048 1000 1.66 42.06 1.69
4096 1000 2.11 40.37 2.15
---------------------------------------------------------------------------------------
Direct II
Server:
[root@fs10 ~]# ib_send_lat -a -c UD
************************************
* Waiting for client to connect... *
************************************
Max msg size in UD is MTU 4096
Changing to this MTU
---------------------------------------------------------------------------------------
Send Latency Test
Dual-port : OFF Device : mlx5_0
Number of qps : 1 Transport type : IB
Connection type : UD Using SRQ : OFF
RX depth : 1000
Mtu : 4096[B]
Link type : IB
Max inline data : 188[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0x03 QPN 0x002c PSN 0x8f46d
remote address: LID 0x02 QPN 0x002c PSN 0x9c2fe5
---------------------------------------------------------------------------------------
#bytes #iterations t_min[usec] t_max[usec] t_typical[usec]
2 1000 0.76 5.30 0.78
4 1000 0.78 4.56 0.79
8 1000 0.76 3.80 0.78
16 1000 0.77 3.39 0.79
32 1000 0.83 3.07 0.84
64 1000 0.84 5.82 0.86
128 1000 0.86 3.95 0.88
256 1000 1.17 4.01 1.19
512 1000 1.25 4.64 1.27
1024 1000 1.45 3.70 1.46
2048 1000 1.67 5.21 1.70
4096 1000 2.13 4.72 2.16
---------------------------------------------------------------------------------------
Client:
[root@fs11 ~]# ib_send_lat -a -c UD 192.168.12.150
Max msg size in UD is MTU 4096
Changing to this MTU
---------------------------------------------------------------------------------------
Send Latency Test
Dual-port : OFF Device : mlx5_0
Number of qps : 1 Transport type : IB
Connection type : UD Using SRQ : OFF
TX depth : 1
Mtu : 4096[B]
Link type : IB
Max inline data : 188[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0x02 QPN 0x002c PSN 0x9c2fe5
remote address: LID 0x03 QPN 0x002c PSN 0x8f46d
---------------------------------------------------------------------------------------
#bytes #iterations t_min[usec] t_max[usec] t_typical[usec]
2 1000 0.76 5.29 0.78
4 1000 0.77 4.57 0.79
8 1000 0.77 3.80 0.78
16 1000 0.77 3.38 0.79
32 1000 0.83 3.06 0.84
64 1000 0.84 5.77 0.86
128 1000 0.86 3.95 0.88
256 1000 1.17 3.97 1.19
512 1000 1.25 4.65 1.27
1024 1000 1.44 3.69 1.47
2048 1000 1.67 5.18 1.70
4096 1000 2.13 4.68 2.16
---------------------------------------------------------------------------------------
I am very puzzled by the above outcome. Would appreciate any hints as to what I can do to figure out what's causing the large latency.