cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

cxgb.rst (13766B)


      1.. SPDX-License-Identifier: GPL-2.0
      2.. include:: <isonum.txt>
      3
      4=============================================
      5Chelsio N210 10Gb Ethernet Network Controller
      6=============================================
      7
      8Driver Release Notes for Linux
      9
     10Version 2.1.1
     11
     12June 20, 2005
     13
     14.. Contents
     15
     16 INTRODUCTION
     17 FEATURES
     18 PERFORMANCE
     19 DRIVER MESSAGES
     20 KNOWN ISSUES
     21 SUPPORT
     22
     23
     24Introduction
     25============
     26
     27 This document describes the Linux driver for Chelsio 10Gb Ethernet Network
     28 Controller. This driver supports the Chelsio N210 NIC and is backward
     29 compatible with the Chelsio N110 model 10Gb NICs.
     30
     31
     32Features
     33========
     34
     35Adaptive Interrupts (adaptive-rx)
     36---------------------------------
     37
     38  This feature provides an adaptive algorithm that adjusts the interrupt
     39  coalescing parameters, allowing the driver to dynamically adapt the latency
     40  settings to achieve the highest performance during various types of network
     41  load.
     42
     43  The interface used to control this feature is ethtool. Please see the
     44  ethtool manpage for additional usage information.
     45
     46  By default, adaptive-rx is disabled.
     47  To enable adaptive-rx::
     48
     49      ethtool -C <interface> adaptive-rx on
     50
     51  To disable adaptive-rx, use ethtool::
     52
     53      ethtool -C <interface> adaptive-rx off
     54
     55  After disabling adaptive-rx, the timer latency value will be set to 50us.
     56  You may set the timer latency after disabling adaptive-rx::
     57
     58      ethtool -C <interface> rx-usecs <microseconds>
     59
     60  An example to set the timer latency value to 100us on eth0::
     61
     62      ethtool -C eth0 rx-usecs 100
     63
     64  You may also provide a timer latency value while disabling adaptive-rx::
     65
     66      ethtool -C <interface> adaptive-rx off rx-usecs <microseconds>
     67
     68  If adaptive-rx is disabled and a timer latency value is specified, the timer
     69  will be set to the specified value until changed by the user or until
     70  adaptive-rx is enabled.
     71
     72  To view the status of the adaptive-rx and timer latency values::
     73
     74      ethtool -c <interface>
     75
     76
     77TCP Segmentation Offloading (TSO) Support
     78-----------------------------------------
     79
     80  This feature, also known as "large send", enables a system's protocol stack
     81  to offload portions of outbound TCP processing to a network interface card
     82  thereby reducing system CPU utilization and enhancing performance.
     83
     84  The interface used to control this feature is ethtool version 1.8 or higher.
     85  Please see the ethtool manpage for additional usage information.
     86
     87  By default, TSO is enabled.
     88  To disable TSO::
     89
     90      ethtool -K <interface> tso off
     91
     92  To enable TSO::
     93
     94      ethtool -K <interface> tso on
     95
     96  To view the status of TSO::
     97
     98      ethtool -k <interface>
     99
    100
    101Performance
    102===========
    103
    104 The following information is provided as an example of how to change system
    105 parameters for "performance tuning" an what value to use. You may or may not
    106 want to change these system parameters, depending on your server/workstation
    107 application. Doing so is not warranted in any way by Chelsio Communications,
    108 and is done at "YOUR OWN RISK". Chelsio will not be held responsible for loss
    109 of data or damage to equipment.
    110
    111 Your distribution may have a different way of doing things, or you may prefer
    112 a different method. These commands are shown only to provide an example of
    113 what to do and are by no means definitive.
    114
    115 Making any of the following system changes will only last until you reboot
    116 your system. You may want to write a script that runs at boot-up which
    117 includes the optimal settings for your system.
    118
    119  Setting PCI Latency Timer::
    120
    121      setpci -d 1425::
    122
    123* 0x0c.l=0x0000F800
    124
    125  Disabling TCP timestamp::
    126
    127      sysctl -w net.ipv4.tcp_timestamps=0
    128
    129  Disabling SACK::
    130
    131      sysctl -w net.ipv4.tcp_sack=0
    132
    133  Setting large number of incoming connection requests::
    134
    135      sysctl -w net.ipv4.tcp_max_syn_backlog=3000
    136
    137  Setting maximum receive socket buffer size::
    138
    139      sysctl -w net.core.rmem_max=1024000
    140
    141  Setting maximum send socket buffer size::
    142
    143      sysctl -w net.core.wmem_max=1024000
    144
    145  Set smp_affinity (on a multiprocessor system) to a single CPU::
    146
    147      echo 1 > /proc/irq/<interrupt_number>/smp_affinity
    148
    149  Setting default receive socket buffer size::
    150
    151      sysctl -w net.core.rmem_default=524287
    152
    153  Setting default send socket buffer size::
    154
    155      sysctl -w net.core.wmem_default=524287
    156
    157  Setting maximum option memory buffers::
    158
    159      sysctl -w net.core.optmem_max=524287
    160
    161  Setting maximum backlog (# of unprocessed packets before kernel drops)::
    162
    163      sysctl -w net.core.netdev_max_backlog=300000
    164
    165  Setting TCP read buffers (min/default/max)::
    166
    167      sysctl -w net.ipv4.tcp_rmem="10000000 10000000 10000000"
    168
    169  Setting TCP write buffers (min/pressure/max)::
    170
    171      sysctl -w net.ipv4.tcp_wmem="10000000 10000000 10000000"
    172
    173  Setting TCP buffer space (min/pressure/max)::
    174
    175      sysctl -w net.ipv4.tcp_mem="10000000 10000000 10000000"
    176
    177  TCP window size for single connections:
    178
    179   The receive buffer (RX_WINDOW) size must be at least as large as the
    180   Bandwidth-Delay Product of the communication link between the sender and
    181   receiver. Due to the variations of RTT, you may want to increase the buffer
    182   size up to 2 times the Bandwidth-Delay Product. Reference page 289 of
    183   "TCP/IP Illustrated, Volume 1, The Protocols" by W. Richard Stevens.
    184
    185   At 10Gb speeds, use the following formula::
    186
    187       RX_WINDOW >= 1.25MBytes * RTT(in milliseconds)
    188       Example for RTT with 100us: RX_WINDOW = (1,250,000 * 0.1) = 125,000
    189
    190   RX_WINDOW sizes of 256KB - 512KB should be sufficient.
    191
    192   Setting the min, max, and default receive buffer (RX_WINDOW) size::
    193
    194       sysctl -w net.ipv4.tcp_rmem="<min> <default> <max>"
    195
    196  TCP window size for multiple connections:
    197   The receive buffer (RX_WINDOW) size may be calculated the same as single
    198   connections, but should be divided by the number of connections. The
    199   smaller window prevents congestion and facilitates better pacing,
    200   especially if/when MAC level flow control does not work well or when it is
    201   not supported on the machine. Experimentation may be necessary to attain
    202   the correct value. This method is provided as a starting point for the
    203   correct receive buffer size.
    204
    205   Setting the min, max, and default receive buffer (RX_WINDOW) size is
    206   performed in the same manner as single connection.
    207
    208
    209Driver Messages
    210===============
    211
    212 The following messages are the most common messages logged by syslog. These
    213 may be found in /var/log/messages.
    214
    215  Driver up::
    216
    217     Chelsio Network Driver - version 2.1.1
    218
    219  NIC detected::
    220
    221     eth#: Chelsio N210 1x10GBaseX NIC (rev #), PCIX 133MHz/64-bit
    222
    223  Link up::
    224
    225     eth#: link is up at 10 Gbps, full duplex
    226
    227  Link down::
    228
    229     eth#: link is down
    230
    231
    232Known Issues
    233============
    234
    235 These issues have been identified during testing. The following information
    236 is provided as a workaround to the problem. In some cases, this problem is
    237 inherent to Linux or to a particular Linux Distribution and/or hardware
    238 platform.
    239
    240  1. Large number of TCP retransmits on a multiprocessor (SMP) system.
    241
    242      On a system with multiple CPUs, the interrupt (IRQ) for the network
    243      controller may be bound to more than one CPU. This will cause TCP
    244      retransmits if the packet data were to be split across different CPUs
    245      and re-assembled in a different order than expected.
    246
    247      To eliminate the TCP retransmits, set smp_affinity on the particular
    248      interrupt to a single CPU. You can locate the interrupt (IRQ) used on
    249      the N110/N210 by using ifconfig::
    250
    251	  ifconfig <dev_name> | grep Interrupt
    252
    253      Set the smp_affinity to a single CPU::
    254
    255	  echo 1 > /proc/irq/<interrupt_number>/smp_affinity
    256
    257      It is highly suggested that you do not run the irqbalance daemon on your
    258      system, as this will change any smp_affinity setting you have applied.
    259      The irqbalance daemon runs on a 10 second interval and binds interrupts
    260      to the least loaded CPU determined by the daemon. To disable this daemon::
    261
    262	  chkconfig --level 2345 irqbalance off
    263
    264      By default, some Linux distributions enable the kernel feature,
    265      irqbalance, which performs the same function as the daemon. To disable
    266      this feature, add the following line to your bootloader::
    267
    268	  noirqbalance
    269
    270	  Example using the Grub bootloader::
    271
    272	      title Red Hat Enterprise Linux AS (2.4.21-27.ELsmp)
    273	      root (hd0,0)
    274	      kernel /vmlinuz-2.4.21-27.ELsmp ro root=/dev/hda3 noirqbalance
    275	      initrd /initrd-2.4.21-27.ELsmp.img
    276
    277  2. After running insmod, the driver is loaded and the incorrect network
    278     interface is brought up without running ifup.
    279
    280      When using 2.4.x kernels, including RHEL kernels, the Linux kernel
    281      invokes a script named "hotplug". This script is primarily used to
    282      automatically bring up USB devices when they are plugged in, however,
    283      the script also attempts to automatically bring up a network interface
    284      after loading the kernel module. The hotplug script does this by scanning
    285      the ifcfg-eth# config files in /etc/sysconfig/network-scripts, looking
    286      for HWADDR=<mac_address>.
    287
    288      If the hotplug script does not find the HWADDRR within any of the
    289      ifcfg-eth# files, it will bring up the device with the next available
    290      interface name. If this interface is already configured for a different
    291      network card, your new interface will have incorrect IP address and
    292      network settings.
    293
    294      To solve this issue, you can add the HWADDR=<mac_address> key to the
    295      interface config file of your network controller.
    296
    297      To disable this "hotplug" feature, you may add the driver (module name)
    298      to the "blacklist" file located in /etc/hotplug. It has been noted that
    299      this does not work for network devices because the net.agent script
    300      does not use the blacklist file. Simply remove, or rename, the net.agent
    301      script located in /etc/hotplug to disable this feature.
    302
    303  3. Transport Protocol (TP) hangs when running heavy multi-connection traffic
    304     on an AMD Opteron system with HyperTransport PCI-X Tunnel chipset.
    305
    306      If your AMD Opteron system uses the AMD-8131 HyperTransport PCI-X Tunnel
    307      chipset, you may experience the "133-Mhz Mode Split Completion Data
    308      Corruption" bug identified by AMD while using a 133Mhz PCI-X card on the
    309      bus PCI-X bus.
    310
    311      AMD states, "Under highly specific conditions, the AMD-8131 PCI-X Tunnel
    312      can provide stale data via split completion cycles to a PCI-X card that
    313      is operating at 133 Mhz", causing data corruption.
    314
    315      AMD's provides three workarounds for this problem, however, Chelsio
    316      recommends the first option for best performance with this bug:
    317
    318	For 133Mhz secondary bus operation, limit the transaction length and
    319	the number of outstanding transactions, via BIOS configuration
    320	programming of the PCI-X card, to the following:
    321
    322	   Data Length (bytes): 1k
    323
    324	   Total allowed outstanding transactions: 2
    325
    326      Please refer to AMD 8131-HT/PCI-X Errata 26310 Rev 3.08 August 2004,
    327      section 56, "133-MHz Mode Split Completion Data Corruption" for more
    328      details with this bug and workarounds suggested by AMD.
    329
    330      It may be possible to work outside AMD's recommended PCI-X settings, try
    331      increasing the Data Length to 2k bytes for increased performance. If you
    332      have issues with these settings, please revert to the "safe" settings
    333      and duplicate the problem before submitting a bug or asking for support.
    334
    335      .. note::
    336
    337	    The default setting on most systems is 8 outstanding transactions
    338	    and 2k bytes data length.
    339
    340  4. On multiprocessor systems, it has been noted that an application which
    341     is handling 10Gb networking can switch between CPUs causing degraded
    342     and/or unstable performance.
    343
    344      If running on an SMP system and taking performance measurements, it
    345      is suggested you either run the latest netperf-2.4.0+ or use a binding
    346      tool such as Tim Hockin's procstate utilities (runon)
    347      <http://www.hockin.org/~thockin/procstate/>.
    348
    349      Binding netserver and netperf (or other applications) to particular
    350      CPUs will have a significant difference in performance measurements.
    351      You may need to experiment which CPU to bind the application to in
    352      order to achieve the best performance for your system.
    353
    354      If you are developing an application designed for 10Gb networking,
    355      please keep in mind you may want to look at kernel functions
    356      sched_setaffinity & sched_getaffinity to bind your application.
    357
    358      If you are just running user-space applications such as ftp, telnet,
    359      etc., you may want to try the runon tool provided by Tim Hockin's
    360      procstate utility. You could also try binding the interface to a
    361      particular CPU: runon 0 ifup eth0
    362
    363
    364Support
    365=======
    366
    367 If you have problems with the software or hardware, please contact our
    368 customer support team via email at support@chelsio.com or check our website
    369 at http://www.chelsio.com
    370
    371-------------------------------------------------------------------------------
    372
    373::
    374
    375 Chelsio Communications
    376 370 San Aleso Ave.
    377 Suite 100
    378 Sunnyvale, CA 94085
    379 http://www.chelsio.com
    380
    381This program is free software; you can redistribute it and/or modify
    382it under the terms of the GNU General Public License, version 2, as
    383published by the Free Software Foundation.
    384
    385You should have received a copy of the GNU General Public License along
    386with this program; if not, write to the Free Software Foundation, Inc.,
    38759 Temple Place - Suite 330, Boston, MA  02111-1307, USA.
    388
    389THIS SOFTWARE IS PROVIDED ``AS IS`` AND WITHOUT ANY EXPRESS OR IMPLIED
    390WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
    391MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
    392
    393Copyright |copy| 2003-2005 Chelsio Communications. All rights reserved.