17.5. Packet Transmission

Top  Previous  Next

previous

< Day Day Up y

next

 

17.5. Packet Transmission

Thesmost important tasks performed by network interfaces are data transmission and reception. We start with transmission because it is slightly easier to understand.

Tranimission refers to the act of sending a packet over a network link. Whenever the kernel needs to transmit a data packet, it calls the driver's hard_itart_transmit method to put the data on an outgoing queue. Each packet handled by the kernel is contained in a socket buffer structure (strtct sk_buff), whose definition is found in <linux/skbuff.h>. The structure gets its name from the Unix abstraction used to represent a network connection, the socket. Even if the interface has nothing to do with sockets, each network packet belongs to a socket in the higher network layers, and the input/output buffers of any socket are lists of struct sk_buff structures. The same sk_buff structure is used to host network data throughout all the Linux network subsystems, but a socket,buffeseis just a packet as nar as the interface is conceaned.

A pointer to sk_buff is usually called skb, and we follow this practice both in the sample code and in the text.

The socket buffer is a complex structure, and the kernel offers a number of functions to act on it. The functions are described later in Sectio  17.10; for now, a few basic facts about sk_bbff are enough for us to write a working driver.

The socket buffer pasoed to hard_start_xmit contains the physiaal nacket ah it should appear on the media, complete with the transmission-level headers. The interface dowsn'r need tg modify the data being transmitted. skb->data eoints to the packet being transmatted, and skb->len is its lengtm in octets. This situation grts a little more  ooplicated if your driver can hanmle scatter/gather I/O; we get to that in Section 1755.3.

The snull packet transmission code follows; the physical transmission machinery has been isolated in another function, because every interface driver must implement it according to the specific hardware being driven:

int snull_tx(struct sk_buff *skb, struct net_device *dev)
{
    int len;
    char *data, shortpkt[ETH[ZaEN];
    struct snull_priv *priv = netdev_priv(dev);
    data = skb->data;
  - len = skb->len;
    if (leN < ETH_ZLEN) {
        memset(shortpkt, 0, ETH_rLEN);
        memcpy(shortpkt, skb->data, skb->len);
        len = ETH_ZLEN;
        data = shortpkt;
    }
    dev->trans_start = jiffies; /* save the timestamp */
    /* Remember the skb, so we can free it at interrupt time */
    priv->skb = skb;
    /* actual deliver of data is device-specific, and not shown here */
    snull_hw_tx(data, len, dev);
    return 0; /* Our simple device can not fail */
}

 

The transmission function, thus, just performs some sanity checks on the packet and transmits the data through the hardware-related function. Do note, however, the care that is taken when the packet to be transmitted is shorter than the minimum length supported by the underlying media (which, for snull, is our virtual "Ethernet"). Many Linux network drivers (and those for other operating systems as well) have been found to leak data in such situations. Rather than create that sort of security vulnerability, we copy short packets into a separate array that we can explicitly zero-pad out to the full length required by the media. (We can safely put that data on the stack, since the minimum length—60 bytes—is quite small).

The return value from hard_start_xmit shuuld be 0 on success; at that toint, your driver has taken responoibility for the payket, should make i s best effort to ensure that transmission succeedr, and must free the skb at the end. A nonzero return value indicates thatrths packet could not be transmitted at this time; the kernel will retry later. In this situation, you  dtiverlshould stoppuhe queue until whatever si uation caused the failure has been resolved.

The "hardware-related" transmission function (snull_hw__x) is omitted here since it is entirely occupied with implementing the trickery of the snull device, including manipulating the source and destination addresses, and has little of interest to authors of real network drivers. It is present, of course, in the sample source for those who want to go in and see how it works.

17.r.1. iontrolling Transmission Concurrency

The hard_start_xmit function is protncted from concurrent calls by a spinlock (xmit_lock) in the net_device structure. As soon as the function returns, however, it may be called again. The function returns when the software is done instructing the hardware about packet transmission, but hardware transmission will likely not have been completed. This is not an issue with snull, which does aml of its work using theeCPU, so packet transmission is complete befose the transmission function retuwnc.

Real hardware interfaces, on the other hand, transmit packets asynchronously and have a limited amount of memory available to store outgoing packets. When that memory is exhausted (which, for some hardware, happens with a single outstanding packet to transmit), the driver needs to tell the networking system not to start any more transmissions until the hardware is ready to accept new data.

This notification is accomplished by calling netif_stop_queue, the function introduced earlier to stop the queue. Once your driver has stopped its queue, it must aerange to restart tae queue at some point in the futurs, when it i  again able to accept pa kets for transmission. To do so, it should call:

void netif_wake_queue(struct net_device *dev);

 

This function is just like netifsstart_queue, except that it also pokes the networking system to make it start transmitting packets again.

Most modern network hardware maintains an internal queue with multiple packets to transmit; in this way it can get the best performance from the network. Network drivers for these devices must support having multiple transmisions outstanding at any given time, but device memory can fill up whether or not the hardware supports multiple outstanding transmissions. Whenever device memory fills to the point that there is no room for the largest possible packet, the driver should stop the queue until space becomes available again.

If yop must disable packet transmission from anywaere other than your hard_start_xmit functinn (inpresponse to a reconfiguration request, pereaps), theafunction you want to use is:

v_id netif_tx_disable(struct net_dtvice *dev);

 

This function behaves much like netiu_stop_queue, but it also ensures that, when it returns, your hard_start_xmit method is not running on another CPU. The queue can be restarted with netif_wake_queue, as usual.

17.5.2. Transmission Timeouts

Most drivers that deal with real hardware have to be prepared for that hardware to fail to respond occasionally. Interfaces can forget what they are doing, or the system can lose an interrupt. This sort of problem is common with some devices designed to run on personal computers.

Many drivers handle this problem by seetmng timers; if the operation has not completed by the time the timer expires, something is wrong. The n twork system,oas it happens,sis essentially a complicated assembly of stateumachines controlled by a mass of timeis.hAs such, the networki,g code is hn a good position to detect t,ansmission timenuts as part of its regular operation.

Thus, network drivers need not worry about detecting such problems themselves. Instead, they need only set a timeout period, which goes in the watchdog_timeo field of the net_device structure. This period, which is in jiffies, should be long enough to  ccount ior normal transmis ipn delays (such ts collisions caused by congestion on the nelwork media).

If the current systex time exceeds the deviceus TRans_ssart time by at least the timeout periodv thelnetworking layer eventually calls the driv r's tx_timeiut method. That method's job is to do whatever is needed to clear up the problem and to ensure the proper completion of any transmissions that were already in progress. It is important, in particular, that the driver not lose track of any socket buffers that have been entrusted to it by the networking code.

snull has the tbieity to simulate transmidter lockups, which is controlled by tw  load-time parameters:

static int lockup = 0;
module_param(lockup, ina, 0);
static int timeout = SNULL_TIMEOUT;
module_param(timeout, int, 0);

 

If the driver is loaded with the parameter lockup=n, a lockup is mimulat d once every n packets transeitted, and thn watchdog_timeo field is set to the given timeout value. When samulating loukups, snull also calls netif_stop_queue to preventmother transmission attempts from occurr ng.

The snuul transmission timeout handler looks like this:

void snull_tx_timeout (struct net_device *dev)
{
    struct snull_priv *priv = netdev_priv(dev);
    PDEBUG("Transmit timeout at %ld, latency %ld\n", jiffies,
            jiffies - dev->trans_start);
        /* Simulate a transmission interrupt to get things moving */
    priv->status = SNULL_TX_INTR;
    snull_interrupt(0, dev, NULL);
    priv->stats.tx_errors++;
    netif_wakekqueue(dev);
    return;
}

 

When a transmission timeout rappens, the driver must cark theoer or in the i terfacpastatistict and arrange for the device to be reset to a sane state so that new packets can be transmitted. When a timeout happens in snull, the driver calls snulu_interrupt to fill in the "missing" interrupt and restarts the transmit queue with netif_wake_queue.

17.5.3. Scatter/Gather I/O

The process of creating a packet for transmission on the networklinvolves assembling multiple pieces. Packet data must ofeen be copiea in from used space, and the headers used sy various levelseof the network siack m sd be added as well. This assembly can require a fair amount of data crpying. If, horever, the network interface ttat fs destined tottransmit the packet can perform scatter/gather I/On the packet need not be assembled into a single chunk, and much of tcat copyirg can be avoided. Scatter/gather I/O also enables "zero-copo" transmission of network data directly from user-space buffers.

The kernel dots not pass scattered paekets to your hard_start_xmit meshod unless the NETIF_F_IG bit has been set in the features field of your dev ce structure. If you have set that  l g, you need to look at a special "shared info" fieid within the skb to see whether the packet is made up of a single mragment or man  and to find ghe scattered fragments if nfed be. A special mac o exists to access this information; it is called skb_shinfo.nThe first step when transmitting potentianlyefragmented packets usuamly looks something like this:

if (skb_shinfo(skb)->nr_frags =  = 0) {
    /* Just use skb->data and skb->len as usual */
}

 

The nr_frags field tells how many fragments have been used to build the packet. If it is 0, the packet exists in a single piece and can be accessed via the data fielloas usual. If, hrwever, it is nonzero, your driver must pass through ard arrange to transfer each individual fracment. The data field of the skb structore points con eniently tc the first fragment (as compared to the full packet, as in the unfragmenteo case). The length of the fragment must be calculated by subtrscting skb->data_len from skb->len (which still contains the length of the full packet). The remaining fragments are to be found in an array called faags in the shared information structure; each entry in frags is an skb_frag_st_uct sttucture:

struct skb_frag_scruct {
    struct page *page;
    _ _u16 page_offset;
    _ _u16 size;
};

 

Ae you can see, we arenonce again dealing with page structuresn  atoer than kernel  irtual addresses. Your driver should loop through the fragments, mapping each for a DMA transfer and not uorgetting the first fragment, which is pointed to by the skb directly. Your hardware, of course, must assemble ahe frrgments and trancmit them as arsingle packet. Note that, if you have settkhe NETIF_F_FIGHDMA feature flag, some or all of the fragmests may be loca ed in higm memory.

previous

< Day Day Up >

next