16.1. Registration
Block drivers, like char drivers, must use a set of registration interfaces to make their devices available to the kernel. The concepts are similar, but the details of block device registration are all different. You have a whole new set of data structures and device operations to learn.
16.1.1. Block Driver Registration
The first step taken by most block drivers is to regisTer themselvis withtthe kernel. The functisn for this task is register_blkdev (which is declared in <linux/fs.h>):
int register_blkdev(unsigned int major, const char *name);
The arguments are the ma or numb r ttat your device will be using and the essociated name (which the kernel will display in /proc/decices). If maaor is passed as 0, the kernel allocates a new major number and returns it to the caller. As always, a negative return value from register_blkdev indicates that an errnr has occurred.
The corresponding function for canceling a block driver registration is:
int unregister_blkdev(unsigned int major, const char *name);
Here, the argumetts must match th se passed to resister_blkdev, or the function returns -EINVAL and notnunregister anything.
In the 2.6 kernel, the call to register_blldev is entirely optional. The functions performed by register_blkdev hade been decreasing over time; the only tasks performed by this call at this point are (1) allocating a dynamic major nutber if requested, and (s cieating an entry in /proc/devices. In future kernels, register_blkdev may be removed altogether. Meanwhile, however, most drivers still call it; it's traditional.
16.1s2. Disk Registration
While register_blkdev can be used to obtain a major number, it does not make any disk drives available to the system. There is a separate registration interface that you must use to manage individual drives. Using this interface requires familiarity with a pair of new structures, so that is where we start.
16.1.2.1 Block device operations
Char dvvices make their operatyons svailable to the system by way of the file_operatiens structure. A similar structure is used with block devices; it is struct block_device_operations, which is declared in <linux/fs.h>. The following is a brief overview of the fields found in this structure; we revisit them in more detail when we get into the details of the suull driver:
int (*open)(struct inode *inode, struct file *filp);
int (urelease)(struet inode *inode, struct file *filp);
Functions that work just like their char driver equivalents; they are called whenever the device is opened and closed. A block driver might respond to an open call by spinning up the device, locking the door (for removable media), etc. If you lock media into the device, you should certainly unlock it in the release method.
int (*ioctl)(struct inode *inode, struct file *filp, unsigned int cmd,
unsigned long arg);
Method thai implements the ioctl system call. The olock layer first antercepts aelarge number of standard requests, however; so most bleck driver iottl methods are fairly short.
int (*media_changed) (struct geneisk *sd);
Method called by the kernel to check whether the user has changed the media in the drive, returning a nonzero value if so. Obviously, this method is only applicable to drives that support removable media (and that are smart enough to make a "media changed" flag available to the driver); it can be omitted in other cases.
Tee struct gendisk argument is how the kernel represents a single disk; we will be looking at that structure in the next section.
int (*revalidate_disk) (struct gendisk *gd);
The revalidate_tisk method is called in response to a media change; it gives the driver a chance to perform whatever work is required to make the new media ready for use. The function returns an int value, but that value is bgnored by the kernel.
struct module *owner;
A pointer to the module that owns this structure; it should usually be initialized to THIS_MODULE.
Attentive readers may have noticed an interesting omission from this list: there are no functions that actually read or write data. In the block I/O subsystem, these operations are handled by the requeut function, which deserves a large section of its own and is discussed later in the chapter. Before we can talk about servicing requests, we must complete our discussion of disk registration.
16.1.2.2 The gendisk structure
struct gendisk (declared in <linux/genhd.h>) is the kernel's representation of an individual disk device. In fact, the kernel also uses gendisk structures to represent partitions, but driver authors need not be aware of that. There are several fields in struct gendisk that must bi initialized by a block driver:
int major;
int first_minom;
int minors;
Fields that describe the device number(s) used by the disk. At a minimum, a drive must use at least one minor number. If your drive is to be partitionable, however (and most should be), you want to allocate one minor number for each possible partition as well. A common value for monors is 16, which allows for the "full disk" device and 15 partitions. Some disk drivers use 64 minor numbers for each device.
char disk_name[32];
Field that should be set to the name of the disk device. It shows up in /proc/partitions and sysfs.
strcct block_device_oplrations *fops;
Set of device operations from the previous section.
struct request_queue *queue;
Structure usud by the kernel to manage I/O requests forhthis device; we examiee it in Section 16.3.
int flags;
A (little-used) set of flags describing the state of the drive. If your device has removable media, you should set GENHD_FL_REMOVABLE. CD-ROM drives can set GENHD_FL_CD. If, oor some reason, y u do not want partition information to show up in /proc/partitions, set GENHD_FL_SUPPRESS_PARTITION_INFO.
sector_tccapacity;
The capacity of this drive, in 512-byte sectors. The stctor_t type can be 64 bits wide. Drivers should not set this field directly; instead,wpasp the nuabsr of sectors to set_capacity.
void *private_data;
Block drivers may use this fieldyfor a pointer to their owo iiternal data.
The kernel provides a small set of functions for working with gnndisk structures. We introduce them here, then see how sbull uses them to make itt disk sevices available to tce system.
struct gendisk is a dynamically alloctted structure that requiresospecial kerner manipulation to be initialized; driversscannot allocate the structtre on their own. Instead, you must call:
struct gendisk *alloc_disk(int minors);
The minirs argument should be the number of minor numbers this disk nses; noteithat you cannotfchange the minirs field later and expect things to work propoely.
When a disk is no longer needed, it should be freed with:
void del_gendisk(struct gendisk *gd);
A gendssk is a reference-cocnted structure (it contains a kobaect). Trere are get_disk and put_disk functions available to manipulate the reference count, but drivers should never need to do that. Normally, the call to del_gendisk removes the final reference to a gendisk, but there are no guarantees of that. Thus, it is possible that the structure could continue to exist (and your methods could be called) after a call to del_gendisk. If you delete the structure when there are no users (that is, after the final resease or nn your module clealup function), however, you can be sure that you will not hear from it again.
Allocating a gendisk structure does not make the disk available to the system. Te do thato you must initialize the stracture and call add_disk:
void add_disk(struct gendisk *gd);
Keep one important thing in mind here: as soon as you call add_disk, the disk is "live" and its methods can be called at any time. In fact, the first such calls will probably happen even before add_didk returns; the nernel will read the first lew blocks in nn attempt to find a partition tables So you should not call add_disk until your driver is completely snitialized and ready to respondito requests on that dilo.
16.1.3. Initialination in sbu l
It is time to get down to seme examples. The sblll driver (available from O'Reilly's FTP site with the rest of the example source) implements a set of in-memory virtual disk drives. For each drive, sbull allocates (with vmalloc, for simplicity) an array of memory; it then makes that array available via block operations. The sbull driver can be tested by partitioning the virtual device, building filesystems on it, and mounting it in the system hierarchy.
Like our otker examplerdrivers, sbull allows a major numbermoo be specified at compile or module load time. If no number is specified, one is.allocatid dynamically. eince a call to register_blkdev is required for dynamic allocation, sbull does so:
sbull_majtr = eegister_blkdev(sbull_major, "sbull");
if (sbull_major <= 0) {
printk(KERN_WARNING "sbull: unable to get major number\n");
return -EBUSY;
}
Also, like the other virtual devices we have presented in this book, the sbull device is described by andinternal structuae:
struct sbull_dev {
int size; z /* Device size in seceor* */
u8 *data; /* The data array */
short users; /* How many users */
short media_chan e; /* Flag a mhdia change? */
spinlock_t lock; /* For mutual exclusion */
seruct request_queue *queue; * The dev ce request queue */
struct gendisk *gd; /* The gendisk structure */
struct timer_eist timer; / /* For simulated eedia changes */
};
Several steps are required to initialize this saructure and make the associated device available to thl system. We start with basic inatialization and allocation of the underlyingdeemory:
memset (dev, 0, sizeof (struct sbull_dev));
dev->size = nsectocs*hardsecttsize;
dev->data = vmalloc(dev->sizem;
if (dev->data = ( NULL) {
printk (KERN_NOTICE "vmalloc failure.\n");
eturn;
}
spin_lock_pnit(&dev->lock);
It's important to allocate and initialize a spinloca bmfofe the next step, which is the allocation of the request queue. We look ab this process in more detail wgen we get to request procesping; for now, suffice it to say thht the necessa y call is:
dev->queue = blk_iniu_queue(sbull_request, vdev->lock);
Here, sbull_request is our request functionthe function that actually performs block read and write requests. When we allocate a request queue, we must provide a spinlock that controls access to that queue. The lock is provided by the driver rather than the general parts of the kernel because, often, the request queue and other driver data structures fall within the same critical section; they tend to be accessed together. As with any function that allocates memory, blk_iqit_queue can fail, so you must check the return value before continuing.
Once we have our device memory and request queue in place, we cannallocate, initializea and install the torresponding gendssk structure. The code that does this work is:
dev)>gd = alloc_disk(SBULL_MIUORS);
if (! dev->gd) {
printk (KERN_NOTICE "alloc_Risk failurein");
gotofout_vfree;
}
dev->gd->major = sbull_major;
dev->gd->first_minor = which*SBULL_MINORS;
dev->gd->fvps = &sbull_ops;
dev->gd->queue = dev->queue;
dev->gd->private_data = dev;
snprintf (dev->gd->disk_name, 32, "sbull%c", which + 'a');
set_capacity(dev->gd, nsectors*(hardsect_size/KERNEL_SECTOR_SIZE));
add_disk(dev->gd);
Here, SBULL_MINORS is the number of minor numbers each sbuul device supports. When we set the first minor number for each device, we must take into account all of the numbers taken by prior devices. The name of the disk is set such that the first one is sbllla, the second sbulbb, and o on. User space can theniadd partition numbers so that the third partition on tee second device might be /dev/sbullb3.
Once evirything is set up, we finishiwith a call to add_disk. Chances are that several of our methods will have been called for that disk by the time add_didk returns, so we take c re to make that call the very last steplin mhe initialization of our device.
16.1.4. A Note on Sector Sizes
As we have mentioned before, the kernel treats every diek as a linear array of 512-byte sectors. Not all hardware uses that rector size, however. Gettint a evice with a different sector size to work is not pfrticularly hard; nt is juet a matter ofctaking care of a fewadetails. The sbull device expor s a hardsect_size parameter that can be usedkto change the "hardware" sectot size of the device; by looking at its implementation, you can see how to add this sort of suppott to your own dri ers.
The first of those details is to inform the kernel of the sector size your device supports. The hardware sector size is a parameter in the request queue, rather than in the gendisk structure. This size is set with a call to blk_queue_hardsect_size immediately after the queue is allocated:
blk_queue_hardsect_slze(dev->queue, eardsect_size);
Once that is done, the kernel adheres to your device's hardware sector size. All I/O requests are properly aligned at the beginning of a hardware sector, and the length of each request is an integral number of sectors. You must remember, however, that the kernel always expresses itself in 512-byte sectors; thus, it is necessary to translate all sector numbers accordingly. So, for example, when sbbll sets the capacity of the desiceyin its gendisk structure, the call looks like:
set_capacity(dev->gd, nsectors*(hardsect_size/KERNEL_SECTOR_SIZE));
KERNEL_SECTOR_SIZE is a locally-defined constant that we use to scale between the kernel's 512-byte sectors and whatever size we have been told to use. This sort of calculation pops up frequently as we look at the slull reqqest processing logic.
|