[Pvfs2-developers] dbench and pvfs2_bufmap race

Phil Carns carns at mcs.anl.gov
Thu Jan 17 19:37:52 EST 2008


Somewhere along the line the PVFS trunk code picked up a regression that 
causes dbench to wedge the client machine.  I looked at this a little 
bit today and observed several indirect problems:

1) dbench somehow kills pvfs2-client-core (not sure why yet)

2) pvfs2-client-core gets restarted, but there isn't anything obvious 
logged anywhere to indicate this

3) There are race conditions between in flight kernel operations using 
the pvfs2_bufmap interface while the pvfs2-client-core exits and 
restarts.  The bufmap functions currently try to protect themselves from 
this scenario by checking an integer (bufmap_init) at the top of each 
function before proceeding.  This has a few sub-problems:

3a) a few functions, including notably bufmap_get(), were missing the 
bufmap_init check

3b) even the functions that do check bufmap_init were not truly safe, 
because bufmap_init is not locked, and there is nothing to prevent 
bufmap from being finalized while a bufmap function is in progress

3c) the functions that check bufmap_init all return an error code like 
EIO, and the callers don't know that they can retry once the 
pvfs2-client-core has been restarted

3d) this same style of safety check is used for the device file by way 
of the open_access_count integer and is_daemon_in_service() function. 
3a through 3c probably apply for this component as well.

Out of the above list, I just committed some changes to trunk to address 
3a) and 3b).  The existing safety check mechanism was added to the 
functions that were lacking it.  Also, all of the safety checks are now 
protected with a rw semaphore so that the finalize can't pull the rug 
out from under another bufmap function before it returns.  I'm not keen 
on changing locking in the kernel module at this point, but if it breaks 
something else fragile (or if someone knows a slicker way to handle 
this) we can back it out.

At any rate, even without fixing 1), 2), 3c), or 3d), this is enough to 
keep dbench from crashing a client with the current trunk code.

Now it just does this:

    # /mnt/dbench/dbench -c client.txt 10 -t 300
    dbench version 3.04 - Copyright Andrew Tridgell 1999-2004

    Running for 300 seconds with load 'client.txt' and minimum warmup 60 
secs
    10 clients started
       10         2     0.00 MB/sec  warmup   1 sec
       10         2     0.00 MB/sec  warmup   2 sec
       10        10     0.00 MB/sec  warmup   3 sec
       10        10     0.00 MB/sec  warmup   4 sec
    write failed on handle 9938
       10        10     0.00 MB/sec  warmup   5 sec
       10        10     0.00 MB/sec  warmup   6 sec
    Child failed with status 1

With this in dmesg:

    pvfs_bufmap_get: not yet initialized.
    pvfs2: please confirm that pvfs2-client daemon is running.
    pvfs2: pvfs2_statfs -- wait timed out; aborting attempt.
    pvfs2: pvfs2_statfs -- wait timed out; aborting attempt.

Things seem Ok after this (ie, the mount point is still responsive, etc.).

I think the real dbench problem is that it is doing something that 
causes pvfs2-client-core to crash.  I just wanted to go ahead and 
comment on the secondary problems before I forgot over the weekend :)

-Phil


More information about the Pvfs2-developers mailing list