[Pvfs2-developers] Copy commands segfault on 2.4 kernels

Bart Taylor batayl at gmail.com
Tue Apr 7 16:40:45 EDT 2009


Hey guys,

I am running into a problem with a system copy command segfaulting on 2.4
kernels. Specifically, I am seeing this show up on RHEL3 machines running a
patched version of PVFS 2.6. Machines running Linux 2.6 kernels do not
experience this problem.  I believe we may have mentioned this recently but
hoped it would be fixed by some updates pulled into dcache. That,
apparently, is not the case.

The segfault is extremely consistent; it happens every time a cp is executed
with a PVFS2 file system as the target.  The target file is always created
with a size of zero, so at least part of the command is completing. 'dd'
commands execute normally.

The setup is simple:  1 server node (RHEL4 2.6 kernel) with the default
interactive genconfig output, and 1 client with a 2.4 kernel.  Mount the
file system, execute a copy onto the file system.

Here is the conf file contents:

<Defaults>
        UnexpectedRequests 50
        EventLogging none
        LogStamp datetime
        BMIModules bmi_tcp
        FlowModules flowproto_multiqueue
        PerfUpdateInterval 1000
        ServerJobBMITimeoutSecs 30
        ServerJobFlowTimeoutSecs 30
        ClientJobBMITimeoutSecs 300
        ClientJobFlowTimeoutSecs 300
        ClientRetryLimit 5
        ClientRetryDelayMilliSecs 2000
        TCPBindSpecific yes
</Defaults>

<Aliases>
        Alias node1 tcp://node1:3334
</Aliases>

<Filesystem>
        Name pvfs2-fs
        ID 1227216139
        RootHandle 1048576
        <MetaHandleRanges>
                Range node1 4-2147483650
        </MetaHandleRanges>
        <DataHandleRanges>
                Range node1 2147483651-4294967297
        </DataHandleRanges>
        <StorageHints>
                TroveSyncMeta no
                TroveSyncData no
                CoalescingHighWatermark infinity
                CoalescingLowWatermark 0
                TroveSyncMetaTimerSecs 5
                DBCacheSizeBytes 1073741824
        </StorageHints>
</Filesystem>

And here is the last bit of an strace on a copy command:

[root at node1 root]# strace cp test.file /mnt/pvfs2/
.....
brk(0)                                  = 0x95ce000
open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=32148976, ...}) = 0
mmap2(NULL, 2097152, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb73f4000
close(3)                                = 0
geteuid32()                             = 0
lstat64("/mnt/pvfs2/", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) =
0
stat64("/mnt/pvfs2/", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0
stat64("test.file", {st_mode=S_IFREG|0644, st_size=5, ...}) = 0
stat64("/mnt/pvfs2/test.file", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
open("test.file", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=5, ...}) = 0
open("/mnt/pvfs2/test.file", O_WRONLY|O_TRUNC|O_LARGEFILE) = 4
fstat64(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fstat64(3, {st_mode=S_IFREG|0644, st_size=5, ...}) = 0
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++


There is nothing in the client or server logs without turning on additional
logging.

Are there any suggestions on what might be causing this? Can I provide any
additional information that will be helpful for debugging?

Bart.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20090407/8f6bcf33/attachment.htm


More information about the Pvfs2-developers mailing list