Hey guys,<br><br>I am running into a problem with a system copy command segfaulting on 2.4 kernels. Specifically, I am seeing this show up on RHEL3 machines running a patched version of PVFS 2.6. Machines running Linux 2.6 kernels do not experience this problem. I believe we may have mentioned this recently but hoped it would be fixed by some updates pulled into dcache. That, apparently, is not the case.<br>
<br>The segfault is extremely consistent; it happens every time a cp is executed with a PVFS2 file system as the target. The target file is always created with a size of zero, so at least part of the command is completing. 'dd' commands execute normally. <br>
<br>The setup is simple: 1 server node (RHEL4 2.6 kernel) with the default interactive genconfig output, and 1 client with a 2.4 kernel. Mount the file system, execute a copy onto the file system. <br><br>Here is the conf file contents:<br>
<br><Defaults><br> UnexpectedRequests 50<br> EventLogging none<br> LogStamp datetime<br> BMIModules bmi_tcp<br> FlowModules flowproto_multiqueue<br> PerfUpdateInterval 1000<br>
ServerJobBMITimeoutSecs 30<br> ServerJobFlowTimeoutSecs 30<br> ClientJobBMITimeoutSecs 300<br> ClientJobFlowTimeoutSecs 300<br> ClientRetryLimit 5<br> ClientRetryDelayMilliSecs 2000<br>
TCPBindSpecific yes<br></Defaults><br><br><Aliases><br> Alias node1 tcp://node1:3334<br></Aliases><br><br><Filesystem><br> Name pvfs2-fs<br> ID 1227216139<br> RootHandle 1048576<br>
<MetaHandleRanges><br> Range node1 4-2147483650<br> </MetaHandleRanges><br> <DataHandleRanges><br> Range node1 2147483651-4294967297<br> </DataHandleRanges><br>
<StorageHints><br> TroveSyncMeta no<br> TroveSyncData no<br> CoalescingHighWatermark infinity<br> CoalescingLowWatermark 0<br> TroveSyncMetaTimerSecs 5<br>
DBCacheSizeBytes 1073741824<br> </StorageHints><br></Filesystem><br><br>And here is the last bit of an strace on a copy command:<br><br>[root@node1 root]# strace cp test.file /mnt/pvfs2/<br>
.....<br>brk(0) = 0x95ce000<br>open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE) = 3<br>fstat64(3, {st_mode=S_IFREG|0644, st_size=32148976, ...}) = 0<br>mmap2(NULL, 2097152, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb73f4000<br>
close(3) = 0<br>geteuid32() = 0<br>lstat64("/mnt/pvfs2/", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0<br>stat64("/mnt/pvfs2/", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0<br>
stat64("test.file", {st_mode=S_IFREG|0644, st_size=5, ...}) = 0<br>stat64("/mnt/pvfs2/test.file", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0<br>open("test.file", O_RDONLY|O_LARGEFILE) = 3<br>
fstat64(3, {st_mode=S_IFREG|0644, st_size=5, ...}) = 0<br>open("/mnt/pvfs2/test.file", O_WRONLY|O_TRUNC|O_LARGEFILE) = 4<br>fstat64(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0<br>fstat64(3, {st_mode=S_IFREG|0644, st_size=5, ...}) = 0<br>
--- SIGSEGV (Segmentation fault) @ 0 (0) ---<br>+++ killed by SIGSEGV +++<br><br><br>There is nothing in the client or server logs without turning on additional logging.<br><br>Are there any suggestions on what might be causing this? Can I provide any additional information that will be helpful for debugging?<br>
<br>Bart.<br><br>