[Pvfs2-users] kernel crashes in client writes
Bill Wichser
bill at Princeton.EDU
Wed Sep 5 15:01:41 EDT 2007
We've never run a prior version on the Woodcrest cluster. We are
running the same version and kernel on another Xeon cluster without any
problems.
We've run the pvfs2-fsck and have found 2T of bad files. Yeah, a lot!
But one user's application may have been the cause of all of this. He
is accessing via the kernel interface and had been terminating the job
mid-stream. This has left some "cp" command active causing I/O nodes to
be left in a state where there is 100% CPU going to the
pvfs2-client-core daemon. This eventually causes nodes to crash (not
the I/o nodes but the client nodes) and the cycle continues.
Since the pvfs2-fsck we have not run this user's code. ROMIO access
still seems fine. We have tried to educate the user to use the pvfs2-cp
command instead thus bypassing the kernel interface. We will see what
happens soon. In the meantime we are trying to duplicate these failures
on a small subset of nodes to try and determine if it is merely this
code and the way that it terminates or if the kernel interface is indeed
causing the problems.
Bill
Sam Lang wrote:
>
> Hi Bill,
>
> Did the crashes start happening with PVFS 2.6.3, where things used to
> work with an older version of PVFS? Or did you patch the RHEL kernel?
> In other words, did things used to work at some point for you?
>
> -sam
>
> On Aug 31, 2007, at 11:35 AM, Bill Wichser wrote:
>
>> Murali,
>>
>> Yes, it isn't good. We still have hopes that an fsck is the fix as
>> there are a number of problems when we run in non-fix mode only. Just
>> haven't been able to obtain the system and the go ahead from the users
>> who need this for their research since many use the ROMIO access
>> method and are not experiencing any problems.
>>
>> Nothing special here for the kernel config. We're running a RHEL4 repo.
>>
>> Bill
>>
>> Murali Vilayannur wrote:
>>> Hi Bill,
>>> This is really bad..I wish I had a system to repro your setup..
>>> Is there something special in your kernel .config?
>>> (PREEMPT for example)
>>> What distro is this on btw?
>>> thanks,
>>> Murali
>>> On 8/24/07, Bill Wichser <bill at princeton.edu> wrote:
>>>> We have been experiencing frequent crashes in the PVFS2 kernel module
>>>> when applications use standard system I/O to write to PVFS2 files. We
>>>> are running the Linux 2.6.9-55.0.2 smp kernel, and PVFS2 v2.6.3.
>>>> The general protection fault almost always occurs at
>>>> pvfs2_devreq_writev+351.
>>>>
>>>> In our build, the invalid reference specifically occurs in the
>>>> qhash_del() operation, within the inline qhash_search_and_remove()
>>>> function called by pvfs2_devreq_writev(). See excerpts below:
>>>>
>>>> vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
>>>> static ssize_t pvfs2_devreq_writev(
>>>> struct file *file,
>>>> const struct iovec *iov,
>>>> unsigned long count,
>>>> loff_t * offset)
>>>> {
>>>> .
>>>> .
>>>> .
>>>> /* lookup (and remove) the op based on the tag */
>>>> hash_link = qhash_search_and_remove(htable_ops_in_progress,
>>>> &(tag));
>>>> if (hash_link)
>>>> {
>>>> .
>>>> .
>>>> .
>>>> }
>>>>
>>>> /* qhash_search_and_remove()
>>>> *
>>>> * searches for and removes a link in the hash table
>>>> * that matches the given key
>>>> *
>>>> * returns pointer to link on success, NULL on failure (or item
>>>> * not found). On success, link is removed from hashtable.
>>>> */
>>>> static inline struct qhash_head *qhash_search_and_remove(
>>>> struct qhash_table *table,
>>>> void *key)
>>>> {
>>>> int index = 0;
>>>> struct qhash_head *tmp_link = NULL;
>>>>
>>>> /* find the hash value */ index = table->hash(key,
>>>> table->table_size);
>>>>
>>>> /* linear search at index to find match */
>>>> qhash_lock(&table->lock);
>>>> qhash_for_each(tmp_link, &(table->array[index]))
>>>> {
>>>> if (table->compare(key, tmp_link))
>>>> {
>>>> qhash_del(tmp_link);
>>>> qhash_unlock(&table->lock);
>>>> return (tmp_link);
>>>> }
>>>> }
>>>> qhash_unlock(&table->lock);
>>>> return (NULL);
>>>> }
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>
>>>> We have since run pvfs2-fsck on the file system and have found some
>>>> corruption. So we're not sure if what we're seeing is just a
>>>> second-order effect of the corruption, or is the actual cause of the
>>>> corruption.
>>>>
>>>> So we're passing this along to you to see if you've had any similar
>>>> reports, or can point us in the right direction to help find the
>>>> problem.
>>>>
>>>> The crash file sys and bt info follows. Please let us know if you need
>>>> more information.
>>>>
>>>> Thanks,
>>>> Bill
>>>>
>>>> vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
>>>> crash> sys
>>>> SYSTEM MAP: /boot/System.map-2.6.9-55.0.2.ELsmp
>>>> DEBUG KERNEL: /home/jsbillin/vmlinux-2.6.9-55.ELsmp (2.6.9-55.ELsmp)
>>>> DUMPFILE: /var/crash/172.18.0.85-2007-08-06-07:47/vmcore
>>>> CPUS: 4
>>>> DATE: Fri Aug 17 11:53:17 2007
>>>> UPTIME: 11 days, 04:08:05
>>>> LOAD AVERAGE: 2.49, 2.10, 1.63
>>>> TASKS: 96
>>>> NODENAME: woodhen-085
>>>> RELEASE: 2.6.9-55.0.2.ELsmp
>>>> VERSION: #1 SMP Mon Jun 25 14:12:33 EDT 2007
>>>> MACHINE: x86_64 (2660 Mhz)
>>>> MEMORY: 9 GB
>>>> PANIC: ""
>>>> crash> bt
>>>> PID: 3454 TASK: 10236b63030 CPU: 0 COMMAND: "pvfs2-client-co"
>>>> #0 [10232fbbc60] netpoll_start_netdump at ffffffffa0249366
>>>> #1 [10232fbbc90] die at ffffffff80111c00
>>>> #2 [10232fbbcb0] do_general_protection at ffffffff801124e5
>>>> #3 [10232fbbcf0] error_exit at ffffffff80110d91
>>>> [exception RIP: pvfs2_devreq_writev+351]
>>>> RIP: ffffffffa0226948 RSP: 0000010232fbbda8 RFLAGS: 00010246
>>>> RAX: 0000000000000000 RBX: 40903a138d84f800 RCX:
>>>> 0000000000000000
>>>> RDX: 40903a138d84f800 RSI: 00000101aeab1bd8 RDI:
>>>> 0000010232fbbdc0
>>>> RBP: 0000010006bccd40 R8: 0000000000000000 R9:
>>>> 0000000000000000
>>>> R10: 0000000000000000 R11: 0000000000000000 R12:
>>>> 0000000000000000
>>>> R13: 000001020557e600 R14: 000001020557e5f0 R15:
>>>> 0000010232fbbe88
>>>> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
>>>> #4 [10232fbbda0] pvfs2_devreq_writev at ffffffffa022693e
>>>> #5 [10232fbbe00] sock_readv_writev at ffffffff802a91f9
>>>> #6 [10232fbbe60] do_readv_writev at ffffffff8017a45f
>>>> #7 [10232fbbf40] sys_writev at ffffffff8017a631
>>>> #8 [10232fbbf80] system_call at ffffffff8011026a
>>>> RIP: 00000035854bfcdb RSP: 0000007fbffff228 RFLAGS: 00010202
>>>> RAX: 0000000000000014 RBX: ffffffff8011026a RCX:
>>>> 00000035854bf1e9
>>>> RDX: 0000000000000004 RSI: 0000007fbffff120 RDI:
>>>> 0000000000000005
>>>> RBP: 0000000000000000 R8: 0000000000000001 R9:
>>>> 0000000000000004
>>>> R10: 0000000000000001 R11: 0000000000000206 R12:
>>>> 0000000000000005
>>>> R13: 0000007fbffff120 R14: 0000000000000004 R15:
>>>> 0000000000000000
>>>> ORIG_RAX: 0000000000000014 CS: 0033 SS: 002b
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> _______________________________________________
>>>> Pvfs2-users mailing list
>>>> Pvfs2-users at beowulf-underground.org
>>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>>>>
>> _______________________________________________
>> Pvfs2-users mailing list
>> Pvfs2-users at beowulf-underground.org
>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>>
>
More information about the Pvfs2-users
mailing list