[Pvfs2-users] problem with pvfs2 over diskless cluster
Murali Vilayannur
murali.vilayannur at gmail.com
Thu Dec 28 13:13:21 EST 2006
Hi Scott,
There is a bug in pvfs2 with the new version of db. I don;t know how
others have not hit this issue just yet..
My fix is defly not correct. Sam is probably the best person to
figure out this breakage. If not, I will try to probe tonight.
Can you roll back to an earlier version of db and run configure with
--with-db=<path to db>?
BTW: It is awesome that you have gotten the MX stuff working!
Everything looks great!
thanks,
Murali
On 12/28/06, Scott Atchley <atchley at myri.com> wrote:
> Hi all,
>
> I am trying to test pvfs2 on some amd64 machines running 2.6.17.11
> with db-4.5.20. I am seeing some of the same error messages as the
> mentioned on this thread earlier. I used Murali's patch and I no
> longer see the infinite loop. I can ping and statfs without problem.
> When I try to cp, however, I get:
>
> % ./sbin/pvfs2-server -d ./etc/fs.conf ./etc/server.conf-shower01
> [D 09:34:56.235123] PVFS2 Server version 2.6.1pre1-2006-12-28-140946
> starting.
> [E 12/28 09:35] TROVE:DBPF:Berkeley DB: DB->get: DB_BUFFER_SMALL:
> User memory too small for return value
> [E 12/28 09:35]
> PVFS2 server got signal 2 (server_status_flag: 262143)
> *** glibc detected *** double free or corruption (!prev):
> 0x0000000000656fe0 ***
> zsh: abort ./sbin/pvfs2-server -d ./etc/fs.conf ./etc/server.conf
>
> The trove error happens at copy time. The double free happens after I
> hit ^c.
>
> If I restart the server and I try to use ls on the directory, I see
> on the server:
>
> % ./sbin/pvfs2-server -d ./etc/fs.conf ./etc/server.conf-shower01
> [D 09:40:29.881305] PVFS2 Server version 2.6.1pre1-2006-12-28-140946
> starting.
> [E 12/28 09:40] TROVE:DBPF:Berkeley DB: DB_THREAD mandates memory
> allocation flag on DBT data
> [E 12/28 09:40] TROVE:DBPF:Berkeley DB: keyval_db->get (handle info):
> Invalid argument
> [E 12/28 09:40] TROVE:DBPF:Berkeley DB: DB_THREAD mandates memory
> allocation flag on DBT data
> [E 12/28 09:40] TROVE:DBPF:Berkeley DB: keyval_db->get (handle info):
> Invalid argument
> [E 12/28 09:40] TROVE:DBPF:Berkeley DB: DB_THREAD mandates memory
> allocation flag on DBT data
> [E 12/28 09:40] TROVE:DBPF:Berkeley DB: keyval_db->get (handle info):
> Invalid argument
>
> and on the client:
>
> % pvfs2-ls -l /mnt/pvfs2
> Failed to get attributes on handle 1048574,1668743878
> Getattr failure: No such file or directory
> drwxrwxrwx 1 atchley softies 4096 2006-12-28 09:34 lost
> +found
>
> Attempting to validate the directory, I see:
>
> % pvfs2-validate -d /mnt/pvfs2
> pvfs2-validate starting validation at object [/mnt/pvfs2]
> *** glibc detected *** free(): invalid next size (fast):
> 0x00000000005ebc10 ***
> zsh: abort (core dumped) pvfs2-validate -d /mnt/pvfs2
>
> The server sees:
>
> [E 12/28 09:42] TROVE:DBPF:Berkeley DB: DB_THREAD mandates memory
> allocation flag on DBT data
> [E 12/28 09:42] TROVE:DBPF:Berkeley DB: keyval_db->get (handle info):
> Invalid argument
>
> and the core from validate shows:
>
> (gdb) bt
> #0 0x00002b12b35d40fd in raise () from /lib/libc.so.6
> #1 0x00002b12b35d582e in abort () from /lib/libc.so.6
> #2 0x00002b12b3608be1 in __fsetlocking () from /lib/libc.so.6
> #3 0x00002b12b360e7ee in malloc_trim () from /lib/libc.so.6
> #4 0x00002b12b360eb36 in free () from /lib/libc.so.6
> #5 0x000000000040d0e9 in validate_pvfs_object
> (fsck_options=0x5b3010, pref=0x7ffff7855ef0,
> creds=0x7ffff7855f00, cur_fs=0x7ffff7856010,
> current_path=0x7ffff7856ad5 "/mnt/pvfs2")
> at src/apps/admin/pvfs2-validate.c:281
> #6 0x000000000040ce13 in main (argc=3, argv=0x7ffff78560f8)
> at src/apps/admin/pvfs2-validate.c:186
>
> What should I try next?
>
> Scott
> _______________________________________________
> Pvfs2-users mailing list
> Pvfs2-users at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>
More information about the Pvfs2-users
mailing list