[Pvfs2-users] I'm in the dark, need someone to shed some light
belcampo
belcampo at zonnet.nl
Wed May 14 07:55:45 EDT 2008
Rob Ross wrote:
> Heh well that's a little different -- that's a read workload. The NFS
> client is reading ahead.
>
> Have a look at this:
>
> http://www.pvfs.org/cvs/pvfs-2-7-branch.build/doc/pvfs2-faq/pvfs2-faq.php#SECTION00074000000000000000
>
> and this:
>
> http://www.pvfs.org/cvs/pvfs-2-7-branch.build/doc/pvfs2-faq/pvfs2-faq.php#SECTION00077000000000000000
>
>
> Also this email and specifically the immutable option; you could set
> this on your files after you are done ripping and encoding:
>
> http://www.beowulf-underground.org/pipermail/pvfs2-developers/2006-September/002688.html
>
>
> You'd probably want to use the pvfs2-xattr utility to set the attribute
> so you don't have to sudo it.
>
>>> Other networked file systems can hide some of this latency by caching
>>> data (either coherently or not) on the client. PVFS does not do this,
>>> so each little operation goes across the wire.
>> Can this be investigated with some networktool and if yes, how ?
>>> There's really no advantage to using a parallel file system for the
>>> workload you have described,
>> But should the disadvantage be in this order of magnitude ?
>
> Apparently :). You could strace the app to see how big/small the IOs
> are. Some apps have options for block sizes for IO that can be used to
> improve performance. Also, there's no reason to bother with striping
> files in this case, since you're accessing serially. You should set the
> the number of datafiles (objects holding data) to 1 on the directory
> you're storing into:
> setfattr -n "user.pvfs2.num_dfiles" -v "1" /mnt/pvfs2/directory
>
>>> unless you're planning on having a lot of systems doing this process
>>> in parallel and want a single place to store the output.
>>> What sort of network do you have in this system? What sort of nodes
>>> are you using for the PVFS servers?
>> All AMD 4000+ systems with 1Gb networkcards and 320GB disk in each or
>> them.
>> Copying from to clients to these 3 servers is > 100MB/sec pretty close
>> to what Gb ethernet can do.
>
> In what context do you get that performance? How do tools like pvfs2-cp
> compare in performance?
cp and pvfs2-cp are not significantly different, although load with
pvfs2-cp is a lot higher, at least at client-side
[mythtv at mm01 pvfs]$ time cp "/pvfs2/videos/In the Flesh - Roger
Waters.mpg" .
0.00user 23.17system 1:42.64elapsed 22%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (1major+1227minor)pagefaults 0swaps
[mythtv at mm01 pvfs]$ time pvfs2-cp "/pvfs2/videos/In the Flesh - Roger
Waters.mpg" test.mpg
2.62user 71.50system 1:41.21elapsed 73%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (12major+3160minor)pagefaults 0swaps
I've found a workaround, by cat and |
time strace -o strace.mplayer cat "/pvfs2/videos/In the Flesh - Roger
Waters.mpg" | mplayer -dumpstream -dumpfile /pvfs2/videos/flesh.mpg -
works very well, strace gives read and write blocks as 'I understand it'
write(1, "\0\0\1\272G\21Uk\315\257\1\211\303\370\0\0\1\275\7\354"...,
4194304) = 4194304
Doing it without the cat and | I get
read(3, "\0\0\1\272D\2%StQ\1\211\303\370\0\0\1\275\7\354\201\200"...,
2048) = 2048
What I have in mind is that > 1000 nodes, can read the same file(s)
simultaneaously, not at exactly the same time though. So striping is
crucial to what I hope to accomplish.
PNFS should/would make this possible I think/hope.
> Rob
More information about the Pvfs2-users
mailing list