[Pvfs2-users] Heavy read workload and "Permission denied" errors

Sam Lang slang at mcs.anl.gov
Tue Mar 18 10:22:39 EST 2008


Hi Tiankai,

I've been debugging something similar I think, but I'm not able to  
reproduce the EACCES (Permission denied) error with only a few nodes.   
It would be helpful to eliminate a few things to isolate the problem,  
and see we're both looking at the same bug.

Can you disable the name and attribute cache in the client daemon?  To  
do that, you should be able to start the pvfs2-client with -n 0 -a 0.   
With those options, does the problem persist?

Are your nodes x86_64?

What happens if you just use one node as a metadata server instead of  
all 6?

Thanks,
-sam

On Mar 17, 2008, at 11:20 AM, Tu, Tiankai wrote:

> I have been testing whether PVFS2 can be used to support large-scale
> read-intensive parallel workload, in particular, post-simulation data
> analysis. Although the preliminary results (on a small cluster) are
> encouraging when everything worked, there have been a few occasions
> where mysterious "Permission Denied" errors occurred and the
> applications halted.
>
> Below are the system hardware/software setup:
>
> - 6 compute nodes each with 8 cores, 16 GB memory, 170 GB free disk
> space managed by xfs.
> - Nodes are interconnected by a 1 GigE cable to a 10 GigE switch
> - Linux kernel: 2.6.22.15-7smp
>
> PVFS setup
>
> - pvfs-2.7.0 installed
> - All the 6 nodes also used as both metadata servers and IO servers
> - The same 6 nodes used to run application codes (as pvfs clients)
> - pvfs kernel module installed on all the nodes
> - pvfs mounted with the local hostname specified as the metadata  
> server
> on each node
> - regular unix open/read/close calls from within the applications
> - Default file striping on all the servers
>
> Application characteristics:
>
> - Parallel Python programs
> - A large number of parallel read threads
> - Mostly independent read traces; occasionally shared accesses to the
> same file but by no more than 2 threads
> - Large, equally-sized files (> 64 MB)
> - Each thread opens a file, reads in the content of the entire file
> (most of the time), extracts data of interest, closes the file and  
> moves
> to the next file
> - The sequence of files to be accessed by each thread pre-determined
> (i.e., no runtime arbitration)
> - Experiments run on configurations with different number of nodes and
> different number of cores per node; total number of (read) threads
> determined by (number of nodes X cores per nodes)
>
> Error:
> - An example (6 nodes, 4 threads per node) : Cannot open file
> /scratch/mnt/pvfs2/merged_frameset_64MB/p2auto/00000001/trj/ 
> frame0000008
> 44 [Errno 13] Permission denied:
> '/scratch/mnt/pvfs2/merged_frameset_64MB/p2auto/00000001/trj/ 
> frame000000
> 844'
> - Similar errors encountered in other node/thread configurations
> - The files being reported as inaccessible were all verified to be
> accessible from all the 6 compute/storage nodes
>
>
> Extra information:
> - On the first trial with PVFS, a different error "[Errno 11] Resource
> temporarily unavailable" occurred multiple times along with "[Errno  
> 13]
> Permission denied."
> - PVFS configuration was changed to increase the number of retry  
> from 5
> to 10 and delay from 2 to 2.5 sec
> - [Errno 11] did not show up again; but [Errno 13] showed up more  
> often
>
> Thanks for the help.
> Tiankai
>
>
>
> _______________________________________________
> Pvfs2-users mailing list
> Pvfs2-users at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2417 bytes
Desc: not available
Url : http://www.beowulf-underground.org/pipermail/pvfs2-users/attachments/20080318/8b57f14d/smime.bin


More information about the Pvfs2-users mailing list