[PVFS-users] pvfs_open, Invalid Argument
Rob Ross
rross at mcs.anl.gov
Tue May 11 14:25:34 EDT 2004
Hi Todd,
You are correct that this is a problem in 1.6.2. The attached patch to
1.6.2 (suggested for everyone) should fix this problem. The fix is also
in CVS and will be in the 1.6.3 release (hopefully soon, if we can work
out the second round of fixes!). We fixed it a slightly different way,
but I think your approach is ok too.
Regards,
Rob
Todd McAnally wrote:
> Sorry if this has already been covered. I did some searching but
> didn't come up with exactly what I am seeing.
>
> The problem is: when I call pvfs_open with O_META and pcount = -1 in
> the meta structure, I get an Invalid Argument (EINVAL) and pvfs_open
> fails.
>
> I want to do this so I can set the stripe size. I believe pcount=-1 is
> supposed to use all nodes without having to explicitly state the count.
>
> It worked the way I expected it to in 1.6.0 but now fails in 1.6.2. I
> found some changes after 1.6.0 that appear to be my problem.
>
> I'd appreciate it if someone would let me know if this is a real issue
> of if maybe it's just too dangerous for me to be poking around in the
> code. :-)
>
> In lib/pvfs_open.c at line 221
>
> ---clip---
> /* otherwise we have a PVFS file... */
> /* Prepare request for file system */
> if (flag & O_META) {
> req.req.open.meta.p_stat.base = meta_p->base;
> niodservers = req.req.open.meta.p_stat.pcount = meta_p->pcount;
> req.req.open.meta.p_stat.ssize = meta_p->ssize;
> }
> else {
> req.req.open.meta.p_stat.base = -1;
> req.req.open.meta.p_stat.pcount = -1;
> req.req.open.meta.p_stat.ssize = -1;
> /*
> * default_alloc is initialized to a default-value which
> * would most likely be set to the total number of I/O servers
> * in the cluster. It is changed after the very first open
> */
> niodservers = pvfs_open_optimization ? default_alloc : 0;
> }
> ---clip---
>
> With O_META set and pcount = -1, niodservers = -1 at the end of this
> code.
> req.req.open.ackdsize gets set negative. The first thing
> send_open_request does is check to see that ackdsize isn't negative.
> If so it fails with EINVAL.
>
> I believe my hack to replace the above code given here is what was
> supposed to happen...
> ---clip---
> /* otherwise we have a PVFS file... */
> /* Prepare request for file system */
> if (flag & O_META) {
> req.req.open.meta.p_stat.base = meta_p->base;
> req.req.open.meta.p_stat.pcount = meta_p->pcount;
> req.req.open.meta.p_stat.ssize = meta_p->ssize;
> }
> else {
> req.req.open.meta.p_stat.base = -1;
> req.req.open.meta.p_stat.pcount = -1;
> req.req.open.meta.p_stat.ssize = -1;
> }
>
> /*
> * default_alloc is initialized to a default-value which
> * would most likely be set to the total number of I/O servers
> * in the cluster. It is changed after the very first open
> */
> if (req.req.open.meta.p_stat.pcount >= 0) {
> niodservers = req.req.open.meta.p_stat.pcount;
> } else {
> niodservers = pvfs_open_optimization ? default_alloc : 0;
> }
> ---clip---
>
> My simple test framework now works. I have yet to test in real code
> beyond my little test harness.
>
> Did I do the right thing or am I way off in left field?
>
> Thanks,
> Todd
-------------- next part --------------
---------------------
PatchSet 654
Date: 2004/01/07 20:27:38
Author: pcarns
Branch: HEAD
Tag: (none)
Log:
fixed flaw in error reporting, spotted by Alexandr Konovalov
<avkon at imm.uran.ru>
Members:
iod/jobs.c:1.44->1.45
Index: pvfs/iod/jobs.c
diff -u pvfs/iod/jobs.c:1.44 pvfs/iod/jobs.c:1.45
--- pvfs/iod/jobs.c:1.44 Wed Nov 12 11:11:57 2003
+++ pvfs/iod/jobs.c Wed Jan 7 15:27:38 2004
@@ -383,7 +383,7 @@
if (!j_p || alist_empty(j_p->al_p)) /* that's the end of this job */ {
LOG("do_job: finished job - removing it\n");
if (j_rem(jobs_p, sock) < 0) {
- PERROR("do_job: j_rem");
+ ERR("do_job: j_rem failure");
}
return(0);
}
---------------------
PatchSet 657
Date: 2004/01/15 13:00:06
Author: rbross
Branch: HEAD
Tag: (none)
Log:
fix bug in pvfs_open.c where pcount of less than zero would cause bad math.
Members:
lib/pvfs_open.c:1.29->1.30
Index: pvfs/lib/pvfs_open.c
diff -u pvfs/lib/pvfs_open.c:1.29 pvfs/lib/pvfs_open.c:1.30
--- pvfs/lib/pvfs_open.c:1.29 Thu Oct 23 14:06:43 2003
+++ pvfs/lib/pvfs_open.c Thu Jan 15 08:00:06 2004
@@ -223,6 +223,7 @@
if (flag & O_META) {
req.req.open.meta.p_stat.base = meta_p->base;
niodservers = req.req.open.meta.p_stat.pcount = meta_p->pcount;
+ if (niodservers < 0) niodservers = 0; /* pcount < 0 input is valid */
req.req.open.meta.p_stat.ssize = meta_p->ssize;
}
else {
---------------------
PatchSet 658
Date: 2004/01/15 14:40:59
Author: rbross
Branch: HEAD
Tag: (none)
Log:
Added Don Porter's 12/24/2003 patch to reduce code in mgr.c by removing do_lstat().
Members:
mgr/mgr.c:1.87->1.88
Index: pvfs/mgr/mgr.c
diff -u pvfs/mgr/mgr.c:1.87 pvfs/mgr/mgr.c:1.88
--- pvfs/mgr/mgr.c:1.87 Mon Nov 24 20:51:43 2003
+++ pvfs/mgr/mgr.c Thu Jan 15 09:40:59 2004
@@ -69,7 +69,7 @@
do_chmod,
do_chown,
do_close,
- do_lstat,
+ do_stat, /*really this is do_lstat*/
do_noop,
do_open,
do_unlink,
@@ -799,107 +799,9 @@
return(do_stat(sock, &fakereq, f_p->f_name, ack_p));
} /* end of do_fstat() */
-int do_lstat(int sock, mreq_p req_p, void *data_p, mack_p ack_p) {
- int ret, i;
- fsinfo_p fs_p;
- ireq iodreq;
-
- /* variables used for calculating correct file size */
- uint64_t real_file_sz, tmp_file_sz, iod_file_sz;
- uint64_t strip_sz, stripe_sz, nr_strips, leftovers;
-
- /* check for reserved name first */
- if (resv_name(data_p) != 0) {
- ack_p->status = -1;
- ack_p->eno = ENOENT;
- return(send_error(-1, errno, sock, ack_p));
- }
-
- if ((fs_p = quick_mount(data_p, req_p->uid, req_p->gid, ack_p)) == NULL)
- {
- return(send_error(-1, errno, sock, ack_p));
- }
-
- /* We dont allow a lstat to follow a symbolic link */
- if ((ret = md_stat(data_p, req_p, &(ack_p->ack.stat.meta), NOFOLLOW_LINK)) != 0) {
- return(send_error(ret, errno, sock, ack_p));
- }
-
- /* if regular file, get size from iods */
- if (S_ISREG(ack_p->ack.stat.meta.u_stat.st_mode)) {
- iodreq.majik_nr = IOD_MAJIK_NR;
- iodreq.release_nr = PVFS_RELEASE_NR;
- iodreq.type = IOD_STAT;
- iodreq.dsize = 0;
- iodreq.req.stat.f_ino = ack_p->ack.stat.meta.u_stat.st_ino;
- if ((ret = send_req(fs_p->iod, fs_p->nr_iods,
- ack_p->ack.stat.meta.p_stat.base, ack_p->ack.stat.meta.p_stat.pcount,
- &iodreq, data_p, NULL)) < 0)
- {
- return(send_error(ret, errno, sock, ack_p));
- }
-
- strip_sz = ack_p->ack.stat.meta.p_stat.ssize;
- stripe_sz = ack_p->ack.stat.meta.p_stat.ssize *
- ack_p->ack.stat.meta.p_stat.pcount;
- real_file_sz = 0;
-
- for (i = 0; i < ack_p->ack.stat.meta.p_stat.pcount; i++) {
- /* i gives us the index into the iods, the ith iod in order */
- iod_file_sz = fs_p->iod[(i + ack_p->ack.stat.meta.p_stat.base) %
- fs_p->nr_iods].ack.ack.stat.fsize;
- if (iod_file_sz == 0) continue;
-
- /* the plan is to calculate the file size based on what
- * this particular iod has. the largest of these sizes
- * is the actual file size, taking into account sparse
- * issues.
- *
- * there are four components to the calculation:
- *
- * soff = should be 0 since it isn't implemented :)
- *
- * nr_strips * stripe_sz = this is a calculation of how many
- * complete stripes are present. note that any partial strip
- * or the last complete strip (if there are no partials)
- * may not be part of a whole stripe (thus the if() below).
- *
- * i * strip_sz = this accounts for full strips that should be
- * present on iods that come before us in the iod ordering
- *
- * leftovers = the last remaining bytes
- */
- nr_strips = iod_file_sz / strip_sz;
- leftovers = iod_file_sz % strip_sz;
-
- if (leftovers == 0) {
- nr_strips--;
- leftovers += strip_sz;
- }
- tmp_file_sz = nr_strips * stripe_sz + i * strip_sz +
- leftovers;
- if (tmp_file_sz > real_file_sz) real_file_sz = tmp_file_sz;
- }
-
- /* we're adding a 64-bit size field in addition to the st_size
- * later i'll remove the addition into st_size probably
- * i'm leaving it for now
- */
- ack_p->ack.stat.meta.u_stat.st_size = real_file_sz;
- ack_p->ack.stat.meta.fsize = real_file_sz;
- }
- else if(S_ISLNK(ack_p->ack.stat.meta.u_stat.st_mode)) {
- /* fill up the size of the link file here */
- ack_p->ack.stat.meta.fsize = ack_p->ack.stat.meta.u_stat.st_size;
- }
-
- if (!ack_p->status) return(bsend(sock, ack_p, sizeof(mack)));
- else send_error(-1, ack_p->eno, sock, ack_p);
-
- if (!errno) errno = ack_p->eno; /* return the old error */
- return(-1);
-} /* end of do_lstat() */
-
+/* do_stat() - also fills in for do_lstat. this is done by checking
+ * the request type; see below.
+ */
int do_stat(int sock, mreq_p req_p, void *data_p, mack_p ack_p) {
int ret, i;
fsinfo_p fs_p;
@@ -922,7 +824,7 @@
}
/* We allow a stat to follow a symbolic link */
- if ((ret = md_stat(data_p, req_p, &(ack_p->ack.stat.meta), FOLLOW_LINK)) != 0) {
+ if ((ret = md_stat(data_p, req_p, &(ack_p->ack.stat.meta), req_p->type == MGR_LSTAT ? NOFOLLOW_LINK :FOLLOW_LINK)) != 0) {
return(send_error(ret, errno, sock, ack_p));
}
@@ -1020,6 +922,11 @@
}
}
#endif
+ }
+ else if(S_ISLNK(ack_p->ack.stat.meta.u_stat.st_mode) && req_p->type == MGR_LSTAT) {
+ /* fill up the size of the link file here */
+ /*is this even necessary??? */
+ ack_p->ack.stat.meta.fsize = ack_p->ack.stat.meta.u_stat.st_size;
}
#ifdef MGR_USE_CACHED_FILE_SIZE
} /* if open or zero... */
---------------------
PatchSet 659
Date: 2004/01/15 14:47:07
Author: rbross
Branch: HEAD
Tag: (none)
Log:
Applied Don Porter's eno patch.
Members:
mgr/mgr.c:1.88->1.89
Index: pvfs/mgr/mgr.c
diff -u pvfs/mgr/mgr.c:1.88 pvfs/mgr/mgr.c:1.89
--- pvfs/mgr/mgr.c:1.88 Thu Jan 15 09:40:59 2004
+++ pvfs/mgr/mgr.c Thu Jan 15 09:47:07 2004
@@ -320,6 +320,7 @@
ack.type = req.type; /* everything else is filled in in reqfn() */
ack.status = 0;
ack.dsize = 0;
+ ack.eno = 0;
LOG2("req: %s on %d\n", reqtext[req.type], sock);
ret = (reqfn[req.type])(sock, &req, buf_p, &ack); /* handle request */
continue;
---------------------
PatchSet 660
Date: 2004/01/15 14:49:50
Author: rbross
Branch: HEAD
Tag: (none)
Log:
Applied Don Porter's correction to keep correct cached file size on truncate, plus permission check.
Members:
mgr/mgr.c:1.89->1.90
Index: pvfs/mgr/mgr.c
diff -u pvfs/mgr/mgr.c:1.89 pvfs/mgr/mgr.c:1.90
--- pvfs/mgr/mgr.c:1.89 Thu Jan 15 09:47:07 2004
+++ pvfs/mgr/mgr.c Thu Jan 15 09:49:50 2004
@@ -539,13 +539,12 @@
/* get metadata, check permission to write to file */
if ((fd = meta_open(data_p, O_RDONLY)) < 0
|| (meta_read(fd, &meta) < 0)
+ || ((meta_access(fd, data_p, req_p->uid, req_p->gid, W_OK)) < 0)
|| (meta_close(fd) < 0))
{
return(send_error(-1, errno, sock, ack_p));
}
- /* for the moment permissions aren't checked... */
-
/* build request for iods, send it */
memset(&iodreq, 0, sizeof(iodreq));
iodreq.majik_nr = IOD_MAJIK_NR;
@@ -562,6 +561,21 @@
{
return(send_error(ret, errno, sock, ack_p));
}
+
+#ifdef FAST_STATS
+ meta.u_stat.st_size = req_p->req.truncate.length;
+ meta.fsize = req_p->req.truncate.length;
+
+ /* write metadata*/
+ if ((fd = meta_open(data_p, O_RDONLY)) < 0
+ || (meta_write(fd, &meta) < 0)
+ || (meta_close(fd) < 0))
+ {
+ return(send_error(-1, errno, sock, ack_p));
+ }
+
+#endif
+
if (sock >= 0) /* real socket number, send the ack */ {
ret = bsend(sock, ack_p, sizeof(mack));
if (ret < 0) return(-1);
---------------------
PatchSet 661
Date: 2004/01/15 14:58:21
Author: pcarns
Branch: HEAD
Tag: (none)
Log:
removed the use of bvrecv() to pull in both the ack and trailing data from
the mgr at the same time, and broke into two seperate brecv() calls. This
gives us the chance to check the ack status before commiting to trying
to grab more data out of the socket
Members:
lib/mgrcomm.c:1.23->1.24
Index: pvfs/lib/mgrcomm.c
diff -u pvfs/lib/mgrcomm.c:1.23 pvfs/lib/mgrcomm.c:1.24
--- pvfs/lib/mgrcomm.c:1.23 Wed Oct 22 07:17:46 2003
+++ pvfs/lib/mgrcomm.c Thu Jan 15 09:58:21 2004
@@ -175,55 +175,22 @@
return(-1);
}
}
- /* Do things differently if it is a open request with ackdsize != 0 */
- if(req_p->type == MGR_OPEN
- && (ackdsize = req_p->req.open.ackdsize) != 0) {
- int bvrecvret = 0;
- if(recv_p == NULL) {
- close(fd);
- if (!newconn) {
- badmgrfd(fd);
- goto make_new_conn; /* try to reopen connection */
- }
- errno = EFAULT;
- PERROR("send_mreq_saddr (receiving ack with bvrecv)");
- return -1;
- }
- /* recv ack and trailing data with a single call */
- vec[0].iov_base = ack_p;
- vec[0].iov_len = sizeof(mack);
- vec[1].iov_base = recv_p;
- vec[1].iov_len = ackdsize;
- bvrecvret = bvrecv(fd, vec, 2);
- if(bvrecvret < (sizeof(mack) + ackdsize)) {
- err = errno;
- PERROR("send_mreq_saddr (receiving ack with bvrecv)");
- close(fd);
- if (!newconn) {
- badmgrfd(fd);
- goto make_new_conn; /* try to reopen connection */
- }
- errno = err;
- return(-1);
- }
- }
- else {
- /* receive ack. Give up if it does not happen in reasonable amount
- * of time.
- */
- if(brecv_timeout(fd, ack_p, sizeof(mack), MGRCOMM_BRECV_TIMEOUT_SECS)
- < (int)sizeof(mack)){
- err = errno;
- PERROR("send_mreq_saddr (receiving ack)");
- close(fd);
- if(!newconn){
- badmgrfd(fd);
- goto make_new_conn; /* try to reopen connection */
- }
- errno = err;
- return(-1);
+ /* receive ack. Give up if it does not happen in reasonable amount
+ * of time.
+ */
+ if(brecv_timeout(fd, ack_p, sizeof(mack), MGRCOMM_BRECV_TIMEOUT_SECS)
+ < (int)sizeof(mack)){
+ err = errno;
+ PERROR("send_mreq_saddr (receiving ack)");
+ close(fd);
+ if(!newconn){
+ badmgrfd(fd);
+ goto make_new_conn; /* try to reopen connection */
}
+ errno = err;
+ return(-1);
}
+
if (ack_p->majik_nr != MGR_MAJIK_NR) /* serious error -- nonsense ack */ {
close(fd);
if (!newconn) badmgrfd(fd);
@@ -250,6 +217,35 @@
errno = ack_p->eno;
return(-1);
}
+
+ /* if ackdsize != 0 on a MGR_OPEN request, then it indicates that we
+ * need to pull in some trailing data as well
+ */
+ if(req_p->type == MGR_OPEN
+ && (ackdsize = req_p->req.open.ackdsize) != 0) {
+ /* make sure that we were given a buffer for trailing data */
+ if(recv_p == NULL) {
+ close(fd);
+ errno = EFAULT;
+ PERROR("send_mreq_saddr (receiving ack with bvrecv)");
+ return -1;
+ }
+ /* recv trailing data, timeout if we don't get it in time */
+ if(brecv_timeout(fd, recv_p, ackdsize, MGRCOMM_BRECV_TIMEOUT_SECS)
+ < ackdsize)
+ {
+ err = errno;
+ PERROR("send_mreq_saddr (receiving trailing data with brecv)");
+ close(fd);
+ if (!newconn) {
+ badmgrfd(fd);
+ goto make_new_conn; /* try to reopen connection */
+ }
+ errno = err;
+ return(-1);
+ }
+ }
+
return(fd);
} /* end of send_mreq_saddr() */
More information about the PVFS-users
mailing list