DCC use of connect() and sendto() on FreeBSD

Vernon Schryver vjs@calcite.rhyolite.com
Thu Sep 23 14:16:55 UTC 2004


(second copy responding to the list this time)


> From: Jamie Clark <jamie@zeroth.org>

> Never mind. I'm jsut interested in the connect() behaviour. Point
> being that that port and distribution tarball behave the same
> wrt this problem.

that is a valid point.

> > <>more than 4 months old.
> >
> Same for FreeBSD 4.9-RELEASE :)

One might not expect to need to update kernels as often as spam filters.
When I get around to it, I'll probably go to 5.*.

Besides, you skipped the damning aspects of what you said about that
port/package.


> >One of my pet peeves is the widely held notion that `truss`, `strace`,
> >or other system call tracing is of great value for diagnosing problems.
> >When that's all that's available, that's what you use.  However, a
> >system call trace is not a substitute for running the application in a
> >debugger.
> >  
> >
> Also, running the application in a debugger is not a substitute for
> a system call trace.

That is more mistaken than not, because in many UNIX flavors one can
put breakpoints in/at the system call wrappers and so see the interesting
system calls the application is making, and in context.


> gdb, strace and truss are all just tools. I have no peeves with
> any of them, and I pick whichever shines a light on a problem.

Perhaps people don't send you thousands of irrelevant lines
of system call traces and expect you to figure what is wrong.
If I had a dollar for every strace/truss report for a problem
that turns out to be a configuration error...


> I had actually been down this path days ago. I found that If
> I change the definition of DCC_UDP_DISCON, in the FreeBSD case to:
>
> #define DCC_UDP_DISCON  sizeof(ctxt->conn_su)
>
> Then the disconnects seem to work. Just as my test code did.

Huh!  Did I mess that not so minor detail in your previous messages?


I call connect() with an address size of 0 because that seems to be
required on some UNIX systems.  I've forgotten exactly which.

Are you sure you are not using a jail?  The 4.9-RELEASE udp_connect()
has a test of p->p_prison and a use of prison_remote_ip().
That use of prison_remote_ip() looks bogus when the address size is zero,
because it uses the s_addr from the supposedly 0-length address.
What happens if you apply the following temporary/diagnostic patch?:

--- dcclib/clnt_send.c 2004/05/02 01:51:55     1.92
+++ dcclib/clnt_send.c 2004/09/23 13:31:25
@@ -1496,6 +1496,7 @@
 static void
 disconnect(DCC_CLNT_CTXT *ctxt)
 {
+       memset(ctxt->conn_su.sa, 0, sizeof(ctxt->conn_su.sa));
        ctxt->conn_su.sa.sa_family = AF_UNSPEC;
        connect(ctxt->soc, &ctxt->conn_su.sa, DCC_UDP_DISCON);
 }


> Exuse my use of truss. I find that it concisely shows the socket
> operations that I want to see. How to do easily in gdb?

try  "b connect" and "b sendto"

or try `grep 'connect(' */*.[ch]` and the set breakpoints at the
connect() system calls you care about in the application, thereby
excluding the uses of connect() in the resolver library and elsewhere.

or look for the error message you were seeing from cdcc in the cdcc source,
and put breakpoints in places suggested by that analysis.

I find "b printf" and similar handy when attacking an unfamiliar
application.

trust/strace is like a decompiler, mental or mechanical.  I've done
things such as patching instructions in running UNIX kernels under the
watchful eyes of armed soldiers in secure installations, but I avoid
that sort of incantation until I'm sure it's required.


> That discovery was the reason for my posting the original question.
> I saw that DCC_UDP_DISCON is being used as a precompiler condition,
> and also as an argument to connect(). I could not determine the
> intent of the disctintion between a value of 0 for 'BSD and the
> size_t value for Linux. The size_t value seemed the only correct
> one as a third argument for connect().

yes, but the point of that connect() is to use incorrect arguments.


> I guessed that perhaps the 0 value was meant to disable the
> use of connected sockets (didn't notice the default of the case)
>
> Hence my original post and question. I was trying to figure out
> what was intended. That question again:
>
> "My question is: is this the precompiler behaviour that was originally
> intended, or have I broken something?"

My original answer stands.  Your proposed change turns off the use
of connected sockets for all systems while leaving a lot of useless
nasty cruft in the DCC source.  In your situation, if I had felt
that DCC_UDP_DISCON was a bad idea, then I would have suggested
removing it, perhaps by offering a patch based on the result of
`unifdef -UDCC_UDP_DISCON`
Or move "FreeBSD" from that case statement down to the AIX|OpenUNIX case.

But I would not have proposed such changes because they would fail my
mental test of "Does it work in zillions of other installations?  If
so there must be something odd about what I'm doing and so such a
change should not be made without really understanding what's happening."
If I made it anyway just to get things going, I'd not talk about it in
public.


I hope that such a major kernel change as what you are implicitly
suggesting has not been made to a moribund system like FreeBSD 4.*.
It's possible that connect(fd,asdf,0) has been broken for disconnecting
a UDP socket between 4.9 and 4.10, but you'd hope not.

I'm not eager to change that connect() even just for FreeBSD to use a
non-zero address size, because of the hassles of testing the change
on old releases.


Vernon Schryver    vjs@rhyolite.com



More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.