I've been trying to debug a problem in rpctrace which causes rpctrace to crash when I use it to wrap /hurd/ext2fs.
The bug is triggered by a memory_object_lock_request / memory_object_lock_completed sequence. Specifically, ext2fs sends a lock request to the kernel with a send-once reply_to port. Once the lock is complete, the kernel sends a memory_object_lock_completed message to the send-once right, including a send right to the memory control port (for identification purposes) and that's where the trouble starts in rpctrace.
rpctrace is designed to trace multiple tasks simultaneously, and to identify which task is doing an RPC, it allocates separate ports for each task (even if they wrap the same port). So, if you pass out a send right to three tasks, three different receive rights will be allocated in rpctrace. Which receive right a message comes in on indicates which task is sending the message.
For this to work right, we need to identify which task is on the receiving end of a send right. Not which task the send right came from, mind you, but the ultimate destination. In the previous example, we transferring a copy of the original send right, we need to pick which of the three new send rights should be transferred, based on the ultimate destination of the message, to ensure that the right task gets the right version of the port. There's also a fourth case - we're transferring to a task that we're not tracing.
All of this complexity is already built into rpctrace. It plays games like looking at a task's port space, extracting a send right from each remote receive right, and checking to see if it matches a local send right, in order to determine that the local send right's final destination is the remote receive right on the task in question. See discover_receive_right().
In my case, problems arise because a send-once right is used to return the lock completed message, and there's no way to know which task the send-once right ultimately goes to. There's a bad pointer deference involved, but even once that's fixed, how do you know which send right to transfer? It's important to get it right, since the send right is used by the memory manager to identify different clients.
I've patched it up by assuming that the task sending the send-once right is the ultimate destination, which works in my case, but obviously it isn't right in general.
The more I think about it, the more I'm thinking that it's a design flaw in rpctrace. We need to identify ultimate destinations, but can't do that reliably.
I read on the website's hurd/debugging/rpctrace page that somebody (zhenga) had come with a new version of rpctrace. Do we have a copy of it around somewhere?
I could submit the patches that I've got, but they're not right, and I don't see any way to make them right. I'm thinking now that the way to fix it is to redesign rpctrace so that each wrapped task gets a separate rpctrace task wrapping it. That way, we should be able to determine which task makes which RPC without the problems I've described above.
I'm also thinking that I don't want to undertake rewriting rpctrace right now. I was just trying to fix it so that I could understand what ext2fs was doing.