Quantcast
Viewing all articles
Browse latest Browse all 15

BASH: Fifo troubles – seems selective about input

So I am having a problem with a BASH service in Debian 7 that I’ve been working on for quite a while and that randomly started having trouble with its fifo, or so it seems. It is based on kind of your classic fifo use example and has worked fine for months but suddenly, today, started giving me trouble. It seems like whenever things like this happen, it is always something completely different from what I originally conclude so I will present what I have and maybe somebody can point out to me the obvious bit I’m not seeing.

As I said, my code for reading / writing from a named pipe is kinda standard. I made a boiled down version (150ish lines) that I thought I’d present but, of course, it worked fine and I have no idea why. So here is the super boiled down version for reference:

#--------------------------------Writer Script--------------------------------------#
#!/bin/bash

fifoIn=".../path/fifoIn"

#Read user input
IFS='' #Changed IFS so that spaces aren't trimmed from input
while true; do
    read -e line
    printf "%bn" "$line" >&4
done 4>"$fifoIn"

exit 0

#--------------------------------Reader Script--------------------------------------#
#!/bin/bash

fifoIn=".../path/fifoIn"
LogFile=".../path/srvc.log"
[ -d ".../path" ] || mkdir -p ".../path"
[ -e "$fifoIn" ] || mkfifo "$fifoIn"

printf "%bn" "Flushing input pipe" >> "$LogFile"
dd if="$fifoIn" iflag=nonblock of=/dev/null >/dev/null 2>&1

while true; do
    if read -t 0.1 -a str; then
        printf "n%sn" "<${str[*]}>"
        case "${str[0]}" in
            "foo")
                printf '%bn' "You said foo..."
                ;;
            "bar")
                printf '%bn' "You said bar..."
                ;;
            "")
                ;;
            *)
                printf "%bn" "${str[*]}:"
                printf "%bn" "Uhhuh..."
                ;;
        esac
    fi
done <"$fifoIn" >> "$LogFile" 2>&1 3>"$fifoIn"

So you take ‘reader script’ and run it as a daemon, then talk to it by echoing or printfing or using the writer script to send messages to the named pipe, fifoIn. This has worked great from the get go but today it got weird.

It, for some reason, started getting choosey about who could write (or at least it seemed to be who could write) to the pipe. I didn’t see any errors, but I would try to send text to the pipe and nothing would happen on the service side. I have cron jobs set up to write to the pipe and those would go through no problem while me echoing from a terminal would get nothing. Not even errors or permission denied messages. The cron jobs are set up to be the same user as my terminal anyway so I don’t think this is a permissions thing.

It seems that every time I deleted the fifo and restarted the service, I could get a few terminal-entered messages through then usually, but not always, that would seem to block or otherwise stop working after a cron-originated message was sent to the service. I would no longer be able to send messages through the pipe, but the cron-originated messages would continue to go through just fine!

I did some googling and came across the strace command. I tried doing something like strace printf '%bn' "foo" >> .../path/fifoIn, got a whole bunch of diagnostic system call stuff that I don’t really understand, but looks like it all worked because there was nothing like like Hey! right here! something broke right here!! and it ended with:

...
write(1, "foon", 4)
close(1)
...

which I’m guessing is a good thing. Now the funny thing, the message went through and the daemon read it as expected! I removed the strace from that exact line and again, no dice.

So all you folks who know way more about io operations and system calls than I do, what happens differently between when you have strace as a preface and when you don’t? What can generally gum up a pipe without its having been closed for reading? Any other leads you may pick up on because I’m at a loss.

UPDATE

@Gilles, I think you’re on to something in suggesting other processes trying to read that same pipe and causing problems… I wrote a new function that calls some instances of mutt that seem to have some association with fifoIn for some reason. I’m not super sure how to read the output of lsof, but it reads this after I execute that function (and consequently gum up my pipe):

COMMAND     PID   TID        USER   FD      TYPE DEVICE SIZE/OFF     NODE NAME
mutt      13874           uname    0r     FIFO   8,17      0t0   393222 .../path/fifoIn
mutt      13874           uname    3w     FIFO   8,17      0t0   393222 .../path/fifoIn
mutt      13897           uname    0r     FIFO   8,17      0t0   393222 .../path/fifoIn
mutt      13897           uname    3w     FIFO   8,17      0t0   393222 .../path/fifoIn
mutt      13932           uname    0r     FIFO   8,17      0t0   393222 .../path/fifoIn
mutt      13932           uname    3w     FIFO   8,17      0t0   393222 .../path/fifoIn
mutt      13971           uname    0r     FIFO   8,17      0t0   393222 .../path/fifoIn
mutt      13971           uname    3w     FIFO   8,17      0t0   393222 .../path/fifoIn
mutt      14012           uname    0r     FIFO   8,17      0t0   393222 .../path/fifoIn
mutt      14012           uname    3w     FIFO   8,17      0t0   393222 .../path/fifoIn
mutt      14051           uname    0r     FIFO   8,17      0t0   393222 .../path/fifoIn
mutt      14051           uname    3w     FIFO   8,17      0t0   393222 .../path/fifoIn
mutt      14096           uname    0r     FIFO   8,17      0t0   393222 .../path/fifoIn
mutt      14096           uname    3w     FIFO   8,17      0t0   393222 .../path/fifoIn
mutt      14124           uname    0r     FIFO   8,17      0t0   393222 .../path/fifoIn
mutt      14124           uname    3w     FIFO   8,17      0t0   393222 .../path/fifoIn
srvc      14298           uname    0r     FIFO   8,17      0t0   393222 .../path/fifoIn
srvc      14298           uname    3w     FIFO   8,17      0t0   393222 .../path/fifoIn
lsof      15587           uname    1w     FIFO    0,8      0t0   176516 pipe
lsof      15587           uname    5w     FIFO    0,8      0t0   176524 pipe
lsof      15587           uname    6r     FIFO    0,8      0t0   176525 pipe
grep      15588           uname    0r     FIFO    0,8      0t0   176516 pipe
lsof      15589           uname    4r     FIFO    0,8      0t0   176524 pipe
lsof      15589           uname    7w     FIFO    0,8      0t0   176525 pipe

I’m guessing I miss-wrote the mutt calls (which wind up executed in subshells) but they are latching onto the inherited FD’s because of whatever I did wrong with the command. I’d say that’s the answer and I’ll take it from there! If you post an ‘answer’ I’d be happy to select it!


Viewing all articles
Browse latest Browse all 15

Trending Articles