I’ve come across various articles and SO questions and I am still confused about something that I use on daily basis, but never realized how confusing it can be. I am experimenting with (named) pipes in Linux.
1st
try was simple: figure out how pipe buffers are working:
#1
mkfifo /tmp/mypipe
#2
echo "Hello World" >/tmp/mypipe
ctrl+c
#3
cat /tmp/mypipe
Observation:
When I killed echo
before cat
reads the data nothing was written to pipe (cat
keeps running but nothing was read from pipe). I was assuming that when you type producent >named_pipe
and you will exit producent
then part of data that match pipe buffer size will be written to named_pipe
and will remain here until it will be read by consument
(now I know that this is not how it works). So what I did next was:
2nd
try was to connect consument
to other end of pipe:
#1
mkfifo /tmp/mypipe
#2
echo "Hello World" >/tmp/mypipe
#3
cat /tmp/mypipe
Observation:
cat
command displays the "Hello World"
message and both processes ends. The interesting discovery here was that during the #2 step ps -elf
does not display the echo
command. It seems that echo
is waiting until somebody will read from pipe and this is explanation why nothing was printed to pipe in my first attempt.
3rd
try was to pipe command that will run “forever” and constantly write to pipe and see what will happened:
#1
mkfifo /tmp/mypipe
#2
yes >/tmp/mypipe
#3
cat /tmp/mypipe
Observation:
This worked as expected and cat
printed out what yes
forwarded to pipe. However I have tried to replace cat
with tail -f
. When I did this then tail
did not print anything until the yes
command was killed.
4th
try is the big mystery:
# 1#
mkfifo /tmp/mypipe
# 2#
for i in $(seq 1 10000); do echo -n $i"|"> /tmp/mypipe; done
# 3#
for i in $(seq 1 10); do echo "${i}# Read:"; cat /tmp/mypipe && echo ""; done
After this the 3# command start typing something like that:
1# Read:
1|2|3|4|5|6|7|8|9|10|11|12|13|14|15|16|17|18|19|20|21|22|23|24|25|26|27|28|29|30|31|32|33|34|35|36|37|38|39|40|41|42|43|44|45|46|47|48|49|50|51|52|53|54|55|56|57|58|59|60|61|62|63|64|65|66|67|68|69|70|71|72|73|74|75|76|77|78|79|80|81|82|83|84|85|86|87|88|89|90|91|92|93|94|95|96|97|98|99|100|101|102|103|104|105|106|107|108|
2# Read:
109|
3# Read:
110|
4# Read:
111|
5# Read:
112|
6# Read:
113|114|115|
7# Read:
116|
8# Read:
117|
9# Read:
118|119|120|121|122|123|124|125|126|127|128|129|130|131|132|133|134|135|136|137|138|139|140|141|142|143|144|145|146|147|148|149|150|151|152|153|154|155|156|157|158|159|160|161|162|163|164|165|166|167|168|169|170|171|172|173|174|175|176|177|178|179|180|181|182|183|184|185|186|187|188|189|190|191|192|193|194|195|196|197|198|199|200|201|202|203|204|205|206|207|208|209|210|211|212|213|214|215|216|217|218|219|220|221|222|223|224|225|226|227|228|229|230|231|232|233|234|235|236|237|238|239|240|241|242|243|244|245|246|247|248|249|250|251|252|253|254|255|256|257|258|259|260|261|262|263|264|265|266|267|268|269|270|271|272|273|274|275|276|277|278|279|280|281|282|283|284|285|286|287|288|289|290|291|292|293|294|295|
10# Read:
296|297|298|299|300|301|302|303|304|305|306|307|308|309|310|311|312|313|314|315|316|317|318|319|320|321|322|323|324|325|326|327|328|329|
Questions:
1st and 2nd try:
- Are the named pipes equivalent to classic
|
pipes as they are
knows e.g. from bash in this particular case? - Does producent always wait for consument? If yes then what is purpose of pipe buffers? Is this behavior known as blocking communication?
-
How does Linux know when the consument is connected to pipe and thus when the communication can happen? I’ve tried
lsof named_pipe
but it gives me nothing, where is this information stored? I have also try following and result was thatcat
cannot read from pipe.#1 mkfifo /tmp/mypipe #2 echo 1 >/tmp/mypipe #3 rm /tmp/mypipe #4 mkfifo /tmp/mypipe #5 cat /tmp/mypipe
-
Is typing:
producent >/tmp/mypipe
the equivalent of typingcommand |
I mean the situation when somebody wants to pipe one command to another but forget to type another command after pipe (ps
in this case also did not show firstcommand
)?
3rd try:
- What is difference between
cat
andtail -f
in this particular case?
4th try:
-
What is going on here? Why the chunks of read data are not the exact
size? I was expecting output as:1# Read:
1|
2# Read:
2|
3# Read:
3|
PS:
Also I have tried different order of starting commands (reading first and writing after) but the result was the same.
PPS:
I hope this is clear but:
Producer = process that writes to pipe.
Consumer = process that reads from pipe.
Is this possible explain to guy which has mostly scripting knowledge with bit of C? Thank you very much.
EDIT in reply to: Joe Sewell
- OK Clear
2.
I understand that both run in parallel, or in other words, following two are not the same:
find | less
vs
find > /tmp/file && less /tmp/file
My further observation discovers that, when I run following, HDD is not working seems that it is stopped until less
command has enough data to display
find | less
When I hit shifg+g
(go to the end of file in less
) HDD starts immediately to work and data starts outputting. Does this mean that when less
command has enough data to display it will somehow tell find
to not produce further data? This is what you mean by synchronization? Also the amount of data writes to pipe corresponds to it buffer size? I have also noticed that find
changes it state (ps aux
– stat column) from S+ to D+
after I hit shift+g
in less
S interruptible sleep (waiting for an event to complete)
D uninterruptible sleep (usually IO)
+ is in the foreground process group.
┌─[wakatana@~] [63 files, 178Mb]
└──> ps aux | egrep -w 'less|find'
wakatana 6071 0.0 0.0 12736 1088 pts/5 S+ 23:15 0:00 find
wakatana 6072 0.0 0.0 7940 928 pts/5 S+ 23:15 0:00 less
wakatana 6183 0.0 0.0 7832 892 pts/6 S+ 23:20 0:00 egrep --color=auto -w less|find
┌─[wakatana@~] [63 files, 178Mb]
└──> ps aux | egrep -w 'less|find'
wakatana 6071 0.0 0.0 12808 1304 pts/5 D+ 23:15 0:00 find
wakatana 6072 0.0 0.0 9556 2508 pts/5 S+ 23:15 0:00 less
wakatana 6193 0.0 0.0 7832 892 pts/6 S+ 23:21 0:00 egrep --color=auto -w less|find
-
Who sends this signal, consument to producent? If yes then how
consument know that he is connected to the pipe which already has
producent (e.g. my example with rm pipe)? -
OK Clear
-
OK Clear
-
I think that the new lines is not the case that confuses me. Based on my previous observations (and you confirmed that: “Yes, both ends wait for each other.”). I was expecting this:
-
I. 1st iteration in 1st loop will write to pipe and because nobody is
reading it will wait here. -
II. When 2nd loop is issued then the data which were written by 1st
loop in 1st iteration will be read, nothing more was written here so
nothing more can be read. -
III. 2nd loop will wait for next data to be written by 1st loop or
(because order no matter) 1st loop will wait until written data will
be read by 2nd loop, and so on and so on.
Because of this I was expecting that one write will corresponds to one read. I was also verifying if loop is not running so I modified a bit original command to see if something will be printed to STDOUT even if consument wont be reading, but nothing was printed.
for i in $(seq 1 10000); do
if [ $(( $i % 5 )) -eq 0 ]; then
echo $i;
else
echo -n $i"|"> /tmp/mypipe;
fi;
done
“Since the writing process isn’t sending any newlines, the reader simply reads until it’s told it got “enough.”"
- Who will tell consument that he’s got enough?
“In the first case it’s probably because the fifo’s buffer filled up,”
- How can I fill buffer if communication is blocked (as i described above)?
“and therefore got flushed through to the reader.”
- What do you mean by this? Sorry for my english.
“While there are ways to make communication asynchronous …”
- Can you please briefly describe what is the difference between asynchronous and synchronous in this case?