How are Unix pipes implemented

What is the exact ingenuity of Unix Pipe


I heard the story of how Douglas Mcllroy came up with the concept and how Ken Thompson put it into practice in one night.

As far as I know, pipe is a system call that shares some memory between two processes, one of which is writing and another of which is reading.

As someone unfamiliar with operating system internals or concepts, I was wondering what exactly the "genius" in the story is. Is it the idea of ​​two processes sharing memory? Or is it the implementation? Or both?

PS: I know how useful the pipe is or how it is used in the bowl. The question is about the concept and implementation of the






Reply:


As far as I know, pipe is a system call that shares some memory between two processes, one of which is writing and another of which is reading.

Actually, it is not about shared memory. The reader and writer do NOT share any part of their address space and do not use explicit synchronization.

The reads, writes and make system calls just like that, as if they were reading / writing to a file. THAT is the genius ... the innovation: the idea that (simple) cross-process communication and file I / O can be handled the same way ... from the point of view of the application programmer and the user.

Once the pipe has been set up, the operating system (not the application code or user space libraries) takes care of the buffering and coordination. Transparent.


In contrast, prior to the invention of the pipe concept, when you had to do some "pipeline" processing, you typically had to write one application to a file and then, when it was done, run the second application to get out of the read file.

Alternatively, if you want a real pipeline, you could code both applications to establish a (real) shared memory segment and use semaphores (or something else) to coordinate read / write. Complicated ... and not often done as a result.






In my opinion, the genius of the idea of ​​"pipes" lies in its ease of use.

No need to make system calls, allocate memory, nothing complicated. In the shell you use a single character:. This gives extraordinary power in combining simple (or complex) tools for a given task.

Take on some mundane tasks like neatly sorting text. You might have a command that lists a whole bunch of names. (For my example, I'm using a file that contains a series of names, courtesy of listofrandomnames.com.) Pipes let you do the following:

This is only an example. there are thousands. Refer to the "The Unix Philosophy" section on this page for a few more specific tasks that the use of pipes makes much easier.


To underline this answer, read slides 4 through 9 of the presentation, "Why Zsh Is Cooler Than Your Shell".


I am aware that the above command contains a UUOC. I'll leave it because it's a placeholder for any command that generates text.





So I tried to do a bit of research by looking for manuals for PDP-10 / TOPS-10 to find out what the state of the art was before whistling. I found this, but TOPS-10 remarkably difficult to google. There are a few good references to the invention of the pipe: an interview with McIlroy, about the history and impact of UNIX.

You need to put this in a historical context. There were few modern tools and conveniences that we take for granted.

"In the beginning, Thompson didn't even program on the PDP itself, but instead used a series of macros for the GEMAP assembler on a GE-635 computer." (29) A paper tape was produced on the GE 635 and then the PDP-7 was tested until, according to Ritchie, "a primitive Unix kernel, an editor, an assembler, a simple shell (command interpreter) and a few utilities (such as the Unix rm , cat, cp) were completed periodically, the operating system was self-supporting, programs could be written and tested without resorting to paper tape, and development of the PDP-7 itself continued. "

A PDP-7 looks like this. Note the lack of an interactive display or hard drive. The "file system" would be stored on the magnetic tape. There was up to 64 KB of memory for programs and data.

In this environment, programmers tended to address the hardware directly, such as issuing commands to power up the tape and sequentially process characters read directly from the tape interface. UNIX also provided abstractions so that they were not combined as separate interfaces "read from teletype" and "read from tape", but with the crucial addition "read from output of other program" without saving a temporary copy on the hard drive or on tape ".

Here is McIlroy on the invention of. I think this is a good job to summarize the workload in the pre-UNIX environment.

"Grep was invented for me. I made a program to read text with a speech synthesizer. While I was making up the phonetic rules, I checked Webster's dictionary for words that could fail. For example, how do you deal with the digraph ? " ui ', which is pronounced in many different ways:' fruit ',' guile ',' guilty ',' qual ',' intuit ',' beguine '? I would break the dictionary into parts that fit in the limited buffer and limited use of ed a global command to select a list. I would shrink this list by repeatedly scanning ed to see how each suggested rule works. "

"The process was tedious and terribly wasteful as the dictionary had to be split (you couldn't afford to keep a split copy online). Then ed copied each part to / tmp and scanned it twice to run the g command. and finally threw it away, which takes time. "

"One afternoon I asked Ken Thompson if he could take the regular expression recognizer out of the editor and create a one-pass program to do it. He said yes. The next morning I found a note in my mail, in who announced a program called grep. It worked like a charm. When asked what this funny name meant, Ken said it was obvious. It stood for the editor command he was simulating g / re / p (global expression for regular Expressions). "

Compare the first part of this with the example. If you choose the "Create command line" or "Handwrite a program specifically for this purpose in assembler" option, it is worth creating the command line. Even if it takes a few hours to read the (paper) manuals. You can then write it down for future reference.


The genius of Pipes is that it combines three important ideas.

First, whistles are a practical implementation of "co-routines," a term coined by Conway in 1958 that held promise but found little practical use before whistling.

Second, Thompson et al. The first real "glue language" through the implementation of pipes in the shell language.

With these two points, reusable software components can be efficiently developed in an optimized, low-level language and then assembled into a much larger, more complex functionality. They called this "programming on a large scale".

Third, the implementation of pipes with the same system calls that were used to access files made it possible to write programs with universal interfaces. This enabled truly universal solutions to software problems that could be used interactively, with data from files, and as part of larger software systems without ever having to change the software components. No compiling, no configuration, just a few simple shell commands.

If you want to go through the learning curve, UNIX software is just as useful today as it was 40 years ago. We are constantly reinventing things for which you already knew and developed solutions. And the decisive breakthrough was the simple pipe. The only real innovation after that was the creation of the internet in the 1980s. By creating a separate API, UNIX drastically distorted the implementation. We're still suffering from the aftermath ... Oh yeah, there was something about video displays and mice that got popular in the late 80s. But that's for WIMPs.

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from.

By continuing, you consent to our use of cookies and other tracking technologies and affirm you're at least 16 years old or have consent from a parent or guardian.

You can read details in our Cookie policy and Privacy policy.