Parallelizing Bash Scripts

par is a tool to parallelize multiple commands and view the output all on one stdout stream, with color-coding. The inspiration for par’s appearance comes from Docker Compose, which clearly separates the output of one process from another. par is open source and available at https://github.com/ughe/par. Here’s an example of par in action with netcat servers and clients:

par example

The terminal screenshot above demonstrates the output of the following par command:

par './nc_listen.sh 1234' './nc_listen.sh 4321' 'echo bravo' './nc_hello.sh 1234' './nc_hello.sh 4321'

This runs five processes. Two are netcat servers on ports :1234 and :4321 and two are netcat clients. One process just echoes bravo. Here are the two nc_listen.sh and nc_hello.sh scripts that are being run in parallel:

  1. nc_listen.sh
     #!/usr/bin/env sh
     if [ $# -ne 1 ]; then >&2 echo "usage: $0 1234"; exit 1; fi
     echo "Starting netcat on :$1 ..."
     lsof -i :$1 | awk '/nc /{print $2}' | xargs kill -9
     nc -l $1
    
  2. nc_hello.sh
     #!/usr/bin/env sh
     sleep 1
     if [ $# -ne 1 ]; then >&2 echo "usage: $0 1234"; exit 1; fi
     for i in `seq 1 3`; do echo hello world $i; done | nc 127.0.0.1 $1
    

Finally, here’s the plaintext of the image above:

par                 | Started 5 procs. Warning: lines truncated around 4096 chars. Waiting for any nonzero exit. PIDs: 7731 7740 7750 7757 7764 
./nc_listen.sh 1234 | Starting netcat on :1234 ...
./nc_listen.sh 4321 | Starting netcat on :4321 ...
echo bravo          | bravo
`echo bravo` exited with code 0 (PID: 7750)
./nc_listen.sh 1234 | hello world 1
./nc_listen.sh 1234 | hello world 2
./nc_listen.sh 1234 | hello world 3
./nc_listen.sh 4321 | hello world 1
./nc_listen.sh 4321 | hello world 2
./nc_listen.sh 4321 | hello world 3
`./nc_listen.sh 1234` exited with code 0 (PID: 7731)
`./nc_listen.sh 4321` exited with code 0 (PID: 7740)
`./nc_hello.sh 1234` exited with code 0 (PID: 7757)
`./nc_hello.sh 4321` exited with code 0 (PID: 7764)
par                 | Done. Exit code: 0

The ANSI color codes are not shown in the text above, although they would be if the output was piped or redirected to a file.

Interworkings

The par command is about 81 lines and is at: https://github.com/ughe/par.

  1. Defines println (which calls print_color) to produce color-coded, tagged lines
  2. Defines pad and unpad for preventing interleaving. Inspired by http://catern.org/pipes.html
  3. For each argument (command)
    • Executes command in the background and captures its PID
    • Captures command’s stdout/err and pads it then appends to the UNIFIED pipe
  4. Prints the number and PIDs of all procs that were started
  5. Backgrounds a process to continuously unpad UNIFIED and write to stdout
  6. Registers a trap to end straggler processes on a Ctrl-C interrupt
  7. Waits until any PID has non-zero exit or all PIDs exit zero
    • Checks every PID every second to see if it has stopped yet. Once it stops:
    • Prints the command name, that it exited, the exit code, and its PID
    • Continues waiting if the exit code was zero (increments a counter too)
    • Otherwise kills all PIDS and 1-level deep of their children
  8. Cleans up and prints “Done” along with the exit code

There’s also a Makefile that supports make test and make install. The tests only demonstrate output (no assertions). Install only works if go is installed since it attempts to copy the executable to $GOPATH/bin.

Alternatives

There are many alternatives to par. For example, you could start each process by hand, write a short script, or just use xargs such as:

echo '"./command1" "./command2"' | xargs -n1 -P`nproc` -I {} sh -c "echo '{}'; {}"

Note on macOS sysctl -n hw.ncpu is used instead of nproc.