You have various options to run programs or commands in parallel on a Linux or Unix-like systems:
=> Use GNU/parallel or xargs command.
=> Use wait built-in command with &.
=> Use xargs command.
This page shows how to run commands or code in parallel in bash shell running on a Linux/Unix systems.
Putting jobs in background
The syntax is:
command &
command arg1 arg2 &
custom_function &
OR
prog1 &
prog2 &
wait
prog3
In above code sample, prog1, and prog2 would be started in the background, and the shell would wait until those are completed before starting the next program named progr3.
Examples
In this following example run sleep command in the background:
$ sleep 60 &
$ sleep 90 &
$ sleep 120 &
To displays status of jobs in the current shell session run jobs command as follows:
$ jobs
Sample outputs:
[1] Running sleep 60 & [2]- Running sleep 90 & [3]+ Running sleep 120 &
Let us write a simple bash shell script:
#!/bin/bash # Our custom function cust_func(){ echo "Do something $1 times..." sleep 1 } # For loop 5 times for i in {1..5} do cust_func $i & # Put a function in the background done ## Put all cust_func in the background and bash ## would wait until those are completed ## before displaying all done message wait echo "All done"
Let us say you have a text file as follows:
$ cat list.txt
Sample outputs:
https://server1.cyberciti.biz/20170406_15.jpg https://server1.cyberciti.biz/20170406_16.jpg https://server1.cyberciti.biz/20170406_17.jpg https://server1.cyberciti.biz/20170406_14.jpg https://server1.cyberciti.biz/20170406_18.jpg https://server1.cyberciti.biz/20170406_19.jpg https://server1.cyberciti.biz/20170406_20.jpg https://server1.cyberciti.biz/20170406_22.jpg https://server1.cyberciti.biz/20170406_23.jpg https://server1.cyberciti.biz/20170406_21.jpg https://server1.cyberciti.biz/20170420_15.jpg https://server1.cyberciti.biz/20170406_24.jpg
To download all files in parallel using wget:
#!/bin/bash # Our custom function cust_func(){ wget -q "$1" } while IFS= read -r url do cust_func "$url" & done < list.txt wait echo "All files are downloaded."
GNU parallel examples to run command or code in parallel in bash shell
From the GNU project site:
GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables.
The syntax is pretty simple:
parallel ::: prog1 prog2
For example, you can find all *.doc files and gzip (compress) it using the following syntax:
$ find . -type f -name '*.doc' | parallel gzip --best
$ find . -type f -name '*.doc.gz'
Install GNU parallel on Linux
Use the apt command/apt-get command on a Debian or Ubuntu Linux:
$ sudo apt install parallel
For a RHEL/CentOS Linux try, yum command:
$ sudo yum install parallel
If you are using a Fedora Linux, try dnf command:
$ sudo dnf install parallel
Examples
Our above wget example can be simplified using GNU parallel as follows:
$ cat list.txt | parallel -j 4 wget -q {}
OR
$ parallel -j 4 wget -q {}
See also
- Putting jobs in background
- Putting functions in background
- See process management commands: bg command, disown command, fg command, and jobs command
🐧 Get the latest tutorials on Linux, Open Source & DevOps via:
- RSS feed or Weekly email newsletter
- Share on Twitter • Facebook • 2 comments... add one ↓
Category | List of Unix and Linux commands |
---|---|
File Management | cat |
Firewall | Alpine Awall • CentOS 8 • OpenSUSE • RHEL 8 • Ubuntu 16.04 • Ubuntu 18.04 • Ubuntu 20.04 |
Network Utilities | dig • host • ip • nmap |
OpenVPN | CentOS 7 • CentOS 8 • Debian 10 • Debian 8/9 • Ubuntu 18.04 • Ubuntu 20.04 |
Package Manager | apk • apt |
Processes Management | bg • chroot • cron • disown • fg • jobs • killall • kill • pidof • pstree • pwdx • time |
Searching | grep • whereis • which |
User Information | groups • id • lastcomm • last • lid/libuser-lid • logname • members • users • whoami • who • w |
WireGuard VPN | Alpine • CentOS 8 • Debian 10 • Firewall • Ubuntu 20.04 |
Could really do with showing use of $! to get the PID as well IMO; very handy, especially for a bash script, when you want to be able to kill a long-running (or never-ending) process later, or wait for a specific process to end.
Maybe it’s just me but I always felt it was good practice to store the PID from $! after every asynchronous call.
For example:
Handy for things like splitting off one process per core and such.
I found GNU parallel to be way over-engineered, so I wrote “map”; see https://github.com/sitaramc/map (and especially https://github.com/sitaramc/map/blob/master/map-vs-gp.mkd I guess). I must warn you it’s a bit dated; I use it every day but have not needed any changes for a long time.