How to get domain name from URL in bash shell script

Posted on in Categories , , , , last updated May 16, 2017

How can I extract or fetch a domain name from a URL string (e.g. https://www.cyberciti.biz/index.php) using bash shell scripting under Linux or Unix-like operating system?

You can use standard Unix commands such as sed, awk, grep, Perl, Python and more to get domain name from URL. No need to write regex. It is pretty simple.
How to get domain name from URL in bash shell script
Let use see various commands and option to grab the domain part from given variable under Linux or Unix-like system.

Get domain name from full URL

Say your url name is stored in a bash shell variable such as $x:
x='https://www.cyberciti.biz/faq/copy-command/'
You can use the awk as follows:
echo "$x" | awk -F/ '{print $3}'
### OR ###
awk -F/ '{print $3}' <<<$x

Sample outputs:

www.cyberciti.biz

Extract domain name from URL using sed

Here is a sample sed command:
url="https://www.cyberciti.biz/faq/copy-command"
echo "$url" | sed -e 's|^[^/]*//||' -e 's|/.*$||'

Extract domain name from URL using bash shell parameter substitution

Another option is to use bash shell parameter substitution:

# My shell variable 
f="https://www.cyberciti.biz/faq/copy-command/"
 
## Remove protocol part of url  ##
f="${f#http://}"
f="${f#https://}"
f="${f#ftp://}"
f="${f#scp://}"
f="${f#scp://}"
f="${f#sftp://}"
 
## Remove username and/or username:password part of URL  ##
f="${f#*:*@}"
f="${f#*@}"
 
## Remove rest of urls ##
f=${f%%/*}
 
## Show domain name only ##
echo "$f"

Shell script example

A shell script to purge urls from Cloudflare by matching domain name part:

#!/bin/bash
zone_id=""
api_key=""
 
urls="$@"
bon=$(tput bold)
boff=$(tput sgr0)
c=1
[ "$urls" == "" ] && { echo "Usage: $0 url"; exit 1; }
 
clear
echo "Purging..."
echo
for u in $urls
do
     echo -n "${bon}${c}${boff}.${u}: "
     ## Get domain name ##
     d="$(echo $u | awk -F/ '{ print $3}')"
     ## Set API_KEY, Email_ID, and ZONE_ID as per domain ##
     case $d in
	     www.cyberciti.biz) zone_id="ID_1"; api_key="MY_KEY_1"; email_id="[email protected]";;
	     theos.in) zone_id="ID_2"; api_key="MY_KEY_2"; email_id="[email protected]";;
	     *) echo "Domain not configured."; continue;;
     esac
     ## Do it ##
     curl -X DELETE "https://api.cloudflare.com/client/v4/zones/${zone_id}/purge_cache" \
     -H "X-Auth-Email: ${email_id}" \
     -H "X-Auth-Key: ${api_key}" \
     -H "Content-Type: application/json" \
     --data "{\"files\":[\"${u}\"]}"
     echo
     (( c++ ))
done
echo

See also

Posted by: Vivek Gite

The author is the creator of nixCraft and a seasoned sysadmin and a trainer for the Linux operating system/Unix shell scripting. He has worked with global clients and in various industries, including IT, education, defense and space research, and the nonprofit sector. Follow him on Twitter, Facebook, Google+.

4 comment

  1. The pure bash version can be shortened to

    url="https://www.cyberciti.biz/faq/copy-command/"
    # Remove protocol
    url="${url#*://}"
    # Remove username and/or username:password url="${url#*@}"
    # Remove rest of url
    url=${url%%/*}
    # Show domain name only
    echo "$url"

Leave a Comment