Lighttpd block wget useragent for specific urls

by Vivek Gite on August 12, 2007 · 3 comments

One of regular reader asks a question:

My website powered by Lighttpd web server. I’d like to block Wget useragent for entire my domain.com site except for /downloads/ url section. How do I configure lighttpd?

You need to use $HTTP filed useragent and url combination. Just open your lighttpd.conf file and append code as follows.

Lighttpd block useragent wget configuration

# vi /etc/lighttpd/lighttpd.conf
Append config directive as follows:

$HTTP["useragent"] =~ "Wget" {
        $HTTP["url"] !~ "^/download($|/)" {
                url.access-deny = ( "" )
        }
  }

Where,

  • $HTTP["useragent"] : Match on useragent i.e. Wget
  • $HTTP["url"] : Match on url section such as /download/*. If there are nested blocks, this must be the most inner block.
  • =~ : Perl style regular expression match
  • !~ : Perl style regular expression not match

Just restart the webserver, enter:
# /etc/init.d/lighttpd restart

Now user can run wget on http://domain.com/download/* urls but not on http://domain.com/file.html or http://domain.com/dir/file

Featured Articles:

Share this with other sys admins!
Facebook it - Tweet it - Print it -

We're here to help you make the most of sysadmin work. So, subscribe!

{ 3 comments… read them below or add one }

1 ear August 14, 2007

This is pointless imo.
Users will still be able to use wget on your whole domain if they use wget –user-agent ….

Reply

2 悉尼 July 8, 2009

Is it possible to match all the blank agent?

Reply

3 pat October 1, 2010

# user-agent empty or made of any number of spaces rejected
$HTTP["useragent"] =~ “^ *$” {
url.access-deny = ( “” )
}

there’s a single space between ^ and *

Reply

Leave a Comment

You can use these HTML tags and attributes for your code and commands: <strong> <em> <ol> <li> <u> <ul> <blockquote> <pre> <a href="" title="">
What is 9 + 10 ?
Please leave these two fields as-is:
Are you a human being? Solve the simple math so we know that you are a human and not a bot.




Previous post:

Next post: