Apache - The HTTP server from the Apache Foundation

mail

Apache configuration directives

KeepAlive
Read this excellent article.
Options FollowSymlinks
The server will follow symbolic links in this directory. This is the default setting.
ScriptAlias urlPath /path/to/dir/
Allows Apache to execute the scripts contained in the local directory /path/to/dir/ when they are called from the given urlPath. For example :
ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/
allows execution of the script /usr/lib/cgi-bin/myScript.cgi when called by www.example.com/cgi-bin/myScript.cgi (source)
ServerAlias
ServerAlias *.example.com matches foo.example.com but also foo.bar.example.com and foo.bar.baz.example.com.

When a request arrives, the server will find the best (most specific) matching VirtualHost argument based on the IP address and port used by the request. If there is more than one virtual host containing this best-match address and port combination, Apache will further compare the ServerName and ServerAlias directives to the server name present in the request.
If you omit the ServerName directive from any name-based virtual host, the server will default to a FQDN derived from the system hostname. This implicitly set server name can lead to counter-intuitive virtual host matching and is discouraged.
If no matching ServerName or ServerAlias is found in the set of virtual hosts containing the most specific matching IP address and port combination, then the first listed virtual host that matches that will be used.

(source)
mail

Why does Apache ignore RewriteRules ?

Have you enabled the rewrite module ?
a2enmod rewrite && apache2ctl restart
Have you enabled the RewriteEngine in the VirtualHost configuration ?
Add to the VirtualHost configuration file : RewriteEngine on, then reload the configuration.

For further debugging, consider mod_rewrite's logging directives.

mail

Apache's access.log format

Common log format :

  1. IP address of the remote host
  2. identity of the client, or "-" if not available. This information is highly unreliable and should almost never be used except on tightly controlled internal networks
  3. userId of the person requesting the document as determined by HTTP authentication. Defaults to "-" if the document is not password-protected
  4. date + time the request was received
  5. request sent by the client : HTTP method (GET) + resource (/index.html) + protocol and version (HTTP/1.0)
  6. status code returned by Apache
  7. size of the object returned to the client, not including the response headers. If no content was returned to the client, this value will be "-"

Combined log format :

Same as above +
  1. Referrer
  2. User-Agent
mail

Apache is suffering from load and logs are full of internal dummy connection

Situation

Apache load is increasing, and /var/log/apache2/access.log gets filled by access from itself and ending by (internal dummy connection)

Details

When the Apache HTTP Server manages its child processes, it needs a way to wake up processes that are listening for new connections. To do this, it sends a simple HTTP request back to itself. This request will appear in the access_log file with the remote address set to the loop-back interface (typically 127.0.0.1 or ::1 if IPv6 is configured). If you log the User-Agent string (as in the combined log format), you will see the server signature followed by (internal dummy connection) on non-SSL servers. During certain periods you may see up to one such request for each httpd child process.

These requests are perfectly normal and you do not, in general, need to worry about them. They can simply be ignored.

In 2.2.6 and earlier, in certain configurations, these requests may hit a heavy-weight dynamic web page and cause unnecessary load on the server. You can avoid this by using mod_rewrite to respond with a redirect when accessed with that specific User-Agent or IP address.
(source)

Solution

Add to the Virtualhost definition (source):
	<IfModule mod_rewrite.c>
		RewriteEngine On

		RewriteCond %{HTTP_USER_AGENT} ^.*internal\ dummy\ connection.*$ [NC]
		RewriteRule .* – [F,L]
	</IfModule>
mail

RewriteRules

Syntax of a RewriteRule :

Syntax : RewriteRule Pattern Substitution [Flags]

What is matched?

  • In VirtualHost context, the pattern will initially be matched against the part of the URL after the hostname and port, and before the query string.
    For instance, given the URL http://www.example.com:81/app/index.php?param=value, the pattern will be matched on /app/index.php.
  • In Directory and htaccess context, the pattern will initially be matched against the filesystem path, after removing the prefix that led the server to the current RewriteRule (e.g. "app1/index.html" or "index.html" depending on where the directives are defined).

Examples (source) :

In VirtualHost context, for request GET /somepath/pathinfo :
Given Rule Resulting Substitution
^/somepath(.*)    /otherpath$1 /otherpath/pathinfo
^/somepath(.*)    /otherpath$1 [R] http://thishost/otherpath/pathinfo via external redirection
^/somepath(.*)    http://thishost/otherpath$1 /otherpath/pathinfo
^/somepath(.*)    http://thishost/otherpath$1 [R] http://thishost/otherpath/pathinfo via external redirection
^/somepath(.*)    http://otherhost/otherpath$1 http://otherhost/otherpath/pathinfo via external redirection
^/somepath(.*)    http://otherhost/otherpath$1 [R] http://otherhost/otherpath/pathinfo via external redirection (the [R] flag is redundant)
To make redirects without keeping the query string :

http://foo?123 ==> http://bar

just do :

^.* http://bar/?

http://stackoverflow.com/questions/9374566/htaccess-remove-query-string-from-url-no-redirection
http://httpd.apache.org/docs/current/en/mod/mod_rewrite.html (See "Modifying the Query String" paragraph)

In .htaccess context, with /physical/path/to/somepath/.htaccess having RewriteBase /somepath and a request such as : GET /somepath/localpath/pathinfo

Given Rule Resulting Substitution
^localpath(.*)    otherpath$1 /somepath/otherpath/pathinfo
^localpath(.*)    otherpath$1    [R] http://thishost/somepath/otherpath/pathinfo via external redirection
^localpath(.*)    /otherpath$1 /otherpath/pathinfo
^localpath(.*)    /otherpath$1    [R] http://thishost/otherpath/pathinfo via external redirection
^localpath(.*)    http://thishost/otherpath$1 /otherpath/pathinfo
^localpath(.*)    http://thishost/otherpath$1    [R] http://thishost/otherpath/pathinfo via external redirection
^localpath(.*)    http://otherhost/otherpath$1 http://otherhost/otherpath/pathinfo via external redirection
^localpath(.*)    http://otherhost/otherpath$1    [R] http://otherhost/otherpath/pathinfo via external redirection (the [R] flag is redundant)
^localpath(.*)    http://otherhost/otherpath$1    [P] http://otherhost/otherpath/pathinfo via internal proxy

Syntax of a RewriteCond :

In the RewriteCond TestString Condition [Flags] statement, Condition is usually a Perl-compatible regex. This answers all questions about wildcards

When more than one RewriteCond is specified, they must all match for the RewriteRule to be applied.

mail

Implementation of SSL / TLS

Becoming a CA :

The CA structure :

  1. mkdir -p /srv/ssl; cd /srv/ssl; mkdir certs crl newcerts private; echo 01 > serial; touch index.txt; cp /usr/lib/ssl/openssl.cnf .
  2. edit /srv/ssl/openssl.cnf :
    • Define the working directory : dir = /srv/ssl
    • Within the [ req_distinguished_name ] section, set some default values (as you're becoming a recognized CA and want to automate as much as possible of your process) :
      • your country code in countryName_default
      • your company name in 0.organizationName_default

Generate the CA private key :

cd /srv/ssl && openssl genrsa -des3 -out private/ca.key.pem 4096

Create a self-signed CA certificate :

cd /srv/ssl && openssl req -config openssl.cnf -new -x509 -nodes -sha1 -days 1825 -key private/ca.key.pem -out ca.cert.pem

This outputs /srv/ssl/ca.cert.pem, which is the CA's certificate. This can already be shared publicly provided Apache can handle its MIME type :

echo "AddType application/x-x509-ca-cert .pem" > /etc/apache2/conf.d/certificates.conf && service apache2 reload

Create a certificate for ACME Corp.'s website :

Create ACME Corp.'s private key :

cd /srv/ssl && openssl genrsa -des3 -out acme.key.pem 4096

ACME Corp. now requests a new certificate (in CSR format) :

cd /srv/ssl && openssl req -config openssl.cnf -new -key acme.key.pem -out acme.csr.pem

As the CA, sign ACME Corp.'s certificate request, and generate its certificate :

cd /srv/ssl && openssl ca -config openssl.cnf -policy policy_anything -out acme.cert.pem -infiles acme.csr.pem

For convenience, let's remove the password on ACME Corp.'s private key file, as this password is prompted to read the key, i.e. when restarting Apache :

cd /srv/ssl && openssl rsa -in acme.key.pem -out acme.key.nopass.pem

Clean up (even though this command itself could be cleaner ) :

mkdir /srv/ssl/acme && mv /srv/ssl/acme* /srv/ssl/acme/

Apache configuration :

Create the VirtualHost file /etc/apache2/sites-available/ssl.acme.com.conf :

<VirtualHost *:443>
	ServerName		ssl.acme.com
	DocumentRoot		/var/www/ssl.acme.com

	SSLEngine		On
	SSLProtocol		+TLSv1 +TLSv1.1 +TLSv1.2 -SSLv2 -SSLv3	# Disable obsolete SSL versions

	SSLCertificateFile	/srv/ssl/acme/acme.cert.pem
	SSLCertificateKeyFile	/srv/ssl/acme/acme.key.nopass.pem
</VirtualHost>

TLSv1 is often said to be equivalent to SSLv3, but this is not true : TLSv1 is an evolution of SSLv3 with backward compatibility. TLSv1 should be preferred, though. (source : 1, 2)
See also : SSL/TLS Strong Encryption: How-To.

Other settings :

  1. echo 'NameVirtualHost *:443' >> /etc/apache2/ports.conf && a2enmod ssl, or, cleaner, edit /etc/apache2/ports.conf :
    <IfModule mod_ssl.c>
    	Listen 443
    	NameVirtualHost *:443
    </IfModule>

    NameVirtualHost is deprecated since Apache 2.3.11 (sources : 1, 2). This solution may not work as expected or generate errors.

    This fixes the [warn] _default_ VirtualHost overlap on port 443, the first has precedence error (source).
  2. a2ensite ssl.acme.com.conf && /etc/init.d/apache2 reload
  3. Then open (with a compatible browser) : https://ssl.acme.com/ (It works, even though you may get a warning regarding the self signed certificate)

Further steps :

The ACME Corp. certificate should be shareable with a simple hyperlink : http://www.acme.com/acme.cert.pem, but Firefox refuses to import a certificate that's not from a recognized CA.

Create a 2nd SSL Virtualhost :

The problem (source)

Running multiple web sites that allow HTTPS connections on the same Apache httpd server is problematic. Typically, HTTP virtual servers (the individual web sites) use named virtual hosting, where the virtual host which serves the site is chosen based on the host name specified in the HTTP Host header.

Named virtual hosting does not work for HTTPS because the server cannot interpret the Host header until the connection has been made, and making the connection requires the completion of the SSL encryption handshake used by HTTPS.

SSL certificates (without extensions) can only have a single server host name as their subject, and thus the certificate and connection will only work for a single host name. As a result, the named virtual hosting mechanism never has a chance to operate on the incoming connection and the only remaining way to host multiple sites is to add multiple IP addresses to the host and use IP-based virtual servers.

Solution 1 : Server Name Indication (aka SNI)

Solution 2 : wildcard names :

The idea is having one certificate to work for any number of hostnames below a given domain: *.example.com would allow any of a.example.com, foo.example.com, or elbow.example.com, but would not work for right.elbow.example.com or www.google.com. Wildcard certificates are much more expensive than standard certificates.

Solution 3 : Subject alternative names (aka SubjectAltNames) :

This allows a certificate to list a number of host names for which it is valid. For example, a single certificate with www.example.com as its (single) subject could list www.example.com, www.example.org, and webapp.example.com as alternate names. The certificate would be recognized as valid for any of those host names.

More details in RFC 3280.

Security strategy (source) :

To avoid leaking user's login and password from a login page, one may be interested in serving this page through HTTPS. The "optimized" strategy would then be to send the login page (and form) in standard HTTP, and just let the form POST its content through HTTPS. But, in case of a MitM attack, the attacker may be able to alter the "action" form field, and get credentials for himself. So the wise strategy is to serve both the form and the form "action" pages on HTTPS.
mail

Order / Allow / Deny

Configuration directives discussed here are deprecated for Apache 2.4.x. Read this article for solution and links.

Samples :

Order Deny,Allow
Deny from all
Allow from apache.org
Order Allow,Deny
Allow from apache.org
Deny from foo.apache.org

Order :

Order Allow,Deny or Order Deny,Allow works on 3 steps :
  1. Processes all directives specified by the 1st parameter (Allow / Deny)
  2. Processes all directives specified by the 2nd parameter (Deny / Allow)
  3. Processes all directives that didn't match yet.
  • Allow,Deny or Deny,Allow matters because the last matching rule applies.
  • The order of the Allow from ... and Deny from ... lines in the configuration file doesn't matter.
  • If both or none of the Allow / Deny filters match, defaults to the 2nd parameter of the Order directive.

Whitelisting IPs (for test/validation environments) :

Order Deny,Allow
Deny from all
Allow from apache.org
    1. Apache gets a request from the remote IP apache.org (someone working for the Apache Foundation)
    2. Apache starts by processing the Deny directives : apache.org matches Deny from all
    3. Then Apache processes the Allow directives : apache.org matches Allow from apache.org
    4. Both Allow and Deny matched : request is allowed
    1. Apache gets a request from the remote IP 1.2.3.4
    2. Apache starts by processing the Deny directives : 1.2.3.4 matches Deny from all
    3. Then Apache processes the Allow directives : no match
    4. Only Deny matched : request is denied

Filtering IPs :

Order Allow,Deny
Allow from apache.org
Deny from foo.apache.org
    1. Apache gets a request from the remote IP www.apache.org (someone working for the website of the Apache Foundation)
    2. Apache starts by processing the Allow directives : www.apache.org matches Allow from apache.org
    3. Then Apache processes the Deny directives : no match
    4. Only Allow matched : request is allowed
    1. Apache gets a request from the remote IP foo.apache.org
    2. Apache starts by processing the Allow directives : foo.apache.org matches Allow from apache.org
    3. Then Apache processes the Deny directives : foo.apache.org matches Deny from foo.apache.org
    4. Both Allow and Deny matched : request is denied
    1. Apache gets a request from the remote IP 1.2.3.4
    2. Apache starts by processing the Allow directives : no match
    3. Then Apache processes the Deny directives : no match
    4. None of Allow and Deny matched : request is denied

Blacklisting IPs (for production environments) :

Order Allow,Deny
Allow from all
Deny from apache.org
    1. Apache gets a request from the remote IP www.apache.org (someone working for the website of the Apache Foundation)
    2. Apache starts by processing the Allow directives : www.apache.org matches Allow from all
    3. Then Apache processes the Deny directives : www.apache.org matches Deny from apache.org
    4. Both Allow and Deny matched : request is denied
    1. Apache gets a request from the remote IP 1.2.3.4
    2. Apache starts by processing the Allow directives : 1.2.3.4 matches Allow from all
    3. Then Apache processes the Deny directives : no match
    4. Only Allow matched : request is allowed
mail

Apache configuration rules for eZ Publish

On RCT server, load images from PROD :

Add into the Apache configuration file :
	# Get images from PROD if not available here.
	RewriteCond %{DOCUMENT_ROOT}/$1 !-f
	RewriteRule ^/(var/[^/]+/storage/.*)$ http://www.example.com/$1 [R,L,NS]
mail

Compress content

Compression is achieved thanks to Apache modules :

Natively compressed formats (source):

Some file formats are already compressed, meaning that any extra compression won't have any noticeable effect on filesize, but will increase processing time.

mail

Apache offers to download .php5 files instead of processing them

Apache requires to be taught that .php5 files can be executed like .php files. To do so :
  1. edit /etc/apache2/mods-available/php5.conf
  2. you will find a configuration line like : <FilesMatch "\.ph(p3?|tml)$">, declaring which files are considered as .php files.
  3. to declare .php5 files, change this into <FilesMatch "\.ph(p3?|p5|tml)$">
  4. apache2ctl restart and voilà !

Other solution with a2enmod php5 (not tested)

mail

Run CGI scripts (PERL, Python, ...)

  1. edit /etc/apache2/mods-available/mime.conf and uncomment the line : AddHandler cgi-script .cgi
  2. if you want *.pl files to be recognized, append .pl to that line, which gives : AddHandler cgi-script .cgi .pl
  3. in the configuration file of your CGI application (/etc/apache2/sites-available/myCgiApp), find the section <Directory /var/www/myCgiApp>, then the line Options, and append +ExecCGI, which may look like :
    <Directory /var/www/myCgiApp>
    Options Indexes FollowSymLinks MultiViews +ExecCGI
    AllowOverride None
    Order allow,deny
    allow from all
    </Directory>

Don't forget to have a look at the error log : tail -f /var/log/apache2/error.log