Apache - The HTTP server from the Apache Foundation

Apache error : AH00035: access to /page.html denied

While trying to setup a basic Apache configuration, /var/log/httpd/error_log reports :
[Wed Sep 20 15:48:30.346550 2017] [core:error] [pid 25989] (13)Permission denied: [client 10.27.25.137:55602] AH00035: access to /page.html denied (filesystem path '/data/rhelRepo/page.html') because search permissions are missing on a component of the path

Solution (part 1/2) :

This is usually because the user account running apache (or httpd for Red Hatoids) is not allowed to read the path set as DocumentRoot.
  1. Just to make sure you're not running something exotic : check the username + group of the apache process owner :
    grep -E '^(User|Group) ' /etc/httpd/conf/httpd.conf
    	User apache
    	group apache
  2. Check current configuration options :
    httpd -t -D DUMP_RUN_CFG
    ServerRoot: "/etc/httpd"
    Main DocumentRoot: "/data/rhelRepo"
    Main ErrorLog: "/etc/httpd/logs/error_log"
    ...
  3. Set permissions (source) :
    myDocumentRoot='/data/rhelRepo'; find "$myDocumentRoot" -type d -exec chmod 755 {} \; ; find "$myDocumentRoot" -type f -exec chmod 644 {} \;

This should do the trick, unless SE Linux is on the way. Red Hatoid users, keep on reading part 2

Solution (part 2/2) (source) :

  1. Confirm SE Linux is involved :
    tail -f /var/log/audit/audit.log | grep -E 'type=(AVC|SYSCALL)'
  2. Send requests to the web server. You found the culprit if you can see things like :
    type=AVC msg=audit(1505919935.243:23294): avc: denied { getattr } for pid=28387 comm="httpd" path="/data/rhelRepo/page.html" dev="dm-2" ino=786435 scontext=system_u:system_r:httpd_t:s0 tcontext=unconfined_u:object_r:unlabeled_t:s0 tclass=file
    type=SYSCALL msg=audit(1505919935.243:23294): arch=c000003e syscall=4 success=no exit=-13 a0=7f511728e340 a1=7ffe6a31c6e0 a2=7ffe6a31c6e0 a3=7f510d20c792 items=1 ppid=28385 pid=28387 auid=4294967295 uid=48 gid=48 euid=48 suid=48 fsuid=48 egid=48 sgid=48 fsgid=48 tty=(none) ses=4294967295 comm="httpd" exe="/usr/sbin/httpd" subj=system_u:system_r:httpd_t:s0 key=(null)

    And also :

    uid=48 gid=48
    id apache
    uid=48(apache) gid=48(apache) groupes=48(apache)

  3. Solution :
    chcon --user system_u --type httpd_sys_content_t -Rv "$myDocumentRoot"
  4. It should work now !

Apache error : AH01630: client denied by server configuration

Situation :

After upgrading Apache 2.2.x to Apache 2.4.10, /var/log/apache2/error.log started complaining :

[Sun Jan 03 15:55:13.845797 2016] [authz_core:error] [pid 2205:tid 139630284179200] [client 127.0.0.1:52045] AH01630: client denied by server configuration: /path/to/some/web/resource

Details :

This is due to changes in the access control methods between versions 2.2 and 2.4, now using the mod_authz_host module.

Solution :

The website we're considering is on a development workstation, and should only be accessed from that workstation. The virtualhost configuration had the following lines for Apache 2.2.x :

Order Deny,Allow
Deny from all
Allow from 127.0.0.1
Can be replaced, for Apache 2.4.10, with :
Require host 127.0.0.1

This is a quick fix, and the Apache documentation should be further studied before doing this to production servers.

Other interesting Require options :
Require local
allow connections from the local host
Require all granted
Require all denied
Respectively grant / deny access to all requests

Apache configuration directives

KeepAlive
Read this excellent article.
Options FollowSymlinks
The server will follow symbolic links in this directory. This is the default setting.
ScriptAlias urlPath /path/to/dir/
Allows Apache to execute the scripts contained in the local directory /path/to/dir/ when they are called from the given urlPath. For example :
ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/
allows execution of the script /usr/lib/cgi-bin/myScript.cgi when called by www.example.com/cgi-bin/myScript.cgi (source)
ServerAlias
ServerAlias *.example.com matches foo.example.com but also foo.bar.example.com and foo.bar.baz.example.com.

When a request arrives, the server will find the best (most specific) matching VirtualHost argument based on the IP address and port used by the request. If there is more than one virtual host containing this best-match address and port combination, Apache will further compare the ServerName and ServerAlias directives to the server name present in the request.
If you omit the ServerName directive from any name-based virtual host, the server will default to a FQDN derived from the system hostname. This implicitly set server name can lead to counter-intuitive virtual host matching and is discouraged.
If no matching ServerName or ServerAlias is found in the set of virtual hosts containing the most specific matching IP address and port combination, then the first listed virtual host that matches that will be used.

(source)

Why does Apache ignore RewriteRules ?

Have you enabled the rewrite module ?
a2enmod rewrite && apache2ctl restart
Have you enabled the RewriteEngine in the VirtualHost configuration ?
Add to the VirtualHost configuration file : RewriteEngine on, then reload the configuration.

For further debugging, consider mod_rewrite's logging directives.

How to specify cache headers in the VirtualHost configuration ?

With mod_expires :

	<IfModule mod_expires.c>
		ExpiresActive On
		ExpiresDefault "access plus 1 month"

		ExpiresByType text/html "access plus 5 minutes"
		ExpiresByType text/css "access plus 10 minutes"
		ExpiresByType image/* "access plus 3 minutes"
	</IfModule>

With mod_headers :

Looks like using <IfModule headers_module> or <IfModule mod_headers.c> makes no difference.
	<IfModule headers_module>
		Header set Cache-Control "max-age=123456, public"
	</IfModule>

It is possible to mimic the ExpiresByType behavior by setting headers based on the file extension. file extension != content-type :

	<IfModule mod_headers.c>
		(other settings)

		<FilesMatch "\.(jpg|jpeg|png|gif)$">
			Header set Cache-Control "max-age=42, public"
		</FilesMatch>
		<FilesMatch "\.(js|css)$">
			Header set Cache-Control "max-age=96, public"
		</FilesMatch>
	</IfModule>

When Apache VirtualHosts are in a mess : Warning: DocumentRoot [...] does not exist

Situation :

Some colleagues are not the "Please-leave-this-place-as-clean-as-it-was-when-you-arrived" type : they add VirtualHosts to a shared Apache webserver (development platform) and just don't give a f*ck about the warnings at reload/restart when their VirtualHosts are not required / working anymore.

Details :

Upon reload/restart, Apache complains :
Warning: DocumentRoot [/path/to/docRoot1/] does not exist
Warning: DocumentRoot [/path/to/docRoot2/] does not exist
Warning: DocumentRoot [/path/to/docRoot3/] does not exist

Solution :

So let's do some cleaning :
cd /etc/apache2/sites-available/; for missingDocRoot in /path/to/docRoot1/ /path/to/docRoot2/ /path/to/docRoot3/; do a2dissite $(grep -l "$missingDocRoot" *); done; /etc/init.d/apache2 reload

Apache's access.log format

Common log format :

  1. IP address of the remote host
  2. identity of the client, or "-" if not available. This information is highly unreliable and should almost never be used except on tightly controlled internal networks
  3. userId of the person requesting the document as determined by HTTP authentication. Defaults to "-" if the document is not password protected
  4. date + time the request was received
  5. request sent by the client : HTTP method (GET) + resource (/index.html) + protocol and version (HTTP/1.0)
  6. status code returned by Apache
  7. size of the object returned to the client, not including the response headers. If no content was returned to the client, this value will be "-"

Combined log format :

Same as above +
  1. Referrer
  2. User-Agent

Apache is suffering from load and logs are full of internal dummy connection

Situation :

Apache load is increasing, and /var/log/apache2/access.log gets filled by access from itself and ending by (internal dummy connection)

Details :

When the Apache HTTP Server manages its child processes, it needs a way to wake up processes that are listening for new connections. To do this, it sends a simple HTTP request back to itself. This request will appear in the access_log file with the remote address set to the loop-back interface (typically 127.0.0.1 or ::1 if IPv6 is configured). If you log the User-Agent string (as in the combined log format), you will see the server signature followed by (internal dummy connection) on non-SSL servers. During certain periods you may see up to one such request for each httpd child process.

These requests are perfectly normal and you do not, in general, need to worry about them. They can simply be ignored.

In 2.2.6 and earlier, in certain configurations, these requests may hit a heavy-weight dynamic web page and cause unnecessary load on the server. You can avoid this by using mod_rewrite to respond with a redirect when accessed with that specific User-Agent or IP address.
(source)

Solution :

Add to the VHost definition (source):
	<IfModule mod_rewrite.c>
		RewriteEngine On

		RewriteCond %{HTTP_USER_AGENT} ^.*internal\ dummy\ connection.*$ [NC]
		RewriteRule .* – [F,L]
	</IfModule>

RewriteRules

Syntax of a RewriteRule :

Syntax : RewriteRule Pattern Substitution [Flags]

What is matched?
  • In VirtualHost context, the pattern will initially be matched against the part of the URL after the hostname and port, and before the query string.
    For instance, given the URL http://www.example.com:81/app/index.php?param=value, the pattern will be matched on /app/index.php.
  • In Directory and htaccess context, the pattern will initially be matched against the filesystem path, after removing the prefix that led the server to the current RewriteRule (e.g. "app1/index.html" or "index.html" depending on where the directives are defined).
Examples (source) :
In VirtualHost context, for request GET /somepath/pathinfo :
Given Rule Resulting Substitution
^/somepath(.*)    /otherpath$1 /otherpath/pathinfo
^/somepath(.*)    /otherpath$1 [R] http://thishost/otherpath/pathinfo via external redirection
^/somepath(.*)    http://thishost/otherpath$1 /otherpath/pathinfo
^/somepath(.*)    http://thishost/otherpath$1 [R] http://thishost/otherpath/pathinfo via external redirection
^/somepath(.*)    http://otherhost/otherpath$1 http://otherhost/otherpath/pathinfo via external redirection
^/somepath(.*)    http://otherhost/otherpath$1 [R] http://otherhost/otherpath/pathinfo via external redirection (the [R] flag is redundant)
To make redirects without keeping the query string :

http://foo?123 ==> http://bar

just do :

^.* http://bar/?

http://stackoverflow.com/questions/9374566/htaccess-remove-query-string-from-url-no-redirection
http://httpd.apache.org/docs/current/en/mod/mod_rewrite.html (See "Modifying the Query String" paragraph)

In .htaccess context, with /physical/path/to/somepath/.htaccess having RewriteBase /somepath and a request such as : GET /somepath/localpath/pathinfo

Given Rule Resulting Substitution
^localpath(.*)    otherpath$1 /somepath/otherpath/pathinfo
^localpath(.*)    otherpath$1    [R] http://thishost/somepath/otherpath/pathinfo via external redirection
^localpath(.*)    /otherpath$1 /otherpath/pathinfo
^localpath(.*)    /otherpath$1    [R] http://thishost/otherpath/pathinfo via external redirection
^localpath(.*)    http://thishost/otherpath$1 /otherpath/pathinfo
^localpath(.*)    http://thishost/otherpath$1    [R] http://thishost/otherpath/pathinfo via external redirection
^localpath(.*)    http://otherhost/otherpath$1 http://otherhost/otherpath/pathinfo via external redirection
^localpath(.*)    http://otherhost/otherpath$1    [R] http://otherhost/otherpath/pathinfo via external redirection (the [R] flag is redundant)
^localpath(.*)    http://otherhost/otherpath$1    [P] http://otherhost/otherpath/pathinfo via internal proxy

Syntax of a RewriteCond :

In the RewriteCond TestString Condition [Flags] statement, Condition is usually a Perl-compatible regex. This answers all questions about wildcards

When more than one RewriteCond is specified, they must all match for the RewriteRule to be applied.

Implementation of SSL / TLS

Becoming a CA :

The CA structure :
  1. mkdir -p /srv/ssl; cd /srv/ssl; mkdir certs crl newcerts private; echo 01 > serial; touch index.txt; cp /usr/lib/ssl/openssl.cnf .
  2. edit /srv/ssl/openssl.cnf :
    • Define the working directory : dir = /srv/ssl
    • Within the [ req_distinguished_name ] section, set some default values (as you're becoming a recognized CA and want to automate as much as possible of your process) :
      • your country code in countryName_default
      • your company name in 0.organizationName_default
Generate the CA private key :
cd /srv/ssl && openssl genrsa -des3 -out private/ca.key.pem 4096
Create a self-signed CA certificate :
cd /srv/ssl && openssl req -config openssl.cnf -new -x509 -nodes -sha1 -days 1825 -key private/ca.key.pem -out ca.cert.pem

This outputs /srv/ssl/ca.cert.pem, which is the CA's certificate. This can already be shared publicly provided Apache can handle its MIME type :

echo "AddType application/x-x509-ca-cert .pem" > /etc/apache2/conf.d/certificates.conf && service apache2 reload

Create a certificate for ACME Corp.'s website :

Create ACME Corp.'s private key :
cd /srv/ssl && openssl genrsa -des3 -out acme.key.pem 4096
ACME Corp. now requests a new certificate (in CSR format) :
cd /srv/ssl && openssl req -config openssl.cnf -new -key acme.key.pem -out acme.csr.pem
As the CA, sign ACME Corp.'s certificate request, and generate its certificate :
cd /srv/ssl && openssl ca -config openssl.cnf -policy policy_anything -out acme.cert.pem -infiles acme.csr.pem
For convenience, let's remove the password on ACME Corp.'s private key file, as this password is prompted to read the key, i.e. when restarting Apache :
cd /srv/ssl && openssl rsa -in acme.key.pem -out acme.key.nopass.pem
Clean up (even though this command itself could be cleaner ) :
mkdir /srv/ssl/acme && mv /srv/ssl/acme* /srv/ssl/acme/

Apache configuration :

Create the VirtualHost file /etc/apache2/sites-available/ssl.acme.com.conf :
<VirtualHost *:443>
	ServerName		ssl.acme.com
	DocumentRoot		/var/www/ssl.acme.com

	SSLEngine		On
	SSLProtocol		+TLSv1 +TLSv1.1 +TLSv1.2 -SSLv2 -SSLv3	# Disable obsolete SSL versions

	SSLCertificateFile	/srv/ssl/acme/acme.cert.pem
	SSLCertificateKeyFile	/srv/ssl/acme/acme.key.nopass.pem
</VirtualHost>

TLSv1 is often said to be equivalent to SSLv3, but this is not true : TLSv1 is an evolution of SSLv3 with backward compatibility. TLSv1 should be preferred, though. (source : 1, 2)
See also : SSL/TLS Strong Encryption: How-To.

Other settings :
  1. echo 'NameVirtualHost *:443' >> /etc/apache2/ports.conf && a2enmod ssl, or, cleaner, edit /etc/apache2/ports.conf :
    <IfModule mod_ssl.c>
    	Listen 443
    	NameVirtualHost *:443
    </IfModule>

    NameVirtualHost is deprecated since Apache 2.3.11 (sources : 1, 2). This solution may not work as expected or generate errors.

    This fixes the [warn] _default_ VirtualHost overlap on port 443, the first has precedence error (source).
  2. a2ensite ssl.acme.com.conf && /etc/init.d/apache2 reload
  3. Then open (with a compatible browser) : https://ssl.acme.com/ (It works, even though you may get a warning regarding the self signed certificate)
Further steps :
The ACME Corp. certificate should be shareable with a simple hyperlink : http://www.acme.com/acme.cert.pem, but Firefox refuses to import a certificate that's not from a recognized CA.

Create a 2nd SSL Virtualhost :

The problem (source)

Running multiple web sites that allow HTTPS connections on the same Apache httpd server is problematic. Typically, HTTP virtual servers (the individual web sites) use named virtual hosting, where the virtual host which serves the site is chosen based on the host name specified in the HTTP Host header.

Named virtual hosting does not work for HTTPS because the server cannot interpret the Host header until the connection has been made, and making the connection requires the completion of the SSL encryption handshake used by HTTPS.

SSL certificates (without extensions) can only have a single server host name as their subject, and thus the certificate and connection will only work for a single host name. As a result, the named virtual hosting mechanism never has a chance to operate on the incoming connection and the only remaining way to host multiple sites is to add multiple IP addresses to the host and use IP-based virtual servers.

Solution 1 : Server Name Indication (aka SNI)
Solution 2 : wildcard names :
The idea is having one certificate to work for any number of hostnames below a given domain: *.example.com would allow any of a.example.com, foo.example.com, or elbow.example.com, but would not work for right.elbow.example.com or www.google.com. Wildcard certificates are much more expensive than standard certificates.
Solution 3 : Subject alternative names (aka SubjectAltNames) :

This allows a certificate to list a number of host names for which it is valid. For example, a single certificate with www.example.com as its (single) subject could list www.example.com, www.example.org, and webapp.example.com as alternate names. The certificate would be recognized as valid for any of those host names.

More details in RFC 3280.

Security strategy (source) :

To avoid leaking user's login and password from a login page, one may be interested in serving this page through HTTPS. The "optimized" strategy would then be to send the login page (and form) in standard HTTP, and just let the form POST its content through HTTPS. But, in case of a MitM attack, the attacker may be able to alter the "action" form field, and get credentials for himself. So the wise strategy is to serve both the form and the form "action" pages on HTTPS.

Order / Allow / Deny

Configuration directives discussed here are deprecated for Apache 2.4.x. Read this article for solution and links.

Samples :

Order Deny,Allow
Deny from all
Allow from apache.org
Order Allow,Deny
Allow from apache.org
Deny from foo.apache.org

Order :

Order Allow,Deny or Order Deny,Allow works on 3 steps :
  1. Processes all directives specified by the 1st parameter (Allow / Deny)
  2. Processes all directives specified by the 2nd parameter (Deny / Allow)
  3. Processes all directives that didn't match yet.
  • Allow,Deny or Deny,Allow matters because the last matching rule applies.
  • The order of the Allow from ... and Deny from ... lines in the configuration file doesn't matter.
  • If both or none of the Allow / Deny filters match, defaults to the 2nd parameter of the Order directive.

Whitelisting IPs (for test/validation environments) :

Order Deny,Allow
Deny from all
Allow from apache.org
    1. Apache gets a request from the remote IP apache.org (someone working for the Apache Foundation)
    2. Apache starts by processing the Deny directives : apache.org matches Deny from all
    3. Then Apache processes the Allow directives : apache.org matches Allow from apache.org
    4. Both Allow and Deny matched : request is allowed
    1. Apache gets a request from the remote IP 1.2.3.4
    2. Apache starts by processing the Deny directives : 1.2.3.4 matches Deny from all
    3. Then Apache processes the Allow directives : no match
    4. Only Deny matched : request is denied

Filtering IPs :

Order Allow,Deny
Allow from apache.org
Deny from foo.apache.org
    1. Apache gets a request from the remote IP www.apache.org (someone working for the website of the Apache Foundation)
    2. Apache starts by processing the Allow directives : www.apache.org matches Allow from apache.org
    3. Then Apache processes the Deny directives : no match
    4. Only Allow matched : request is allowed
    1. Apache gets a request from the remote IP foo.apache.org
    2. Apache starts by processing the Allow directives : foo.apache.org matches Allow from apache.org
    3. Then Apache processes the Deny directives : foo.apache.org matches Deny from foo.apache.org
    4. Both Allow and Deny matched : request is denied
    1. Apache gets a request from the remote IP 1.2.3.4
    2. Apache starts by processing the Allow directives : no match
    3. Then Apache processes the Deny directives : no match
    4. None of Allow and Deny matched : request is denied

Blacklisting IPs (for production environments) :

Order Allow,Deny
Allow from all
Deny from apache.org
    1. Apache gets a request from the remote IP www.apache.org (someone working for the website of the Apache Foundation)
    2. Apache starts by processing the Allow directives : www.apache.org matches Allow from all
    3. Then Apache processes the Deny directives : www.apache.org matches Deny from apache.org
    4. Both Allow and Deny matched : request is denied
    1. Apache gets a request from the remote IP 1.2.3.4
    2. Apache starts by processing the Allow directives : 1.2.3.4 matches Allow from all
    3. Then Apache processes the Deny directives : no match
    4. Only Allow matched : request is allowed

Protect an Apache directory with an .htaccess

Generic situation :

  1. In the Virtual Host definition or in the directory to restrict, create a .htaccess file and fill in the directives (source) :
    AuthType Basic
    AuthName "[your prompt here]"
    AuthUserFile /path/to/.htpasswd
    Require valid-user
  2. Create the /path/to/.htpasswd file and populate it as shown below.

For Free.fr :

  1. In the directory to restrict, create a .htaccess file such as :
    PerlSetVar AuthFile /path/to/.htpasswd
    AuthName "Explicit Grant Required"
    AuthType Basic
    Require valid-user
  2. Then, create a .htpasswd file like :
    user1:password1
    user2:password2
    To do so :
    1. create a new .htpasswd file : htpasswd -c /path/to/.htpasswd userName. This will prompt for a password.
    2. add a new user to an existing .htpasswd file : htpasswd /path/to/.htpasswd userName
  3. There is some special stuff for Free.fr servers (sources : 1, 2) :
    • about the .htpasswd path : use the PerlSetVar AuthFile directive instead of AuthUserFile. Give it a path relative to the root of the virtualhost.
    • Don't crypt passwords

Apache configuration rules for eZ Publish

On RCT server, load images from PROD :

Add into the Apache configuration file :
	# Get images from PROD if not available here.
	RewriteCond %{DOCUMENT_ROOT}/$1 !-f
	RewriteRule ^/(var/[^/]+/storage/.*)$ http://www.example.com/$1 [R,L,NS]

Compress content

Compression is achieved thanks to Apache modules :

Natively compressed formats (source):

Some file formats are already compressed, meaning that any extra compression won't have any noticeable effect on filesize, but will increase processing time.

How to list the loaded modules ?

Apache offers to download .php5 files instead of processing them

Apache requires to be taught that .php5 files can be executed like .php files. To do so :
  1. edit /etc/apache2/mods-available/php5.conf
  2. you will find a configuration line like : <FilesMatch "\.ph(p3?|tml)$">, declaring which files are considered as .php files.
  3. to declare .php5 files, change this into <FilesMatch "\.ph(p3?|p5|tml)$">
  4. apache2ctl restart and voilà !

Other solution with a2enmod php5 (not tested)

apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.1.1 for ServerName

If the webserver hosts a single site, and the server names matches the website's name :

echo "ServerName myServerName" > /etc/apache2/conf.d/fqdn

Otherwise (multiple websites, multiple virtualhosts, multiple server names per virtualhost) :

  1. echo "ServerName localhost" > /etc/apache2/conf.d/fqdn
    echo "ServerName localhost" >> /etc/apache2/apache2.conf
  2. Make sure /etc/hosts has a line as short as : 127.0.0.1 localhost (no more aliases are necessary / welcome)

How to create a new web site on a virtual host ?

On the web server :

  1. make available the new website :
    1. cp /etc/apache2/sites-available/default /etc/apache2/sites-available/myNewWebSite
    2. edit /etc/apache2/sites-available/myNewWebSite and set values :
      • ServerName myNewWebsite (3rd line)
      • DocumentRoot /path/to/myNewWebsite/files
      • <Directory /path/to/myNewWebsite/files>
  2. enable the new website :
    1. cd /etc/apache2/sites-enabled
    2. ln -s ../sites-available/myNewWebSite
    3. apache2ctl restart

On the client :

  1. add a new entry to /etc/hosts such as : xxx.xxx.xxx.xxx myNewWebSite
  2. open your web browser at : http://myNewWebSite

Run CGI scripts (PERL, Python, ...)

  1. edit /etc/apache2/mods-available/mime.conf and uncomment the line : AddHandler cgi-script .cgi
  2. if you want *.pl files to be recognized, append .pl to that line, which gives : AddHandler cgi-script .cgi .pl
  3. in the configuration file of your CGI application (/etc/apache2/sites-available/myCgiApp), find the section <Directory /var/www/myCgiApp>, then the line Options, and append +ExecCGI, which may look like :
    <Directory /var/www/myCgiApp>
    Options Indexes FollowSymLinks MultiViews +ExecCGI
    AllowOverride None
    Order allow,deny
    allow from all
    </Directory>

Don't forget to have a look at the error log : tail -f /var/log/apache2/error.log