The Error 404 "Page not found" is the error page displayed wheneversomeone asks for a page that’s simply not available on your site. Thereason for this is that there may be a link on your site that was wrongor the page might have been recently removed from the site. As there isno web page to display, the web server sends a page that simply says"404 Page not found".
The 404 error message is an HTTP (Hypertext Transfer Protocol)standard status code. This "Not Found" response code indicates thatalthough the client could communicate to the server, the server couldnot find what was requested or it was configured not to fulfill therequest.
The 404 "Not Found" error is not the same as the "Server Not Found"error which you see whenever a connection to the destination servercould not be established at all.
The default 404 error page as shown on Internet Explorer is given below.
Whenever you visit a web page, your computer will request data froma server through HTTP. Even before the requested page is displayed inyour browser, the web server will send the HTTP header that has thestatus code. The status code provides information about the status ofthe request. A normal web page gets the status code as 200. But we donot see this as the server proceeds to send the contents of the page.It’s only when there is an error, we see the status code 404 Not Found.
As a part of the HTTP 0.9 specifications, the World Wide WebConsortium (W3C) established HTTP status codes in 1992. TimBerners-Lee, who invented the web and the first web browser in 1990,defined the status codes.
A brief overview of HTTP status codes is given below.
Code | Meaning | Description |
100 | Continue | Confirms the client about the arrival of the first part of the request and informs to continue with the rest of the request or ignore if the request has been fulfilled |
101 | Switching Protocols | Informs the client about the server switching the protocols to that specified in the Upgrade message header field during the current connection. |
200 | OK | Standard response for successful requests |
201 | Created | Request fulfilled and new resource created |
202 | Accepted | Request accepted, but not yet processed |
203 | Non-Authoritative Information | Returned meta information was not the definitive set from the origin server. |
204 | No Content | Request succeeded without requiring the return of an entity-body |
205 | Reset Content | Request succeeded but require resetting of the document view that caused the request |
206 | Partial Content | Partial GET request was successful |
300 | Multiple Choices | Requested resource has multiple choices at different locations. |
301 | Moved Permanently | Resource permanently moved to a different URL. |
302 | Found | Requested resource was found under a different URL but the client should continue to use the original URL. |
303 | See Other | Requested response is at a different URL and can be accessed only through a GET command. |
304 | Not Modified | Resource not modified since the last request. |
305 | Use Proxy | Requested resource should be accessed through the proxy specified in the location field. |
306 | No Longer Used | Reserved for future use |
307 | Temporary Redirect | Resource has been moved temporarily to a different URL. |
400 | Bad Request | Syntax of the request not understood by the server. |
401 | Not Authorized | Request requires user authentication |
402 | Payment Required | Reserved for future use. |
403 | Forbidden | Server refuses to fulfill the request. |
404 | Not Found | Document or file requested by the client was not found. |
405 | Method Not Allowed | Method specified in the Request-Line was not allowed for the specified resource. |
406 | Not Acceptable | Resource requested generates response entities that has content characteristics not specified in the accept headers. |
407 | Proxy Authentication Required | Request requires the authentication with the proxy. |
408 | Request Timeout | Client fails to send a request in the time allowed by the server. |
409 | Conflict | Request was unsuccessful due to a conflict in the state of the resource. |
410 | Gone | Resource requested is no longer available with no forwarding address |
411 | Length Required | Server doesn’t accept the request without a valid Content-Length header field. |
412 | Precondition Failed | Precondition specified in the Request-Header field returns false. |
413 | Request Entity Too Large | Request unsuccessful as the request entity is larger than that allowed by the server |
414 | Request URL Too Long | Request unsuccessful as the URL specified is longer than the one, the server is willing to process. |
415 | Unsupported Media Type | Request unsuccessful as the entity of the request is in a format not supported by the requested resource |
416 | Requested Range Not Satisfiable | Request included a Range request-header field without any range-specifier value |
417 | Expectation Failed | Expectation given in the Expect request-header was not fulfilled by the server. |
422 | Unprocessable Entity | Request well-formed but unable to process because of semantic errors |
423 | Locked | Resource accessed was locked |
424 | Failed Dependency | Request failed because of the failure of a previous request |
426 | Upgrade Required | Client should switch to Transport Layer Security |
500 | Internal Server Error | Request unsuccessful because of an unexpected condition encountered by the server. |
501 | Not Implemented | Request unsuccessful as the server could not support the functionality needed to fulfill the request. |
502 | Bad Gateway | Server received an invalid response from the upstream server while trying to fulfill the request. |
503 | Service Unavailable | Request unsuccessful to the server being down or overloaded. |
504 | Gateway Timeout | Upstream server failed to send a request in the time allowed by the server. |
505 | HTTP Version Not Supported | Server does not support the HTTP version specified in the request. |
When we expand the code 404, the first digit “4” represents a clienterror. The server indicates that you did a mistake like misspelling theURL or requesting for a page that is no longer available.
The middle digit, 0 represents a general syntax error and could indicate a spelling mistake.
The last digit, 4 refers to a specific error in the group of 40x.
The World Wide Web Consortium (W3C) states that 404 Not Found shouldbe used in cases where the server fails to find the requested locationand is unsure of its status. Whenever a page has been permanentlyremoved, the status code used must be 410. But hardly have we seen a410 page. Instead, 404 Not Found page has become popular and the mostcommonly used error page.
A 404 response code is always followed by a human readable reasonphrase as per the HTTP specification. Generally, a web server issues anHTML page that has the 404 code and the “Not Found” phrase by default.You can configure a web server to display a branded page with a betterdescription and a search form. But the protocol level phrase requiresno customization as it is hidden from the user.
Soft 404 errors are actually “Not Found” errors returned by a webserver as a standard web page with a 200 Ok response code. In anautomated process of discovering a broken link, the soft 404 errors areproblematic.
The BT Group of UK has a clean feed content blocking system thatreturns a 404 error to the requests for content identified as illegalby the Internet Watch Foundation. Even when the user tries to accessthe Government censored websites, a fake 404 error will be returned.
A sample web trends’ summary report by ARCHIVI shows the client error details for 404 Page.
Error | Hits | % of Failed Hits |
000 Incomplete / Undefined | 29,164 | 69.62% |
404 Page or File Not Found | 12,651 | 30.2% |
400 Bad Request | 57 | 0.13% |
18745 Incomplete / Undefined | 5 | 0.01% |
18747 Incomplete / Undefined | 4 | 0% |
401 Unauthorized Access | 4 | 0% |
Total | 41,885 | 100% |
Although the web statistics generally vary from month to month,based on the strategy used to eliminate 404 errors, and how active thewebsite is, the percentage of 404 errors varies. Most active websitesthat have frequently changed or added content generally experience ahigher number of Page Not Found errors. But there are many large andbusy sites that achieve zero percent 404 errors over a period. On anaverage, around 7% of visits to any given web site will result in a 404error page.
A sample line from a common transfer log file is given below.
revacsystems.com - - [18/June/2008:12:13:03 -0700]
GET /download/windows/happiness.zip HTTP/1.0 200 9887
http://www.payoneer.com/ Mozilla/4.7 [en]C-SYMPA (Win95; U)
Address or DNS | revacsystems.com |
RFC931 | - |
AuthUser | - |
TimeStamp | [18/June/2008:12:13:03 -0700] |
Access Request | GET /download/windows/happiness.zip HTTP/1.0 |
Status Code | 200 |
Transfer Volume | 9887 |
Referer URL | http://www.payoneer.com/ |
User Agent | Mozilla/4.7 [en]C-SYMPA (Win95; U) |
Redirects are very useful when used in conjunction with a 404 errorpage. To redirect a page, simply follow the steps given below.
1. Create a file "notfound404.htm" with a message such as:
"Sorry, this page was not found. In a few seconds, you will be redirected to the Home page."
2. Allow 5 seconds for reading the message and then redirect.
3. A sample redirect code is:
<HTML>
<head>
<meta HTTP-EQUIV="Refresh" CONTENT="5; URL=not404.htm">
</head>
</HTML>
Note: The value for CONTENT specifies the number of seconds you allow the user to read the message before redirecting.
Robots.txt file is useful when there are frequently changingsections on your webpage. To use a robots.txt file, simply follow thesteps given below.
1. Create a file “robots.txt” in the root directory.
2. A sample robots.txt code is:
User-agent: *
Disallow: /disappearing/
Disallow: /soontobe404.htm
Note:User-agent: * will apply to all search engines. Disallow command helpsyou to block complete directories or only the individual files thatchange.
According to a recent poll conducted by the web masters, only 23% ofvisitors that encounter a 404 page make a second attempt to find themissing page. The rest 77% of visitors will not make any effort to finda missing page whenever they encounter a 404 error. So, customizingsuch an error page on your site will increase your chances of keepingthose visitors on your site. A custom error page will be an addedfeature and advantage to your website. It shows that you care for yourvisitors and made an attempt to catch them.
Besides, there are many problems with the standard error page. They are:
To avoid these problems and provide your visitor with a better userexperience, it is always ideal to customize a 404 error page.
A good 404 error page conveys a right message and leads the visitor to where he intends to go.
Note:There is no need for you to do all of these things on a single errorpage. But following these simple tactics will help your visitor to stayon your site.
Internet Explorer versions 5 and below do not support yourcustomized error page if its size is less than 512 bytes. Make sure youcreate customized error page that has the size greater than 512 bytes.Note that this doesn’t include graphics. The best way is to add somefiller text in comment lines in the source code of your file.
The versions 5 and above have friendly built in error page. But thisis no replacement to a customized error page. You can even change thedefault behavior of the Internet Explorer.
1. Go to Tools Menu in the Internet Explorer and click the Internet Options.
2. You will be displayed with the Internet Options dialog box. Click the Advanced tab.
3. Uncheck the "Show friendly HTTP error messages" check box and then click the OK button at the end.
The Hypertext access (.htaccess) file is the Apache'sdirectory-level configuration file that allows you to manipulate thebehavior of the server. Often, the .htaaccess file is accompanied by an.hpasswd file for storing valid usernames and respective passwords.
The .htaccess file is used to specify the security restrictions of aparticular directory. It helps the server to rewrite URLs and controluser agent caching so that bandwidth usage, server load and perceivedlag are reduced to a great extent. More commonly, it is used forcustomizing error messages for server errors.
Before we use any .htaccess file, we need to know that it’s thefilename in full and not any extension. For example, we cannot create afile called "error.htaccess" and the file is just called ".htaccess".
When you place a ".htaccess" file in any directory, it will beloaded via the Apache Web Server software. The file will have effectover the other files in the entire directory it is placed in, includingthe subdirectories. You can use text editors like TextPad, UltraEdit orMS Wordpad to create a .htaccess file.
After you create .htaccess file, you must upload it using a FTP(File Transfer Protocol) program. It is also important to note thatwhile you upload the file, you need to upload it in "ASCII" mode.
It is not possible to customize your 404 error page if your web host has not enabled this facility for your website.
If your web host has this facility, you will usually find mentioningof this information somewhere in their documentation. In fact, if theymention somewhere that you can customize a file named ".htaccess", itprobably means that you can also customize your 404 File Not Founderror page.
The steps in customizing a 404 error page are given below.
Step 1: Create a customized 404 File Not Found Error page.
Step 2: Create a .htaccess file
Step 3: Upload both files.
Step 4: Test the page.
We’ll create a custom 404 error page based on a html/php script.This code will send an email notice whenever the error page loads. Thenotice has the information such as the Date and Time of the visit, theIP address of the visitor, the attempted URL, the visitor’s browserinformation, and the bad link that led to the error page.
Here’s the sample code:
<HTML>
<HEAD>
<title> 404 Error Page</title>
</HEAD>
<BODY>
<p align="center">
<h1>Error 404</h1><br>Page Not Found
<p>
<?php
$ip = getenv ("REMOTE_ADDR");
$requri = getenv ("REQUEST_URI");
$servname = getenv ("SERVER_NAME");
$combine = $ip . " tried to load " . $servname . $requri ;
$httpref = getenv ("HTTP_REFERER");
$httpagent = getenv ("HTTP_USER_AGENT");
$today = date("D M j Y g:i:s a T");
$note = "You are in a wrong page!" ;
$message = "$today \n
<br>
$combine <br> \n
User Agent = $httpagent \n
<h2> $note </h2>\n
<br> $httpref ";
$message2 = "$today \n
$combine \n
User Agent = $httpagent \n
$note \n
$httpref ";
$to = "error@yourdomain.com";
$subject = "yourdomain Error Page";
$from = "From: fake@yourdomain.com\r\n";
mail($to, $subject, $message2, $from);
echo $message;
?>
Visit our Home Page yourdomain.com
</BODY></HTML>
1. Copy this code and paste it in notepad as shown below.
2. Replace "yourdomain.com" with the URL of your website.
3. Save the file as 404NotFound.php.
1. Open a new file in NotePad.
2. Add the following line to the file:
ErrorDocument 404 /errors/404NotFound.php
3. Save the file as .htaccess. The name must be all lowercase i.e. .htaccess
Note: Whenyou create a new .htaccess file, the resulting file may be named as -htaccess.txt If that is the case, remove the extension and rename thefile to .htaccess.
1. Open the FTP tool through which you upload your website files.
2. Upload the 404NotFound.php file to the folder you wish to upload.
3. Upload the .htaccess file to the top directory.
1. Now test your error page by typing a URL that you know does notexist. Your error page should load up like the one shown below.
The most common error you will find is related to the URL. If youmention the URL incorrectly in the .htaccess file, it leads the webserver into a loop when a visitor tries to access the missing file.
Another common error is related to the insertion of invalidhyperlinks on 404 Not Found page. So, When you provide the hyperlinksof other pages on the 404 Not Found page, make sure that they work andare not relative links. For Instance,
<a href="http://www.sample.com/support.html">Support</a>
instead of
<a href="support.html">Support</a>
联系客服