URL中的井号的6个知识点

6 Things You Should Know About Fragment URLs1. A Fragment URL Specifies A Location Within A Page

Any URL that contains a # character is a fragment URL. The portion of the URL to the left of the # identifies a resource that can be downloaded by a browser and the portion on the right, known as the fragment identifier, specifies a location within the resource:

In HTML documents, the browser looks for an anchor tag <a> with a name attribute matching the fragment. For example, in the URL shown above the browser finds a matching tag in the Printing Support heading:

<h3><a name="print"></a>Printing Support</h3>

and scrolls the page to display that section:

2. Fragments Are not Sent in HTTP Request Messages

If you try using fragment URLs in an HTTP sniffer like HttpWatch, you’ll never see the fragment IDs in the requested URL or Referer header. The reason is that the fragment identifier is only used by the browser – it doesn’t affect which resource is returned from the server.

Here’s a screen shot of HttpWatch showing the traffic generated by refreshing a fragment URL:

So don’t expect to see fragments identifiers in your server side code.

3. Anything After the First # is a Fragment Identifier

It doesn’t matter if the first # appears to be contained within the host name, path or query string – it always indicates where the fragment identifier starts.

For example, here’s a URL that attempts to encode an HTML color and shape into the query string:

http://example.com/?color=#ffff&shape=circle

Unfortunately, the # in the HTML color makes the rest of the URL a fragment identifier and the server will see a single, empty color parameter in the query string:

4. Changing A Fragment ID Doesn’t Reload a Page but Does Create History

Fragments have a couple of handy features. First, if you manually change a fragment URL from something like this:

http://www.httpwatch.com/features.htm#filter

to this:

http://www.httpwatch.com/features.htm#print

and the browser scrolls the page to the new location but doesn’t reload the page.

However, it does add an entry in the browser’s history so that clicking the Back button will go back to the original location in the page.

These features are particular useful when used with Javacript (see below) to create linkable URLs and history for pages that either use top level HTML frames or update their content dynamically with Ajax calls.

5. JavaScript Can Use window.location.hash to Change Fragment IDs

The window object’s hash property allows JavaScript to manipulate the current page’s fragment identifier. As described in 4) this can be used to add history entries for a page without forcing a complete reload.

We recently deployed the help and automation reference for HttpWatch on our web site using the frame based HTML generated by the help authoring tool. Although the content was easily accessible in the browser, the URL in the location bar didn’t change as you moved between topics making it practically impossible to share URLs for topics of interest.

The solution was to use fragment identifiers and JavaScript to create linkable URLs. The fragment identifier specifies the embedded help topic page:

6. Googlebot Ignores Fragments By Default

The Googlebot is responsible for crawling sites to find content and embedded links that will become part of the Google search index. It fetches and parses HTML, but it’s not a full blown browser and doesn’t have a JavaScript engine. As a consequence, it will normally ignore fragment identifiers and just look at the resource returned from the web server. Any JavaScript used by your page to load or build content will not be executed.

This means it would be impossible for Ajax driven sites to be indexed and have their fragment URLs returned directly in Google searches. To overcome this problem Google supports a convention that allows the Googlebot to turn fragment identifiers into query string parameters.

To use this indexing scheme you would first need to change all your fragment identifiers to start with a ! symbol:

http://www.example.com/ajax.html#mystate

would need to change to:

http://www.example.com/ajax.html#!mystate

The presence of the leading ! indicates to Google that you support this scheme.

Also, your page needs to be able supply the HTML for a given state in response to a query string parameter named _escaped_fragment_ . When the Googlebot needs the content for a given state it supplies the fragment identifier using a simple GET request and a query string value:

http://www.example.com/ajax.html?_escaped_fragment_=mystate

补充中文翻译：

URL的井号

作者：阮一峰

日期： 2011年3月 9日

去年9月，twitter改版。

一个显著变化，就是URL加入了"#!"符号。比如，改版前的用户主页网址为

　　http://twitter.com/username

改版后，就变成了

　　http://twitter.com/#!/username

在我印象中，这是主流网站第一次将"#"大规模用于直接与用户交互的关键URL中。这表明井号（Hash）的作用正在被重新认识。本文根据HttpWatch的文章，整理与井号有关的所有重要知识点。

一、#的涵义

#代表网页中的一个位置。其右面的字符，就是该位置的标识符。比如，

　　http://www.example.com/index.html#print

就代表网页index.html的print位置。浏览器读取这个URL后，会自动将print位置滚动至可视区域。

为网页位置指定标识符，有两个方法。一是使用锚点，比如<a name="print"></a>，二是使用id属性，比如<div id="print" >。

二、HTTP请求不包括#

#是用来指导浏览器动作的，对服务器端完全无用。所以，HTTP请求中不包括#。

比如，访问下面的网址，

　　http://www.example.com/index.html#print

浏览器实际发出的请求是这样的：

　　GET /index.html HTTP/1.1

　　Host: www.example.com

可以看到，只是请求index.html，根本没有"#print"的部分。

三、#后的字符

在第一个#后面出现的任何字符，都会被浏览器解读为位置标识符。这意味着，这些字符都不会被发送到服务器端。

比如，下面URL的原意是指定一个颜色值：

　　http://www.example.com/?color=#fff

但是，浏览器实际发出的请求是：

　　GET /?color= HTTP/1.1

　　Host: www.example.com

可以看到，"#fff"被省略了。只有将#转码为%23，浏览器才会将其作为实义字符处理。也就是说，上面的网址应该被写成：

　　http://example.com/?color=%23fff

四、改变#不触发网页重载

单单改变#后的部分，浏览器只会滚动到相应位置，不会重新加载网页。

比如，从

　　http://www.example.com/index.html#location1

改成

　　http://www.example.com/index.html#location2

浏览器不会重新向服务器请求index.html。

五、改变#会改变浏览器的访问历史

每一次改变#后的部分，都会在浏览器的访问历史中增加一个记录，使用"后退"按钮，就可以回到上一个位置。

这对于ajax应用程序特别有用，可以用不同的#值，表示不同的访问状态，然后向用户给出可以访问某个状态的链接。

值得注意的是，上述规则对IE 6和IE 7不成立，它们不会因为#的改变而增加历史记录。

六、window.location.hash读取#值

window.location.hash这个属性可读可写。读取时，可以用来判断网页状态是否改变；写入时，则会在不重载网页的前提下，创造一条访问历史记录。

七、onhashchange事件

这是一个HTML 5新增的事件，当#值发生变化时，就会触发这个事件。IE8+、Firefox 3.6+、Chrome 5+、Safari 4.0+支持该事件。

它的使用方法有三种：

　　window.onhashchange = func;

　　window.addEventListener("hashchange", func, false);

对于不支持onhashchange的浏览器，可以用setInterval监控location.hash的变化。

八、Google抓取#的机制

默认情况下，Google的网络蜘蛛忽视URL的#部分。

但是，Google还规定，如果你希望Ajax生成的内容被浏览引擎读取，那么URL中可以使用"#!"，Google会自动将其后面的内容转成查询字符串_escaped_fragment_的值。

比如，Google发现新版twitter的URL如下：

　　http://twitter.com/#!/username

就会自动抓取另一个URL：

　　http://twitter.com/?_escaped_fragment_=/username

通过这种机制，Google就可以索引动态的Ajax内容。

（完）

本站仅提供存储服务，所有内容均由用户发布，如发现有害或侵权内容，请点击举报。