本程序使用了Python自带的HTMLParser,从Yahoo Finance指定页面抓取几个字段,代码30行左右,简单实用,居家旅行必备 :)
代码是从官方文档的HTMLParser的示例程序改成的
https://docs.python.org/2/library/htmlparser.html
完整的代码和介绍在:
https://bitbucket.org/lsz/html-parser
代码如下:
import urllibimport sysimport stringfrom HTMLParser import HTMLParserticker_list = ["ibb", "socl", "pnqi", "qqq", "vbk", "eirl", "ewi", "pbd", "ita", "dfe"]ticker = ticker_list[0]class MyHTMLParser(HTMLParser): def handle_data(self, data): starttag_text = self.get_starttag_text() ticker_str = "(%s)" % ticker if -1!=string.find(data, ticker_str.upper()) and -1!=string.find(starttag_text, "<h2>"): sys.stdout.write(data) if -1!=string.find(str(starttag_text), "yfs_g53_%s" % ticker.lower()) and -1==string.find(data, "-"): sys.stdout.write("\t") sys.stdout.write(data) if -1!=string.find(str(starttag_text), "yfs_h53_%s" % ticker.lower()): print "\t", datafor t in ticker_list: ticker = t parser = MyHTMLParser() f = urllib.urlopen("http://finance.yahoo.com/q?s=%s" % ticker) html_string = f.read() parser.feed(html_string)
示例输出:
iShares Nasdaq Biotechnology (IBB) 228.14 234.90Global X Social Media Index ETF (SOCL) 17.38 17.92PowerShares NASDAQ Internet (PNQI) 61.73 63.17PowerShares QQQ (QQQ) 87.31 88.15Vanguard Small Cap Growth ETF (VBK) 118.53 120.54iShares MSCI Ireland Capped (EIRL) 38.37 38.84iShares MSCI Italy Capped (EWI) 17.95 18.09PowerShares Global Clean Energy (PBD) 12.95 13.12 Defense (ITA) 107.93 109.36WisdomTree Europe SmallCap Dividend (DFE) 62.08 62.64
联系客服