打开APP
userphoto
未登录

开通VIP,畅享免费电子书等14项超值服

开通VIP
darren hobbs: distributed lucene
Interesting article by Mark Harwood here regarding distributed lucene indexes. Using distributed indexes is how google achieves its scalability I believe, but they are a fairly special case. If scalability in the sense of concurrent users is the issue, I tend to favour multiple identical boxes with a load balancer and an RPC frontend. This can be as simple as a servlet, or you can use SOAP or XML-RPC etc. (Possibly RMI, although I‘ve never tried that across a load balancer). Doing things this way is probably a lot simpler to manage than splitting your indexes across boxes and means that even if your queries are asymmetric (ie. 85% of the queries are for the same thing), the load can be fairly balanced. Reliability is achieved for free as well - if a box dies just stop sending requests there. Given Lucene‘s performance (it has been used to index collections of more than 10 million documents) its pretty unlikely that your dataset will get so large that sheer size starts to affect your query times. Unless of course, you are google :)
本站仅提供存储服务,所有内容均由用户发布,如发现有害或侵权内容,请点击举报
打开APP,阅读全文并永久保存 查看更多类似文章
猜你喜欢
类似文章
【热】打开小程序,算一算2024你的财运
Quora - What are good resources to learn about search engine architecture?
[Bernstein09] Chapter 9. Replication
非参数检验能不能分析正态分布数据呢?
Deployment Diagram(部署图)
Two Way Light Switch Wiring
LED灯和脉冲发生器电路
更多类似文章 >>
生活服务
热点新闻
分享 收藏 导长图 关注 下载文章
绑定账号成功
后续可登录账号畅享VIP特权!
如果VIP功能使用有故障,
可点击这里联系客服!

联系客服