打开APP
userphoto
未登录

开通VIP,畅享免费电子书等14项超值服

开通VIP
17 Great Machine Learning Libraries

After wonderful feedback on myprevious post on Scikit-learnfrom the guys at/r/MachineLearning,I decided to collect the list of machine learning libraries into thisseperate note. Let me know if there’s a library that should beincluded here.


Python

  • Scikit-learn: comprehensive and easyto use, I wrote a whole articleon why I like this library.
  • PyBrain: Neural networks are one thingthat are missing from SciKit-learn, but this module makes up forit.
  • nltk: really useful if you’re doinganything NLP or text mining related.
  • Theano:efficient computation of mathematical expressions usingGPU. Excellent for deep learning.
  • Pylearn2: machinelearning toolbox built on top of Theano - in very early stages ofdevelopment.
  • MDP (Modular toolkit forData Processing): a framework that is useful when setting upworkflows.

Java

  • Spark: Apache’s new upstart,supposedly up to a hundred times faster than Hadoop, now includesMLLib, which contains a good selection of machine learningalgorithms, including classification, clustering and recommendationgeneration. Currently undergoing rapid development. Development canbe in Python as well as JVM languages.
  • Mahout: Apache’s machine learningframework built on top of Hadoop, this looks promising, but comeswith all the baggage and overhead of Hadoop.
  • Weka: this is a Javabased library with a graphical user interface that allows you torun experiments on small datasets. This is great if you restrictyourself to playing around to get a feel for what is possible withmachine learning. However, I would avoid using this in productioncode at all costs: the API is very poorly designed, the algorithmsare not optimised for production use and the documentation is oftenlacking.
  • Mallet: another Java based librarywith an emphasis on document classification. I’m not so familiarwith this one, but if you have to use Java this is bound to bebetter than Weka.
  • JSAT:stands for “Java Statistical Analysis Tool” - created by EdwardRaff and was born out of his frustation with Weka (I know thefeeling). Looks pretty cool.

.NET

  • Accord.NET: thisseems to be pretty comprehensive, and comes recommended byprimaryobjects onReddit. There is perhaps a slight slant towards image processingand computer vision, as it builds on the popular libraryAForge.NET for this purpose.
  • Another option is to use one of the Java libraries compiled to .NETusing IKVM - I have used this approachwith success in production.

C++

  • Vowpal Wabbit:designed for very fast learning and released under a BSD license,this comes recommended byterath on Reddit.
  • MultiBoost: a fast C++ frameworkimplementing some boosting algorithms as well as some cascades(like the Viola-Jones cascades). It’s mainly focused on AdaBoost.MHso it is multi-class/multi-label.
  • Shogun: large machine learning library with a focus on kernel methods and support vector machines. Bindings to Matlab, R, Octave and Python.

General

  • LibSVM andLibLinear:these are C libraries for support vector machines; there are alsobindings or implementations for many other languages. These are thelibraries used for support vector machine learning in Scikit-learn.

Conclusion

This article is a work in progress, so please send me your comments orcriticisms!

Want more? Sign up below to get a free ebookMachine Learning in Practice, andupdates on new posts:

本站仅提供存储服务,所有内容均由用户发布,如发现有害或侵权内容,请点击举报
打开APP,阅读全文并永久保存 查看更多类似文章
猜你喜欢
类似文章
【热】打开小程序,算一算2024你的财运
老照片:战争拖拉机
原创译文 | 最新顶尖数据分析师必用的15大Python库(下)
scikit-learn intallation
scikit-learn安装注意顺序
机器学习经典 Python Machine Learning 作者:新书计划曝光,分享实战经验
A Gentle Introduction to Scikit
更多类似文章 >>
生活服务
热点新闻
分享 收藏 导长图 关注 下载文章
绑定账号成功
后续可登录账号畅享VIP特权!
如果VIP功能使用有故障,
可点击这里联系客服!

联系客服