Getting Started
Classifier Reborn is a fork of cardmagic/classifier under more active development. The Classifier Reborn library is released under the terms of the GNU LGPL-2.1. Currently, it has Bayesian Classifier and Latent Semantic Indexer (LSI) implemented.
Here is a quick example to illustrate the usage.
$ gem install classifier-reborn
$ irb
irb(main):001:0> require 'classifier-reborn'
irb(main):002:0> classifier = ClassifierReborn::Bayes.new 'Ham', 'Spam'
irb(main):003:0> classifier.train "Ham", "Sunday is a holiday. Say no to work on Sunday!"
irb(main):004:0> classifier.train "Spam", "You are the lucky winner! Claim your holiday prize."
irb(main):005:0> classifier.classify "What's the plan for Sunday?"
#=> "Ham"
Here is a line-by-line explanation of what we just did.
- Installed the
classifier-reborn
gem (assuming that Ruby is installed already). - Started the Interactive Ruby Shell (IRB).
- Loaded the
classifier-reborn
gem in the interactive Ruby session. - Created an instance of
Bayesian
classifier with two classesHam
andSpam
. - Trained the classifier with an example of
Ham
. - Trained the classifier with an example of
Spam
. - Asked the classifier to classify a text and got the response as
Ham
.
Installation
To use classifier-reborn
in your Ruby application add the following line into your application’s Gemfile
.
gem 'classifier-reborn'
Then from your application’s folder run the following command to install the gem and its dependencies.
$ bundle install
Alternatively, run the following command to manually install the gem.
$ gem install classifier-reborn
Dependencies
The only runtime dependency of this gem is Roman Shterenzon’s fast-stemmer
gem. This should install automatically with RubyGems. Otherwise manually install it as following.
gem install fast-stemmer
In addition, it is recommended to install either Numo or GSL to speed up LSI classification by at least 10x.
Note that LSI will work without these libraries, but as soon as they are installed, classifier will make use of them. No configuration changes are needed, we like to keep things ridiculously easy for you.
Install Numo Gems
Numo is a set of Numerical Module gems for Ruby that provide a Ruby interface to LAPACK. If classifier detects that the required Numo gems are installed, it will make use of them to perform LSI faster.
- Install LAPACKE
- Ubuntu:
apt-get install liblapacke-dev
- macOS:
brew install lapack
- Ubuntu:
- Install OpenBLAS
- Ubuntu:
apt-get install libopenblas-dev
- macOS:
brew install openblas
- Ubuntu:
- Install the Numo::NArray and Numo::Linalg gems. If you’re using Bundler, add
numo-narray
andnumo-linalg
to your Gemfile. (If using Bundler on macOS, you should set the build config likebundle config set --global build.numo-linalg --with-openblas-dir=$(brew --prefix openblas) --with-lapack-lib="$(brew --prefix lapack)/lib"
.)- Ubuntu:
gem install numo-narray numo-linalg
- macOS:
gem install numo-narray
, `gem install numo-linalg – –with-openblas-dir=$(brew –prefix openblas) –with-lapack-lib=”$(brew –prefix lapack)/lib”
- Ubuntu:
Install GSL Gem
Note: The gsl
gem is currently incompatible with Ruby 3. It is recommended to use Numo instead with Ruby 3.
The GNU Scientific Library (GSL) is an alternative to Numo/LAPACK that can be used to improve LSI performance. (You should install one or the other, but both are not required.)
- Install the GNU Scientific Library
- Ubuntu:
apt-get install libgsl-dev
- Ubuntu:
- Install the Ruby/GSL Gem. If you’re using Bundler, add
gsl
to your Gemfile.gem install gsl
Further Readings
For more information read the following documentation topics.