Getting Started

Classifier Reborn is a fork of cardmagic/classifier under more active development. The Classifier Reborn library is released under the terms of the GNU LGPL-2.1. Currently, it has Bayesian Classifier and Latent Semantic Indexer (LSI) implemented.

Here is a quick example to illustrate the usage.

$ gem install classifier-reborn
$ irb
irb(main):001:0> require 'classifier-reborn'
irb(main):002:0> classifier = ClassifierReborn::Bayes.new 'Ham', 'Spam'
irb(main):003:0> classifier.train "Ham", "Sunday is a holiday. Say no to work on Sunday!"
irb(main):004:0> classifier.train "Spam", "You are the lucky winner! Claim your holiday prize."
irb(main):005:0> classifier.classify "What's the plan for Sunday?"
#=> "Ham"

Here is a line-by-line explanation of what we just did.

Installation

To use classifier-reborn in your Ruby application add the following line into your application’s Gemfile.

gem 'classifier-reborn'

Then from your application’s folder run the following command to install the gem and its dependencies.

$ bundle install

Alternatively, run the following command to manually install the gem.

$ gem install classifier-reborn

Dependencies

The only runtime dependency of this gem is Roman Shterenzon’s fast-stemmer gem. This should install automatically with RubyGems. Otherwise manually install it as following.

gem install fast-stemmer

In addition, it is recommended to install either Numo or GSL to speed up LSI classification by at least 10x.

Note that LSI will work without these libraries, but as soon as they are installed, classifier will make use of them. No configuration changes are needed, we like to keep things ridiculously easy for you.

Install Numo Gems

Numo is a set of Numerical Module gems for Ruby that provide a Ruby interface to LAPACK. If classifier detects that the required Numo gems are installed, it will make use of them to perform LSI faster.

Install GSL Gem

Note: The gsl gem is currently incompatible with Ruby 3. It is recommended to use Numo instead with Ruby 3.

The GNU Scientific Library (GSL) is an alternative to Numo/LAPACK that can be used to improve LSI performance. (You should install one or the other, but both are not required.)

Further Readings

For more information read the following documentation topics.