OleanderSoftware

Products
Readability Studio
Stemming Library
Industries
Education
ESL Education
Healthcare
Information Retrieval Systems
Military & Government
Publishing
Company
Home Page
Contact Us
About Us
  • Overview
Related links
  • Overview of stemming
  • Porter stemming algorithms
  • Other Porter implementations
stemming library
Overview

Stemming is a normalization process used to reduce words down to their root. Stemming removes inflectional suffixes so that morphological variants of the same word can be compared more easily. For example, the words “predicts”, “prediction”, and “predicted” will all be seen as having the same root, “predict”, after being stemmed and therefore would be considered the same word.


Stemming is primarily used in Information Retrieval systems where “fuzzy” string matching is a necessity. IR systems that allow users to search for a particular word and all its variants, rather than only matching the user's precise query, normally use stemming to accomplish this. IR systems include desktop and web-based search engines.


The Oleander C++ stemming library is an implementation of the Porter stemming algorithms and supports most Western European languages.

Features
  • Full implementations of the Porter stemming algorithms
  • Includes stemmers for English, Danish, Dutch, French, Finnish, German, Italian, Norwegian, Portuguese, Spanish, Swedish, and Russian
  • Case-insensitive text handling
  • Designed for C++'s standard wstring class (Unicode strings)
  • BSD licensing
Download
  • Oleander Stemming Library

Privacy PolicyAbout UsContact Us
Copyright © 2019, Oleander Software, Ltd. All rights reserved.