Tuesday, July 30, 2013

An N-Gram Parser in F#

This is a very basic n-gram parser in F#. It takes an IEnumerable containing the word tokens that you want N-Grams for, a number indicating the n-gram order, and a delimiter for the output. It returns an IEnumerable with each string containing a single n-gram with the words separated by the delimiter provided in the third parameter. For example:
NGramParsing.GetNGrams(new [] { "This", "is", "a", "test" }, 2, "|");
Would yield:
{ "This|is", "is|a", "a|test" }

No comments: