public class IDF
extends java.lang.Object
idf = log((m + 1) / (d(t) + 1))
, where m
is the total
number of documents and d(t)
is the number of documents that contain term t
.
This implementation supports filtering out terms which do not appear in a minimum number
of documents (controlled by the variable minDocFreq
). For terms that are not in
at least minDocFreq
documents, the IDF is found as 0, resulting in TF-IDFs of 0.
param: minDocFreq minimum of documents in which a term should appear for filtering
Modifier and Type | Class and Description |
---|---|
static class |
IDF.DocumentFrequencyAggregator
Document frequency aggregator.
|