public class NaiveBayes extends ProbabilisticClassifier<Vector,NaiveBayes,NaiveBayesModel>
http://nlp.stanford.edu/IR-book/html/htmledition/naive-bayes-text-classification-1.html)
which can handle finitely supported discrete data. For example, by converting documents into
TF-IDF vectors, it can be used for document classification. By making every vector a
binary (0/1) data, it can also be used as Bernoulli NB
(http://nlp.stanford.edu/IR-book/html/htmledition/the-bernoulli-model-1.html).
The input feature values must be nonnegative.| Constructor and Description |
|---|
NaiveBayes() |
NaiveBayes(java.lang.String uid) |
| Modifier and Type | Method and Description |
|---|---|
NaiveBayes |
copy(ParamMap extra)
Creates a copy of this instance with the same UID and some extra params.
|
Param<java.lang.String> |
featuresCol()
Param for features column name.
|
java.lang.String |
getFeaturesCol() |
java.lang.String |
getLabelCol() |
java.lang.String |
getModelType() |
java.lang.String |
getPredictionCol() |
java.lang.String |
getRawPredictionCol() |
double |
getSmoothing() |
Param<java.lang.String> |
labelCol()
Param for label column name.
|
static NaiveBayes |
load(java.lang.String path) |
Param<java.lang.String> |
modelType()
The model type which is a string (case-sensitive).
|
Param<java.lang.String> |
predictionCol()
Param for prediction column name.
|
Param<java.lang.String> |
rawPredictionCol()
Param for raw prediction (a.k.a.
|
NaiveBayes |
setModelType(java.lang.String value)
Set the model type using a string (case-sensitive).
|
NaiveBayes |
setSmoothing(double value)
Set the smoothing parameter.
|
DoubleParam |
smoothing()
The smoothing parameter.
|
protected NaiveBayesModel |
train(DataFrame dataset)
Train a model using the given dataset and parameters.
|
java.lang.String |
uid()
An immutable unique ID for the object and its derivatives.
|
StructType |
validateAndTransformSchema(StructType schema,
boolean fitting,
DataType featuresDataType) |
StructType |
validateAndTransformSchema(StructType schema,
boolean fitting,
DataType featuresDataType)
Validates and transforms the input schema with the provided param map.
|
MLWriter |
write()
Returns an
MLWriter instance for this ML instance. |
setProbabilityCol, setThresholdssetRawPredictionColextractLabeledPoints, fit, setFeaturesCol, setLabelCol, setPredictionCol, transformSchematransformSchemaclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitsaveclear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn, validateParamstoStringinitializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarningpublic NaiveBayes(java.lang.String uid)
public NaiveBayes()
public static NaiveBayes load(java.lang.String path)
public java.lang.String uid()
Identifiableuid in interface Identifiablepublic NaiveBayes setSmoothing(double value)
value - (undocumented)public NaiveBayes setModelType(java.lang.String value)
value - (undocumented)protected NaiveBayesModel train(DataFrame dataset)
Predictorfit() to avoid dealing with schema validation
and copying parameters into the model.
train in class Predictor<Vector,NaiveBayes,NaiveBayesModel>dataset - Training datasetpublic NaiveBayes copy(ParamMap extra)
Paramscopy in interface Paramscopy in class Predictor<Vector,NaiveBayes,NaiveBayesModel>extra - (undocumented)defaultCopy()public DoubleParam smoothing()
public double getSmoothing()
public Param<java.lang.String> modelType()
public java.lang.String getModelType()
public MLWriter write()
MLWritableMLWriter instance for this ML instance.write in interface MLWritablepublic StructType validateAndTransformSchema(StructType schema, boolean fitting, DataType featuresDataType)
public Param<java.lang.String> rawPredictionCol()
public java.lang.String getRawPredictionCol()
public StructType validateAndTransformSchema(StructType schema, boolean fitting, DataType featuresDataType)
schema - input schemafitting - whether this is in fittingfeaturesDataType - SQL DataType for FeaturesType.
E.g., VectorUDT for vector features.public Param<java.lang.String> labelCol()
public java.lang.String getLabelCol()
public Param<java.lang.String> featuresCol()
public java.lang.String getFeaturesCol()
public Param<java.lang.String> predictionCol()
public java.lang.String getPredictionCol()