How about adding another word token to the bag-of-words model: "UNKNOWN"?
[mailto:email@example.com] On Behalf Of ttou
Sent: Saturday, July 28, 2007 7:04 PM
Subject: [Wekalist] On String to Word Vector
I have issues to utilize conversion from string to word vector.
I have test and training data set which some string data only existing in
test dataset but not training data set. As it is applied to conversion. It
results in incompatible datasets.
Since the string value will be replaced with another attribute in test data
set arff file. Any pointer to resolve the issue for model training and