Either way is fine. Unless the learning scheme is semi-supervised, you
should get statistically indistinguishable results. In some cases, the
actual concrete result may be slightly different, e.g., if the learning
scheme has a random component that is based on the entire, labeled and
unlabelled parts of the data.
A lot of classifiers in WEKA will actually remove all unlabelled data
straightaway in the buildClassifier() method.
Which learning scheme gave you different results?
On Sun, 26 Nov 2017 at 9:00 AM, JonasB <jonas.bisscheroux(a)gmail.com> wrote:
What would be the common method of predicting the values of a 'class' based
on classification. I am using two methods right now:
i) I include the instances with an unknown 'class' value in the dataset; I
then simply use the dataset as training set and it will provide predicted
values for 'class'
ii) I split the dataset into a separate training set (with all values
including 'class') and a test set (with all values known, except
and use the 'supplied test set' option to get the predicted values for
I noticed that these two methods will give very different predictions; but
which method would be recommended?
Sent from: http://weka.8497.n7.nabble.com/
Wekalist mailing list
Send posts to: Wekalist(a)list.waikato.ac.nz
List info and subscription status: