Hello, I can not seem to find any information regarding the results
which are produced by Weka after a cross-validation. Assuming
10-fold cv, how does Weka produce the confusion matrix. Is it
based on the results of the best fold, the last fold, or averaged in
some weird and wonderful way?
Also but less importantly, is it possible to capture the model/s that
are produced during cross-validation?
Show replies by date
I'm working on a text-classification problem where the attributes fall
into some explicitly predefined groups: words, POS tags, word bigrams,
etc. What I would like to do is build a NaiveBayes classifier for each
group, and then use some kind of learned linear weighting function to
combine the output of each classifier. My problem is that there does not
appear to be any way to label attributes as belonging to a specific group.
I could rely on their order of appearance in the datafile (e.g.,
attributes 0-100 are words, 101-150 are POS tags...) but the ordering gets
messed up by the attribution selection filters that I am using. Any
ideas? I could put a marker in the attribute name, but I'd prefer
something a little less clunky.