The Explorer uses pooling to compute performance statistics. The command-line interface and the KnowledgeFlow use the same approach. This makes it easier to draw ROC curves, etc.


Due to this approach, the confusion matrix is computed from the pooled results for the k test sets in a k-fold cross-validation.


One disadvantage of this approach is that it does not produce estimates of variance. The Experimenter uses averaging rather than pooling and it will give you estimates of variance.





From: Edward Wiskers
Sent: Monday, 1 January 2018 1:40 PM
To: Weka machine learning workbench list.
Subject: Re: [Wekalist] Confusion matrix in the Explorer for cross-validationmode


Thanks Peter. 


Why sum not average?






On 1 Jan 2018 7:03 a.m., "Peter Reutemann" <> wrote:

> In the Explorer, is the confusion matrix computed as the sum or average of
> the 10 confusion matrices (ie, sum or average of the confusion matrices of
> each of the 10 folds)?

Sum - from the 10 test folds.

Cheers, Peter
Peter Reutemann
Dept. of Computer Science
University of Waikato, NZ
+64 (7) 858-5174
Wekalist mailing list
Send posts to:
List info and subscription status:
List etiquette: