1. After installing autoweka, I see that weka 3.7.10
UI won't start. 3.7.13
is fine. I didn't try other versions.
Ok, thanks for reporting. I'll have a look, but fixing things for outdated
versions of WEKA isn't high on the priority list.
2. How are instance weights handled? As we know, some
weights and others don't.
There's no special distinction done in Auto-WEKA -- if the classifier supports
weights, they'll be used, if not, not.
3. I have a dataset with ~80K instances which, due to
class imbalance, I
sample down to ~26K instances with equal class distribution. Autoweka has
been running on this data for several hours and shows the following log:
Are you saying that is has been running for much longer than it should?
22:20:51: Base relation is now
init_ddp-weka.filters.unsupervised.attribute.Remove-R1 (77289 instances)
22:22:13: Command: weka.filters.supervised.instance.SpreadSubsample -M 1.0
-X 0.0 -S 1
22:22:13: Base relation is now
22:23:06: Started Auto-WEKA for
22:38:35: Performed 10 evaluations, estimated error 37.03867403314917...
22:59:00: Performed 20 evaluations, estimated error 37.31123388581952...
02:03:58: Performed 30 evaluations, estimated error 37.46408839779006...
02:07:33: Performed 40 evaluations, estimated error 37.46408839779006...
04:24:21: Performed 50 evaluations, estimated error 37.46408839779006...
04:30:41: Performed 60 evaluations, estimated error 37.46408839779006...
06:15:54: Performed 70 evaluations, estimated error 37.46408839779006...
I would like to see additional log that shows what has been done so far at
some level of detail (model-parameter tried, for example), perhaps
controllable via a logging level parameter.
The reason why we haven't enabled this by default is that you'll get a *lot* of
output which won't be relevant for most people. Auto-WEKA is also smart about
how classifiers are evaluated in that it doesn't necessarily evaluate all
configurations on the entire data. Internally, the data is partitioned into
training and test set and the training set further into 10 folds. If a
classifier has very bad performance, it may only be evaluated for one of these
What does that mean for the log? You'll get what looks like duplicate
evaluations that really aren't (which again makes the log larger) and a lot of
configurations that were only evaluated on parts of the data (i.e. if you try
the same configurations on the full data, you may get different results).
In addition, there will be some configurations that are invalid because of
particular properties of the data or weird interactions with other parts of
WEKA. So you'll get some configurations in the log that couldn't be evaluated.
While this information would be available in the log, it makes interpreting it
That said, we'll certainly consider adding such a parameter in future releases
(I've opened an issue at https://github.com/larskotthoff/autoweka/issues/17
but we'll have to think about the best way of implementing it such that
interpreting the logs doesn't become too difficult.