we reported a couple of bugs in the Jira, but we are
not sure anymore if
that is actively used.
With Pentaho's sale to Hitachi and the subsequent departure of Mark
Hall from Pentaho, I doubt that this is still actively being used.
So we just post them here as well:
- Tree-based classifiers
crash with data close to MaxDouble
- BayesNet crashes with
many starved categories
- Bug in SMO or
Standardization with values close to MaxDouble
- Number of KMeans
clusters in EM is not checked
crashing with FilteredDistance for MaxDouble
The issues contain all details for reproduction.
Two questions for the future:
- Should we report similar bugs or are these uninteresting corner
cases? I got another one, with a possibly preventable underflow in
IMHO These edge cases are unlikely to happen for most users. However,
you could provide patches against the trunk branch of Weka's
subversion repo to fix the issues (assuming they are not detrimental
to the performance). Always happy to receive bug fixes. ;-)
- Where should we report them?
On this mailing list.
Dept. of Computer Science
University of Waikato, NZ
+64 (7) 577-5304