I saw weka has its ramdom forest implemented,
following Leo's one.
These days I am using leo's one and I am wondering if it exactly
follows or there is some difference.
I am not aware of any differences, but I might be mistaken...
One of my concerns is about the
maximum level for categorical variable. R's randomForest cannot handle
more than 32's levels.
Weka doesn't have this limitation.
I think this question should be directly addressed to
the author of
weka's random forest, but I just cannot find his/her address so I
Just have a look at the Javadoc, especially the @author tag.
A general suggestion is, weka should add more
information on this so
that it is easier for people to use it, like paper which it follows,
the authors' contact info, or even the document of pseudo code and
implementation. I believe they will be VERY VERY helpful.
You'll find paper references in the Javadoc as well as in the GUI in the
GenericObjectEditor (click the "More" button) when you choose a
classifier. In case of the RandomForest it is:
Leo Breiman. "Random Forests". Machine Learning 45 (1):5-32, October 2001.
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
+64 (7) 838-4466 Ext. 5174