How to perform attribute selection in KnowledgeFlow using ranking and
InfoGainAttributeEval. I could not link the Ranker method with any
component including InfoGainAttributeEval.
Note that I am not interested in applying AttributeSelectedClassifer here.
I am using WEKA GUI (3.8.5) for my classification task. I have built several classification models that classify each instance to 0 or 1.
The training set consists of 791 instances for class 1 and 18939 instances for class 0.
The test set consists of 993 instances for class 1 and 3939 instances for class 0.
MCC values of training and test set are above 0.95 for all classification models.
After saving the models, I have loaded them to make predictions for unlabeled data with 84656 instances.
I followed the steps below:
1. prepare unlabeled test data with notepad: insert '?' in class labels
2. load train data in preprocess window
3. load a saved model in result list in classify window
4. load unlabeled test data on supplied test set
5. click 'more options' and choose 'Plaintext' for the output predictions
6. click 're-evaluate model on current test set'
However, as the result, all models predicted all instances in unlabeled test data as class 1.
In summary section, all instances were positioned in 'ignored class unknown instances.
Is the model simply overfitted to the training data? or is it due to some kind of mistakes in my workflow?
Good day to everyone
My question is if it is possible we can add outliers to our
regression-based data using Weka explorer?
There are indeed some outliers already in my data, but I want some ranges
of the outliers i.e. 20%, 40%, 70% and 100% differences from the original
looking at the documentation [https://www.cs.waikato.ac.nz/ml/weka/Witten_et_al_2016_appendix.pdf] it seems that whenever a MultiClassClassifier is used, the multi-class problem is treated as a binary one (I guess comparing one class vs the rest). However, there is no reference on how the problem is handled without toning the base classifier. To be clear:
1. if I use i.e. a Random Forest or J48 classifiers for multiclass classification, is the problem treated as a binary one by default?
2. How can I understand if the problem is handled as binary or multiclass?
first: Weka is a great tool, congratulations!
I am testing different algorithms and the truth is that CatBoost is quite interesting for some lines of work.
I can run XGBoost inside Weka without problems (via R-Plugin). Is there a way to do something similar with CatBoost?
Perhaps via wekaPhyton? If so, I would appreciate an example, just toy code.
thanks again to the Weka devs!