I have a dataset with somewhat unbalanced classes. Class A has roughly 10
times as many instances as class B. Let R be the raw data set, and let S
be the manually boosted dataset, where I have included 10 copies of every
instance of class B, to even out the classes and force the classifier to
try and separate the classes. I get very, very different classifiers if I
run SimpleLogistic on R vs S, as to be expected. Let Model M be the model
built on dataset R, and let Model N be the model built on dataset S.
Assuming we ultimately calibrate the predictions of Model N using Isotonic
Regression, is it a bad idea to manually boost when using SimpeLogistic?
Should the parameters to SimpleLogistic be adjusted to help prevent
LogitBoost from overfitting? How unbalanced should classes be before you
recommend using a manual boosting method such as this?
Just wanting some answers from people with more experience than me.
At a minimum, you need to know the structure of the training data in order
to initialise an incremental classifier in Weka. In other words, the
buildClassifier() method needs to be called once with an empty set of
Instances (just attribute information defined). After this, you can call
updateClassifier() with individual instances.
From: <wekalist-bounces(a)list.waikato.ac.nz> on behalf of Yanchao YU
Reply-To: "Weka machine learning workbench list."
Date: Monday, 9 November 2015 10:09 pm
To: "Weka machine learning workbench list." <wekalist(a)list.waikato.ac.nz>
Subject: [Wekalist] Questions related to build an incremental Clasisfier
> Hi All,
> I am a beginner to use Weka on my project. I tried to implement an incremental
> Classifier (SGD) for learning unknown objects. I followed some tutorials for
> building classifier using existing data instances
> (http://www.slideshare.net/vrohit13/weka-incremental-learning). However, in my
> situation, I cannot receive the training samples in the beginning. Is it
> possible to build classifier without or with less instances?
> My code is shown below:
> _isExisted = true;
> Thanks for your help in advance.
> _______________________________________________ Wekalist mailing list Send
> posts to: Wekalist(a)list.waikato.ac.nz List info and subscription status:
> http://list.waikato.ac.nz/mailman/listinfo/wekalist List etiquette:
I am building an incremental SGD classifier on my project. I hope to output the probability for each instance. However, in my result what I received is only binary results like below:
Does someone know what happens here? Should I set up some parameters to get the probabilistic results while building the classifier?
Many thanks for your help in advance,
I have a training file with numerical features and my target label has 2
I use weka 3.7.12. and had installed gridsearch via package manager.
Using grid search with SMO gives the following error
ava -cp /users/amita/software/weka-3-7-12/weka.jar weka.Run
-t /Users/amita/trainingdata_balanced.arff -x 10 -c last -E ACC
-y-property classifier.kernel.gamma -y-min -3.0 -y-max 3.0 -y-step 1.0
-y-base 10.0 -y-expression pow\(BASE,I\) -x-property classifier.c -x-min
-3.0 -x-max 3.0 -x-step 1.0 -x-base 10.0 -x-expression pow\(BASE,I\)
-sample-size 100.0 -traversal ROW-WISE -log-file /Users/amita -num-slots 1
-S 1 -W weka.classifiers.functions.SMO -- -C 1.0 -L 0.001 -P 1.0E-12 -N 0
-V -1 -W 1 -K "weka.classifiers.functions.supportVector.RBFKernel -G 0.01
java.beans.IntrospectionException: Method not found: isClassifier
If I execute only SMO, I do not get any error.
The following command works.
java -cp /users/amita/software/weka-3-7-12/weka.jar
-C 1.0 -L 0.001 -P 1.0E-12 -N 0 -V -1 -W 1 -K
-E 1.0 -C 250007" -t /Users/amita/trainingdata_balanced.arff -x 10 -c last
Any help is highly appreciated. Thanks for your time.
Graduate Student Researcher
Natural Language and Dialogue Systems Lab
Baskin School of Engineering
University of California Santa Cruz
The JavaDoc for interface weka.classifier.Classifier says:
For classifyInstance: "Note that a classifier MUST implement either this or distributionForInstance()."
For distributionForInstance: "Note that a classifier MUST implement either this or classifyInstance()"
So does this mean one has to expect that either of the two could not be implemented?
What is the best way to detect this? There does not seem to be a capability
and the JavaDoc does not say anything about what the expected result is when
the method is not implemented.
Do all the algorithms that come with the Weka standard distribution (current development)
implement both methods correctly?
Can't remember your password? Do you need a strong and secure password?
Use Password manager! It stores your passwords & protects your account.
Check it out at http://mysecurelogon.com/password-manager
I've built a decent predictor in Weka of whether two sperm whale flukes are
a match in two separate images. When I take a similar approach with dolphin
dorsals, I get great model fits but terrible prediction power. Many, many
attempts at filtering, reducing attributes, switching models, examining my
algorithms, etc. have not resolved the situation.
I am a newbie. I am RTFM'ing...but there's a lot of FM!
I have ARFF, data visualization pages, and lots of insight into my data.
Are you an expert Weka user? Interested in being a co-author? I could use
your help as a specialist. Please contact me at jason(a)wildme.org.
View this message in context: http://weka.8497.n7.nabble.com/Weka-Help-for-Dolphins-tp35880.html
Sent from the WEKA mailing list archive at Nabble.com.
I have a question which may be simple for you.
I want to use simple recurrent network algorithm for my data but as far as
I know, weka doesn't have this algorithm. Should I use multilayer
perceptron instead of SRN?
Or specifically what is the difference between the two?
Sorry to bother you for this question but I couldn't find anything about
weka's including SRN.
Enes Avcu, M.Sc.
Department of Linguistics and Cognitive Science
University of Delaware
I'm evaluating a dataset using filtered classifier and a number of
different clarification algorithms.
For each of these, i'd like to view information on the actual model
parameters that was trained.
For example, i'd like to see what are the coefficients of each feature in
logistic regression, what is the variable importance of each feature in the
RF model etc.
Looking at the weka results and the visualization tab, I don't see how to
obtain these? can anyone help me understand where they may be found? :-(