I am performing attribute selection and i am using the Wrapper with J48. I
tested with 23 attributes and it took about 8 hours to finish. Now my data
extend to over 120 features, so imagine the time needed. The worse thing is
not the time but actually not seeing any progress while waiting.
Is there anyway that i can implement this process in my code or use command
lines so that i can:
- The number of datasets needs searched
- See every dataset that have been traversed
- See the result of each dataset tested
I know that my only option if the above is not doable, is to perform
unsupervised learning (preprocessing) to reduce the features and then use
the wrapper, but i wanted to see if its possible or not to keep track of
the progress in the wrapper.
From: wekalist-bounces(a)list.waikato.ac.nz <wekalist-bounces(a)list.waikato.ac.nz> on behalf of wekalist-request(a)list.waikato.ac.nz <wekalist-request(a)list.waikato.ac.nz>
Sent: Friday, March 4, 2016 6:00 PM
Subject: Wekalist Digest, Vol 157, Issue 12
Send Wekalist mailing list submissions to
To subscribe or unsubscribe via the World Wide Web, visit
or, via email, send a message with subject or body 'help' to
You can reach the person managing the list at
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Wekalist digest..."
1. Re: Release of Auto-WEKA (Lars Kotthoff)
2. Re: Multi-Label Classification or Single-Label Classification
Problem? (Maria Kerthin)
Date: Fri, 4 Mar 2016 13:35:49 -0800
From: Lars Kotthoff <larsko(a)cs.ubc.ca>
To: Eibe Frank <eibe(a)waikato.ac.nz>
Cc: "Weka machine learning workbench list."
Subject: Re: [Wekalist] Release of Auto-WEKA
Content-Type: text/plain; charset=US-ASCII
> I don't think it's worthwhile to do this. However, the Description.props file
> for the package should be changed:
> What about the evaluation process? When data is split into training and test
> sets for Auto-WEKA's internal evaluation, weights are probably not taken into
> account to determine the split? Also, error on the test set is probably
> computed by taking weights into account?
The weights are not taken into account for splitting. The error is computed
using WEKA's own methods.
> WEKA's evaluation module does take weights into account when computing
> performance statistics, but weights are not used to determine splits into
> training and test sets (i.e., in a 90/10 split, you cannot be sure that 90%
> of the weight mass is in the training set and the rest in the test set).
Sounds like Auto-WEKA is doing exactly the same thing :)
Date: Sat, 5 Mar 2016 06:39:45 +0800
From: Maria Kerthin <mariakerthin(a)gmail.com>
To: "Weka machine learning workbench list."
Subject: Re: [Wekalist] Multi-Label Classification or Single-Label
Content-Type: text/plain; charset="utf-8"
In certain class, if you have only two binary labels (e.g., 1 or 0) then it
is multi-label learning, while if the class has more than 2 labels, it is a
mulit-target learning. You need to know your data to select the propper
On 5 Mar 2016 02:58, "Blessing Ojeme" <bojeme(a)cs.uct.ac.za> wrote:
> Dear Weka users,
> Please I need clarifications on what classification problems my research
> falls into. I have a depression dataset in this format:
> patient 1, symp1, symp 2, symp3, depression diag, DepressionComorbidWith.
> I have 1090 data instances, 23 attributes (symp 1 to symp 23)and two class
> attributes (depression diag, depressionComorbidWith)
> My study is about separating depressive disorders from physical disorders
> using machine learning algorithms. My question: Is this a multi-label
> classification problem or a single-label classification problem. Whatever
> classification problem it is, any suggestion on how to go about this
> Thank you
> Blessing Ojeme
> PhD student, Dept of Computer Science,
> University of Cape Town, +27725574409
> Wekalist mailing list
> Send posts to: Wekalist(a)list.waikato.ac.nz
> List info and subscription status:
> List etiquette:
I apologize for a very awkward question. I made an effort to be more
descriptive, but honestly, even I myself am not too pleased with this
I have a dataset as follows,
Patient X, test type A, result1, result2, result3
Patient X, test type B, result1, result2
Patient Y, test type A, result1
Patient Y, test type B, result1, result2
Basically, each patient can have many test types, and for each test type,
each patient may have 1 to n actual tests performed.
Obviously, this data format is not going to work for weka, so I need to
shrink it - for example, I need to come up with a way where I can take the
3 results from test type A ( patient X), and merge them into one single
value (lets not argue about this - thanks to the nature of the data, I can
confirm that this is true)
Basically, if I can do that, and have a single value for each patient and
test type combination, I can use that to process my dataset. As an example,
i'd need -
Patient id, test type A, test type b
id, value X, value Y
id, value c, value d
A colleague tells me that he uses multi instance learning performing the
above. I looked at multi instance, and tried it out. As far as I can read,
this is NOT the purpose of multi instance. And anyways, multi instance is
supervised, so it doesn't suit my needs.
I'm considering the 'mergemanyvalues' filter instead, as an alternative.
Could someone please confirm that i'm right that multi instance is NOT the
way, and that I should be doing something like mergemanyvalues?
we have just released a new version of Auto-WEKA, which provides automatic
model selection and hyperparameter tuning -- you give it some data, and it goes
away and figures out what the best classifier (and the best options for that
The package is available through the package manager. After installing, you can
use it through the new Auto-WEKA tab. Any feedback appreciated!
You can find more information at the project website at
http://www.cs.ubc.ca/labs/beta/Projects/autoweka/ and the source at
Hi Mr Eibe
I cant run FLDA() and A2DE algorithm in eclipse with jdk_8.
I have an error in evaluation command.
please send me the correct form this code ?
FLDA flda = new FLDA();
Evaluation eval = new Evaluation(newAttTrain);
System.out.println("with attribute selection"+"\n"+"F-Score
I’m building a classifier using NaiveBayesUpdateable. I hope to calculate the final results based on the predicted probability. However, all probabilities I received from classifier are always closed to 0/1, as shown below:
There seems to be something wrong with the probability calculation. I suppose to get a more natural prediction like “positive: 0.72 negative 0.28” rather than a set of extreme values. I don’t know there is anything wrong with my configuration or samples, as below:
NaiveBayesUpdateable _classifier = new NaiveBayesUpdateable();
Size of Training Examples: 500 (including 89 positive instances)
I will appreciate if you can give me some advice about how to fix it.
Thanks for help,
Hi, I am trying to convert my CSV file to ARFF using weka converters. While doing so, the converters treat the attributes having blank values in the data as Strings which causes problems in applying the algorithms.
Hence I want to use the RemoveType filter which is available in weka(GUI). in weka java API, I used the filter the following way.
RemoveType rt=new RemoveType();
I am getting errors in using this useFilter. and Can anyone explain the "SelectedTag" type as I had to add it since I was getting errors if I directly use the next statement.
Sent from http://weka.8497.n7.nabble.com