The problem I am having is may be trivial one. I have total 8 columns in my
dataset (the last one is a class attribute). But I want to apply
standardization filter on first six attributes and leave last two
attributes untouched. So far I know that I can apply standardization filter
on all attribute but leaving class attribute. How can I not apply filter to
last two columns in my dataset?
Good Morning everyone,
I am facing a problem with minimum Redundancy Maximum Relevance method in weka. I had read a tutorial informing that I can find it out under the Select attribute panel, using the CfsSubsetEval as an Attribute Evaluator, and a Ranking Search Method. From the Ranking Searching method, I shall leave all parameters the same except the rankedMethod and then select Peng's Max-Relevance and Min Redundancy.
Are these steps correct? If not, can any one explain to me the correct steps.
Thanks a lot!
New versions of Weka are available for download from the Weka homepage:
* Weka 3.8.2 - stable version. It is available as ZIP, with Win32 installer, Win32 installer incl. JRE 1.8.0_152, Win64 installer, Win64 installer incl. 64 bit JRE 1.8.0_152 and Mac OS X application with Oracle 64 bit JRE 1.8.0_152.
* Weka 3.9.2 - development version. It is available as ZIP, with Win32 installer, Win32 installer incl. JRE 1.8.0_152, Win64 installer, Win64 installer incl. 64 bit JRE 1.8.0_152 and Mac OS X application with Oracle 64 bit JRE 1.8.0_152.
Stable 3.8 receives bug fixes and new features that do not include breaking API changes and maintain serialized model compatibility. 3.9 (development) receives bug fixes and new features that might include breaking API changes and/or render models serialized using earlier versions incompatible.
Pentaho data mining community documentation:
Packages for Weka>=3.7.2 can be browsed online at:
What's new in 3.8.2/3.9.2?
These releases include a *lot* of bug fixes and improvements. Some of these are detailed at
As usual, for a complete list of changes refer to the changelogs.
The Weka Team
I am having this error log which prevents me from opening WEKA (attached).
I had WEKA disinstalled and reinstalled but to no avail as the issue still
Any suggetions to solve it?
What approach can be used to find the "empirical risk" of a particular
classifier in Weka.--how to find the empirical risk minimization in
logistic regression, for instance?
In sum, kindly I need the implementation of empirical risk Minimization on
PART algorithm and logistic regression algorithm.
I do appreciate if any of Weka developers help me?
Thanks for your reply. Yes, the three subsets are selected at random from the overall dataset.
Date: Sat, 16 Dec 2017 18:42:36 +1300
From: Eibe Frank <eibe(a)waikato.ac.nz<mailto:firstname.lastname@example.org>>
To: Weka machine learning workbench list.
Subject: Re: [Wekalist] Select attributes - why are the rankings
different forsubsets of the same dataset?
Content-Type: text/plain; charset="utf-8"
Have you shuffled the data before you created the three subsets? The Randomize filter in WEKA can be used for that. Alternatively, you can use the RemoveFolds filter (configuring it for a three-fold cross-validation).
From: Ronan Flynn
Sent: Saturday, 16 December 2017 12:50 AM
Subject: [Wekalist] Select attributes - why are the rankings different forsubsets of the same dataset?
I have a speech dataset that is divided into three subsets. There are approximately 90 attributes and the target is a numerical correlation value. I want to rank the attributes and have used the following:
Evaluator:??? weka.attributeSelection.WrapperSubsetEval -B weka.classifiers.functions.SMOreg -F 5 -T 0.01 -R 1 -E CORR-COEFF -- -C 0.0302 -N 0 -I "weka.classifiers.functions.supportVector.RegSMOImproved -T 0.001 -V -P 1.0E-12 -L 0.001 -W 1" -K "weka.classifiers.functions.supportVector.PolyKernel -E 1.0 -C 250007"
Search:?????? weka.attributeSelection.GreedyStepwise -R -T -1.7976931348623157E308 -N -1 -num-slots 1
When I run the attribute selection on each of the three speech subsets I get three very different ranked lists. I would have expected the rankings for the three subsets to be similar given that they are taken from the same overall speech dataset. Can anyone suggest possible reasons as to why the rankings are so different for each of the three speech subsets?
Also, is it possible when doing the ranking to output the correlation for each attribute individually? I would like to see the correlation for the individual attributes.
Regards and thanks,
T? an t-eolas at? le f?il sa r?omhphost seo faoi iontaoibh agus t? s? ceaptha le haghaidh aird an fhaighteora bheartaithe/na bhfaighteoir? beartaithe amh?in. M?s rud ? go bhfuair t? an r?omhphost seo go hearr?ideach, n? h?s?id agus n? tarchuir ? ar mhaithe le haon chusp?ir, le do thoil; ina ?it sin cuir ar an eolas muid l?ithreach agus scrios gach c?ip den r?omhphost seo ? do ch?ra(i)s r?omhaireachta. Ach amh?in sa ch?s gur comhaonta?odh a leith?id go sonrach ag ?r n-ionada? ?daraithe, is le h?dar an r?omhphoist amh?in na tuairim? a chuirtear in i?l ann, agus n? l?ir?onn siad tuairim n? n? chuireann siad ceangal ar aon chaoi eile ar Institi?id Teicneola?ochta Bhaile ?tha Luain. D?an teagmh?il le administrator(a)ait.ie n? cuir glao ar 090 6468000. The information contained in this email is confidential and is designated solely for the attention of the intended recipient(s). If you have received this email in error, please do not use or transmit it for any purpose but rather notify us immediately and delete all copies of this email from your computer system(s). Unless otherwise specifically agreed by our authorised representative, the views expressed in this email are those of the author only and shall not represent the view of or otherwise bind Athlone Institute of Technology. Contact administrator(a)ait.ie or telephone 090 6468000.
I am wondering does calling filter.setInputFormat(dataset) just after
definition of the filter technically differs than calling it just before
applying filter? More specifically, does code 1 technically differs than
StringToWordVector filter = new StringToWordVector();
filter.setInputFormat(trainingDataset); // At the first
StringToWordVector filter = new StringToWordVector();
filter.setInputFormat(trainingDataset); // At the end
I have question
– My question applies to multiple attributes, but the same answer would apply if we only consider the ‘class’ Attribute.
Can I create the class Attribute once and then add that very same attribute to multiple Instance sets?
Or is it better, every time to do a shallow copy of that attribute and use the copy to add to the new instance set?
Sent from Mail for Windows 10
I have been trying to run in Knowledgeflow in order to create an Arff
However anytime (multiple......I have tried it has miserable failed.
I am attaching a screen shot of the partial log (rather obscure to me).
Can you suggest a a resolution? I am stack and I cannot progress in
Advanced Data Mining with WEKA.