Hi. I'm working on a regression problem where I have some attributes
which I see as useful but I can't convince many learning algorithms that
The problem seems to be that a lot of algorithms expect attributes to
have a normal distribution or at least some distribution that is
somewhat bell-like. The attributes in question have a distribution much
more like an exponential or geometric distribution.
I tried discretizing the variables. I didn't like equal width bins for
such a distribution, but this made some of the variables "available" to
the algorithms. Using equal height bins was useful too, but not enough.
Also, I found important split points using DecisionStump recursively,
leading to a real good binning, but this was a manual process and not
Does anyone have good references or rules of thumb for dealing with this
sort of attribute?
For true timeseries problems I suggest you look at the approaches typically used by econometricians. There are some excellent tools available. Perhaps you should look at the Ox and related systems available at http://www.nuff.ox.ac.uk/Users/Doornik/.
From: Gopal Annasundaram [mailto:firstname.lastname@example.org]
Sent: Thu 18.9.2003 6:39
Subject: [Wekalist] time series
Based on weekly consumption of items, I'm trying to predict the inventory trend for x weeks in future for each item. The consumption can vary based on seasonality, promotion etc. etc. Is there a WEKA class available to do classify & predict time series data? (something other than TClass by Waleed Kadous <mailto:email@example.com> posted in an earlier mail thread.)
I've tried to run KEA (a Weka based keyphrase extractor) on a
collection of 2.000 documents. Sadly it crashed after about one hour of
processing returning a "out of memory" error.
Since I was thinking on using WEKA platform for intensively document
processing I wonder whether it will stand the amount of data to
process. Just to know if anybody else had similar experiences of if
anyone of you has more facts about WEKA limitations. I guess that these
problems rise whenever I have a large amount of large instances. In my
implementations I have to either use an external database or external
files to store intermediate processing results and work in an almost
Any suggestions about it?
___________________________Arturo Montejo Raez
ETT-EC-EX (CERN) http://cern.ch/amontejo
Based on weekly consumption of items, I'm trying to predict the inventory trend for x weeks in future for each item. The consumption can vary based on seasonality, promotion etc. etc. Is there a WEKA class available to do classify & predict time series data? (something other than TClass by Waleed Kadous posted in an earlier mail thread.)
I am a Graduate research student in West Virginia University.
I am using some of the methods in WEKA tool.
I want to know some details of Neural Network used in WEKA, like what type
of Back propogation algorithm is used, what is the decay constant, if any?Thank you.
Sujan T.V. Parthasaradhi.
have got anybody experience
with weka and data which are
a mix for fixed and random attributes
and multi-states which are often used in
clinical statistics i.e :
Id Gender Profit/TimePeriod TargetClass
1 girl 100 Normal
1 girl 120 Premium
2 boy 30 Normal
2 boy 10 Loosed
3 girl 25 Normal
3 girl 47 Premium
So my intention is to get probability transitions for every person. Time
period (..for one row track!)could be one month and so
i know the probability how probable it is that customer Id1 stay in the
state Premium for next time, or change to state Loosed......
Until now i work with Bayesian and Hidden-Markov approaches!?
Many thanks for a starting point or more
I'm currently developing a CRM module using the WEKA tool to extract
Assocation Rules that will eventually be used by a JESS Expert System. I
have been looking at automating the updating of the JESS rules using the
rules contained in the concept description output by WEKA and was wondering
if anyone has had any similiar experience that you think I may benefit from?
Looking through the list archive I also came across the mail concerning the
Academy project that appears to be analagous to my project. In the mail it
was mentioned that there were PMML output classes that may be implemented
into the next release of WEKA. I was wondering as to the status of this
implementation? Thanks in advance for any responses.
Dear Tony hi!!!
My name is Andreas Symeonidis and I am a Research Associate for CERTH, Greece.
I am also the assigned contact person for Agent Academy, a platform for embedding intelligence extracted through data mining techniques.
As far as your question is concerned, I am happy to inform that the platform is almost finished, so by the end of this week (or the next one at the latest) we will have provided the beta version of our platform.
Agent Academy uses core functionalities of teh WEKA suite in order to perform data mining, creates PMML documents and uses JESS in order to incorporate this knowledge into JADE agents. Until now, we can use knowledge extracted through decision trees, association rules and clustering. Results seem quite promising...
Agent Acdemy has been uploaded to SourceForge (http://sourceforge.net/), so can find all necessary information about the platform, as well as the source code there.
Nevertheless, if you need any additional information, I would be most willing to help.
Best regards to all the wekans,