Re: [Wekalist] Re: Retail Application with WEKA - Please share your
by Hans van Rijnberk , Assort Vision, Utrecht
The type of forecasting ment I guess spans over 2 or 3 years at most and
thus time trend ananlysis isn't of much practicle use, while classification
analysis might be. So the forecasting here is more like predicting the near
future by use of a static model of the near past. I am more inclined to
think of classification trees, logistic regression and discriminant function
and Neural nets (an d other techniques I am not familiar with) as more
appropriate for this task. It is though of utmost importance to select the
correct predictor variables (by Attribute selection in Weka). I would try
the GUI version and prepare a training and a test set, processing it with a
model selection scheme with
- attribute selection
- Classification (logistic regression, C4.5 (J48 in Weka) or a Neural net) and
- the application of the same scheme or model to the test set.
At 14:11 4-4-2004 +1200, Sionep wrote:
>Harry Wells wrote:
>> Dear All,
>> I was wondering how might data mining be applicable to sales forecasting,
for example shoes.
>I am not sure if you can do forecasting using WEKA. Perhaps the WEKA
>team can clarify this for this list. As my understanding of the area of
>forecasting based on my experience in numerical computing , you can
>attack this problem from different methodologies.
>- Use System Identifications techniques (such as ARMA - Autoregressive
> Moving Average)
>- Use Kalman Filtering techniques
>- Use Wavelets Analysis techniques
>- Use Hybrids Soft Computing techniques such
> as ANFIS (Adaptive Neuro-Fuzzy Inference Systems) and
> CANFIS (Co-Active Neuro-Fuzzy Inference Systems).
>- and many more other methods, ...
>Now I have used all of the above using MATLAB. The system ID toolbox in
>MatLab is very sophisticated. ANFIS & CANFIS are available in the Neural
>Network & Fuzzy Logic toolbox. I have written some Java classes in
>System ID for time-series analysis including forecasting. I did include
>some wavelets classes to fine-tune System ID algorithms. I did develop
>this work for a computational finance work I was involved before , for
>the analysis and forecasting of the stock-price movement. Wavelets does
>a very good job of decomposing the frequency components of the
>time-series and made the job of the SystemID algorithm easier in
>identifying trends, shocks, and good prediction (forecasting) to a good
>degree. I have not done any Java work in Kalman Filtering yet but it is
>still on my to do list. I am currently trying to combine FuzzyJ (Fuzzy
>Logic Java Toolkit API - a commercial tool) which is available to be
>downloaded from the internet and JOONE (Java Object Oriented Neural
>Engine) which is an open source project in Java. My aim is to develop a
>Neuro-Fuzzy package where ANFIS & CANFIS would be available as well.
>Neuro-Fuzzy computing (soft-computing in general) is very popular in
>data mining at the moment.
>There was an industrial mathematics week in Janurary (from 26th to
>30th) of 2004 that was held at Auckland University, New Zealand which I
>attended and there were some presentations on forecastings. Neuro-Fuzzy
>techniques was shown of how to predict weather patterns (time-series)
>for a local power company (Transpower New Zealand). It is vital for this
>company to be able to forecast in advance the power demand and the wind
>velocity for the optimal performance of operating the wind-farm
>generators. Another group of mathematicians also showed by using the
>same data that Kalman Filtering is also a robust method for forecasting.
>Another group also used ARMA to do forecasting using exactly the same
>Transpower data. A local government research institute shows of how
>using wavelets techniques to forecast the frequency of earthquakes and
>where it might likely to occur. Now enough of that.
>The JDMAPI (Java Data Mining API) which is JSR-73 will implement
>forecasting modules more likely to be in the next version 2. This a
>comment made to me by the lead-spec of JSR-73 Mark Honick of Oracle.
>Mark has also invited me personally to join the expert group for JSR-73
>which is responsible for the development of JDMAPI. If I will join this
>group (I have not decided yet) my main aim is to push for:
> - multi variate statistical analysis sub-package
> - systems IDs algorithms
> - soft-computing hybrids (Neuro-Fuzzy, Neuro-Genetics, Fuzzy-Bayesian,
> etc,.. )
> - numerical computing techniques as digital filters, wavelets
> and Kalman filtering.
> - Rough Sets Analysis
>I have developed a full Java statistical API (Univariate & Multivariate)
>for my own work, some wavelets modules, system IDs and may be this work
>could be used as a basis to be modified and included in JSR-73 work for
>upcoming work in version 2 if I ever join the expert group for JSR-73.
>Finally , there may be a way to do forecasting in WEKA , but I will
>leave that verdict to the WEKA team to comment.
>Wekalist mailing list
Hans van Rijnberk
Assort Vision (machine vision software & information services)
3524 KM Utrecht,
031 (0)30 2148681 / 2889531