I can run it on Win2k Client.
I used the following command:
C:\DJFHQ\Info_Extraction\Weka>java -classpath weka-3-2-3\weka-3-2-3\weka.jar
I have WekaMetal.jar in the CLASSPATH. I also used WinZip to "unzip" the JAR
files for Weka 3.2.3 and WekaMetal after downloading them.
My version of Java is the following:
java version "1.3.0_02"
Java(TM) 2 Runtime Environment, Standard Edition (build 1
Java HotSpot(TM) Client VM (build 1.3.0_02, mixed mode)
PS I also noticed a possible typo in your quoted string which maybe should
"G:\Program Files\Weka-3-2-3\weka.jar" i.e. the "-3" is missing.
From: Chris Bacon [mailto:email@example.com]
Sent: Saturday, 17 August 2002 10:29 AM
Subject: [Wekalist] WekaMetal and Windows?
Has anyone been able to run WekaMetal on Windows? I've tried on
Server from a command prompt where I get this:
G:\Program Files\WekaMetal>java -jar WekaMetal.jar:"G:\Program
Exception in thread "main" java.util.zip.ZipException: The filename,
ame, or volume label syntax is incorrect
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.<init>(Unknown Source)
at java.util.jar.JarFile.<init>(Unknown Source)
at java.util.jar.JarFile.<init>(Unknown Source)
I've played around with the ClassPath, but it didn't help. I've also
from within IBM Websphere Application Developer, where it gets an IO error
because it can't find the cache files (ap.cache and dc.cache), even though
they're both in the same directory as the WekaMetal.jar file.
Any help would be appreciated.
Wekalist mailing list
I am applying ID3 to a medical data set and was wondering if there is a way to
get Weka to output the results of the ID3 algorithm graphically so that the
Specialist can interpret the results easily?
I have written a classification scheme in Java using the Weka-library.
Now, I want to use the 'Experimenter' for doing some experiments
with this new classification scheme.
(Up till now I wrote my own experiments in Java, but probably the
Experimenter seems like a better option.)
In the README file it is indicated that the file
GenericObjectEditor.props is the place to be. I thus copied this
file to my home directory and I added ...
monotone.MinMaxExtension,\ #line added
monotone.OSDL #line added
This works, in the sense that in the Experimenter-gui
I now can choose from 'MinMaxExtension' and 'OSDL'.
but upon choosing one of these I get the error message
'could not create an example of weka.MinMaxExtension from the
current classpath' ...
I have to say that
1) I find it strange that the 'monotone' part is cut from the classname,
and that 'weka' is prefixed ...
2) the class 'MinMaxExtension' is part of the package 'monotone'.
I.e. the first line of 'MinMaxExtension.java' is 'package monotone;'
3) the file 'MinMaxExtension.class' is in a directory called
$SOMETHING/Source/monotone and this directory is part of my
classpath; in my .bashrc I have
Any hints on how to proceed are welcome. If possible I would like to
keep these files in the current package ....
I'm trying to build a classifier. I have a dataset with around 260K
instances. Each instance has 4 attributes. The forth attribute is the
class. But there are about 560 different classes. Will I have to list
all 560 different classes in the attribute section of the ARFF, or is
there a better way to do it?
I am working through the example of how to interact with WEKA
programmatically in chapter 8 of the book and am adapting it to use a Naïve
The example relies on a hard-coded list of attributes which are described in
the comments as "Our (rather arbitrary) set of keywords".
My question: Why do I have to provide a set of keywords? Isn't it the job
of the naïve Bayes classifier to take the training data I have, which is
already sorted into "hit" and "miss" categories, and compute what the most
telling attributes are?
Here is how I understand how a classifier should work: I should be able to
feed it a bunch of files and for each one say "this category" or "that
category" and then, when I have fed it enough to train it, I should be able
to send it a new file and it should try and predict the best category for
me. I don't understand where the list of user-provided keywords comes into
Thanks to anyone who can help me with this. Sorry for being a newbie.
Decision Systems Group
I am trying to use the Information Gain attribute selection through my java
How do I do this?
If I use for instance, DiscretizeFilter
private Filter m_Filter = new DiscretizeFilter();
before the new instance is added to the classsifier they are filtered:
Instances filteredData = Filter.useFilter(m_Data, m_Filter);
and the filtered data are used to udate the classifer
Just like the basic tutorial on Weka describes in the messageClassifier.java
Now I have to use Infrormation Gain Filter. Is this used on each instance as
the previous? I think it is used after the classifier is build. Then it
evaluates the attributes and rebuilds the classifier again??
Generally, I lost.. :<<
I will keep seaching but meanwhile i thought to make this post so as to ask
how to use the inforamtion gain in my java code. I am working on text
classification problem. I have 1000 attributes, the most frequent words of a
folder with already categoried text.
Thanks for your consideration.
FREE pop-up blocking with the new MSN Toolbar - get it now!
I have a question about using the coefficients of the WEKA Logistic
Regression model to compute the probability distribution over the target
I train a WEKA Logistic Regression model on some training data, and get
the coefficients which the model outputs. Then using those coefficients
I compute (outside of WEKA) the probability distribution over the target
classes for the instances of the training set. When I compare that to
the WEKA-produced probability distribution of the training set, there
are differences, sometimes major. It makes me think that I don't
understand how WEKA comes up with the probability distribution. Please
help, if you can.
Here is what I do:
My dataset has 9 attributes, and each one is discretized into 10 buckets
(effectively producing 90 binary attributes).
The target class is discretized into 4 classes/buckets.
I run the Logistic Regression classifier (maxIts=-1, ridge=0), and it
produces 3 sets of coefficients (90 in each set, plus the intercept), as
well as 3 sets of odds ratios (90 in each set).
For each training set instance (with attributes [x_1 .. x_90]), I
compute three log odds ratios, based on the three coefficient sets [c1_1
.. c1_90], [c2_1 .. c2_90], and [c3_1 .. c3_90]:
g1 = c1_0 + x_1*c1_1 + x_2*c1_2 + ... + x_90*c1_90
g2 = c2_0 + x_1*c2_1 + x_2*c2_2 + ... + x_90*c2_90
g3 = c3_0 + x_1*c3_1 + x_2*c3_2 + ... + x_90*c3_90.
P1 stand for P(instance is in class 1)
P2 stand for P(instance is in class 2)
P3 stand for P(instance is in class 3)
P4 stand for P(instance is in class 4).
g1 = log (P1/P4)
g2 = log (P2/P4)
g3 = log (P3/P4)
P1 + P2 + P3 + P4 = 1
P1 = exp(g1)*P4
P2 = exp(g2)*P4
P3 = exp(g3)*P4
P4 = 1/(1 + exp(g1) + exp(g2) + exp(g3))
Is this correct? The probabilities I get for the test set using the
above formulas match perfectly those produced by WEKA for some
instances, and for some are quite different. What's going on - ideas?
Thanks in advance for your help -