this is the first time to use this list, really hope that i do send this
and my question is that by using the following program, i met a error, that
i do not know how to deal with that.
the code are:
Instances shotInstances =
BufferedWriter writer = new BufferedWriter(new
and the error(if necessary, i can also post all about the error here):
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: -1
a little thing about my Instancese, that is Instances with five relational
attributes, each of them have a different size from 0 to maybe 4 or 5.
there are totally 240 Instances.
Can anyone give me a help to the error? even some tipps to help me to edit
with best regards to all
I am using Weka and I have a list of words which are by default saved as Nominal. But, I want them to be Strings. On the other hand, Naive Bayes classifier can not handle String values. So, I have decided to use two filters. I should Firstly filter my .csv data and change Nominals to Strings and then change Strings to WordVectors.
For the first part, I am using the following code. But, when I look at the created file, I still see my attribute which is saved as Nominal.
Here is my code:
String file = "G:\\1_training_feats.csv";
String res_file = "G:\\training_set_without_nominals.arff";
// load CSV
CSVLoader loader = new CSVLoader();
Instances data = loader.getDataSet();
NominalToString filter1 = new NominalToString();
data = Filter.useFilter(data, filter1);
// save ARFF
ArffSaver saver = new ArffSaver();
I face no error and I get the "res_file" created, but still the type of the attributes are Nominal). Also, I tried to load the "res_file" which is an arff file to do the filtering on that data. But, again I got the same results, a file including nominal types.
Can anybody please tell me what is wrong with my code?
I am trying to write some code to add a value to a nominal attribute like this:
it seems that if the attribute is nominal I get an error: Value not defined for given nominal attribute!
I tried to add the value first as in:
but nothing happened and I still get the same error message
Can you please help me?
I am using weka 3.6.11-SNAPSHOTin my jython project.
I use MultiFilter in my jython code and it works fine, here is a snippet:
multiFilter = MultiFilter()
dataSet = Filter.useFilter(dataSet, multiFilter);
where myFilters is a list of filters I defined.
This is ok so far and I can even store the instances relation in an ARFF file and open it with the Explorer.
Now I want to reuse the MultiFilter, in this way:
myFilters = 
filter9 = Remove()
dataSet = Filter.useFilter(dataSet, multiFilter)
I get a NullPointerException with the following trace and error message:
dataSet = Filter.useFilter(dataSet, multiFilter)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
java.lang.NullPointerException: java.lang.NullPointerException: No output instance format defined
I could not find any method in the Filter class to set the output instance format.
Can anyone please help me?
WEKA version: weka-3-6-10
Operating System: Ubuntu 12.04 LTS
Java version: 1.6.0_26
I am trying to apply clustering with 6 clusters to an .arff file with two
attributes, but the SimpleKMeans algorithm gets only 5 clusters. If I try
to get other number of clusters it works fine, but with 6 clusters it
I attach the .arff file and the command line I use to apply clustering with
java -Xmx2048M -cp /usr/share/java/weka.jar weka.clusterers.SimpleKMeans -t
"training.arff" -d "clustering_6k.model" -N 6 > clustering_6k.txt
The problem is the same if you try to do it with the graphic interface of
Thank you for your time.
Jesús Virseda Jerez.
I have an ARFF file where the vast majority of (binary) values (over 90%) I want to predict are 0. When I run the Weka Associator on this file, it gives me rules that predict when these values are 0. However, because of the "sparsity" of this data, I really need the rules to only focus on cases where the value is 1.
Is there a way I can get Weka to do this?
After developing and saving a clustering model computed off-line from a data set of a few thousand data points (with dimensionality 10 - 20 features), can this clustering model be exported directly to a streaming data environment? Or does the clustering algorithm have to be re-implemented for streaming data?
We would like the clustering model to aggregate a few thousand streaming data points, cluster those data points, and report any interesting results.
Thanks for your suggestions.
What is the correct way to do hierarchical clustering dynamically. That is,
as new sets of data arrive i would like to keep applying a clustering model
that is trained by previous data sets and so on.
How can I do this in Weka GUI or using the API?
Greetings Weka users,
I'm currently trying to use Weka to generate a simple logic regression for
my dataset. Now, what's challenging to me is that i'm using my data set for
both training and testing.
My plan is to enforce a percentage split of 90% on the data. That way, i'll
get to train it on 90% of the data, and test it on 10%. Next, I will repeat
this test multiple times, assuming that a different 10% of the data will be
tested on each time.
To make sure that a new set of data gets selected for training / testing
each time, I will specify different values for the 'Random seed for XVal /
% split' value.
(1) is this an appropriate way of making sure that different data will get
selected each time?
(2) is there a more convenient approach to automate running a logistic
regression multiple times with different values, instead of doing so