Hi to all,
I see this topic in the archives and i want to know if its possible de permut rows (173 rows ) and columns (6152 colums) :
name Col1 col2col3
row1 0.2 1 -0.2
row2 1 -0.2 0.3
row4 -0.2 0.3 -1
I want : (Transpose )
name row1 row2 row3row4
Col1 0.2 1 -0,1 -0.2
col2 1 -0.2 0 0.3
col3-0.2 0.30.1 -1
Thanks in advance
[Wekalist] Permut rows and collonnes
Peter Reutemann fracpete at cs.waikato.ac.nz
Sat Nov 11 14:57:53 NZDT 2006
Previous message: [Wekalist] Permut rows and collonnes
Next message: [Wekalist] Knowledge Flow: incremental training update
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> Is there any way to permut rows (instances) and collonnes ( attributs)
> using weka?
> I can't use Excel to do that because I have 6000 attributs.
Use the weka.filters.unsupervised.attribute.Reorder filter to get your
attributes into a new order. Probably a wise idea to write a
script/program for doing this for 6000 attributes...
You can use the weka.filters.unsupervised.instance.Randomize filter to
randomize the order of your instances. BTW datasets get normally
randomized before they get passed on to the schemes (e.g., classifiers),
unless you're calling the "build" methods yourself.
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/ Ph. +64 (7) 858-5174
Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions !
Profitez des connaissances, des opinions et des expériences des internautes sur Yahoo! Questions/Réponses
I'm now trying to connect Weka to Mysql database. But after I select data in
SQl-viewer and press "Ok" I get this error:
> couldn't read from database unknown data type: INT. Add Entry in
> If the type contains blanks, either escape them with a backslash or
> use underscores instead of blanks.
> I read several treads about this error but still cannt find out what
> is wrong. I would be great if someone take a look at my
> DatabaseUtils.props and give me a hint what is wrong
Here is my DatabaseUtils.props:
# General information on database access can be found here:
# Version: $Revision: 5836 $
# The comma-separated list of jdbc drivers to use
# The url to the experiment database
#the method that is used to retrieve values from the db
# (java datatype + RecordSet.<method>) string, getString() = 0; --> nominal
boolean, getBoolean() = 1; --> nominal
double, getDouble() = 2; --> numeric byte, getByte() = 3; --> numeric
short, getByte()= 4; --> numeric
int, getInteger() = 5; --> numeric INT, getInteger() = 5; INT(11),
getInteger() = 5; -->numeric INT., getInteger() = 5; -->numeric long,
getLong() = 6; --> numeric float, getFloat() = 7; --> numeric
date, getDate() = 8; --> date text, getString() = 9; --> string
time, getTime() = 10; --> date
# the original conversion: <column type>=<conversion>
#mappings for table creation
# All the reserved keywords for this database
# The character to append to attribute names to avoid exceptions due to
# clashes between keywords and attribute names
#flags for loading and saving instances using DatabaseLoader/Saver
I am using weka3.5.6 developer version.When i run weka with console,it says some error messages "trying to add JDBC driver:rmijdbc.Rjdriver-error,not in CLASSPATH? etc..i can't get what it is..
I am using Weka in Ubuntu11.10.
But don't know how to enlarge Weka GUI font size.
(It is in agony to read tiny font.)
I have search the Mail list archive, but can't find a solution.
Any answer will be appreciated!
Please tell me the procedure how SMOreg technique works and how to analyse
As Regression technique is usually used for prediction , While using SMOreg
technique,it gives output in normalized form. I want to know how it can be
mapped predict the actual data..
Dear Weka Users,
in order to better understand the operating principles of parameter
optimization performed by GridSearch and CVParameterSelection I have had a
look at the code and I got a couple of doubts about the results produced,
in particular about the nested crossvalidations setting.
Both parameter search class and the classifier panel class perform
crossvalidation, therefore there is an "outer" crossvalidation, set by the
classifier panel class, that divides the whole dataset in n-fold subsets
and applies the parameter optimization to each of them collecting
fold-by-fold the predictions for all the instances, but there is also an
inner crossvalidation, set by the parameter search, that for each value of
the investigated parameter further splits each sub-dataset in m-fold
sub-sub-datasets, applies the classifier on each of them collecting the
predictions and finally returns the parameter best value (or parameters
values best combination).
This means that, most likely, each fold of the outer cv will correspond to
a classification model with different parameter(s) value(s), i.e. a
different model, therefore, whether I understood correctly, I am a bit
confused by the meaning of the predictions vector collected in this way.
The only explanation I deduced, is that this procedure does not aim to
validate the model but on the contrary the procedure itself.
Similar consideration can be done for the AttributeSelectedClassifer; also
in this case we have two nested crossvalidations and the model which the
different fold of the outer cv are performed with is likely different for
each fold. Therefore the summary statistics is calculated on a predictions
vector which different m-fold portions are likely related to m-fold
Thanks in advance for your precious help.
I have a dataset with 2320 instances and I performed a svm classification
with the optimized parameters c=500, gamma=0.1 . RMSE for training is
0.3122 which can be considered to be successful. Then, I wanted to supply a
test set (18 instances) to measure how well the model can predict the
correct results. The result was satisfactory with RMSE 0.3162, correctly
classified instances 15 and misclassified 3. The problem is when I wanted
the output predictions I saw that all the instances were misclassified.
There is a correlation with the misclassified values (ex. all excellent
values were predicted very bad, etc..) but none of them were predicted
well. I also supplied different data sets but the problem was the same. I
couldn't find the problem. How can I solve this issue? Output predictions
=== Predictions on test set ===
6,3:'VERY GOOD',1:VERY BAD,+,0.333
and the result window
=== Summary ===
Correctly Classified Instances 15 83.3333 %
Incorrectly Classified Instances 3 16.6667 %
Kappa statistic 0.7823
Mean absolute error 0.2259
Root mean squared error 0.3162
Relative absolute error 93.5738 %
Root relative squared error 85.7026 %
Coverage of cases (0.95 level) 100 %
Mean rel. region size (0.95 level) 83.3333 %
Total Number of Instances 18