I want to extract rules learned by PART and also I want to check which rules
are covering given new instances in existing java implementation in weka
3.6.
Idealy, I would love to have above for all rule based classifiers.
Thanks in advance...
--
Sent from: http://weka.8497.n7.nabble.com/

Dear Eibe
Thank you very much for you quick response - very helpful explanations!
Yes, indeed, if I evaluate on the training data, I get 100 % accuracy for the Logistic Regression Classifier, and by increasing the ridge parameter I get more useful class probabilities for my test data.
My next question is though, how much do I increase the ridge parameter? I've started at 1.0E-8 and went all the way to 3, with the higher values making more sense. I then tried to optimise the R parameter using CVParameterSelection, using 1.0E-8 to 4 in six steps. When it finished running, I see the following which I interpret as that it chose 1.0E-8 as the best value for the ridge parameter (which is the one that resulted in probabilities of 0 and 1 only).
-----------------------------------------------------------------------------
=== Classifier model (full training set) ===
Cross-validated Parameter selection.
Classifier: weka.classifiers.functions.Logistic
Cross-validation Parameter: '-R' ranged from 1.0E-8 to 4.0 with 6.0 steps
Classifier Options: -R 1.0E-8 -M -1 -num-decimal-places 4
Logistic Regression with ridge parameter of 1.0E-8
----------------------------------------------------------------------------
How do I choose a sensible ridge parameter and how do I justify why I've used it?
Many thanks for your time and help!
Best wishes, Christel
********************************************
Christel Krueger
Epigenetics/Bioinformatics (B570)
Babraham Institute
Cambridge CB22 3AT, UK
Tel: +44(0)1223 496245
Fax: +44(0)1223 496022
email: christel.krueger(a)babraham.ac.uk
*******************************************
-----Original Message-----
From: wekalist-bounces(a)list.waikato.ac.nz [mailto:wekalist-bounces@list.waikato.ac.nz] On Behalf Of wekalist-request(a)list.waikato.ac.nz
Sent: 01 March 2018 06:18
To: wekalist(a)list.waikato.ac.nz
Subject: Wekalist Digest, Vol 181, Issue 3
Send Wekalist mailing list submissions to
wekalist(a)list.waikato.ac.nz
To subscribe or unsubscribe via the World Wide Web, visit
https://list.waikato.ac.nz/mailman/listinfo/wekalist
or, via email, send a message with subject or body 'help' to
wekalist-request(a)list.waikato.ac.nz
You can reach the person managing the list at
wekalist-owner(a)list.waikato.ac.nz
When replying, please edit your Subject line so it is more specific than "Re: Contents of Wekalist digest..."
Today's Topics:
1. Re: Tie breaking in kNN (Eibe Frank)
2. Re: Cannot handle numeric class when using Logistic
Regression algorithm (Eibe Frank)
3. Re: classification confidence (Eibe Frank)
4. Re: Cannot handle numeric class when using Logistic
Regression algorithm (Derrick Peh)
5. Unable to access rules learned by PART in java. (Kuleshwar Sahu)
----------------------------------------------------------------------
Message: 1
Date: Thu, 1 Mar 2018 14:21:56 +1300
From: Eibe Frank <eibe(a)waikato.ac.nz>
To: "Weka machine learning workbench list."
<wekalist(a)list.waikato.ac.nz>
Subject: Re: [Wekalist] Tie breaking in kNN
Message-ID: <832F980D-A0B6-46BB-BB4E-924ABC036FFD(a)waikato.ac.nz>
Content-Type: text/plain; charset=us-ascii
NearestNeighbourSearch in WEKA does not break ties. If there are ties for the k-th neighbour, all instances that are tied are included in the neighbourhood.
Cheers,
Eibe
> On 28/02/2018, at 11:04 PM, asharma <arnabsh91(a)gmail.com> wrote:
>
> Hi,
>
> I have a confusion regarding the tie breaking condition implemented in Weka.
> Suppose we have taken k=3 and now the minimum distances from the
> predicted instance is same for four neighboring instances(for example d1, d2, d3, d4).
> Which three of them to choose? How it is decided in Weka?In which
> class/package the code for tie breaking condition has been implemented?
>
> Thanks and Regards,
> Arnab Sharma
>
>
>
> --
> Sent from: http://weka.8497.n7.nabble.com/
> _______________________________________________
> Wekalist mailing list
> Send posts to: Wekalist(a)list.waikato.ac.nz List info and subscription
> status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette:
> http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
------------------------------
Message: 2
Date: Thu, 1 Mar 2018 14:25:51 +1300
From: Eibe Frank <eibe(a)waikato.ac.nz>
To: "Weka machine learning workbench list."
<wekalist(a)list.waikato.ac.nz>
Subject: Re: [Wekalist] Cannot handle numeric class when using
Logistic Regression algorithm
Message-ID: <B711AB37-B491-4015-8DD7-AD1DEEF2CE87(a)waikato.ac.nz>
Content-Type: text/plain; charset=utf-8
This is not a problem with WEKA. :-) Logistic regression is designed for classification problems, i.e., problems where the class is nominal, not numeric.
If your Intensity_Score has only a few different values, e.g., -1, 0, +1, you can code the attribute as
@attribute Intensity_Score {-1, 0, +1)
and run Logistic regression with this nominal target attribute.
If your target is truly continuous and it doesn?t make sense to code it as a nominal attribute, you can use RegressionByDiscretization to apply logistic regression. RegressionByDiscretization will discretize the target into intervals/bins, and then treat each bin as a discrete class value.
Cheers,
Eibe
> On 1/03/2018, at 3:11 AM, Derrick Peh <derrickpehjh(a)gmail.com> wrote:
>
> I am relatively new to using WEKA and have been trying solve this problem but to no avail.
>
> I have previously tried to train a SVM regression (from LibLinear) using these parameters and it worked perfectly. I was using L2-regularized L2-loss support vector regression(dual) which is the option -S 12 in weka.classifiers.functions.LibLINEAR as shown below.
>
> java -Xmx4G -cp %WEKA_FOLDER%/weka.jar weka.Run
> weka.classifiers.meta.FilteredClassifier -t EI-reg-En-anger-train.arff
> -T 2018-EI-reg-En-anger-test.arff -classifications
> "weka.classifiers.evaluation.output.prediction.CSV -use-tab -p
> first-last -file EI-reg-En-anger-weka-predictions.csv" -F
> "weka.filters.MultiFilter -F
> \"weka.filters.unsupervised.attribute.TweetToSparseFeatureVector -E 5
> -D 3 -I 0 -F -M 2 -G 0 -taggerFile
> %HOME%/wekafiles/packages/AffectiveTweets/resources/model.20120919
> -wordClustFile
> %HOME%/wekafiles/packages/AffectiveTweets/resources/50mpaths2.txt.gz
> -Q 1 -stemmer weka.core.stemmers.NullStemmer -stopwords-handler
> \\\"weka.core.stopwords.Null \\\" -I 2 -U -tokenizer
> \\\"weka.core.tokenizers.TweetNLPTokenizer \\\"\" -F
> \"weka.filters.unsupervised.attribute.Reorder -R 5-last,4\"" -W
> weka.classifiers.functions.LibLINEAR -- -S 12 -C 1.0 -E 0.001 -B 1.0
> -L 0.1 -I 1000
>
> However, when i changed to option -S 7 which is L2-regularized logistic regression, it gave me an error weka.core.UnsupportedAttributeTypeException: weka.classifiers.meta.FilteredClassifier: Cannot handle numeric class!.
>
> My dataset attributes are:
> @attribute ID string
> @attribute Tweet string
> @attribute Affect_Dimension string
> @attribute Intensity_Score numeric
>
> An example of a @data row is:
> '2017-En-10264','@xandraaa5 @amayaallyn6 shut up hashtags are cool
> #offended','anger',0.562
>
>
> Virus-free. www.avast.com
> _______________________________________________
> Wekalist mailing list
> Send posts to: Wekalist(a)list.waikato.ac.nz List info and subscription
> status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette:
> http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
------------------------------
Message: 3
Date: Thu, 1 Mar 2018 14:35:32 +1300
From: Eibe Frank <eibe(a)waikato.ac.nz>
To: "Weka machine learning workbench list."
<wekalist(a)list.waikato.ac.nz>
Subject: Re: [Wekalist] classification confidence
Message-ID: <75F0EB98-1756-4FFC-AAAF-7682EFFFD7DD(a)waikato.ac.nz>
Content-Type: text/plain; charset=us-ascii
J48 will give 0 and 1 probability estimates if the leaf nodes of the tree are pure, i.e., contain training instances of only one class, because class probability estimates for a leaf node are obtained by counting how often each class occurs at the leaf node concerned. There is an option in J48 to turn on the Laplace correction of estimated probabilities. This will give somewhat more meaningful probability estimates (by initialising the count for each class with 1 instead of 0).
Logistic regression can give extreme probability estimates if the data is linearly separable (i.e., you can put hyperplanes through the data that perfectly separate the classes). Your data is linearly separable if your classification accuracy on the training data is 100%. You can address this problem by increasing the value of the ridge parameter in Logistic. Increasing this parameter will reduce the fit of the model to the training set.
Cheers,
Eibe
> On 1/03/2018, at 4:27 AM, Christel Krueger <christel.krueger(a)babraham.ac.uk> wrote:
>
> Hello!
>
> I'm using Weka to predict the nature of a sequencing library from sequence composition. I've tried different classifiers which I thought would make sense, and they generally perform very well. My favourite one here is J48.
>
> While I mostly end up with the right classification, for downstream processing I would like to include a measure of how confident I am that the class prediction is correct. I've used the option '-distribution' to report probabilities for each class. As I wasn't sure how probabilities were handled with J48, so I also tried logistic regression to report probabilities.
>
> My expectation would have been that I see very high probability for the chosen class when things are clear cut, but that I see lower probability for the chosen class when the data is more ambiguous. What puzzles me it that I always end up with a probability of 1 for the chosen class and probabilities of 0 for all other classes - even when the prediction is incorrect (I've only ever see 2 cases where the probabilities were different from 1 or 0).
>
> Am I doing something wrong? Is there a better way to report the confidence in the classification?
>
> Many thanks for help!
> Best wishes, Christel
>
>
> The Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT Registered Charity No. 1053902.
> The information transmitted in this email is directed only to the
> addressee. If you received this in error, please contact the sender
> and delete this email from your system. The contents of this e-mail
> are the views of the sender and do not necessarily represent the views
> of the Babraham Institute. Full conditions at:
> www.babraham.ac.uk<http://www.babraham.ac.uk/terms>
> _______________________________________________
> Wekalist mailing list
> Send posts to: Wekalist(a)list.waikato.ac.nz List info and subscription
> status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette:
> http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
------------------------------
Message: 4
Date: Thu, 1 Mar 2018 11:20:12 +0800
From: Derrick Peh <derrickpehjh(a)gmail.com>
To: "Weka machine learning workbench list."
<wekalist(a)list.waikato.ac.nz>
Subject: Re: [Wekalist] Cannot handle numeric class when using
Logistic Regression algorithm
Message-ID: <1C877CFA-BFE7-40D7-8217-157B9330F7D7(a)gmail.com>
Content-Type: text/plain; charset=utf-8
Thank you for your explanation. ^^
Regards,
Derrick Peh
> On 1 Mar 2018, at 9:25 AM, Eibe Frank <eibe(a)waikato.ac.nz> wrote:
>
> This is not a problem with WEKA. :-) Logistic regression is designed for classification problems, i.e., problems where the class is nominal, not numeric.
>
> If your Intensity_Score has only a few different values, e.g., -1, 0,
> +1, you can code the attribute as
>
> @attribute Intensity_Score {-1, 0, +1)
>
> and run Logistic regression with this nominal target attribute.
>
> If your target is truly continuous and it doesn?t make sense to code it as a nominal attribute, you can use RegressionByDiscretization to apply logistic regression. RegressionByDiscretization will discretize the target into intervals/bins, and then treat each bin as a discrete class value.
>
> Cheers,
> Eibe
>
>> On 1/03/2018, at 3:11 AM, Derrick Peh <derrickpehjh(a)gmail.com> wrote:
>>
>> I am relatively new to using WEKA and have been trying solve this problem but to no avail.
>>
>> I have previously tried to train a SVM regression (from LibLinear) using these parameters and it worked perfectly. I was using L2-regularized L2-loss support vector regression(dual) which is the option -S 12 in weka.classifiers.functions.LibLINEAR as shown below.
>>
>> java -Xmx4G -cp %WEKA_FOLDER%/weka.jar weka.Run
>> weka.classifiers.meta.FilteredClassifier -t
>> EI-reg-En-anger-train.arff -T 2018-EI-reg-En-anger-test.arff
>> -classifications "weka.classifiers.evaluation.output.prediction.CSV
>> -use-tab -p first-last -file EI-reg-En-anger-weka-predictions.csv" -F
>> "weka.filters.MultiFilter -F
>> \"weka.filters.unsupervised.attribute.TweetToSparseFeatureVector -E 5
>> -D 3 -I 0 -F -M 2 -G 0 -taggerFile
>> %HOME%/wekafiles/packages/AffectiveTweets/resources/model.20120919
>> -wordClustFile
>> %HOME%/wekafiles/packages/AffectiveTweets/resources/50mpaths2.txt.gz
>> -Q 1 -stemmer weka.core.stemmers.NullStemmer -stopwords-handler
>> \\\"weka.core.stopwords.Null \\\" -I 2 -U -tokenizer
>> \\\"weka.core.tokenizers.TweetNLPTokenizer \\\"\" -F
>> \"weka.filters.unsupervised.attribute.Reorder -R 5-last,4\"" -W
>> weka.classifiers.functions.LibLINEAR -- -S 12 -C 1.0 -E 0.001 -B 1.0
>> -L 0.1 -I 1000
>>
>> However, when i changed to option -S 7 which is L2-regularized logistic regression, it gave me an error weka.core.UnsupportedAttributeTypeException: weka.classifiers.meta.FilteredClassifier: Cannot handle numeric class!.
>>
>> My dataset attributes are:
>> @attribute ID string
>> @attribute Tweet string
>> @attribute Affect_Dimension string
>> @attribute Intensity_Score numeric
>>
>> An example of a @data row is:
>> '2017-En-10264','@xandraaa5 @amayaallyn6 shut up hashtags are cool
>> #offended','anger',0.562
>>
>>
>> Virus-free. www.avast.com
>> _______________________________________________
>> Wekalist mailing list
>> Send posts to: Wekalist(a)list.waikato.ac.nz List info and subscription
>> status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
>> List etiquette:
>> http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: Wekalist(a)list.waikato.ac.nz List info and subscription
> status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette:
> http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
------------------------------
Message: 5
Date: Wed, 28 Feb 2018 23:17:16 -0700 (MST)
From: Kuleshwar Sahu <kuleshwar03(a)gmail.com>
To: wekalist(a)list.waikato.ac.nz
Subject: [Wekalist] Unable to access rules learned by PART in java.
Message-ID: <1519885036810-0.post(a)n7.nabble.com>
Content-Type: text/plain; charset=us-ascii
I want to extract rules learned by PART and also I want to check which rules are covering given new instances in existing java implementation in weka 3.6.
Idealy, I would love to have above for all rule based classifiers.
Thanks in advance...
--
Sent from: http://weka.8497.n7.nabble.com/
------------------------------
_______________________________________________
Wekalist mailing list
Wekalist(a)list.waikato.ac.nz
https://list.waikato.ac.nz/mailman/listinfo/wekalist
End of Wekalist Digest, Vol 181, Issue 3
****************************************
The Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT Registered Charity No. 1053902.
The information transmitted in this email is directed only to the addressee. If you received this in error, please contact the sender and delete this email from your system. The contents of this e-mail are the views of the sender and do not necessarily represent the views of the Babraham Institute. Full conditions at: www.babraham.ac.uk<http://www.babraham.ac.uk/terms>

I’m not sure I understand your question. Every network constructed by BayesNet should be a directed acrylic graph.
Cheers,
Eibe
> From: Angel Marya <angelmarya9812(a)gmail.com>
> Subject: Algorithm for GBN
> Date: 2 March 2018 at 8:14:06 AM NZDT
> To: wekalist(a)list.waikato.ac.nz
>
>
> Is there any algorithm availabe in Weka for General Bayesian Network?
> ICS is similar to general bayesian structure but I can see its accuracy is
> low and also it is not exactly a Directed Acyclic Graph. I could see that
> class is a parent of an attribute and the attribute itself is a parent of
> the class.How can you have a Bayesian Network which is not a DAG?