Probability Returning Infinity for most Categories #3

Nath5 · 2015-01-12T21:16:23Z

Hello,

I know you haven't worked on this in a while but was wondering if you had any idea why I keep seeing this issue. I have added about 25 categories to the model with lots of data in each category. For the majority of the categories no matter what I feed in when I classify a chunk of text most of the categories return a probability of infinity.

ex.

Classification[
category=friends_gatherings,
probability=Infinity,
featureset=[
after,
school,
soccerabout,
this,
...
--
]
]

windweller · 2015-02-13T15:46:14Z

I literally encountered the same issue LOL. I think it may be because he didn't do any smoothing technique.

ptnplanet · 2015-09-10T14:00:37Z

Hello, yes, unfortunately there is no smoothing technique applied. PROD(P(featI|cat) becomes pretty big with lots of features and categories. You can however provide your own IFeatureProbability<T, K> calculator. This requires you to provide an own Classifier<T, K> though (or to override featuresProbabilityProduct(Collection<T> features, K category) in BayesClassifier<T, K>.

#256 (comment) which addresses a problem in the Bayesian Classifier source code as discussed in ptnplanet/Java-Naive-Bayes-Classifier#3

ptnplanet · 2017-02-03T10:10:35Z

Hi all. You might want to explore the latest feature branch (feature/weight).

Take the feature weight into consideration when calculating the featureProbabilityProduct

Made BayesClassifier.featureProbabilityProduct public to enable other
implementations to overwrite the calculation

By default now take the feature weight and the assumed Probability
into consideration when calculating the feautersProbabilityProduct

Added a test to test with high number of categories

barovehicles · 2017-09-14T11:17:51Z

I'm comparing the results in python with numpy and the results with this routine and are completely different. This routine definitely don't work.

ptnplanet added the enhancement label Sep 10, 2015

marcnause mentioned this issue Mar 7, 2016

Problems parsing numbers: JSON forbids NaN and infinities loklak/loklak_server#256

Closed

Orbiter added a commit to loklak/loklak_server that referenced this issue Mar 8, 2016

added patch from Low012 as submitted in

65ede16

#256 (comment) which addresses a problem in the Bayesian Classifier source code as discussed in ptnplanet/Java-Naive-Bayes-Classifier#3

ptnplanet self-assigned this Feb 3, 2017

ptnplanet added this to the v1.1 milestone Feb 3, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Probability Returning Infinity for most Categories #3

Probability Returning Infinity for most Categories #3

Nath5 commented Jan 12, 2015

windweller commented Feb 13, 2015

ptnplanet commented Sep 10, 2015

ptnplanet commented Feb 3, 2017

barovehicles commented Sep 14, 2017

Probability Returning Infinity for most Categories #3

Probability Returning Infinity for most Categories #3

Comments

Nath5 commented Jan 12, 2015

windweller commented Feb 13, 2015

ptnplanet commented Sep 10, 2015

ptnplanet commented Feb 3, 2017

barovehicles commented Sep 14, 2017