Bug in the computeQ function - v2 classifier #41

bharat-biradar · 2021-08-20T06:46:22Z

Describe the issue
In the computeQ when the threshold is set to 1.0 the granularity is being calculated as 10, but if we set the threshold to 0.95, 0.99, or 0.999 the granularity is being calculated as 19, 99, 999, respectively where there is exponential growth and also the granularity is greater than the granularity set at maxThresold(1.0) which is 10.

Is this intentional?

A problem occurring due to this issue is that when we set the threshold to 0.95 or greater a lot of licenses are not being detected which in the case we set to 0.9 are easily being detected.

I ran the program for around 17,300 license files out of which around 2950 BSD-3-Clause, 850 BSD-2-Clause and some other licenses were not at all detected which were otherwise detected at a granularity of 10 because at that threshold the granularity is greater than 20 and nearly reaches 100.

A possible solution would be to set the granularity to 10 for a threshold greater than 0.9 and it will also handle the divide by zero cases.

The text was updated successfully, but these errors were encountered:

bharat-biradar mentioned this issue Aug 20, 2021

Set q=10, for threshold>0.9 #42

Open

rspier assigned wcn3 Aug 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug in the computeQ function - v2 classifier #41

Bug in the computeQ function - v2 classifier #41

bharat-biradar commented Aug 20, 2021 •

edited

Loading

Bug in the computeQ function - v2 classifier #41

Bug in the computeQ function - v2 classifier #41

Comments

bharat-biradar commented Aug 20, 2021 • edited Loading

bharat-biradar commented Aug 20, 2021 •

edited

Loading