Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mutating labels when training MulticlassLDA #205

Open
grero opened this issue Oct 7, 2022 · 1 comment
Open

Mutating labels when training MulticlassLDA #205

grero opened this issue Oct 7, 2022 · 1 comment

Comments

@grero
Copy link

grero commented Oct 7, 2022

This recent change really threw a wrench into my pipeline:

idxs = toindices(y)

I am training LDAs on one set of trials and testing the decoding performance on a separate set of trials. All of a sudden, my performance dropped to chance and after about a day of digging around, I realised that toindices actually mutates the label names. In other words, when I was decoding the testset by finding the projected mean that each sample was closest to, I was using the original labels for my testing, and so the class assignments were all essentially random.

As a stopgap measure for my pipeline, I defined

MultivariateStats.toindices(label::AbstractVector{T}) where T <: Integer = label

which fixed my issue, but I realise that this is not general solution. In particular, if there are gaps in label, such that maximum(label) !== length(unique(label)), this could also cause problems.
Is there currently an array type that fulfils that criteria?

@wildart
Copy link
Collaborator

wildart commented Oct 7, 2022

I see. That was a problem with previous implementation. The labels and indices were conflated which caused bounds errors if labels weren't properly defined, #187. It looks like the design problem because the LDA model doesn't carry any explicit information about labels. Class centroids relate to an index of a label rather than the label itself. You can use toindices to get a map of labels to indices and use this map get correct class centroid and weight data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants