-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The influence of t (temperature) in the E5 Model paper #1588
Comments
Hi @daegonYu , This is a hyperparameter for tuning. Empirically we observe that lower temperature will lead to better performance but might cause training instability under float16 precision for large models. A lower temperature allows the logits to vary in a wider range and thus has more flexibility. |
“A lower temperature allows the logits to vary in a wider range and thus has more flexibility.” This can be interpreted as saying that embeddings make it easier to learn more diverse expressions. But in "https://huggingface.co/intfloat/multilingual-e5-base"
If embeddings can be expressed in wider range, I think cosine similarity should be distributed over a wide range. Cosine similarity is distributed between 0.7 and 1.0. It's difficult to understand because it seems like something contradictory. Simply put, I wonder why lowering the temperature allows learning a wider range of logits. |
The logits are calculated with However, this does not mean the learned cosine similarity will be in a wider range. On the contrary, the cosine similarity tends to concentrate as the temperature becomes lower. |
All right. I understand what you said, but why does "the cosine similarity tends to concentrate as the temperature becomes lower." Can you tell if this is happening? |
Describe
Model I am using (UniLM, MiniLM, LayoutLM ...): E5
hello. I am a student studying sentence similarity.
“Paper: Text Embeddings by Weakly-Supervised Contrastive Pre-training”
While reading this paper, a question arose. The point is that t is 0.01. In the SimCSE paper, the sentence similarity is set to 0.05 for the task (STS), and in other papers, the sentence similarity is set to 0.02, but in this paper, the sentence similarity was set to 0.01. Can you tell us what effects can be achieved by lowering the temperature?
The text was updated successfully, but these errors were encountered: