michaelfeil / infinity Public

Notifications You must be signed in to change notification settings
Fork 73
Star 1k

Code
Issues 26
Pull requests 3
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: michaelfeil/infinity

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

26 Open 77 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Infinity Docs Are Offline

#300 opened Jul 16, 2024 by Senslyze-Docuwise

1 of 4 tasks

how to accelerate bge m3 sparse embeding module when inference？

#294 opened Jul 2, 2024 by seetimee

Classify endpoint not available for a finetuned DebertaV2ForSequenceClassification model

#276 opened Jun 20, 2024 by dblakely

2 of 4 tasks

[Benchmark] - clip embeddings good first issue

Good for newcomers

help wanted

Extra attention is needed

#259 opened Jun 10, 2024 by michaelfeil

[CLIP][Server/Engine] Send images to engine / accept PIL images good first issue

Good for newcomers

help wanted

Extra attention is needed

#258 opened Jun 10, 2024 by michaelfeil

[Benchmark] - embedding quantization help wanted

Extra attention is needed

#250 opened Jun 8, 2024 by michaelfeil

3 of 4 tasks

BUG ERROR: Server stops accepting new requests after _core_batch(self) exceptions

#242 opened Jun 2, 2024 by vitteloil

2 of 4 tasks

nvidia/NV-Embed-v1 new model

Make a model compatible

#239 opened May 31, 2024 by Strive-for-excellence

3 tasks done

ValueError: No onnx files found

#225 opened May 17, 2024 by netw0rkf10w

Tensor-parallelism for multi-gpu support wontfix

This will not be worked on

#213 opened Apr 29, 2024 by SalomonKisters

Add a TextSplitter in LangChain to share the model of the embedding model

#193 opened Apr 4, 2024 by Jimmy-Newtron

float16 and other optimizations help?

#159 opened Mar 18, 2024 by BBC-Esq

Love the repo! Wish I could help!

#157 opened Mar 18, 2024 by BBC-Esq

Move .detach().cpu() into encode_core, and option to use cuda streams

#155 opened Mar 18, 2024 by jobright-jiyuan

Dynamic loading - different models at request time / multiple models

#151 opened Mar 17, 2024 by cduk

Question: Support for sparse embeddings? new model

Make a model compatible

question

Further information is requested

#146 opened Mar 16, 2024 by Matheus-Garbelini

Content-Encoding: gzip

#136 opened Mar 14, 2024 by andrew-at-rise

Support for instructur/instructor-xl models

#125 opened Mar 2, 2024 by BBC-Esq

Create llama-index InfinityEmbeddings as langchain

#111 opened Feb 21, 2024 by semoal

How does this compare to Huggingface's Text Embedding Inference?

#108 opened Feb 21, 2024 by alpayariyak

AWQ-Bert / 4-bit Bert enhancement

New feature or request

help wanted

Extra attention is needed

#95 opened Feb 10, 2024 by michaelfeil

AMD ROCm docker images support (+ optimization) help wanted

Extra attention is needed

#94 opened Feb 9, 2024 by michaelfeil

3 tasks

Return actual token count on forward pass good first issue

Good for newcomers

#92 opened Feb 8, 2024 by michaelfeil

Adding max token budget per batch

#87 opened Feb 5, 2024 by michaelfeil

Idea: add a parameter to configure number of decimals in JSON output enhancement

New feature or request

#64 opened Jan 17, 2024 by lasttero

Previous 1 2 Next

Previous Next

ProTip! Mix and match filters to narrow down what you’re looking for.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly