The training is really fast, but the inference speed is very slow. I read the document and wrote batch, multi-core, but it is still very slow. Is there any other way to optimize the inference speed? #205

xiaohuzi1996 · 2023-06-26T07:10:24Z

The problem encountered is the same, occupying 100G of memory, 40 cores are turned on, and reasoning is performed on texts with a length of less than 5000 words, 2 entries/s

narayanacharya6 · 2023-08-15T07:08:52Z

I've had a similar experience. Using the DMRModel I get only around 20docs/minute or so on my MBP 2.6 GHz 6-Core Intel Core i7, 32 GB RAM.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The training is really fast, but the inference speed is very slow. I read the document and wrote batch, multi-core, but it is still very slow. Is there any other way to optimize the inference speed? #205

The training is really fast, but the inference speed is very slow. I read the document and wrote batch, multi-core, but it is still very slow. Is there any other way to optimize the inference speed? #205

xiaohuzi1996 commented Jun 26, 2023

narayanacharya6 commented Aug 15, 2023

The training is really fast, but the inference speed is very slow. I read the document and wrote batch, multi-core, but it is still very slow. Is there any other way to optimize the inference speed? #205

The training is really fast, but the inference speed is very slow. I read the document and wrote batch, multi-core, but it is still very slow. Is there any other way to optimize the inference speed? #205

Comments

xiaohuzi1996 commented Jun 26, 2023

narayanacharya6 commented Aug 15, 2023