Error using vllm OpenAI server #357

Shamepoo · 2024-07-04T07:32:34Z

INFO: 172.16.80.35:48532 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 411, in run_asgi
result = await app( # type: ignore[func-returns-value]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 69, in call
return await self.app(scope, receive, send)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in call
await super().call(scope, receive, send)
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/applications.py", line 123, in call
await self.middleware_stack(scope, receive, send)
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in call
raise exc
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in call
await self.app(scope, receive, _send)
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/middleware/cors.py", line 85, in call
await self.app(scope, receive, send)
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 65, in call
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/routing.py", line 756, in call
await self.middleware_stack(scope, receive, send)
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/routing.py", line 776, in app
await route.handle(scope, receive, send)
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/routing.py", line 297, in handle
await self.app(scope, receive, send)
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/routing.py", line 77, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/routing.py", line 72, in app
response = await func(request)
^^^^^^^^^^^^^^^^^^^
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/fastapi/routing.py", line 278, in app
raw_response = await run_endpoint_function(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
return await dependant.call(**values)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/vllm/entrypoints/openai/api_server.py", line 89, in create_chat_completion
generator = await openai_serving_chat.create_chat_completion(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/vllm/entrypoints/openai/serving_chat.py", line 68, in create_chat_completion
sampling_params = request.to_sampling_params()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/vllm/entrypoints/openai/protocol.py", line 157, in to_sampling_params
return SamplingParams(
^^^^^^^^^^^^^^^
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/vllm/sampling_params.py", line 157, in init
self._verify_args()
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/vllm/sampling_params.py", line 172, in _verify_args
if self.n < 1:
^^^^^^^^^^
TypeError: '<' not supported between instances of 'NoneType' and 'int'
INFO 07-04 07:24:41 metrics.py:218] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%
INFO 07-04 07:24:51 metrics.py:218] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%
INFO 07-04 07:25:01 metrics.py:218] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%
INFO 07-04 07:25:11 metrics.py:218] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%
INFO 07-04 07:25:21 metrics.py:218] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%
INFO 07-04 07:25:31 metrics.py:218] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%
INFO 07-04 07:25:41 metrics.py:218] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%
INFO 07-04 07:25:51 metrics.py:218] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%
INFO 07-04 07:26:01 metrics.py:218] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%
INFO 07-04 07:26:11 metrics.py:218] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%
INFO 07-04 07:26:21 metrics.py:218] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%
INFO 07-04 07:26:31 metrics.py:218] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%
INFO 07-04 07:26:41 metrics.py:218] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%
INFO 07-04 07:26:51 metrics.py:218] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%
INFO 07-04 07:27:01 metrics.py:218] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%
INFO 07-04 07:27:11 metrics.py:218] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%
INFO 07-04 07:27:21 metrics.py:218] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%
INFO: 172.16.80.35:53354 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 411, in run_asgi
result = await app( # type: ignore[func-returns-value]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 69, in call
return await self.app(scope, receive, send)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in call
await super().call(scope, receive, send)
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/applications.py", line 123, in call
await self.middleware_stack(scope, receive, send)
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in call
raise exc
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in call
await self.app(scope, receive, _send)
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/middleware/cors.py", line 85, in call
await self.app(scope, receive, send)
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 65, in call
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/routing.py", line 756, in call
await self.middleware_stack(scope, receive, send)
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/routing.py", line 776, in app
await route.handle(scope, receive, send)
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/routing.py", line 297, in handle
await self.app(scope, receive, send)
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/routing.py", line 77, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/starlette/routing.py", line 72, in app
response = await func(request)
^^^^^^^^^^^^^^^^^^^
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/fastapi/routing.py", line 278, in app
raw_response = await run_endpoint_function(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
return await dependant.call(**values)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/vllm/entrypoints/openai/api_server.py", line 89, in create_chat_completion
generator = await openai_serving_chat.create_chat_completion(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/vllm/entrypoints/openai/serving_chat.py", line 68, in create_chat_completion
sampling_params = request.to_sampling_params()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/vllm/entrypoints/openai/protocol.py", line 157, in to_sampling_params
return SamplingParams(
^^^^^^^^^^^^^^^
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/vllm/sampling_params.py", line 157, in init
self._verify_args()
File "/home/bigdata/anaconda3/envs/vllm/lib/python3.11/site-packages/vllm/sampling_params.py", line 172, in _verify_args
if self.n < 1:
^^^^^^^^^^
TypeError: '<' not supported between instances of 'NoneType' and 'int'

mxchinegod · 2024-07-04T14:18:24Z

possible to share your config with keys redacted? thanks

Shamepoo · 2024-07-04T14:29:12Z

Sure. Here is my settings.yaml

encoding_model: cl100k_base
skip_workflows: []
llm:
api_key: ""
type: openai_chat # or azure_openai_chat
model: Qwen2-72B-Instruct-AWQ
model_supports_json: true # recommended if this is available for your model.
max_tokens: 4000
request_timeout: 180.0
api_base: http://172.16.80.35:7005/v1

api_version: Qwen2-72B-Instruct-AWQ
organization: <organization_id>
deployment_name: <azure_model_deployment_name>
tokens_per_minute: 150_000 # set a leaky bucket throttle
requests_per_minute: 10_000 # set a leaky bucket throttle
max_retries: 10
max_retry_wait: 10.0

sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
concurrent_requests: 1 # the number of parallel inflight requests that may be made
model_parameters:
n: 1

parallelization:
stagger: 0.3
num_threads: 64 # the number of threads to use for parallel processing

async_mode: threaded # or asyncio

embeddings:

parallelization: override the global parallelization settings for embeddings
async_mode: threaded # or asyncio
llm:
api_key: ${GRAPHRAG_API_KEY}
type: openai_embedding # or azure_openai_embedding
model: text-embedding-3-small
api_base: http://ip:post/v1
# api_version: 2024-02-15-preview
# organization: <organization_id>
# deployment_name: <azure_model_deployment_name>
# tokens_per_minute: 150_000 # set a leaky bucket throttle
# requests_per_minute: 10_000 # set a leaky bucket throttle
# max_retries: 10
# max_retry_wait: 10.0
# sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
# concurrent_requests: 25 # the number of parallel inflight requests that may be made
batch_size: 16 # the number of documents to send in a single request
batch_max_tokens: 8191 # the maximum number of tokens to send in a single request
# target: required # or optional

chunks:
size: 1500
overlap: 100
group_by_columns: [id] # by default, we don't allow chunks to cross documents

input:
type: file # or blob
file_type: text # or csv
base_dir: "input"
file_encoding: utf-8
file_pattern: ".*\.txt$"

cache:
type: file # or blob
base_dir: "cache"

connection_string: <azure_blob_storage_connection_string>
container_name: <azure_blob_storage_container_name>
storage:
type: file # or blob
base_dir: "output/${timestamp}/artifacts"

connection_string: <azure_blob_storage_connection_string>
container_name: <azure_blob_storage_container_name>
reporting:
type: file # or console, blob
base_dir: "output/${timestamp}/reports"

connection_string: <azure_blob_storage_connection_string>
container_name: <azure_blob_storage_container_name>
entity_extraction:

llm: override the global llm settings for this task
parallelization: override the global parallelization settings for this task
async_mode: override the global async_mode settings for this task
prompt: "prompts/entity_extraction.txt"
entity_types: [organization,person,geo,event]
max_gleanings: 0

summarize_descriptions:

llm: override the global llm settings for this task
parallelization: override the global parallelization settings for this task
async_mode: override the global async_mode settings for this task
prompt: "prompts/summarize_descriptions.txt"
max_length: 500

claim_extraction:

llm: override the global llm settings for this task
parallelization: override the global parallelization settings for this task
async_mode: override the global async_mode settings for this task
enabled: true
prompt: "prompts/claim_extraction.txt"
description: "Any claims or facts that could be relevant to information discovery."
max_gleanings: 0

community_report:

llm: override the global llm settings for this task
parallelization: override the global parallelization settings for this task
async_mode: override the global async_mode settings for this task
prompt: "prompts/community_report.txt"
max_length: 2000
max_input_length: 8000

cluster_graph:
max_cluster_size: 10

embed_graph:
enabled: false # if true, will generate node2vec embeddings for nodes

num_walks: 10
walk_length: 40
window_size: 2
iterations: 3
random_seed: 597832
umap:
enabled: false # if true, will generate UMAP embeddings for nodes

snapshots:
graphml: false
raw_entities: false
top_level_nodes: false

local_search:

text_unit_prop: 0.5
community_prop: 0.1
conversation_history_max_turns: 5
top_k_mapped_entities: 10
top_k_relationships: 10
max_tokens: 12000
global_search:

max_tokens: 12000
data_max_tokens: 12000
map_max_tokens: 1000
reduce_max_tokens: 2000
concurrency: 32

possible to share your config with keys redacted? thanks

mxchinegod · 2024-07-04T15:39:50Z

Initially I thought maybe a default value which is surrounded by < could be the cause but it doesn't look like anything snuck in there. Which version of vllm are you on?

AlonsoGuevara · 2024-07-04T21:38:41Z

Hi @Shamepoo
Investigating at this it seems this could be due to some faulty parsing of the n parameter when building the configuration class on our end that ends up sending a None to the vllm backend.

I will submit a fix for this.

Shamepoo · 2024-07-05T06:06:07Z

@AlonsoGuevara Yes, looking forward to that.

AntoninLeroy · 2024-07-05T15:50:22Z

Support for Vllm would be highly appreciated.

Also any tutorial on how to adapt any LLM / Embedding endpoint to fit the GraphRag specifications.

AlonsoGuevara · 2024-07-06T00:00:21Z

Just submitted #390 This shall fix the error you're experiencing on vllm

AlonsoGuevara · 2024-07-08T20:36:26Z

Code is now merged, if you are running from source this should fix it. If running from pypi, on v0.1.2 I'll be including this fix

Shamepoo · 2024-07-09T02:16:26Z

Code is now merged, if you are running from source this should fix it. If running from pypi, on v0.1.2 I'll be including this fix

Now works for me, thanks!

riyajatar37003 · 2024-07-16T10:09:49Z

⠦ GraphRAG Indexer
├── Loading Input (csv) - 1 files loaded (1 filtered) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── create_base_text_units
├── create_base_extracted_entities
└── create_final_covariates
❌ create_final_covariates

ERROR 07-16 12:08:18 api_server.py:247] Error in applying chat template from request: Conversation roles must alternate user/assistant/user/assistant/...

logs.json

{"type": "error", "data": "Error Invoking LLM", "stack": "Traceback (most recent call last):\n File "//graphrag/graphrag/llm/base/base_llm.py", line 53, in _invoke\n output = await self._execute_llm(input, **kwargs)\n File "/graphrag/graphrag/llm/openai/openai_chat_llm.py", line 55, in _execute_llm\n completion = await self.client.chat.completions.create(\n File "/tmp/.conda/envs/grag/lib/python3.10/site-packages/openai/resources/chat/completions.py", line 1289, in create\n return await self._post(\n File "/tmp/.conda/envs/grag/lib/python3.10/site-packages/openai/_base_client.py", line 1826, in post\n return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)\n File "/tmp/.conda/envs/grag/lib/python3.10/site-packages/openai/_base_client.py", line 1519, in request\n return await self._request(\n File "/tmp/.conda/envs/grag/lib/python3.10/site-packages/openai/_base_client.py", line 1620, in _request\n raise self._make_status_error_from_response(err.response) from None\nopenai.BadRequestError: Error code: 400 - {'object': 'error', 'message': 'Conversation roles must alternate user/assistant/user/assistant/...', 'type': 'invalid_request_error', 'param': None, 'code': None}\n", "source": "Error code: 400 - {'object': 'error', 'message': 'Conversation roles must alternate user/assistant/user/assistant/...', 'type': 'invalid_request_error', 'param': None, 'code': None}", "details": {"input": "MANY entities were missed in the last extraction. Add them below using the same format:\n"}}
{"type": "error", "data": "Entity Extraction Error", "stack": "Traceback (most recent call last):\n File "/graphrag/graphrag/index/graph/extractors/graph/graph_extractor.py", line 123, in call\n result = await self._process_document(text, prompt_variables)\n File

riyajatar37003 · 2024-07-16T10:10:14Z

vllm is up and running properly but at this steps its showing that message

AlonsoGuevara mentioned this issue Jul 5, 2024

Add N parameter support #390

Merged

4 tasks

AlonsoGuevara closed this as completed Jul 8, 2024

cd80 mentioned this issue Jul 12, 2024

Clustering crashes: ValueError("Columns must be same length as key") - too little input text maybe? #362

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error using vllm OpenAI server #357

Error using vllm OpenAI server #357

Shamepoo commented Jul 4, 2024

mxchinegod commented Jul 4, 2024

Shamepoo commented Jul 4, 2024 •

edited

Loading

mxchinegod commented Jul 4, 2024

AlonsoGuevara commented Jul 4, 2024

Shamepoo commented Jul 5, 2024

AntoninLeroy commented Jul 5, 2024

AlonsoGuevara commented Jul 6, 2024 •

edited

Loading

AlonsoGuevara commented Jul 8, 2024

Shamepoo commented Jul 9, 2024

riyajatar37003 commented Jul 16, 2024 •

edited

Loading

riyajatar37003 commented Jul 16, 2024

Error using vllm OpenAI server #357

Error using vllm OpenAI server #357

Comments

Shamepoo commented Jul 4, 2024

mxchinegod commented Jul 4, 2024

Shamepoo commented Jul 4, 2024 • edited Loading

mxchinegod commented Jul 4, 2024

AlonsoGuevara commented Jul 4, 2024

Shamepoo commented Jul 5, 2024

AntoninLeroy commented Jul 5, 2024

AlonsoGuevara commented Jul 6, 2024 • edited Loading

AlonsoGuevara commented Jul 8, 2024

Shamepoo commented Jul 9, 2024

riyajatar37003 commented Jul 16, 2024 • edited Loading

logs.json

riyajatar37003 commented Jul 16, 2024

Shamepoo commented Jul 4, 2024 •

edited

Loading

AlonsoGuevara commented Jul 6, 2024 •

edited

Loading

riyajatar37003 commented Jul 16, 2024 •

edited

Loading