Share my config to change to your local LLM and embedding #374

KylinMountain · 2024-07-05T08:05:48Z

settings.yaml

config the llm to llama3 in groq or any other model compatible with OAI API.

llm:
  api_key: ${GRAPHRAG_API_KEY}
  type: openai_chat # or azure_openai_chat
  model: llama3-8b-8192
  model_supports_json: false # recommended if this is available for your model.
  api_base: https://api.groq.com/openai/v1
  max_tokens: 8192
  concurrent_requests: 1 # the number of parallel inflight requests that may be made
  tokens_per_minute: 28000 # set a leaky bucket throttle
  requests_per_minute: 29 # set a leaky bucket throttle
  # request_timeout: 180.0
  # api_version: 2024-02-15-preview
  # organization: <organization_id>
  # deployment_name: <azure_model_deployment_name>
  max_retries: 10
  max_retry_wait: 60.0
  sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times

Using the llama.cpp to server embedding API, it is compatible with OAI API.
The start command:

./server -m ./models/mymodels/qwen1.5-chat-ggml-model-Q4_K_M.gguf -c 8192 -n -1 -t 7 --embeddings

So the embedding in the setting config is

embeddings:
  ## parallelization: override the global parallelization settings for embeddings
  async_mode: threaded # or asyncio
  llm:
    api_key: ${GRAPHRAG_API_KEY}
    type: openai_embedding # or azure_openai_embedding
    model: text-embedding-ada-002
    api_base: http://localhost:8080
    batch_size: 1 # the number of documents to send in a single request
    # api_version: 2024-02-15-preview
    # organization: <organization_id>
    # deployment_name: <azure_model_deployment_name>
    # tokens_per_minute: 150_000 # set a leaky bucket throttle
    # requests_per_minute: 10_000 # set a leaky bucket throttle
    # max_retries: 10
    # max_retry_wait: 10.0
    # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
    # concurrent_requests: 25 # the number of parallel inflight requests that may be made
    # batch_max_tokens: 8191 # the maximum number of tokens to send in a single request
    # target: required # or optional

But....

⠦ GraphRAG Indexer
├── Loading Input (text) - 1 files loaded (0 filtered) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── create_base_text_units
├── create_base_extracted_entities
├── create_summarized_entities
├── create_base_entity_graph
├── create_final_entities
├── create_final_nodes
├── create_final_communities
├── join_text_units_to_entity_ids
├── create_final_relationships
├── join_text_units_to_relationship_ids
└── create_final_community_reports
❌ Errors occurred during the pipeline run, see logs for more details.

The text was updated successfully, but these errors were encountered:

gdhua · 2024-07-05T09:58:09Z

Is that OK? I would also like to switch to a local ollama supported model

qwaszaq · 2024-07-05T10:20:42Z

I also changed it for Mixtral 8x7B under LM Studio and embeddings to Nomic using the config data from LMStudio

KylinMountain · 2024-07-05T10:31:27Z

Finally, it works. There's a bug when doing create_final_community_reports, it will use llm in setting instead of the llm under community_report in setting.yaml.

Llama3 context window is only 8192, it is not enough to do summary for create_final_community_reports. So you must have a context window like 32k model.

⠙ GraphRAG Indexer 
├── Loading Input (text) - 1 files loaded (0 filtered) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── create_base_text_units
├── create_base_extracted_entities
├── create_summarized_entities
├── create_base_entity_graph
├── create_final_entities
├── create_final_nodes
├── create_final_communities
├── join_text_units_to_entity_ids
├── create_final_relationships
├── join_text_units_to_relationship_ids
├── create_final_community_reports
├── create_final_text_units
├── create_base_documents
└── create_final_documents
🚀 All workflows completed successfully.

KylinMountain · 2024-07-05T10:33:11Z

Is that OK? I would also like to switch to a local ollama supported model

ollama doesn't support OAI compatible embedding API , try use the llamap.cpp to server the model.
Otherwise you can modify the code to use huggingface embedding model.

bernardmaltais · 2024-07-05T11:21:56Z

Llama3 context window is only 8192, it is not enough to do summary for create_final_community_reports. So you must have a context window like 32k model.

What model do you recommend for the task?

qwaszaq · 2024-07-05T11:34:02Z

Mixtral has it 32k W dniu pt., 5.07.2024 o 13:22 bernardmaltais ***@***.***> napisał(a):

…

Llama3 context window is only 8192, it is not enough to do summary for create_final_community_reports. So you must have a context window like 32k model. What model do you recommend for the task? — Reply to this email directly, view it on GitHub <#374 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/A646JTIQQKUKDNG6IPVEV7LZKZ6WXAVCNFSM6AAAAABKMXEWRCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJQGY4TSNBUHE> . You are receiving this because you commented.Message ID: ***@***.***>

KylinMountain · 2024-07-05T11:35:54Z

i am using moonshot and qwen max.
you also can try mistral 8x7b it is have 32k context window, ps the model on groq is only 5000 tokens in tpm of free tier. it is not ok to do that.

emrgnt-cmplxty · 2024-07-05T14:08:56Z

Are people finding that OSS models are strong enough to actually do meaningful work with the graphRAG approach in this repository?

bmaltais · 2024-07-05T15:22:36Z

Hard. to tell. Even commercial models like gpt-3.5-turbo are not providing mind blowing results when compared to something like Google's NotebookLM

A lot of the time GraphRAG fail to provide the correct answer, where NotebookLM nails it.

Example GraphRAG global:

python -m graphrag.query --method global --root . "What is an example of a windows virtual machine name structure?"


INFO: Reading settings from settings.yaml
creating llm client with {'api_key': 'REDACTED,len=56', 'type': "openai_chat", 'model': 'gpt-3.5-turbo', 'max_tokens': 4000, 'request_timeout': 180.0, 'api_base': None, 'api_version': None, 'organization': None, 'proxy': None, 'cognitive_services_endpoint': None, 'deployment_name': None, 'model_supports_json': True, 'tokens_per_minute': 0, 'requests_per_minute': 0, 'max_retries': 10, 'max_retry_wait': 10.0, 'sleep_on_rate_limit_recommendation': True, 'concurrent_requests': 25}

SUCCESS: Global Search Response: ### Windows Virtual Machine Name Structure

Analysts have highlighted that a typical example of a Windows virtual machine name structure in the Azure environment follows a specific convention. For instance, a virtual machine name could be structured as SCPC-CTO-VDC-CORE-VM01. In this example, the initial segment, SCPC-CTO-VDC-CORE, signifies the resource group to which the virtual machine belongs. The latter part, VM01, indicates the specific instance of the virtual machine [Data: Reports (71, 78, 77, 60, 46, +more)].

This naming convention showcases the importance of incorporating various identifiers within the virtual machine name to denote crucial information such as the purpose, ownership, and sequence of the virtual machine within the infrastructure. Such structured naming not only aids in easy identification but also plays a significant role in efficient resource management within the Cloud Computing domain.

Analysts also emphasize that the naming structure for Windows virtual machines may include elements like the resource group name, department abbreviation, environment designation, and sequential numbering for individual instances. These components collectively contribute to creating a standardized and organized naming system that facilitates effective resource allocation and management [Data: Reports (71, 78, 77, 60, 46, +more)].

In essence, the meticulous design of the Windows virtual machine name structure, as outlined by the analysts, serves as a fundamental aspect of maintaining clarity, consistency, and operational efficiency within the Azure environment. Understanding and adhering to such naming conventions shall play a vital role in streamlining processes and enhancing overall system coherence.

GraphRAG Global:

python -m graphrag.query --method local --root . "What is an example of a windows virtual machine name structure?" 


INFO: Reading settings from settings.yaml
creating llm client with {'api_key': 'REDACTED,len=56', 'type': "openai_chat", 'model': 'gpt-3.5-turbo', 'max_tokens': 4000, 'request_timeout': 180.0, 'api_base': None, 'api_version': None, 'organization': None, 'proxy': None, 'cognitive_services_endpoint': None, 'deployment_name': None, 'model_supports_json': True, 'tokens_per_minute': 0, 'requests_per_minute': 0, 'max_retries': 10, 'max_retry_wait': 10.0, 'sleep_on_rate_limit_recommendation': True, 'concurrent_requests': 25}
creating embedding llm client with {'api_key': 'REDACTED,len=56', 'type': "openai_embedding", 'model': 'text-embedding-3-small', 'max_tokens': 4000, 'request_timeout': 180.0, 'api_base': None, 'api_version': None, 'organization': None, 'proxy': None, 'cognitive_services_endpoint': None, 'deployment_name': None, 'model_supports_json': None, 'tokens_per_minute': 0, 'requests_per_minute': 0, 'max_retries': 10, 'max_retry_wait': 10.0, 'sleep_on_rate_limit_recommendation': True, 'concurrent_requests': 25}

SUCCESS: Local Search Response: An example of a Windows virtual machine name structure within the Azure environment follows a specific naming convention outlined in the data. For instance, a Windows virtual machine name typically consists of various components that adhere to the standard naming patterns. In the context of the provided data, a Windows virtual machine name might include elements such as the department code, environment, CSP region, device type, and a user-defined string. An example of such a naming structure could be "ScPcSWA-MyApp01" [Data: Sources (38), Sources (36)].

This naming convention ensures consistency and clarity in identifying and managing Windows virtual machines within the Azure cloud computing environment. By incorporating specific elements in the name, such as the department code and user-defined string, administrators can easily track and categorize virtual machines based on their purpose, location, or ownership. Adhering to these naming standards is crucial for effective resource management and governance in cloud environments like Azure.

Same question with NobetookLM:

Windows Virtual Machine Name Structure Example

An example of a Windows virtual machine name structure is **ScPcSWA-MyApp01**. This structure uses the following fields: 

* **ScPc:** This is the four-character prefix representing the department code (Sc), environment (P for Production), and cloud service provider region (c for Azure Central). 
* **SWA:** This is the optional device type, representing Server, Windows, and Domain Controller. 
* **-:** This is the mandatory field delimiter.
* **MyApp01:** This is the user-defined string.

The device type field is optional and aligns with SSC's end-state data center naming standard and asset tracking systems. The user-defined string allows for flexibility and customization based on specific requirements.

GraphRAG Global was totally wrong with the answer. Local was better... but NotebookLM provided a more relevant answer.

Maybe if I used Chatgpt-4o it might do better... but I am not willing to pay the $ to discover it.

jgbradley1 · 2024-07-05T17:25:21Z

I can understand the cost argument for development/testing reasons. What we’ve found so far is that use of OSS models leads to more noise in the knowledge graph and therefore a degradation in the overall quality of the graph that is constructed. With a subpar graph, you’re likely to see a wide range of issues in the query response. 

We encourage testing with other models but we find that the gpt-4 turbo and gpt-4o LLM’s provide the best quality in practice (at this time). When using models that produce low precision results, this can cause problems in the knowledge graph due to the noise that they introduce. With the GPT-4 family, those models are strongly biased toward precision and the noise is minimal (even when compared to gpt-3.5 turbo). 

For a better quality knowledge graph construction, also consider taking a closer look at the prompts generated from the auto-templating process. These prompts are a vital component of the graphrag approach. Our docs don’t currently cover this feature in detail but you can increase the quality of your knowledge graphs by manually reviewing the auto-generated prompts and editing/tuning them (if there are clear errors) to your own data before indexing.

bmaltais · 2024-07-05T18:46:14Z

@jgbradley1 Thank you for the info. I did create the prompt for the document using the auto-generated feature of graphrag. Still performed less than expected. Probably because I used gpt-3.5 turbo for the whole process.

COPILOT-WDP · 2024-07-07T10:59:26Z

What dataset have you indexed? Would be curious to run the process using GPT-4o and compare to NotebookLM (running on Gemini 1.5 Pro I believe).

ayushjadia · 2024-07-10T07:42:10Z

Is that OK? I would also like to switch to a local ollama supported model

ollama doesn't support OAI compatible embedding API , try use the llamap.cpp to server the model. Otherwise you can modify the code to use huggingface embedding model.

How can we modify code to support huggingface embedding too?

RicardoLeeV587 · 2024-07-10T21:08:38Z

Hi everyone, I'd like to share my configuration of using local LLM and embedding to run GraphRag.

As for the LLM, I use Mistral-7B-Instruct-v0.3. It has 60K+ input length so it can handle the create_community_reports easily.

for the embedding model, I use e5-mistral-7b-instruct, as this is the best open source sentence embedding I find through some literature review.

All of the previous models can be served through vLLM, so you can build your local rag system with some speed boost provided by vLLM.

Besides, there is a small issue lies in query phase. Since the GraphRag request the LLM server through OpenAI style, the "system" content is not capable for mistral model. However, you can import your customize chat template to overcome this issue. Here is the template I use:

{%- for message in messages %}
    {%- if message['role'] == 'system' -%}
        {{- message['content'] -}}
    {%- else -%}
        {%- if message['role'] == 'user' -%}
            {{-'[INST] ' + message['content'].rstrip() + ' [/INST]'-}}
        {%- else -%}
            {{-'' + message['content'] + '</s>' -}}
        {%- endif -%}
    {%- endif -%}
{%- endfor -%}
{%- if add_generation_prompt -%}
    {{-''-}}
{%- endif -%}

I have already run through the whole local process on the novel “A Christmas Coral”. Hope this message can help everyone who wants to build your own local GraphRag 🎉.

menghongtao · 2024-07-11T03:46:17Z

Hi everyone, I'd like to share my configuration of using local LLM and embedding to run GraphRag.

As for the LLM, I use Mistral-7B-Instruct-v0.3. It has 60K+ input length so it can handle the create_community_reports easily.

for the embedding model, I use e5-mistral-7b-instruct, as this is the best open source sentence embedding I find through some literature review.

All of the previous models can be served through vLLM, so you can build your local rag system with some speed boost provided by vLLM.

Besides, there is a small issue lies in query phase. Since the GraphRag request the LLM server through OpenAI style, the "system" content is not capable for mistral model. However, you can import your customize chat template to overcome this issue. Here is the template I use:

{%- for message in messages %} {%- if message['role'] == 'system' -%} {{- message['content'] -}} {%- else -%} {%- if message['role'] == 'user' -%} {{-'[INST] ' + message['content'].rstrip() + ' [/INST]'-}} {%- else -%} {{-'' + message['content'] + '</s>' -}} {%- endif -%} {%- endif -%} {%- endfor -%} {%- if add_generation_prompt -%} {{-''-}} {%- endif -%}

I have already run through the whole local process on the novel “A Christmas Coral”. Hope this message can help everyone who wants to build your own local GraphRag 🎉.

Thanks for your sharing. I also want to use Mistral model, Could you please paste your settings.yaml file

ayushjadia · 2024-07-11T05:32:29Z

Hi everyone, I'd like to share my configuration of using local LLM and embedding to run GraphRag.

As for the LLM, I use Mistral-7B-Instruct-v0.3. It has 60K+ input length so it can handle the create_community_reports easily.

for the embedding model, I use e5-mistral-7b-instruct, as this is the best open source sentence embedding I find through some literature review.

All of the previous models can be served through vLLM, so you can build your local rag system with some speed boost provided by vLLM.

Besides, there is a small issue lies in query phase. Since the GraphRag request the LLM server through OpenAI style, the "system" content is not capable for mistral model. However, you can import your customize chat template to overcome this issue. Here is the template I use:

{%- for message in messages %} {%- if message['role'] == 'system' -%} {{- message['content'] -}} {%- else -%} {%- if message['role'] == 'user' -%} {{-'[INST] ' + message['content'].rstrip() + ' [/INST]'-}} {%- else -%} {{-'' + message['content'] + '</s>' -}} {%- endif -%} {%- endif -%} {%- endfor -%} {%- if add_generation_prompt -%} {{-''-}} {%- endif -%}

I have already run through the whole local process on the novel “A Christmas Coral”. Hope this message can help everyone who wants to build your own local GraphRag 🎉.

Please share ur settings.yaml file

RicardoLeeV587 · 2024-07-11T06:44:29Z

Hi everyone. Here is the settings.yaml I used

encoding_model: cl100k_base
skip_workflows: []
llm:
  api_key: ${GRAPHRAG_API_KEY}
  type: openai_chat # or azure_openai_chat
  # model: gpt-4-turbo-preview
  # model: "/data3/litian/Redemption/LLama-3/Meta-Llama-3-8B-Instruct"
  model: "/data3/litian/Redemption/generativeModel/Mistral-7B-Instruct-v0.3"
  # model: "/data3/litian/Redemption/generativeModel/Meta-Llama-3-8B-Instruct"
  model_supports_json: false # recommended if this is available for your model.
  # max_tokens: 4000
  # request_timeout: 180.0
  # api_base: https://<instance>.openai.azure.com
  api_base: http://localhost:8000/v1
  # api_version: 2024-02-15-preview
  # organization: <organization_id>
  # deployment_name: <azure_model_deployment_name>
  # tokens_per_minute: 150_000 # set a leaky bucket throttle
  # requests_per_minute: 10_000 # set a leaky bucket throttle
  # max_retries: 10
  # max_retry_wait: 10.0
  # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
  # concurrent_requests: 25 # the number of parallel inflight requests that may be made

parallelization:
  stagger: 0.3
  # num_threads: 50 # the number of threads to use for parallel processing

async_mode: threaded # or asyncio

embeddings:
  ## parallelization: override the global parallelization settings for embeddings
  async_mode: threaded # or asyncio
  llm:
    api_key: ${GRAPHRAG_API_KEY}
    type: openai_embedding # or azure_openai_embedding
    # model: text-embedding-3-small
    model: "/data3/litian/Redemption/embeddingModel/test/e5-mistral-7b-instruct"
    # api_base: https://<instance>.openai.azure.com
    api_base: http://localhost:8001/v1
    # api_version: 2024-02-15-preview
    # organization: <organization_id>
    # deployment_name: <azure_model_deployment_name>
    # tokens_per_minute: 150_000 # set a leaky bucket throttle
    # requests_per_minute: 10_000 # set a leaky bucket throttle
    # max_retries: 10
    # max_retry_wait: 10.0
    # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
    # concurrent_requests: 25 # the number of parallel inflight requests that may be made
    # batch_size: 16 # the number of documents to send in a single request
    # batch_max_tokens: 8191 # the maximum number of tokens to send in a single request
    # target: required # or optional
  


chunks:
  size: 300
  overlap: 100
  group_by_columns: [id] # by default, we don't allow chunks to cross documents
    
input:
  type: file # or blob
  file_type: text # or csv
  base_dir: "input"
  file_encoding: utf-8
  file_pattern: ".*\\.txt$"

cache:
  type: file # or blob
  base_dir: "cache"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

storage:
  type: file # or blob
  base_dir: "output/${timestamp}/artifacts"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

reporting:
  type: file # or console, blob
  base_dir: "output/${timestamp}/reports"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

entity_extraction:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/entity_extraction.txt"
  entity_types: [organization,person,geo,event]
  max_gleanings: 0

summarize_descriptions:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/summarize_descriptions.txt"
  max_length: 500

claim_extraction:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  # enabled: true
  prompt: "prompts/claim_extraction.txt"
  description: "Any claims or facts that could be relevant to information discovery."
  max_gleanings: 0

community_reports:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/community_report.txt"
  max_length: 2000
  max_input_length: 8000

cluster_graph:
  max_cluster_size: 10

embed_graph:
  enabled: false # if true, will generate node2vec embeddings for nodes
  # num_walks: 10
  # walk_length: 40
  # window_size: 2
  # iterations: 3
  # random_seed: 597832

umap:
  enabled: false # if true, will generate UMAP embeddings for nodes

snapshots:
  graphml: false
  raw_entities: false
  top_level_nodes: false

local_search:
  # text_unit_prop: 0.5
  # community_prop: 0.1
  # conversation_history_max_turns: 5
  # top_k_mapped_entities: 10
  # top_k_relationships: 10
  # max_tokens: 12000

global_search:
  # max_tokens: 12000
  # data_max_tokens: 12000
  # map_max_tokens: 1000
  # reduce_max_tokens: 2000
  # concurrency: 32

menghongtao · 2024-07-11T08:55:05Z

Thanks for sharing! I have another question for your help, where to change the template you have pasted before?

RicardoLeeV587 · 2024-07-11T12:44:28Z

Thanks for sharing! I have another question for your help, where to change the template you have pasted before?

By vLLM you can use --chat-template to specify your own template. The bash script is shown as follow:

base_model="/data3/litian/Redemption/generativeModel/Mistral-7B-Instruct-v0.3"

api_key="12345"
n_gpu=1

python -m vllm.entrypoints.openai.api_server \
  --model ${base_model} \
  --dtype float16 \
  --tensor-parallel-size ${n_gpu} \
  --api-key ${api_key} \
  --enforce-eager \
  --chat-template=./template/mistral.jinja

1193700079 · 2024-07-12T02:30:59Z

谢谢分享！我还有一个问题需要您帮忙，在哪里可以更改之前粘贴的模板？

通过 vLLM 您可以使用 --chat-template 指定您自己的模板。bash 脚本如下所示：
base_model="/data3/litian/Redemption/generativeModel/Mistral-7B-Instruct-v0.3"

api_key="12345"
n_gpu=1

python -m vllm.entrypoints.openai.api_server \
  --model ${base_model} \
  --dtype float16 \
  --tensor-parallel-size ${n_gpu} \
  --api-key ${api_key} \
  --enforce-eager \
  --chat-template=./template/mistral.jinja

Hello, I would like to know how to use vllm to start the embedding model。

I look at your setting.yaml
embeddings:
model: "/data3/litian/Redemption/embeddingModel/test/e5-mistral-7b-instruct"
api_base: http://localhost:8001/v1
llm:
model: "/data3/litian/Redemption/generativeModel/Mistral-7B-Instruct-v0.3"
api_base: http://localhost:8000/v1

RicardoLeeV587 · 2024-07-12T08:59:26Z

``

谢谢分享！我还有一个问题需要您帮忙，在哪里可以更改之前粘贴的模板？

通过 vLLM 您可以使用 --chat-template 指定您自己的模板。bash 脚本如下所示：
base_model="/data3/litian/Redemption/generativeModel/Mistral-7B-Instruct-v0.3"

api_key="12345"
n_gpu=1

python -m vllm.entrypoints.openai.api_server \
  --model ${base_model} \
  --dtype float16 \
  --tensor-parallel-size ${n_gpu} \
  --api-key ${api_key} \
  --enforce-eager \
  --chat-template=./template/mistral.jinja
Hello, I would like to know how to use vllm to start the embedding model。

I look at your setting.yaml embeddings: model: "/data3/litian/Redemption/embeddingModel/test/e5-mistral-7b-instruct" api_base: http://localhost:8001/v1 llm: model: "/data3/litian/Redemption/generativeModel/Mistral-7B-Instruct-v0.3" api_base: http://localhost:8000/v1

Hi, Actually vLLM support e5-mistral-7b-instruct. I think this is the only embedding model that vLLM support officially (If I am wrong, please correct me 😊). You can start it through the following command:

base_model="/data3/litian/Redemption/embeddingModel/test/e5-mistral-7b-instruct"

api_key="12345"
n_gpu=1

python -m vllm.entrypoints.openai.api_server --port 8001 --model ${base_model} --dtype auto --tensor-parallel-size ${n_gpu} --api-key ${api_key}

s106916 · 2024-07-13T03:41:54Z

this is a temp solution for local ollama
https://github.com/s106916/graphrag

AlonsoGuevara mentioned this issue Jul 5, 2024

large costs to run #385

Closed

Nuclear6 mentioned this issue Jul 10, 2024

Build a graphrag system with the help of Wenxin 4-8k model #488

Closed

KylinMountain mentioned this issue Jul 12, 2024

[Feature Request]: Local LLM and embedding #524

Open

vv111y mentioned this issue Jul 12, 2024

[Bug]: JSON Decode Error #471

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Share my config to change to your local LLM and embedding #374

Share my config to change to your local LLM and embedding #374

KylinMountain commented Jul 5, 2024

gdhua commented Jul 5, 2024

qwaszaq commented Jul 5, 2024

KylinMountain commented Jul 5, 2024

KylinMountain commented Jul 5, 2024

bernardmaltais commented Jul 5, 2024 •

edited

Loading

qwaszaq commented Jul 5, 2024 via email

KylinMountain commented Jul 5, 2024

emrgnt-cmplxty commented Jul 5, 2024

bmaltais commented Jul 5, 2024 •

edited

Loading

jgbradley1 commented Jul 5, 2024 •

edited

Loading

bmaltais commented Jul 5, 2024

COPILOT-WDP commented Jul 7, 2024

ayushjadia commented Jul 10, 2024

RicardoLeeV587 commented Jul 10, 2024 •

edited

Loading

menghongtao commented Jul 11, 2024

ayushjadia commented Jul 11, 2024

RicardoLeeV587 commented Jul 11, 2024

menghongtao commented Jul 11, 2024

RicardoLeeV587 commented Jul 11, 2024

1193700079 commented Jul 12, 2024

RicardoLeeV587 commented Jul 12, 2024

s106916 commented Jul 13, 2024

Share my config to change to your local LLM and embedding #374

Share my config to change to your local LLM and embedding #374

Comments

KylinMountain commented Jul 5, 2024

gdhua commented Jul 5, 2024

qwaszaq commented Jul 5, 2024

KylinMountain commented Jul 5, 2024

KylinMountain commented Jul 5, 2024

bernardmaltais commented Jul 5, 2024 • edited Loading

qwaszaq commented Jul 5, 2024 via email

KylinMountain commented Jul 5, 2024

emrgnt-cmplxty commented Jul 5, 2024

bmaltais commented Jul 5, 2024 • edited Loading

jgbradley1 commented Jul 5, 2024 • edited Loading

bmaltais commented Jul 5, 2024

COPILOT-WDP commented Jul 7, 2024

ayushjadia commented Jul 10, 2024

RicardoLeeV587 commented Jul 10, 2024 • edited Loading

menghongtao commented Jul 11, 2024

ayushjadia commented Jul 11, 2024

RicardoLeeV587 commented Jul 11, 2024

menghongtao commented Jul 11, 2024

RicardoLeeV587 commented Jul 11, 2024

1193700079 commented Jul 12, 2024

RicardoLeeV587 commented Jul 12, 2024

s106916 commented Jul 13, 2024

bernardmaltais commented Jul 5, 2024 •

edited

Loading

bmaltais commented Jul 5, 2024 •

edited

Loading

jgbradley1 commented Jul 5, 2024 •

edited

Loading

RicardoLeeV587 commented Jul 10, 2024 •

edited

Loading