Getting Key error for most use cases in a simple dataframe using open source lama3:8b - instruct model via ollama #1243

sktg84 · 2024-06-20T04:39:13Z

System Info

Server:
mac m1

Packages:
pandas==1.5.3
pandasai==2.1

Python 3.10.9

🐛 Describe the bug

Getting key error for most use cases and its not able to parse dataframes in local llm.

2024-06-20 10:05:10 [ERROR] Failed with error: Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandasai/pipelines/chat/code_execution.py", line 85, in execute
result = self.execute_code(code_to_run, code_context)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandasai/pipelines/chat/code_execution.py", line 171, in execute_code
exec(code, environment)
File "", line 3, in
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/frame.py", line 8402, in groupby
return DataFrameGroupBy(
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/groupby/groupby.py", line 965, in init
grouper, exclusions, obj = get_grouper(
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/groupby/grouper.py", line 888, in get_grouper
raise KeyError(gpr)
KeyError: 'software_version'

2024-06-20 10:05:10 [ERROR] Pipeline failed on step 6: 'software_version'
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandasai/pipelines/chat/generate_chat_pipeline.py", line 310, in run
).run(input)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandasai/pipelines/pipeline.py", line 137, in run
raise e
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandasai/pipelines/pipeline.py", line 101, in run
step_output = logic.execute(
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandasai/pipelines/chat/code_execution.py", line 113, in execute
raise e
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandasai/pipelines/chat/code_execution.py", line 85, in execute
result = self.execute_code(code_to_run, code_context)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandasai/pipelines/chat/code_execution.py", line 171, in execute_code
exec(code, environment)
File "", line 3, in
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/frame.py", line 8402, in groupby
return DataFrameGroupBy(
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/groupby/groupby.py", line 965, in init
grouper, exclusions, obj = get_grouper(
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/groupby/grouper.py", line 888, in get_grouper
raise KeyError(gpr)
KeyError: 'software_version'
2024-06-20 10:05:38 [INFO] HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
2024-06-20 10:05:38 [INFO] Prompt used:

dosubot · 2024-06-20T04:43:01Z

Hey there, @sktg84! 👋 I'm here to help you out with any bugs or questions you have. I'm a bot that can assist with solving bugs, answering questions, and becoming a contributor. Let's tackle this issue together!

To resolve the KeyError related to the 'software_version' key and successfully parse the DataFrame while using the lama3:8b-instruct model via Ollama, you need to ensure that the 'software_version' key is included in the extras dictionary when calling the serialize method. Here is an example of how you can do this:

import pandas as pd
from pandasai.helpers.dataframe_serializer import DataframeSerializer, DataframeSerializerType

# Sample DataFrame
data = {
    'column1': [1, 2, 3],
    'column2': [4, 5, 6]
}
df = pd.DataFrame(data)

# Create an instance of DataframeSerializer
serializer = DataframeSerializer()

# Define the extras dictionary with the 'software_version' key
extras = {
    'index': 0,
    'type': 'example_type',
    'software_version': '1.0.0'  # Ensure this key is included
}

# Serialize the DataFrame to JSON
json_output = serializer.serialize(df, extras, type_=DataframeSerializerType.JSON)

print(json_output)

This code ensures that the 'software_version' key is present in the extras dictionary, which should prevent the KeyError and allow the DataFrame to be successfully parsed ^[1].

To continue talking to Dosu, mention @dosu.

sktg84 · 2024-06-20T11:33:56Z

this code is auto gen from llm.

sktg84 · 2024-06-25T08:25:45Z

@dosu any updates?

dosubot · 2024-06-25T08:31:52Z

To resolve the KeyError: 'software_version', you need to ensure that the 'software_version' column exists in the DataFrame before performing operations like groupby. Since the code is auto-generated by the LLM, you can add a check in the execute_code method to handle this gracefully.

Here is an updated version of the execute_code method that includes a check for the 'software_version' column:

def execute_code(self, code: str, context: CodeExecutionContext) -> Any:
    """
    Execute the python code generated by LLMs to answer the question
    about the input dataframe. Run the code in the current context and return the
    result.

    Args:
        code (str): Python code to execute.
        context (CodeExecutionContext): Code Execution Context
                with prompt id and skills.

    Returns:
        Any: The result of the code execution. The type of the result depends
            on the generated code.

    """
    # List the required dfs, so we can avoid to run the connectors
    # if the code does not need them
    dfs = self._required_dfs(code)
    environment: dict = get_environment(self._additional_dependencies)
    environment["dfs"] = self._get_originals(dfs)
    if len(environment["dfs"]) == 1:
        environment["df"] = environment["dfs"][0]

    if self._config.direct_sql:
        environment["execute_sql_query"] = self._dfs[0].execute_direct_sql_query

    # Add skills to the env
    if context.skills_manager.used_skills:
        for skill_func_name in context.skills_manager.used_skills:
            skill = context.skills_manager.get_skill_by_func_name(skill_func_name)
            environment[skill_func_name] = skill

    # Check if 'software_version' column exists in the DataFrame
    if 'software_version' not in environment['df'].columns:
        raise KeyError("The 'software_version' column is missing from the DataFrame.")

    # Execute the code
    exec(code, environment)

    # Get the result
    if "result" not in environment:
        raise NoResultFoundError("No result returned")

    return environment["result"]

This code snippet adds a check to ensure that the 'software_version' column exists in the DataFrame before executing the code. If the column is missing, it raises a KeyError with a descriptive message ^[1].

To continue talking to Dosu, mention @dosu.

dosubot bot added the bug Something isn't working label Jun 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting Key error for most use cases in a simple dataframe using open source lama3:8b - instruct model via ollama #1243

Getting Key error for most use cases in a simple dataframe using open source lama3:8b - instruct model via ollama #1243

sktg84 commented Jun 20, 2024

dosubot bot commented Jun 20, 2024

sktg84 commented Jun 20, 2024

sktg84 commented Jun 25, 2024

dosubot bot commented Jun 25, 2024

Getting Key error for most use cases in a simple dataframe using open source lama3:8b - instruct model via ollama #1243

Getting Key error for most use cases in a simple dataframe using open source lama3:8b - instruct model via ollama #1243

Comments

sktg84 commented Jun 20, 2024

System Info

🐛 Describe the bug

dosubot bot commented Jun 20, 2024

sktg84 commented Jun 20, 2024

sktg84 commented Jun 25, 2024

dosubot bot commented Jun 25, 2024