Last code generated never used for prompt generation #1235

sujeendran · 2024-06-14T13:05:00Z

System Info

OS version: Windows 10
Python version: 3.10
PandasAI version: 2.0.40

🐛 Describe the bug

pandas-ai/pandasai/pipelines/chat/prompt_generation.py

Line 64 in 64e6dcd

last_code_generated=context.get("last_code_generated"),

While using this library I noticed this last_code_generated is always None while generating the prompt because of this:

pandas-ai/pandasai/pipelines/chat/generate_chat_pipeline.py

Line 288 in 64e6dcd

self.context.reset_intermediate_values()

Is there any case where it should actually be populated? The reason why I am asking this is because during an ongoing chat session if I want the llm to make a small correction to the existing result, it is not able to make that without redoing the code from start. I feel having this and a check to see if the last conversation is relevant will give better results. (I just realized that was agent function was removed in the last release)

Is there a way to pass the agent.last_code_generated to the prompt generation somehow?

dosubot · 2024-06-14T13:12:17Z

Hello, @sujeendran! I'm here to assist you with any bugs, questions, or contributions you have for the pandas-ai project. Let me know how I can help you!

Yes, there is a way to pass the agent.last_code_generated to the prompt generation in the PandasAI library. The PipelineContext object can be used to store and retrieve the last_code_generated variable. You can modify the PromptGeneration step to include this variable in the prompt.

Here is an example of how you can modify the PromptGeneration step to include last_code_generated:

from typing import Any, Union

from pandasai.pipelines.logic_unit_output import LogicUnitOutput
from ...helpers.logger import Logger
from ...prompts.base import BasePrompt
from ...prompts.generate_python_code import GeneratePythonCodePrompt
from ...prompts.generate_python_code_with_sql import GeneratePythonCodeWithSQLPrompt
from ..base_logic_unit import BaseLogicUnit
from ..pipeline_context import PipelineContext

class PromptGeneration(BaseLogicUnit):
    """
    Code Prompt Generation Stage
    """

    def execute(self, input: Any, **kwargs) -> Any:
        """
        This method will return output according to
        Implementation.

        :param input: Your input data.
        :param kwargs: A dictionary of keyword arguments.
            - 'logger' (any): The logger for logging.
            - 'config' (Config): Global configurations for the test
            - 'context' (any): The execution context.

        :return: The result of the execution.
        """
        self.context: PipelineContext = kwargs.get("context")
        self.logger: Logger = kwargs.get("logger")

        prompt = self.get_chat_prompt(self.context)
        self.logger.log(f"Using prompt: {prompt}")

        return LogicUnitOutput(
            prompt,
            True,
            "Prompt Generated Successfully",
            {"content_type": "prompt", "value": prompt.to_string()},
        )

    def get_chat_prompt(self, context: PipelineContext) -> Union[str, BasePrompt]:
        # set matplotlib as the default library
        viz_lib = "matplotlib"
        if context.config.data_viz_library:
            viz_lib = context.config.data_viz_library

        output_type = context.get("output_type")

        return (
            GeneratePythonCodeWithSQLPrompt(
                context=context,
                last_code_generated=context.get("last_code_generated"),
                viz_lib=viz_lib,
                output_type=output_type,
            )
            if context.config.direct_sql
            else GeneratePythonCodePrompt(
                context=context,
                last_code_generated=context.get("last_code_generated"),
                viz_lib=viz_lib,
                output_type=output_type,
            )
        )

In the GenerateChatPipeline class, ensure that the last_code_generated is stored in the context after code generation:

def run_generate_code(self, input: ChatPipelineInput) -> dict:
    self._logger.log(f"Executing Pipeline: {self.__class__.__name__}")

    # Reset intermediate values
    self.context.reset_intermediate_values()

    # Start New Tracking for Query
    self.query_exec_tracker.start_new_track(input)

    self.query_exec_tracker.add_skills(self.context)

    self.query_exec_tracker.add_dataframes(self.context.dfs)

    # Add Query to memory
    self.context.memory.add(input.query, True)

    self.context.add_many(
        {
            "output_type": input.output_type,
            "last_prompt_id": input.prompt_id,
        }
    )
    try:
        output = self.code_generation_pipeline.run(input)

        # Store the last code generated in the context
        self.context.add("last_code_generated", output.get("value"))

        self.query_exec_tracker.success = True

        self.query_exec_tracker.publish()

        return output

    except Exception as e:
        import traceback
        traceback.print_exc()

        self.last_error = str(e)
        self.query_exec_tracker.success = False
        self.query_exec_tracker.publish()

        return (
            "Unfortunately, I was not able to answer your question, "
            "because of the following error:\n"
            f"\n{e}\n"
        )

Additionally, you can use the Callbacks class to handle the last_code_generated variable. This class provides methods that are called at different stages of the code generation process, ensuring that the last_code_generated is always up-to-date:

from ..prompts import BasePrompt

class Callbacks:
    def __init__(self, agent):
        self.agent = agent

    def on_prompt_generation(self, prompt: BasePrompt) -> str:
        """
        A method to be called after prompt generation.

        Args:
            prompt (str): A prompt
        """
        self.agent.last_prompt = str(prompt)

    def on_code_generation(self, code: str):
        """
        A method to be called after code generation.

        Args:
            code (str): A python code
        """
        self.agent.last_code_generated = code

    def before_code_execution(self, code: str):
        """
        A method to be called after code execution.

        Args:
            code (str): A python code
        """
        self.agent.last_code_executed = code

    def on_result(self, result):
        """
        A method to be called after code execution.

        Args:
            result (Any): A python code
        """
        self.agent.last_result = result

By making these modifications, the last_code_generated variable will be included in the prompt generation, allowing for small corrections to the existing result without redoing the code from the start. Additionally, the BaseAgent class in the PandasAI library already includes attributes like last_code_generated and methods to handle the context and callbacks, which can be leveraged to ensure the last_code_generated is properly managed ^[1]^[2]^[3]^[4].

To continue talking to Dosu, mention @dosu.

robustgoku · 2024-06-29T20:14:45Z

@dosu what about the relevant dataframe or dataset it used to run the code on? How to get that?

dosubot bot added the bug Something isn't working label Jun 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Last code generated never used for prompt generation #1235

Last code generated never used for prompt generation #1235

sujeendran commented Jun 14, 2024

dosubot bot commented Jun 14, 2024

robustgoku commented Jun 29, 2024

Last code generated never used for prompt generation #1235

Last code generated never used for prompt generation #1235

Comments

sujeendran commented Jun 14, 2024

System Info

🐛 Describe the bug

dosubot bot commented Jun 14, 2024

robustgoku commented Jun 29, 2024