Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Last code generated never used for prompt generation #1235

Open
sujeendran opened this issue Jun 14, 2024 · 2 comments
Open

Last code generated never used for prompt generation #1235

sujeendran opened this issue Jun 14, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@sujeendran
Copy link

System Info

OS version: Windows 10
Python version: 3.10
PandasAI version: 2.0.40

馃悰 Describe the bug

last_code_generated=context.get("last_code_generated"),

While using this library I noticed this last_code_generated is always None while generating the prompt because of this:

self.context.reset_intermediate_values()

Is there any case where it should actually be populated? The reason why I am asking this is because during an ongoing chat session if I want the llm to make a small correction to the existing result, it is not able to make that without redoing the code from start. I feel having this and a check to see if the last conversation is relevant will give better results. (I just realized that was agent function was removed in the last release)

Is there a way to pass the agent.last_code_generated to the prompt generation somehow?

@dosubot dosubot bot added the bug Something isn't working label Jun 14, 2024
Copy link

dosubot bot commented Jun 14, 2024

Hello, @sujeendran! I'm here to assist you with any bugs, questions, or contributions you have for the pandas-ai project. Let me know how I can help you!

Yes, there is a way to pass the agent.last_code_generated to the prompt generation in the PandasAI library. The PipelineContext object can be used to store and retrieve the last_code_generated variable. You can modify the PromptGeneration step to include this variable in the prompt.

Here is an example of how you can modify the PromptGeneration step to include last_code_generated:

from typing import Any, Union

from pandasai.pipelines.logic_unit_output import LogicUnitOutput
from ...helpers.logger import Logger
from ...prompts.base import BasePrompt
from ...prompts.generate_python_code import GeneratePythonCodePrompt
from ...prompts.generate_python_code_with_sql import GeneratePythonCodeWithSQLPrompt
from ..base_logic_unit import BaseLogicUnit
from ..pipeline_context import PipelineContext

class PromptGeneration(BaseLogicUnit):
    """
    Code Prompt Generation Stage
    """

    def execute(self, input: Any, **kwargs) -> Any:
        """
        This method will return output according to
        Implementation.

        :param input: Your input data.
        :param kwargs: A dictionary of keyword arguments.
            - 'logger' (any): The logger for logging.
            - 'config' (Config): Global configurations for the test
            - 'context' (any): The execution context.

        :return: The result of the execution.
        """
        self.context: PipelineContext = kwargs.get("context")
        self.logger: Logger = kwargs.get("logger")

        prompt = self.get_chat_prompt(self.context)
        self.logger.log(f"Using prompt: {prompt}")

        return LogicUnitOutput(
            prompt,
            True,
            "Prompt Generated Successfully",
            {"content_type": "prompt", "value": prompt.to_string()},
        )

    def get_chat_prompt(self, context: PipelineContext) -> Union[str, BasePrompt]:
        # set matplotlib as the default library
        viz_lib = "matplotlib"
        if context.config.data_viz_library:
            viz_lib = context.config.data_viz_library

        output_type = context.get("output_type")

        return (
            GeneratePythonCodeWithSQLPrompt(
                context=context,
                last_code_generated=context.get("last_code_generated"),
                viz_lib=viz_lib,
                output_type=output_type,
            )
            if context.config.direct_sql
            else GeneratePythonCodePrompt(
                context=context,
                last_code_generated=context.get("last_code_generated"),
                viz_lib=viz_lib,
                output_type=output_type,
            )
        )

In the GenerateChatPipeline class, ensure that the last_code_generated is stored in the context after code generation:

def run_generate_code(self, input: ChatPipelineInput) -> dict:
    self._logger.log(f"Executing Pipeline: {self.__class__.__name__}")

    # Reset intermediate values
    self.context.reset_intermediate_values()

    # Start New Tracking for Query
    self.query_exec_tracker.start_new_track(input)

    self.query_exec_tracker.add_skills(self.context)

    self.query_exec_tracker.add_dataframes(self.context.dfs)

    # Add Query to memory
    self.context.memory.add(input.query, True)

    self.context.add_many(
        {
            "output_type": input.output_type,
            "last_prompt_id": input.prompt_id,
        }
    )
    try:
        output = self.code_generation_pipeline.run(input)

        # Store the last code generated in the context
        self.context.add("last_code_generated", output.get("value"))

        self.query_exec_tracker.success = True

        self.query_exec_tracker.publish()

        return output

    except Exception as e:
        import traceback
        traceback.print_exc()

        self.last_error = str(e)
        self.query_exec_tracker.success = False
        self.query_exec_tracker.publish()

        return (
            "Unfortunately, I was not able to answer your question, "
            "because of the following error:\n"
            f"\n{e}\n"
        )

Additionally, you can use the Callbacks class to handle the last_code_generated variable. This class provides methods that are called at different stages of the code generation process, ensuring that the last_code_generated is always up-to-date:

from ..prompts import BasePrompt

class Callbacks:
    def __init__(self, agent):
        self.agent = agent

    def on_prompt_generation(self, prompt: BasePrompt) -> str:
        """
        A method to be called after prompt generation.

        Args:
            prompt (str): A prompt
        """
        self.agent.last_prompt = str(prompt)

    def on_code_generation(self, code: str):
        """
        A method to be called after code generation.

        Args:
            code (str): A python code
        """
        self.agent.last_code_generated = code

    def before_code_execution(self, code: str):
        """
        A method to be called after code execution.

        Args:
            code (str): A python code
        """
        self.agent.last_code_executed = code

    def on_result(self, result):
        """
        A method to be called after code execution.

        Args:
            result (Any): A python code
        """
        self.agent.last_result = result

By making these modifications, the last_code_generated variable will be included in the prompt generation, allowing for small corrections to the existing result without redoing the code from the start. Additionally, the BaseAgent class in the PandasAI library already includes attributes like last_code_generated and methods to handle the context and callbacks, which can be leveraged to ensure the last_code_generated is properly managed [1][2][3][4].

To continue talking to Dosu, mention @dosu.

@robustgoku
Copy link

@dosu what about the relevant dataframe or dataset it used to run the code on? How to get that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants