AutoDetect:

Towards a Unified Framework for Automated Weakness Detection in Large Language Models

🤗 Data • 📃 Paper

Effective weakness discovery can well guide model enhancement. AutoDetect can achieve high identification success rates in the instruction-following, mathematics and coding tasks (A). Moreover, leveraging this data, we can further improve LLMs (B).

Data

Our dataset can be found on Hugging Face.

We show data from the Autodetect process for the models listed in the paper, each in the following format:

{
    "key_point": {key_point},
    "prompt": {prompt},
    "answer": {answer},
    "ref_ans": {ref_ans},
    "comparison": {comparison},
    "score": {score}
}

{prompt}: The question used to evaluate the target LLM.
{key_point}: The summary of the key knowledge encapsulated in the {prompt}.
{answer}: The response generated by the target LLM.
{ref_ans}: The reference response generated by GPT-4 Turbo.
{comparison}: The comparison output from the LLM-judge (GPT-4 Turbo).
{score}: The final evaluation score, ranging from 1 to 10.

By leveraging the prompt alongside the ref_ans, we can effectively enhance the performance of the target LLM.

Quick Start

For all codes, we have added #TODO comments to indicate places in the code that need modification before running. Please update the relevant parts as noted before executing each file.

Setup

pip install -r requirements.txt

Weakness Detection

We provide the code for AutoDetect on Instruction-following, Mathematics and Coding tasks, the usage of which are exactly the same.

Here we take Instruction-following task for example. To run AutoDetect, please run the following command.

cd autodetect_if/scripts
bash autodetect_if

The result will be stored under the path autodetect_if/result.

To calculate the weakness identification success rate, run the following command:

python count_res.py

Acknowledgement

Fine-tuning code: llm_finetuning
Evaluation scripts: IFEval, MetaMath, and HumanEval

Citation

@article{cheng2024autodetect,
  title={AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models},
  author={Cheng, Jiale and Lu, Yida and Gu, Xiaotao and Ke, Pei and Liu, Xiao and Dong, Yuxiao and Wang, Hongning and Tang, Jie and Huang, Minlie},
  journal={arXiv preprint arXiv:2406.16714},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
autodetect_code		autodetect_code
autodetect_if		autodetect_if
autodetect_math		autodetect_math
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AutoDetect:

Towards a Unified Framework for Automated Weakness Detection in Large Language Models

Table of Contents

Data

Quick Start

Setup

Weakness Detection

Acknowledgement

Citation

About

Releases

Packages

Languages

License

thu-coai/AutoDetect

Folders and files

Latest commit

History

Repository files navigation

AutoDetect:

Towards a Unified Framework for Automated Weakness Detection in Large Language Models

Table of Contents

Data

Quick Start

Setup

Weakness Detection

Acknowledgement

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages