You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What is the processing of the inference and extraction for humaneval?
My test score is only 20.73 for humaneval (k=1, model = CodeLlama-7b-Instruct-hf+DPO ).
The text was updated successfully, but these errors were encountered:
Hi! My sincere apologies for the delayed response.
For HumanEval+, we used https://github.com/evalplus/evalplus implementation. You can clone the repo, and add the generate.py in the /evalplus subfolder. You also need to add the evalplus/templates.py file.
Next, you can run generation for HumanEval+ as follows:
In the paper, we use --temperature=0.8 and n_samples=10 to report Pass@1 and Pass@10.
After generating the responses, you can use evalplus/evaluate.py to obtain the metrics.
What is the processing of the inference and extraction for humaneval?
My test score is only 20.73 for humaneval (k=1, model = CodeLlama-7b-Instruct-hf+DPO ).
The text was updated successfully, but these errors were encountered: