[Feat] Give estimates for exact cost/token count for hard

It would be nice to know the rough number of expected tokens in and out to estimate the potential cost if I were to run it myself, especially for the hard bench, as this seemingly isn't detailed in your paper.

In your paper you mention that the BigCodeBench-Complete has roughly 1112.5 chars per prompt, and 426 chars per answer for 1140 questions. This gives us enough to calculate very rough cost.

This roughly gives: 1140*1112.5 = 1,753,890 chars * 0.75 = 1,315,418 tokens for the complete

Or for instruct: 1140*(663.2+426) = 1,241,688 chars * 0.75 = 931,266 tokens

Could we potentially get similar figures for the hard benchmark dataset?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feat] Give estimates for exact cost/token count for hard #118

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feat] Give estimates for exact cost/token count for hard #118

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions