Documentation Index
Fetch the complete documentation index at: https://mintlify.com/openai/parameter-golf/llms.txt
Use this file to discover all available pages before exploring further.
What exactly counts toward the 16MB artifact size?
What exactly counts toward the 16MB artifact size?
train_gpt.py script.The cap is decimal 16MB — 16,000,000 total bytes — not 16 MiB (16,777,216 bytes).No external downloads, training dataset access, or network calls are allowed during evaluation. The artifact must be fully self-contained and reproducible. Specifically:train_gpt.pyis measured as raw UTF-8 bytes- The model is measured as compressed bytes in the
final_model.int8.ptzartifact (int8-quantized weights, zlib-compressed) - Any external data your model needs at eval time must be baked into the 16MB limit
Are scores independently verified by OpenAI?
Are scores independently verified by OpenAI?
train_gpt.py under logs/) is required for all submissions and is the primary means of verification.What counts as 'external compute'? Is it fair to tune hyperparameters offline?
What counts as 'external compute'? Is it fair to tune hyperparameters offline?
What are the restrictions on evaluation?
What are the restrictions on evaluation?
- Time limit: Submissions must complete evaluation in under 10 minutes on 8xH100 SXM (note: this is in addition to the 10-minute training limit)
- Sequence length: Evaluation at any sequence length is allowed
- Training data access: You cannot access any training data during evaluation unless you pay for those bits within the 16MB limit
- Evaluation methods: Aggressive, creative evaluation strategies are explicitly encouraged — push the bounds just as you would with training
Can I use a different tokenizer?
Can I use a different tokenizer?
val_bpb is correctly calculated. Tokenizer bugs — such as miscounting bytes per token — can unjustly improve your score and will result in disqualification.If you retokenize from scratch, use data/download_hf_docs_and_tokenize.py with the published docs_selected.jsonl and docs_selected.source_manifest.json to ensure you are operating on the exact same document set as the baseline.What GPUs can I use for development?
What GPUs can I use for development?
What Python dependencies are pre-installed on Runpod?
What Python dependencies are pre-installed on Runpod?
requirements.txt are pre-installed in the official Runpod template image. After cloning the repo, you can immediately run the training script and data download commands without any pip install steps.If you are running on a different machine or image, install the required packages manually before proceeding.Can I access the FineWeb training data during evaluation?
Can I access the FineWeb training data during evaluation?
Where do I report issues with a submission?
Where do I report issues with a submission?
How are inter-run variance and statistical significance handled?
How are inter-run variance and statistical significance handled?
