PRIME
Published:
Scalable RL solution for advanced reasoning of language models.
Published:
Scalable RL solution for advanced reasoning of language models.
Published:
A large-scale, fine-grained, diverse preference dataset (and models).
Published:
The implementation and evaluation of Mistral-Interact, a powerful model that proactively assesses task vagueness, inquires user intentions, and refines them into actionable goals before starting downstream agent task execution.
Published:
NeurIPS 2022 Datasets & Benchmarks. The pipe of Openbackdoor toolkit: