Notice ID: CAW-AISI-0002
NIST is performing market research to identify potential sources for an anticipated contract to assist in developing evaluations and benchmarks of AI models’ relevant cyber capabilities and risks.
The Contractor must provide or develop resources for various aspects of assessing frontier Al model cyber capabilities and risks. The Contractor would be responsible to conduct one or more of the tasks in list A in order to assess one or more of the capabilities in list B.
Contractor Tasks:
LIST A – Contractors must provide or develop resources for one or more of the following.
- Developing benchmarks and scoring mechanisms for automated evaluation of Al models’ relevant cyber capabilities based on real or realistic offensive cyber tasks or workflows.
- Developing tasks for automated evaluation of AI models’ relevant cyber capabilities with accompanying data on human baseline performance (e.g., how long the tasks take human experts to complete) …
LIST B – Relevant frontier model capabilities to elicit, evaluate, and benchmark include:
- Capabilities that enable a model to discover vulnerabilities in real or realistic code bases, web resources, or networks;
- Capabilities that enable a model to develop working exploits for discovered or known vulnerabilities in real or realistic code bases, web resources, or networks…
Not Yet a Premium Partner/Sponsor? Learn more about the OS AI Premium Corporate and Individual Plans here. Plans start at $250 annually.