File size: 2,940 Bytes
74ef57e 14222a1 013699d 14222a1 74ef57e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 |
---
license: apache-2.0
language:
- en
pipeline_tag: text-generation
library_name: transformers
---
# Requirement Checker Adapters
## Model Summary
This **Requirement Checker** family of adapters are designed to check if specified requirements were satisfied by the last model generation. Only one requirement is checked at a time (multiple requirements can be checked with parallel model calls).
- **Developer:** IBM Research
- **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
## Usage
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
### Intended use
**Usage steps** Given a generation task and a set of requirements:
1. Use the base model to generate a response as normal (via the `assistant` role), with the prompt describing the task followed by "Requirements:"" and the list of active requirements.
2. Repeat the requirement to be checked.
3. The Requirement Checker model will respond with "true" or "false", where "true" means the requirement is satisfied.
### Quickstart Example
First, see information elsewhere in this repo on how to start up a vLLM server hosting the LoRAs and/or aLoRAs. Once this server is started, it can be queried via the OpenAI API. An example for this intrinsic follows.
```
import os
import openai
import json
import granite_common
PROMPT = "What is IBM?"
REQUIREMENTS = "Use a formal tone.\n Do not use long words."
REPONSE = ... # this should be generated by the base model corresponding to the chosen adapter
REQUIREMENT_TO_CHECK = "Use a formal tone."
request = {
"messages": [
{
"content": PROMPT + "\nRequirements: " + REQUIREMENTS,
"role": "user"
},
{
"role": "assistant",
"content": RESPONSE
},
],
"model": "requirement_check",
"temperature": 0.0
}
openai_base_url = ...
openai_api_key = ...
io_yaml_file = "./rag_intrinsics_lib/requirement_check/.../io.yaml"
rewriter = granite_common.IntrinsicsRewriter(config_file=io_yaml_file)
result_processor = granite_common.IntrinsicsResultProcessor(config_file=io_yaml_file)
rewritten_request = rewriter.transform(request, requirement = REQUIREMENT_TO_CHECK)
client = openai.OpenAI(base_url=openai_base_url, api_key=openai_api_key)
chat_completion = client.chat.completions.create(**rewritten_request.model_dump())
transformed_completion = result_processor.transform(chat_completion)
print(transformed_completion.model_dump_json(indent=2))
```
## Evaluation
The model was evaluated on 200 rows of held-out synthetic data. Error rates are as follows:
**aLoRA models**
* Granite 3.3 2B: 6.0%
* Granite 3.3 8B: 5.75%
* GPT-OSS 20B: 5.75%
**LoRA models**
* Granite 3.3 2B: 4.5%
* Granite 3.3 8B: 4.0%
* GPT-OSS 20B: 4.0%
### Training Data
Synthetic data generated by Mixtral 8x22b and GPT-OSS 120B.
## Model Card Authors
Kristjan Greenewald
Bo Wu |