nanditasn commited on
Commit
2a9be9d
·
verified ·
1 Parent(s): 64562fd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +78 -75
README.md CHANGED
@@ -4,41 +4,90 @@ license: cc-by-nc-sa-4.0
4
  pipeline_tag: text-ranking
5
  ---
6
 
 
 
7
  # Contextual AI Reranker v2 6B-NVFP4
8
 
 
 
 
 
 
 
 
 
 
9
  ## Highlights
10
 
11
- Our reranker is on the cost/performance Pareto frontier across 5 key areas:
12
- - Instruction following (including capability to rank more recent information higher)
13
- - Question answering
14
- - Multilinguality
15
- - Product search / recommendation systems
 
16
  - Real-world use cases
17
 
18
  <p align="center">
19
  <img src="main_benchmark.png" width="1200"/>
20
  <p>
21
 
22
- For more details on these and other benchmarks, please refer to our [blogpost](https://contextual.ai/blog/rerank-v2).
23
 
24
  ## Overview
25
 
26
- - Model Type: Text Reranking
27
- - Supported Languages: 100+
28
- - Number of Paramaters: 6B
29
- - Quantization: NVFP4
30
- - Context Length: up to 32K
31
- - Blogpost: https://contextual.ai/blog/rerank-v2
 
 
 
 
 
 
 
 
 
32
 
33
  ## Quickstart
34
 
35
- ### vLLM usage
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36
 
37
- Requires vllm==0.10.0 for NVFP4 or vllm>=0.8.5 for BF16.
 
 
38
 
39
  ```python
40
  import os
41
- os.environ['VLLM_USE_V1'] = '0' # v1 engine doesnt support logits processor yet
42
 
43
  import torch
44
  from vllm import LLM, SamplingParams
@@ -98,66 +147,20 @@ def infer_w_vllm(model_path: str, query: str, instruction: str, documents: list[
98
  print(f"Instruction: {instruction}")
99
  for score, doc_id, doc in results:
100
  print(f"Score: {score:.4f} | Doc: {doc}")
101
- ```
102
-
103
 
104
- ### Transformers Usage
105
 
106
- Requires transformers>=4.51.0 for BF16. Not supported for NVFP4.
107
-
108
- ```python
109
- import torch
110
- from transformers import AutoTokenizer, AutoModelForCausalLM
111
-
112
-
113
- def format_prompts(query: str, instruction: str, documents: list[str]) -> list[str]:
114
- """Format query and documents into prompts for reranking."""
115
- if instruction:
116
- instruction = f" {instruction}"
117
- prompts = []
118
- for doc in documents:
119
- prompt = f"Check whether a given document contains information helpful to answer the query.\n<Document> {doc}\n<Query> {query}{instruction} ??"
120
- prompts.append(prompt)
121
- return prompts
122
-
123
-
124
- def infer_w_hf(model_path: str, query: str, instruction: str, documents: list[str]):
125
- device = "cuda" if torch.cuda.is_available() else "cpu"
126
- dtype = torch.bfloat16 if torch.cuda.is_available() else torch.float32
127
-
128
- tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=True)
129
- if tokenizer.pad_token is None:
130
- tokenizer.pad_token = tokenizer.eos_token
131
- tokenizer.padding_side = "left" # so -1 is the real last token for all prompts
132
-
133
- model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=dtype).to(device)
134
- model.eval()
135
-
136
- prompts = format_prompts(query, instruction, documents)
137
- enc = tokenizer(
138
- prompts,
139
- return_tensors="pt",
140
- padding=True,
141
- truncation=True,
142
- )
143
- input_ids = enc["input_ids"].to(device)
144
- attention_mask = enc["attention_mask"].to(device)
145
-
146
- with torch.no_grad():
147
- out = model(input_ids=input_ids, attention_mask=attention_mask)
148
-
149
- next_logits = out.logits[:, -1, :] # [batch, vocab]
150
-
151
- scores_bf16 = next_logits[:, 0].to(torch.bfloat16)
152
- scores = scores_bf16.float().tolist()
153
-
154
- # Sort by score (descending)
155
- results = sorted([(s, i, documents[i]) for i, s in enumerate(scores)], key=lambda x: x[0], reverse=True)
156
-
157
- print(f"Query: {query}")
158
- print(f"Instruction: {instruction}")
159
- for score, doc_id, doc in results:
160
- print(f"Score: {score:.4f} | Doc: {doc}")
161
  ```
162
 
163
  ## Citation
@@ -167,7 +170,7 @@ If you use this model, please cite:
167
  ```bibtex
168
  @misc{ctxl_rerank_v2_instruct_multilingual,
169
  title={Contextual AI Reranker v2},
170
- author={George Halal, Sheshansh Agrawal},
171
  year={2025},
172
  url={https://contextual.ai/blog/rerank-v2},
173
  }
@@ -179,4 +182,4 @@ Creative Commons Attribution Non Commercial Share Alike 4.0 (cc-by-nc-sa-4.0)
179
 
180
  ## Contact
181
 
182
- For questions or issues, please open an issue on the model repository or contact george@contextual.ai.
 
4
  pipeline_tag: text-ranking
5
  ---
6
 
7
+ <div align="center">
8
+
9
  # Contextual AI Reranker v2 6B-NVFP4
10
 
11
+ <img src="Contextual_AI_Brand_Mark_Dark.png" width="10%" alt="Contextual_AI"/>
12
+
13
+ [![Blog Post](https://img.shields.io/badge/📝%20Blog-ContextualReranker-green)](https://contextual.ai/blog/rerank-v2)
14
+ [![Hugging Face Collection](https://img.shields.io/badge/🤗%20Hugging%20Face-Model%20Collection-yellow)](https://huggingface.co/collections/ContextualAI/contextual-ai-reranker-v2)
15
+
16
+ </div>
17
+
18
+ <hr>
19
+
20
  ## Highlights
21
 
22
+ Contextual AI's reranker is the **first instruction-following reranker** capable of handling retrieval conflicts and ranking with custom instructions (e.g., prioritizing recent information). It achieves state-of-the-art performance on BEIR and sits on the cost/performance Pareto frontier across:
23
+
24
+ - Instruction following
25
+ - Question answering
26
+ - Multilinguality (100+ languages)
27
+ - Product search & recommendation
28
  - Real-world use cases
29
 
30
  <p align="center">
31
  <img src="main_benchmark.png" width="1200"/>
32
  <p>
33
 
34
+ For detailed benchmarks, see our [blog post](https://contextual.ai/blog/rerank-v2).
35
 
36
  ## Overview
37
 
38
+ - **Model Type**: Text Reranking
39
+ - **Supported Languages**: 100+
40
+ - **Parameters**: 6B
41
+ - **Precision**: NVFP4 (4-bit floating point)
42
+ - **Context Length**: up to 32K
43
+
44
+ ## When to Use This Model
45
+
46
+ Use this reranker when you need to:
47
+ - Re-rank retrieved documents with custom instructions
48
+ - Handle conflicting information in retrieval results
49
+ - Prioritize documents by recency or other criteria
50
+ - Support multilingual search (100+ languages)
51
+ - Process long contexts (up to 32K tokens)
52
+ - **Maximize efficiency with 4-bit precision (NVFP4)**
53
 
54
  ## Quickstart
55
 
56
+ ### Basic Usage
57
+
58
+ ```python
59
+ # Requires vLLM==0.10.0 for NVFP4 support
60
+ # See full implementation below
61
+
62
+ model_path = "ContextualAI/ctxl-rerank-v2-instruct-multilingual-6b-nvfp4"
63
+
64
+ query = "What are the health benefits of exercise?"
65
+ instruction = "Prioritize recent medical research"
66
+ documents = [
67
+ "Regular exercise reduces risk of heart disease and improves mental health.",
68
+ "A 2024 study shows exercise enhances cognitive function in older adults.",
69
+ "Ancient Greeks valued physical fitness for military training."
70
+ ]
71
+
72
+ infer_w_vllm(model_path, query, instruction, documents)
73
+ ```
74
+
75
+ **Expected Output:**
76
+ ```
77
+ Query: What are the health benefits of exercise?
78
+ Instruction: Prioritize recent medical research
79
+ Score: 0.8542 | Doc: A 2024 study shows exercise enhances cognitive function in older adults.
80
+ Score: 0.7891 | Doc: Regular exercise reduces risk of heart disease and improves mental health.
81
+ Score: 0.4123 | Doc: Ancient Greeks valued physical fitness for military training.
82
+ ```
83
 
84
+ ### vLLM Usage
85
+
86
+ Requires `vllm==0.10.0` for NVFP4 support.
87
 
88
  ```python
89
  import os
90
+ os.environ['VLLM_USE_V1'] = '0' # v1 engine doesn't support logits processor yet
91
 
92
  import torch
93
  from vllm import LLM, SamplingParams
 
147
  print(f"Instruction: {instruction}")
148
  for score, doc_id, doc in results:
149
  print(f"Score: {score:.4f} | Doc: {doc}")
 
 
150
 
 
151
 
152
+ # Example usage
153
+ if __name__ == "__main__":
154
+ model_path = "ContextualAI/ctxl-rerank-v2-instruct-multilingual-6b-nvfp4"
155
+ query = "What are the health benefits of exercise?"
156
+ instruction = "Prioritize recent medical research"
157
+ documents = [
158
+ "Regular exercise reduces risk of heart disease and improves mental health.",
159
+ "A 2024 study shows exercise enhances cognitive function in older adults.",
160
+ "Ancient Greeks valued physical fitness for military training."
161
+ ]
162
+
163
+ infer_w_vllm(model_path, query, instruction, documents)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
164
  ```
165
 
166
  ## Citation
 
170
  ```bibtex
171
  @misc{ctxl_rerank_v2_instruct_multilingual,
172
  title={Contextual AI Reranker v2},
173
+ author={Halal, George and Agrawal, Sheshansh},
174
  year={2025},
175
  url={https://contextual.ai/blog/rerank-v2},
176
  }
 
182
 
183
  ## Contact
184
 
185
+ For questions or issues, please open an issue on the model repository.