Spaces:

MCP-1st-Birthday
/

DeepBoner

Running

App Files Files Community

VibecoderMcSwaggins commited on 25 days ago

Commit

11ad1dd

1 Parent(s): d1f91a4

fix(judges): fallback to heuristic extraction when HF quota exhausted

Browse files

Files changed (3) hide show

docs/bugs/INVESTIGATION_QUOTA_BLOCKER.md +49 -0
src/agent_factory/judges.py +23 -6
tests/unit/agent_factory/test_judges_hf_quota.py +49 -0

docs/bugs/INVESTIGATION_QUOTA_BLOCKER.md ADDED Viewed

	@@ -0,0 +1,49 @@

+# Bug Investigation: HF Free Tier Quota Exhaustion
+## Status
+- **Date:** 2025-11-29
+- **Reporter:** CLI User
+- **Component:** `HFInferenceJudgeHandler`
+- **Priority:** High (UX Blocker for Free Tier)
+- **Resolution:** FIXED
+## Issue Description
+On a fresh run with a simple query ("What drugs improve female libido post-menopause?"), the system retrieved 20 valid sources but failed during the Judge/Analysis phase with:
+`⚠️ Free Tier Quota Exceeded ⚠️`
+This results in a "Synthesis" step that has 0 candidates and 0 findings, rendering the application useless for free users once the (very low) limit is hit, despite having valid search results.
+## Evidence
+Output provided:
+```
+### Citations (20 sources)
+...
+### Reasoning
+⚠️ **Free Tier Quota Exceeded** ⚠️
+```
+## Root Cause Analysis
+1. **Search Success:** `SearchAgent` correctly found 20 documents (PubMed/EuropePMC).
+2. **Judge Failure:** `HFInferenceJudgeHandler` called the HF Inference API.
+3. **Quota Trap:** The API returned a 402 (Payment Required) or Quota error.
+4. **Previous Handling:** The handler caught this error and returned a `JudgeAssessment` with `sufficient=True` (to stop the loop) and *empty* fields.
+5. **Data Loss:** The 20 valid search results were effectively discarded from the "Analysis" perspective.
+## The "Deep Blocker"
+The system had a "hard failure" mode for quota exhaustion, assuming that if the LLM can't judge, we have *no* useful information. This "bricked" the UX for free users immediately upon hitting the limit.
+## Solution Implemented
+Modified `HFInferenceJudgeHandler._create_quota_exhausted_assessment` to:
+1. Accept the `evidence` list as an argument.
+2. Perform basic heuristic extraction (borrowed from `MockJudgeHandler` logic):
+   - Use titles as "Key Findings" (first 5 sources).
+   - Add a clear message in "Drug Candidates" telling the user to upgrade.
+3. Return this "Partial" assessment instead of an empty one.
+## Verification
+- Created `tests/unit/agent_factory/test_judges_hf_quota.py` to verify that:
+  - 402 errors are caught.
+  - `sufficient` is set to `True` (stops loop).
+  - `key_findings` are populated from search result titles.
+  - `reasoning` contains the warning message.
+- Ran existing tests `tests/unit/agent_factory/test_judges_hf.py` - All passed.

src/agent_factory/judges.py CHANGED Viewed

@@ -218,7 +218,7 @@ class HFInferenceJudgeHandler:
                     or "payment required" in error_str.lower()
                 ):
                     logger.error("HF Quota Exhausted", error=error_str)
-                    return self._create_quota_exhausted_assessment(question)
                 logger.warning("Model failed", model=model, error=str(e))
                 last_error = e
@@ -342,16 +342,31 @@ IMPORTANT: Respond with ONLY valid JSON matching this schema:
         return None
-    def _create_quota_exhausted_assessment(self, question: str) -> JudgeAssessment:
         """Create an assessment that stops the loop when quota is exhausted."""
         return JudgeAssessment(
             details=AssessmentDetails(
                 mechanism_score=0,
-                mechanism_reasoning="Free tier quota exhausted.",
                 clinical_evidence_score=0,
-                clinical_reasoning="Free tier quota exhausted.",
-                drug_candidates=[],
-                key_findings=[],
             ),
             sufficient=True,  # STOP THE LOOP
             confidence=0.0,
@@ -360,6 +375,8 @@ IMPORTANT: Respond with ONLY valid JSON matching this schema:
             reasoning=(
                 "⚠️ **Free Tier Quota Exceeded** ⚠️\n\n"
                 "The HuggingFace Inference API free tier limit has been reached. "
                 "Please try again later, or add an OpenAI/Anthropic API key above "
                 "for unlimited access."
             ),

                     or "payment required" in error_str.lower()
                 ):
                     logger.error("HF Quota Exhausted", error=error_str)
+                    return self._create_quota_exhausted_assessment(question, evidence)
                 logger.warning("Model failed", model=model, error=str(e))
                 last_error = e
         return None
+    def _create_quota_exhausted_assessment(
+        self, question: str, evidence: list[Evidence]
+    ) -> JudgeAssessment:
         """Create an assessment that stops the loop when quota is exhausted."""
+        # Heuristic extraction for fallback
+        findings = []
+        for e in evidence[:5]:
+            title = e.citation.title
+            if len(title) > 150:
+                title = title[:147] + "..."
+            findings.append(title)
+        if not findings:
+            findings = ["No findings available (Quota exceeded and no search results)."]
         return JudgeAssessment(
             details=AssessmentDetails(
                 mechanism_score=0,
+                mechanism_reasoning="Free tier quota exhausted. Unable to analyze mechanism.",
                 clinical_evidence_score=0,
+                clinical_reasoning=(
+                    "Free tier quota exhausted. Unable to analyze clinical evidence."
+                ),
+                drug_candidates=["Upgrade to paid API for drug extraction."],
+                key_findings=findings,
             ),
             sufficient=True,  # STOP THE LOOP
             confidence=0.0,
             reasoning=(
                 "⚠️ **Free Tier Quota Exceeded** ⚠️\n\n"
                 "The HuggingFace Inference API free tier limit has been reached. "
+                "The search results listed below were retrieved but could not be "
+                "analyzed by the AI. "
                 "Please try again later, or add an OpenAI/Anthropic API key above "
                 "for unlimited access."
             ),

tests/unit/agent_factory/test_judges_hf_quota.py ADDED Viewed

	@@ -0,0 +1,49 @@

+"""Unit tests for HFInferenceJudgeHandler Quota Logic."""
+from unittest.mock import patch
+import pytest
+from src.agent_factory.judges import HFInferenceJudgeHandler
+from src.utils.models import Citation, Evidence
+@pytest.mark.unit
+class TestHFInferenceJudgeHandlerQuota:
+    """Tests for HFInferenceJudgeHandler Quota handling."""
+    @pytest.mark.asyncio
+    async def test_assess_quota_exhausted(self):
+        """Test that quota exhaustion triggers fallback extraction."""
+        handler = HFInferenceJudgeHandler()
+        # Create some dummy evidence
+        evidence = [
+            Evidence(
+                content="Content 1",
+                citation=Citation(
+                    source="pubmed", title="Important Drug A Findings", url="u1", date="d1"
+                ),
+            ),
+            Evidence(
+                content="Content 2",
+                citation=Citation(
+                    source="pubmed", title="Clinical Trial of Drug B", url="u2", date="d2"
+                ),
+            ),
+        ]
+        # Mock _call_with_retry to raise a Quota error
+        with patch.object(
+            handler, "_call_with_retry", side_effect=Exception("402 Payment Required")
+        ):
+            result = await handler.assess("question", evidence)
+            # Check that it caught the error and stopped
+            assert result.sufficient is True
+            assert "Free Tier Quota Exceeded" in result.reasoning
+            # CRITICAL: Check that it extracted findings from titles
+            assert "Important Drug A Findings" in result.details.key_findings
+            assert "Clinical Trial of Drug B" in result.details.key_findings
+            assert result.details.drug_candidates == ["Upgrade to paid API for drug extraction."]