ChAbhishek28 commited on
Commit
5065491
Β·
1 Parent(s): a2ca191

Add 899999999999999999999999999999

Browse files
CONVERSATIONAL_ENHANCEMENT.md ADDED
@@ -0,0 +1,197 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🎯 Conversational Voice Bot Enhancement
2
+
3
+ ## Problem Identified
4
+ Your friend's voice bot is **conversational and interactive** - when it doesn't understand context, it asks for more information instead of giving generic responses. Your current bot gives generic answers even when the query is unclear.
5
+
6
+ ## πŸš€ Conversational Features Implemented
7
+
8
+ ### 1. **Smart Clarity Detection** (`conversational_service.py`)
9
+ The bot now analyzes queries and detects when they need clarification:
10
+
11
+ ```python
12
+ # Examples of queries that trigger clarification:
13
+ "pension" β†’ "What specifically about pension?"
14
+ "what" β†’ "Could you be more specific?"
15
+ "help" β†’ "What topic can I help you with?"
16
+ ```
17
+
18
+ ### 2. **Context-Aware Clarification Questions**
19
+ Different types of unclear queries get different clarification approaches:
20
+
21
+ #### **Vague Pension Queries:**
22
+ ```
23
+ User: "pension"
24
+ Bot: "I can help with pension-related questions! To give you the most accurate information, could you specify:
25
+
26
+ πŸ“‹ What specifically about pension?
27
+ β€’ How to calculate pension amount?
28
+ β€’ Pension eligibility criteria?
29
+ β€’ How to apply for pension?
30
+ β€’ Family pension benefits?
31
+ β€’ Commutation of pension?
32
+ β€’ Required documents for pension?"
33
+ ```
34
+
35
+ #### **Generic Procurement Queries:**
36
+ ```
37
+ User: "tender"
38
+ Bot: "I can help with procurement and tendering! To provide the right information, could you clarify:
39
+
40
+ 🏒 What specifically about procurement?
41
+ β€’ How to participate in tenders?
42
+ β€’ Procurement rules and thresholds?
43
+ β€’ GeM portal registration?
44
+ β€’ MSME benefits in procurement?
45
+
46
+ Are you a vendor looking to participate, or an officer managing procurement?"
47
+ ```
48
+
49
+ ### 3. **Follow-Up Context Handling**
50
+ The bot remembers clarification requests and uses enhanced queries:
51
+
52
+ ```python
53
+ # Original unclear query: "pension"
54
+ # User clarification: "calculation"
55
+ # Enhanced query: "pension calculation formula amount computation"
56
+ ```
57
+
58
+ ### 4. **Conversational Response Enhancement**
59
+ Responses now include encouraging follow-up questions:
60
+
61
+ ```
62
+ "Here's the pension calculation formula...
63
+
64
+ Is there anything specific about this topic you'd like me to explain further?"
65
+ ```
66
+
67
+ ## 🎯 Enhanced User Experience
68
+
69
+ ### **Before (Generic Responses):**
70
+ ```
71
+ User: "What about pension?"
72
+ Bot: [Returns random training/salary information] ❌
73
+ ```
74
+
75
+ ### **After (Conversational Interaction):**
76
+ ```
77
+ User: "What about pension?"
78
+ Bot: "I can help with pension-related questions! To give you the most accurate information, could you specify:
79
+
80
+ πŸ“‹ What specifically about pension?
81
+ β€’ How to calculate pension amount?
82
+ β€’ Pension eligibility criteria?
83
+ β€’ How to apply for pension?
84
+ β€’ Family pension benefits?
85
+
86
+ Please let me know which aspect interests you most!" βœ…
87
+
88
+ User: "How to calculate"
89
+ Bot: "Here's the pension calculation formula:
90
+ (Last drawn basic pay + DA) Γ— service years Γ· 70
91
+
92
+ The minimum pension is β‚Ή9,000 per month...
93
+
94
+ Would you like me to walk through a specific calculation example?" βœ…
95
+ ```
96
+
97
+ ## πŸ”§ Technical Implementation
98
+
99
+ ### **1. Query Clarity Analysis**
100
+ ```python
101
+ def analyze_query_clarity(query, search_results):
102
+ # Detects vague patterns
103
+ # Checks pension/procurement ambiguity
104
+ # Analyzes search result relevance
105
+ # Returns clarification questions if needed
106
+ ```
107
+
108
+ ### **2. Enhanced WebSocket Handler**
109
+ ```python
110
+ # Before processing search results, check for clarity
111
+ conversational_analysis = conversational_service.generate_conversational_response(
112
+ user_message, docs, session_id
113
+ )
114
+
115
+ if conversational_analysis['needs_clarification']:
116
+ # Send clarification request instead of generic response
117
+ return clarification_response
118
+ ```
119
+
120
+ ### **3. Frontend Clarification Handling**
121
+ ```javascript
122
+ // Handle clarification requests in frontend
123
+ if (msg.type === "streaming_response" && msg.needs_clarification) {
124
+ // Display clarification questions
125
+ // Keep mic active for user response
126
+ // Mark as clarification in conversation history
127
+ }
128
+ ```
129
+
130
+ ### **4. Session Management**
131
+ ```python
132
+ # Remember conversation context
133
+ conversation_history[session_id] = {
134
+ 'last_query': query,
135
+ 'awaiting_clarification': True,
136
+ 'clarification_type': 'pension_generic'
137
+ }
138
+ ```
139
+
140
+ ## 🎯 Specific Improvements
141
+
142
+ ### **1. Vague Query Detection**
143
+ - βœ… Single word queries: "pension", "tender", "help"
144
+ - βœ… Too generic: "what", "how", "tell me about"
145
+ - βœ… Unclear pronouns: "this", "that", "it"
146
+
147
+ ### **2. Domain-Specific Clarification**
148
+ - βœ… **Pension queries**: Asks about calculation, eligibility, application, etc.
149
+ - βœ… **Procurement queries**: Asks about vendor vs officer role, specific process
150
+ - βœ… **Finance queries**: Asks about budget, approval, audit context
151
+
152
+ ### **3. Conversational Flow**
153
+ - βœ… **Friendly tone**: "I'd be happy to help!"
154
+ - βœ… **Structured options**: Bullet-pointed choices
155
+ - βœ… **Follow-up encouragement**: "Please let me know..."
156
+ - βœ… **Context retention**: Remembers what user is asking about
157
+
158
+ ### **4. Enhanced Search Integration**
159
+ - οΏ½οΏ½οΏ½ **Low relevance detection**: Asks for clarification when search results don't match
160
+ - βœ… **Enhanced follow-up queries**: Combines original + clarification
161
+ - βœ… **Better document targeting**: Uses improved search terms
162
+
163
+ ## πŸŽ‰ Expected Results
164
+
165
+ Your voice bot will now behave like your friend's bot:
166
+
167
+ 1. **πŸ€” Asks for clarification** when queries are unclear
168
+ 2. **πŸ’¬ Provides structured options** for user to choose from
169
+ 3. **πŸ”„ Remembers context** from clarification requests
170
+ 4. **βœ… Gives precise answers** after understanding intent
171
+ 5. **🎯 Encourages conversation** with follow-up questions
172
+
173
+ ## πŸ“‹ Testing the Enhancement
174
+
175
+ Try these scenarios:
176
+
177
+ ### **Scenario 1: Vague Query**
178
+ ```
179
+ Say: "pension"
180
+ Expected: Bot asks for specific pension topic
181
+ ```
182
+
183
+ ### **Scenario 2: Follow-up**
184
+ ```
185
+ Say: "pension"
186
+ Bot: "What specifically about pension?"
187
+ Say: "calculation"
188
+ Expected: Bot gives pension calculation details
189
+ ```
190
+
191
+ ### **Scenario 3: Unclear Context**
192
+ ```
193
+ Say: "what"
194
+ Expected: Bot asks what topic you want help with
195
+ ```
196
+
197
+ Your voice bot is now **conversational and interactive** just like your friend's! πŸŽ‰
app.py CHANGED
@@ -8,6 +8,7 @@ from fastapi.middleware.cors import CORSMiddleware
8
  from fastapi.responses import JSONResponse, FileResponse
9
  from websocket_handler import handle_websocket_connection
10
  from enhanced_websocket_handler import handle_enhanced_websocket_connection
 
11
  from hybrid_llm_service import HybridLLMService
12
  from voice_service import VoiceService
13
  from groq_voice_service import groq_voice_service # Import the new Groq voice service
@@ -257,6 +258,14 @@ async def websocket_stream_endpoint(websocket: WebSocket):
257
  finally:
258
  await groq_websocket_handler.disconnect(session_id)
259
 
 
 
 
 
 
 
 
 
260
  if __name__ == "__main__":
261
  import uvicorn
262
  uvicorn.run(app, host="0.0.0.0", port=7860)
 
8
  from fastapi.responses import JSONResponse, FileResponse
9
  from websocket_handler import handle_websocket_connection
10
  from enhanced_websocket_handler import handle_enhanced_websocket_connection
11
+ from conversational_websocket_handler import handle_conversational_websocket
12
  from hybrid_llm_service import HybridLLMService
13
  from voice_service import VoiceService
14
  from groq_voice_service import groq_voice_service # Import the new Groq voice service
 
258
  finally:
259
  await groq_websocket_handler.disconnect(session_id)
260
 
261
+ @app.websocket("/ws/conversational")
262
+ async def websocket_conversational_endpoint(websocket: WebSocket):
263
+ """
264
+ Enhanced Conversational WebSocket endpoint with session memory and personalization
265
+ Based on friend's conversational implementation for better user experience
266
+ """
267
+ await handle_conversational_websocket(websocket)
268
+
269
  if __name__ == "__main__":
270
  import uvicorn
271
  uvicorn.run(app, host="0.0.0.0", port=7860)
conversational_audio_service.py ADDED
@@ -0,0 +1,194 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Enhanced Audio Services for Conversational Voice Bot
3
+ Based on friend's implementation with Murf TTS and Groq ASR
4
+ """
5
+ from groq import AsyncGroq
6
+ from config import GROQ_API_KEY, MURF_API_KEY
7
+ import asyncio
8
+ import logging
9
+ import re
10
+
11
+ logger = logging.getLogger(__name__)
12
+
13
+ # Initialize Groq client
14
+ groq_client = AsyncGroq(api_key=GROQ_API_KEY)
15
+
16
+ # Initialize Murf client
17
+ try:
18
+ from murf import AsyncMurf
19
+ murf_client = AsyncMurf(api_key=MURF_API_KEY)
20
+ MURF_AVAILABLE = True
21
+ except ImportError:
22
+ logger.warning("Murf package not available. Install with: pip install murf")
23
+ MURF_AVAILABLE = False
24
+
25
+ def clean_markdown_for_tts(text: str) -> str:
26
+ """
27
+ Clean markdown and formatting from text to make it suitable for TTS
28
+ """
29
+ if not text:
30
+ return ""
31
+
32
+ # Remove markdown links [text](url) -> text
33
+ text = re.sub(r'\[([^\]]+)\]\([^\)]+\)', r'\1', text)
34
+
35
+ # Remove markdown emphasis **bold** and *italic* -> text
36
+ text = re.sub(r'\*\*([^\*]+)\*\*', r'\1', text)
37
+ text = re.sub(r'\*([^\*]+)\*', r'\1', text)
38
+
39
+ # Remove markdown headers ### ->
40
+ text = re.sub(r'^#+\s*', '', text, flags=re.MULTILINE)
41
+
42
+ # Remove code blocks ```code``` -> code
43
+ text = re.sub(r'```[^`]*```', '', text)
44
+ text = re.sub(r'`([^`]+)`', r'\1', text)
45
+
46
+ # Remove HTML tags
47
+ text = re.sub(r'<[^>]+>', '', text)
48
+
49
+ # Clean up bullet points and lists
50
+ text = re.sub(r'^\s*[-\*\+]\s*', '', text, flags=re.MULTILINE)
51
+ text = re.sub(r'^\s*\d+\.\s*', '', text, flags=re.MULTILINE)
52
+
53
+ # Remove extra whitespace and line breaks
54
+ text = re.sub(r'\n\s*\n', '. ', text)
55
+ text = re.sub(r'\s+', ' ', text)
56
+
57
+ # Remove special characters that TTS might struggle with
58
+ text = re.sub(r'[#\*_~`]', '', text)
59
+
60
+ return text.strip()
61
+
62
+ async def groq_asr_bytes(audio_bytes: bytes, model: str = "whisper-large-v3", language: str = "en") -> str:
63
+ """
64
+ Transcribe audio using Groq Whisper ASR
65
+ Enhanced version similar to friend's implementation
66
+ """
67
+ try:
68
+ logger.info(f"🎀 Transcribing audio with Groq ASR (model: {model}, language: {language})")
69
+
70
+ # Groq client is async, so we can use it directly
71
+ response = await groq_client.audio.transcriptions.create(
72
+ model=model,
73
+ file=("audio.wav", audio_bytes, "audio/wav"),
74
+ response_format="text",
75
+ language=language,
76
+ temperature=0.0 # For more consistent results
77
+ )
78
+
79
+ transcription = response.strip() if response else ""
80
+ logger.info(f"🎯 Transcription result: {transcription}")
81
+
82
+ return transcription
83
+
84
+ except Exception as e:
85
+ logger.error(f"❌ Groq ASR failed: {e}")
86
+ return ""
87
+
88
+ async def murf_tts(text: str, voice_id: str = "en-IN-isha", format: str = "MP3") -> bytes:
89
+ """
90
+ Convert text to speech using Murf TTS
91
+ Enhanced version similar to friend's implementation
92
+ """
93
+ if not MURF_AVAILABLE:
94
+ logger.error("❌ Murf TTS not available")
95
+ return b""
96
+
97
+ if not text or not text.strip():
98
+ logger.warning("⚠️ Empty text provided to TTS")
99
+ return b""
100
+
101
+ try:
102
+ # Clean text for TTS
103
+ clean_text = clean_markdown_for_tts(text)
104
+ if not clean_text:
105
+ logger.warning("⚠️ Text became empty after cleaning")
106
+ return b""
107
+
108
+ logger.info(f"πŸ”Š Generating speech with Murf TTS (voice: {voice_id})")
109
+ logger.debug(f"TTS text: {clean_text}")
110
+
111
+ # Generate speech using async streaming
112
+ response = murf_client.text_to_speech.stream(
113
+ text=clean_text,
114
+ voice_id=voice_id,
115
+ format=format,
116
+ sample_rate=44100.0
117
+ )
118
+
119
+ # Collect all chunks
120
+ chunks = [chunk async for chunk in response]
121
+ full_audio = b''.join(chunks)
122
+
123
+ logger.info(f"βœ… Generated {len(full_audio)} bytes of audio")
124
+ return full_audio
125
+
126
+ except Exception as e:
127
+ logger.error(f"❌ Murf TTS failed: {e}")
128
+ return b""
129
+
130
+ async def edge_tts_fallback(text: str, voice: str = "en-IN-Neerja") -> bytes:
131
+ """
132
+ Fallback TTS using edge-tts if Murf is not available
133
+ """
134
+ try:
135
+ import edge_tts
136
+
137
+ clean_text = clean_markdown_for_tts(text)
138
+ if not clean_text:
139
+ return b""
140
+
141
+ logger.info(f"πŸ”Š Using Edge TTS fallback (voice: {voice})")
142
+
143
+ communicate = edge_tts.Communicate(clean_text, voice)
144
+ audio_chunks = []
145
+
146
+ async for chunk in communicate.stream():
147
+ if chunk["type"] == "audio":
148
+ audio_chunks.append(chunk["data"])
149
+
150
+ audio_data = b"".join(audio_chunks)
151
+ logger.info(f"βœ… Generated {len(audio_data)} bytes of audio with Edge TTS")
152
+ return audio_data
153
+
154
+ except ImportError:
155
+ logger.error("❌ Edge TTS not available. Install with: pip install edge-tts")
156
+ return b""
157
+ except Exception as e:
158
+ logger.error(f"❌ Edge TTS failed: {e}")
159
+ return b""
160
+
161
+ class ConversationalAudioService:
162
+ """
163
+ Main audio service class for conversational voice bot
164
+ """
165
+
166
+ def __init__(self):
167
+ self.groq_client = groq_client
168
+ self.murf_available = MURF_AVAILABLE
169
+ self.default_voice = "en-IN-isha" # Indian English voice
170
+
171
+ async def transcribe_audio(self, audio_bytes: bytes, language: str = "en") -> str:
172
+ """Transcribe audio to text using Groq ASR"""
173
+ return await groq_asr_bytes(audio_bytes, language=language)
174
+
175
+ async def synthesize_speech(self, text: str, voice_id: str = None) -> bytes:
176
+ """Convert text to speech using best available TTS"""
177
+ voice = voice_id or self.default_voice
178
+
179
+ if self.murf_available:
180
+ # Try Murf TTS first
181
+ audio = await murf_tts(text, voice_id=voice)
182
+ if audio:
183
+ return audio
184
+
185
+ # Fallback to Edge TTS
186
+ return await edge_tts_fallback(text, voice="en-IN-Neerja")
187
+
188
+ def set_default_voice(self, voice_id: str):
189
+ """Set default voice for TTS"""
190
+ self.default_voice = voice_id
191
+ logger.info(f"🎡 Default voice set to: {voice_id}")
192
+
193
+ # Global audio service instance
194
+ conversational_audio_service = ConversationalAudioService()
conversational_service.py ADDED
@@ -0,0 +1,332 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Conversational Voice Bot Service
3
+ Makes the bot more interactive by asking for clarification when context is unclear
4
+ """
5
+
6
+ import logging
7
+ from typing import Dict, Any, List, Optional
8
+ import re
9
+
10
+ logger = logging.getLogger("voicebot")
11
+
12
+ class ConversationalService:
13
+ def __init__(self):
14
+ self.conversation_history = {}
15
+
16
+ def analyze_query_clarity(self, query: str, search_results: List[Dict[str, Any]]) -> Dict[str, Any]:
17
+ """
18
+ Analyze if the query is clear enough or needs clarification
19
+ Returns clarification questions if needed
20
+ """
21
+ query_lower = query.lower().strip()
22
+
23
+ # Remove filler words
24
+ filler_words = ['um', 'uh', 'like', 'you know', 'actually', 'basically']
25
+ clean_query = query_lower
26
+ for filler in filler_words:
27
+ clean_query = clean_query.replace(filler, '')
28
+
29
+ clarity_analysis = {
30
+ 'is_clear': True,
31
+ 'confidence': 1.0,
32
+ 'clarification_needed': False,
33
+ 'clarification_questions': [],
34
+ 'suggested_responses': [],
35
+ 'query_type': 'specific'
36
+ }
37
+
38
+ # Check for vague queries
39
+ vague_patterns = [
40
+ r'\b(what|how|tell me|explain)\s+(about|regarding)?\s*$',
41
+ r'^(help|info|information)$',
42
+ r'^(what|how)$',
43
+ r'\b(anything|something|stuff)\b',
44
+ r'^(yes|no|okay|ok)$',
45
+ r'\b(this|that|it)\b.*\?$'
46
+ ]
47
+
48
+ is_vague = any(re.search(pattern, clean_query) for pattern in vague_patterns)
49
+
50
+ # Check for ambiguous pension queries
51
+ if 'pension' in query_lower:
52
+ pension_ambiguity = self._check_pension_ambiguity(query_lower)
53
+ if pension_ambiguity['needs_clarification']:
54
+ clarity_analysis.update(pension_ambiguity)
55
+ return clarity_analysis
56
+
57
+ # Check for ambiguous procurement queries
58
+ elif any(word in query_lower for word in ['tender', 'procurement', 'bid']):
59
+ procurement_ambiguity = self._check_procurement_ambiguity(query_lower)
60
+ if procurement_ambiguity['needs_clarification']:
61
+ clarity_analysis.update(procurement_ambiguity)
62
+ return clarity_analysis
63
+
64
+ # Check for too generic queries
65
+ elif is_vague or len(clean_query.split()) <= 2:
66
+ clarity_analysis.update({
67
+ 'is_clear': False,
68
+ 'confidence': 0.3,
69
+ 'clarification_needed': True,
70
+ 'query_type': 'vague',
71
+ 'clarification_questions': [
72
+ "I'd be happy to help! Could you be more specific about what you're looking for?",
73
+ "Are you asking about:",
74
+ "β€’ Pension rules and benefits?",
75
+ "β€’ Leave policies and applications?",
76
+ "β€’ Procurement and tender processes?",
77
+ "β€’ Salary and allowances?",
78
+ "β€’ Or something else?"
79
+ ],
80
+ 'suggested_responses': [
81
+ "Please let me know which topic interests you most, and I'll provide detailed information."
82
+ ]
83
+ })
84
+ return clarity_analysis
85
+
86
+ # Check if search results are relevant
87
+ if search_results:
88
+ relevance_score = self._calculate_relevance_score(query_lower, search_results)
89
+ if relevance_score < 0.3: # Low relevance
90
+ clarity_analysis.update({
91
+ 'is_clear': False,
92
+ 'confidence': relevance_score,
93
+ 'clarification_needed': True,
94
+ 'query_type': 'low_relevance',
95
+ 'clarification_questions': [
96
+ f"I found some information, but I want to make sure I understand your question correctly.",
97
+ f"When you asked '{query}', did you mean:",
98
+ "β€’ Something specific about government policies?",
99
+ "β€’ A particular process or procedure?",
100
+ "β€’ Information for a specific situation?",
101
+ "",
102
+ "Could you provide a bit more context about what you're trying to accomplish?"
103
+ ]
104
+ })
105
+
106
+ return clarity_analysis
107
+
108
+ def _check_pension_ambiguity(self, query: str) -> Dict[str, Any]:
109
+ """Check if pension query needs clarification"""
110
+
111
+ # Specific pension queries that are clear
112
+ clear_pension_terms = [
113
+ 'pension calculation', 'pension formula', 'pension eligibility',
114
+ 'pension application', 'family pension', 'commutation',
115
+ 'pension documents', 'pension process'
116
+ ]
117
+
118
+ if any(term in query for term in clear_pension_terms):
119
+ return {'needs_clarification': False}
120
+
121
+ # Generic pension queries that need clarification
122
+ generic_pension_patterns = [
123
+ r'^pension\??$',
124
+ r'^what.*pension\??$',
125
+ r'^tell me.*pension\??$',
126
+ r'^pension.*rules?\??$',
127
+ r'^how.*pension\??$'
128
+ ]
129
+
130
+ if any(re.search(pattern, query) for pattern in generic_pension_patterns):
131
+ return {
132
+ 'needs_clarification': True,
133
+ 'is_clear': False,
134
+ 'confidence': 0.4,
135
+ 'clarification_needed': True,
136
+ 'query_type': 'pension_generic',
137
+ 'clarification_questions': [
138
+ "I can help with pension-related questions! To give you the most accurate information, could you specify:",
139
+ "",
140
+ "πŸ“‹ **What specifically about pension?**",
141
+ "β€’ How to calculate pension amount?",
142
+ "β€’ Pension eligibility criteria?",
143
+ "β€’ How to apply for pension?",
144
+ "β€’ Family pension benefits?",
145
+ "β€’ Commutation of pension?",
146
+ "β€’ Required documents for pension?",
147
+ "",
148
+ "Or do you have a specific pension situation you need help with?"
149
+ ],
150
+ 'suggested_responses': [
151
+ "Please let me know which aspect of pension you're interested in, and I'll provide detailed guidance."
152
+ ]
153
+ }
154
+
155
+ return {'needs_clarification': False}
156
+
157
+ def _check_procurement_ambiguity(self, query: str) -> Dict[str, Any]:
158
+ """Check if procurement query needs clarification"""
159
+
160
+ clear_procurement_terms = [
161
+ 'tender process', 'bid submission', 'gem portal', 'msme benefits',
162
+ 'vendor registration', 'procurement threshold', 'tender documents'
163
+ ]
164
+
165
+ if any(term in query for term in clear_procurement_terms):
166
+ return {'needs_clarification': False}
167
+
168
+ generic_procurement_patterns = [
169
+ r'^tender\??$',
170
+ r'^procurement\??$',
171
+ r'^bid\??$',
172
+ r'^what.*tender\??$',
173
+ r'^how.*procurement\??$'
174
+ ]
175
+
176
+ if any(re.search(pattern, query) for pattern in generic_procurement_patterns):
177
+ return {
178
+ 'needs_clarification': True,
179
+ 'is_clear': False,
180
+ 'confidence': 0.4,
181
+ 'clarification_needed': True,
182
+ 'query_type': 'procurement_generic',
183
+ 'clarification_questions': [
184
+ "I can help with procurement and tendering! To provide the right information, could you clarify:",
185
+ "",
186
+ "🏒 **What specifically about procurement?**",
187
+ "β€’ How to participate in tenders?",
188
+ "β€’ Procurement rules and thresholds?",
189
+ "β€’ GeM portal registration?",
190
+ "β€’ MSME benefits in procurement?",
191
+ "β€’ Vendor empanelment process?",
192
+ "β€’ Bid preparation and submission?",
193
+ "",
194
+ "Are you a vendor looking to participate, or an officer managing procurement?"
195
+ ],
196
+ 'suggested_responses': [
197
+ "Please specify your role and what aspect of procurement you need help with."
198
+ ]
199
+ }
200
+
201
+ return {'needs_clarification': False}
202
+
203
+ def _calculate_relevance_score(self, query: str, search_results: List[Dict[str, Any]]) -> float:
204
+ """Calculate how relevant search results are to the query"""
205
+ if not search_results:
206
+ return 0.0
207
+
208
+ query_words = set(query.lower().split())
209
+ query_words = {word for word in query_words if len(word) > 2} # Remove short words
210
+
211
+ total_relevance = 0.0
212
+ for result in search_results:
213
+ content = result.get('content', '').lower()
214
+ filename = result.get('filename', '').lower()
215
+
216
+ # Count query word matches in content
217
+ content_matches = sum(1 for word in query_words if word in content)
218
+ filename_matches = sum(1 for word in query_words if word in filename)
219
+
220
+ # Calculate relevance score for this result
221
+ max_possible_matches = len(query_words)
222
+ if max_possible_matches > 0:
223
+ relevance = (content_matches + filename_matches * 2) / (max_possible_matches * 2)
224
+ total_relevance += relevance
225
+
226
+ # Average relevance across all results
227
+ return total_relevance / len(search_results) if search_results else 0.0
228
+
229
+ def generate_conversational_response(self,
230
+ query: str,
231
+ search_results: List[Dict[str, Any]],
232
+ session_id: str = None) -> Dict[str, Any]:
233
+ """
234
+ Generate a conversational response that asks for clarification when needed
235
+ """
236
+ clarity = self.analyze_query_clarity(query, search_results)
237
+
238
+ if clarity['clarification_needed']:
239
+ # Generate clarification response
240
+ clarification_text = "\n".join(clarity['clarification_questions'])
241
+
242
+ response = {
243
+ 'needs_clarification': True,
244
+ 'response': clarification_text,
245
+ 'type': 'clarification_request',
246
+ 'suggested_follow_ups': clarity.get('suggested_responses', []),
247
+ 'query_type': clarity.get('query_type', 'unclear'),
248
+ 'confidence': clarity.get('confidence', 0.5)
249
+ }
250
+
251
+ # Store conversation context for follow-up
252
+ if session_id:
253
+ self.conversation_history[session_id] = {
254
+ 'last_query': query,
255
+ 'awaiting_clarification': True,
256
+ 'clarification_type': clarity.get('query_type', 'unclear'),
257
+ 'timestamp': __import__('time').time()
258
+ }
259
+
260
+ return response
261
+
262
+ else:
263
+ # Query is clear, proceed with normal response
264
+ return {
265
+ 'needs_clarification': False,
266
+ 'response': None, # Will be filled by normal RAG processing
267
+ 'type': 'information_request',
268
+ 'confidence': clarity.get('confidence', 1.0)
269
+ }
270
+
271
+ def handle_follow_up(self, query: str, session_id: str) -> Dict[str, Any]:
272
+ """Handle follow-up queries after clarification request"""
273
+ if session_id not in self.conversation_history:
274
+ return {'is_follow_up': False}
275
+
276
+ context = self.conversation_history[session_id]
277
+
278
+ if not context.get('awaiting_clarification', False):
279
+ return {'is_follow_up': False}
280
+
281
+ # Check if this is a clarification response
282
+ query_lower = query.lower().strip()
283
+
284
+ # Clear clarification state
285
+ context['awaiting_clarification'] = False
286
+
287
+ # Enhanced query based on clarification
288
+ clarification_type = context.get('clarification_type', '')
289
+ original_query = context.get('last_query', '')
290
+
291
+ enhanced_query = f"{original_query} {query}"
292
+
293
+ return {
294
+ 'is_follow_up': True,
295
+ 'enhanced_query': enhanced_query,
296
+ 'context_type': clarification_type,
297
+ 'original_query': original_query
298
+ }
299
+
300
+ def generate_friendly_greeting(self) -> str:
301
+ """Generate a friendly greeting that encourages conversation"""
302
+ greetings = [
303
+ "Hello! I'm here to help you with government policies and procedures. What would you like to know about?",
304
+ "Hi there! I can assist you with information about pensions, procurement, leave policies, and more. What's on your mind?",
305
+ "Welcome! I'm your government assistant. Feel free to ask me about any policies, rules, or procedures you need help with.",
306
+ "Hello! I'm ready to help you navigate government policies and processes. What information are you looking for today?"
307
+ ]
308
+
309
+ import random
310
+ return random.choice(greetings)
311
+
312
+ def generate_helpful_response(self, response_text: str, sources: List[Dict[str, Any]]) -> str:
313
+ """Make responses more helpful and conversational"""
314
+
315
+ # Add conversational elements
316
+ if response_text:
317
+ # Add follow-up encouragement
318
+ follow_up_phrases = [
319
+ "\n\nIs there anything specific about this topic you'd like me to explain further?",
320
+ "\n\nDo you have any follow-up questions about this information?",
321
+ "\n\nWould you like me to provide more details on any particular aspect?",
322
+ "\n\nIs there a specific situation you're dealing with that I can help you navigate?"
323
+ ]
324
+
325
+ import random
326
+ follow_up = random.choice(follow_up_phrases)
327
+ response_text += follow_up
328
+
329
+ return response_text
330
+
331
+ # Global instance
332
+ conversational_service = ConversationalService()
conversational_websocket_handler.py ADDED
@@ -0,0 +1,274 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Enhanced Conversational WebSocket Handler
3
+ Based on friend's implementation with session memory, personalization, and better conversational flow
4
+ """
5
+ from fastapi import WebSocket, WebSocketDisconnect
6
+ from langchain_core.messages import HumanMessage, SystemMessage, AIMessage
7
+ import logging
8
+ import json
9
+ import asyncio
10
+ import uuid
11
+ from typing import Dict, Any, List
12
+ from session_service import session_service
13
+ from conversational_audio_service import conversational_audio_service
14
+ from enhanced_search_service import enhanced_pension_search
15
+ from llm_service import create_graph, create_basic_graph
16
+
17
+ logger = logging.getLogger("conversational_voicebot")
18
+
19
+ async def handle_conversational_websocket(websocket: WebSocket):
20
+ """
21
+ Enhanced conversational websocket handler similar to friend's implementation
22
+ """
23
+ await websocket.accept()
24
+ logger.info("πŸ”Œ Conversational WebSocket client connected.")
25
+
26
+ # Initialize conversation variables
27
+ initial_data = await websocket.receive_json()
28
+ messages = []
29
+ conversation_history = []
30
+
31
+ # Check if user authentication is provided
32
+ flag = "user_id" in initial_data
33
+ if flag:
34
+ thread_id = initial_data.get("user_id")
35
+
36
+ # Get user session context and preferences
37
+ conversation_context = await session_service.get_conversation_context(thread_id)
38
+ user_preferences = await session_service.get_user_preferences(thread_id)
39
+
40
+ # Create graph with RAG capabilities
41
+ graph = await create_graph(kb_tool=True, mcp_config=None)
42
+
43
+ # Set conversational audio service voice based on preferences
44
+ voice_id = user_preferences.get('voice_id', 'en-IN-isha')
45
+ conversational_audio_service.set_default_voice(voice_id)
46
+
47
+ language_code = user_preferences.get('language', 'english')
48
+ lang_map = {
49
+ 'english': 'en',
50
+ 'hindi': 'hi',
51
+ 'hinglish': 'en'
52
+ }
53
+ lang_code = lang_map.get(language_code, 'en')
54
+
55
+ config = {
56
+ "configurable": {
57
+ "thread_id": thread_id,
58
+ "knowledge_base": "pension_docs",
59
+ }
60
+ }
61
+
62
+ # Enhanced conversational system prompt with personalization
63
+ base_prompt = """You are a warm, friendly, and knowledgeable PWC Pension Assistant. Your responses should be:
64
+
65
+ - Conversational and natural, as if speaking to a trusted friend
66
+ - Concise and informative - aim for 1-3 sentences unless more detail is specifically requested
67
+ - Clear and easy to understand when spoken aloud
68
+ - Professional yet personable and approachable
69
+ - Avoid overly complex jargon or long lists that are hard to follow in audio format
70
+
71
+ When responding about pension policies:
72
+ - Use a warm, reassuring tone appropriate for financial guidance
73
+ - Speak in a natural rhythm suitable for text-to-speech
74
+ - Break complex information into digestible, conversational chunks
75
+ - Ask clarifying questions to better understand their specific situation
76
+ - Remember this is voice interaction - structure responses to be easily understood when heard
77
+ - Reference specific pension documents when relevant
78
+ - If uncertain, clearly state limitations and suggest authoritative sources
79
+
80
+ Keep responses short and conversational. Don't use abbreviations or complex numerical content in your responses.
81
+ Focus on being helpful, accurate, and easy to understand through voice."""
82
+
83
+ # Add conversation context if available
84
+ if conversation_context:
85
+ system_message = f"{base_prompt}\n\nPersonalization context: {conversation_context}"
86
+ else:
87
+ system_message = base_prompt
88
+
89
+ messages.append(SystemMessage(content=system_message))
90
+
91
+ else:
92
+ # Anonymous user
93
+ graph = create_basic_graph()
94
+ thread_id = str(uuid.uuid4())
95
+ voice_id = "en-IN-isha"
96
+ lang_code = "en"
97
+ config = {"configurable": {"thread_id": thread_id}}
98
+
99
+ # Generate personalized greeting
100
+ if flag:
101
+ greeting_response = await session_service.generate_personalized_greeting(thread_id, messages)
102
+ else:
103
+ greeting_response = "Hello! I'm your PWC Pension Assistant. I'm here to help you navigate pension policies, calculations, and retirement planning. What pension question can I help you with today?"
104
+
105
+ # Add greeting to conversation
106
+ messages.append(AIMessage(content=greeting_response))
107
+ conversation_history.append({
108
+ 'type': 'assistant',
109
+ 'content': greeting_response,
110
+ 'timestamp': asyncio.get_event_loop().time()
111
+ })
112
+
113
+ # Generate and send greeting audio
114
+ try:
115
+ greeting_audio = await conversational_audio_service.synthesize_speech(greeting_response, voice_id)
116
+ await websocket.send_json({"type": "connection_successful"})
117
+ if greeting_audio:
118
+ await websocket.send_bytes(greeting_audio)
119
+ except Exception as e:
120
+ logger.error(f"❌ Error sending greeting: {e}")
121
+ await websocket.send_json({"type": "connection_successful"})
122
+
123
+ try:
124
+ while True:
125
+ try:
126
+ # Handle different message types
127
+ data = await websocket.receive_json()
128
+
129
+ if data.get("type") == "end_call":
130
+ logger.info("πŸ“ž Call ended by client")
131
+ await websocket.close()
132
+ break
133
+
134
+ # Get language preference
135
+ lang = data.get("lang", "english").lower()
136
+
137
+ # Receive audio data
138
+ audio_bytes = await websocket.receive_bytes()
139
+
140
+ # --- Enhanced ASR with Groq ---
141
+ if flag:
142
+ transcription = await conversational_audio_service.transcribe_audio(
143
+ audio_bytes, language=lang_code
144
+ )
145
+ else:
146
+ transcription = await conversational_audio_service.transcribe_audio(audio_bytes)
147
+
148
+ if not transcription or not transcription.strip():
149
+ logger.warning("⚠️ Empty transcription received")
150
+ continue
151
+
152
+ # Send transcription to client
153
+ await websocket.send_json(
154
+ {"type": "transcription", "text": transcription}
155
+ )
156
+
157
+ # Add to conversation history
158
+ messages.append(HumanMessage(content=transcription))
159
+ conversation_history.append({
160
+ 'type': 'user',
161
+ 'content': transcription,
162
+ 'timestamp': asyncio.get_event_loop().time()
163
+ })
164
+
165
+ # --- Enhanced Document Search ---
166
+ search_results = None
167
+ try:
168
+ search_results = await enhanced_pension_search(transcription, limit=3)
169
+ logger.info(f"πŸ” Found {len(search_results) if search_results else 0} relevant documents")
170
+ except Exception as search_error:
171
+ logger.warning(f"⚠️ Document search failed: {search_error}")
172
+
173
+ # --- Enhanced LLM Response with Context ---
174
+ try:
175
+ if search_results and len(search_results) > 0:
176
+ # Enrich the message with search context
177
+ context_message = f"User query: {transcription}\n\nRelevant pension documents found:\n"
178
+ for i, doc in enumerate(search_results[:2], 1):
179
+ content_preview = doc.get('content', '')[:300]
180
+ context_message += f"\n{i}. {doc.get('source', 'Document')}: {content_preview}...\n"
181
+
182
+ context_message += f"\nBased on the above pension documents, provide a helpful and conversational response to: {transcription}"
183
+
184
+ # Replace user message with enriched version for LLM
185
+ messages[-1] = HumanMessage(content=context_message)
186
+
187
+ # Get LLM response
188
+ result = await graph.ainvoke({"messages": messages}, config=config)
189
+ llm_response = result["messages"][-1].content
190
+
191
+ # Send LLM response to client
192
+ await websocket.send_json(
193
+ {"type": "llm_response", "text": llm_response}
194
+ )
195
+
196
+ # Add to conversation
197
+ messages.append(AIMessage(content=llm_response))
198
+ conversation_history.append({
199
+ 'type': 'assistant',
200
+ 'content': llm_response,
201
+ 'timestamp': asyncio.get_event_loop().time()
202
+ })
203
+
204
+ # --- Enhanced TTS with Murf ---
205
+ try:
206
+ if flag:
207
+ audio_stream = await conversational_audio_service.synthesize_speech(
208
+ llm_response, voice_id=voice_id
209
+ )
210
+ else:
211
+ audio_stream = await conversational_audio_service.synthesize_speech(llm_response)
212
+
213
+ await websocket.send_json({"type": "tts_start"})
214
+ if audio_stream:
215
+ await websocket.send_bytes(audio_stream)
216
+ await websocket.send_json({"type": "tts_end"})
217
+
218
+ except Exception as tts_error:
219
+ logger.error(f"❌ TTS failed: {tts_error}")
220
+ await websocket.send_json({"type": "tts_error", "message": "Audio generation failed"})
221
+
222
+ except Exception as llm_error:
223
+ logger.error(f"❌ LLM processing failed: {llm_error}")
224
+ error_response = "I apologize, but I encountered an issue processing your question. Could you please try rephrasing it?"
225
+
226
+ await websocket.send_json(
227
+ {"type": "llm_response", "text": error_response}
228
+ )
229
+
230
+ # Try to generate error audio
231
+ try:
232
+ error_audio = await conversational_audio_service.synthesize_speech(error_response)
233
+ await websocket.send_json({"type": "tts_start"})
234
+ if error_audio:
235
+ await websocket.send_bytes(error_audio)
236
+ await websocket.send_json({"type": "tts_end"})
237
+ except:
238
+ pass
239
+
240
+ except WebSocketDisconnect:
241
+ logger.info("πŸ”Œ WebSocket disconnected.")
242
+ break
243
+ except Exception as e:
244
+ logger.exception(f"❌ Error during conversation: {e}")
245
+ try:
246
+ await websocket.send_json({"error": str(e)})
247
+ except:
248
+ pass
249
+ break
250
+
251
+ finally:
252
+ # Session summary generation (like friend's bot)
253
+ if flag and len(conversation_history) > 2:
254
+ try:
255
+ logger.info("πŸ’Ύ Generating session summary...")
256
+
257
+ # Generate session summary
258
+ summary = await session_service.generate_session_summary(messages, thread_id)
259
+
260
+ # Store the session summary
261
+ await session_service.store_session_summary(
262
+ thread_id,
263
+ summary,
264
+ conversation_history
265
+ )
266
+
267
+ logger.info(f"βœ… Session summary stored for user {thread_id}")
268
+
269
+ except Exception as e:
270
+ logger.exception(f"❌ Error storing session summary: {e}")
271
+ else:
272
+ logger.info("πŸ”„ Session ended - no summary needed")
273
+
274
+ logger.info(f"πŸ”„ Conversational session {thread_id} ended")
enhanced_websocket_handler.py CHANGED
@@ -20,6 +20,7 @@ from hybrid_llm_service import HybridLLMService
20
  from voice_service import voice_service
21
  from rag_service import search_government_docs
22
  from policy_chart_generator import PolicyChartGenerator
 
23
 
24
  # Initialize hybrid LLM service
25
  hybrid_llm_service = HybridLLMService()
@@ -633,7 +634,7 @@ async def handle_text_message(websocket: WebSocket, data: dict, session_data: di
633
  response_chunks = []
634
  provider_used = None
635
  async for chunk in get_hybrid_response(
636
- user_message, session_data["context"], config, knowledge_base
637
  ):
638
  response_chunks.append(chunk)
639
  # Send each chunk as structured data
@@ -829,7 +830,7 @@ async def handle_voice_message(websocket: WebSocket, data: dict, session_data: d
829
  if use_hybrid:
830
  response_chunks = []
831
  async for chunk in get_hybrid_response(
832
- enhanced_message, session_data["context"], config, knowledge_base
833
  ):
834
  response_chunks.append(chunk)
835
  # Send each chunk as structured data
@@ -895,18 +896,53 @@ async def handle_voice_message(websocket: WebSocket, data: dict, session_data: d
895
  "message": f"Error processing voice message: {str(e)}. Please try again or switch to text mode."
896
  })
897
 
898
- async def get_hybrid_response(user_message: str, context: str, config: dict, knowledge_base: str):
899
- """Get response using hybrid LLM with intelligent document search and fallback (streaming)"""
900
  try:
901
  # First, determine if this is a government document query or general query
902
  query_context = analyze_query_context(user_message)
903
  logger.info(f"πŸ” Query analysis: {query_context}")
904
 
905
- logger.info(f"πŸ” Searching documents for: '{user_message}' in knowledge base: {knowledge_base}")
 
 
 
 
 
 
 
 
 
 
906
  from rag_service import search_documents_async
907
- docs = await search_documents_async(user_message, limit=3)
908
  logger.info(f"πŸ“Š Document search returned {len(docs) if docs else 0} results")
909
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
910
  # Check if we have relevant documents
911
  has_relevant_docs = docs and any(doc.get("score", 0) > 0.5 for doc in docs)
912
 
 
20
  from voice_service import voice_service
21
  from rag_service import search_government_docs
22
  from policy_chart_generator import PolicyChartGenerator
23
+ from conversational_service import conversational_service
24
 
25
  # Initialize hybrid LLM service
26
  hybrid_llm_service = HybridLLMService()
 
634
  response_chunks = []
635
  provider_used = None
636
  async for chunk in get_hybrid_response(
637
+ user_message, session_data["context"], config, knowledge_base, session_data.get("session_id")
638
  ):
639
  response_chunks.append(chunk)
640
  # Send each chunk as structured data
 
830
  if use_hybrid:
831
  response_chunks = []
832
  async for chunk in get_hybrid_response(
833
+ enhanced_message, session_data["context"], config, knowledge_base, session_data.get("session_id")
834
  ):
835
  response_chunks.append(chunk)
836
  # Send each chunk as structured data
 
896
  "message": f"Error processing voice message: {str(e)}. Please try again or switch to text mode."
897
  })
898
 
899
+ async def get_hybrid_response(user_message: str, context: str, config: dict, knowledge_base: str, session_id: str = None):
900
+ """Get response using hybrid LLM with conversational clarity checks and intelligent document search"""
901
  try:
902
  # First, determine if this is a government document query or general query
903
  query_context = analyze_query_context(user_message)
904
  logger.info(f"πŸ” Query analysis: {query_context}")
905
 
906
+ # Check for follow-up context from previous clarification requests
907
+ follow_up_context = conversational_service.handle_follow_up(user_message, session_id) if session_id else {'is_follow_up': False}
908
+
909
+ if follow_up_context['is_follow_up']:
910
+ # Use enhanced query from follow-up context
911
+ search_query = follow_up_context['enhanced_query']
912
+ logger.info(f"οΏ½ Using follow-up enhanced query: '{search_query}'")
913
+ else:
914
+ search_query = user_message
915
+
916
+ logger.info(f"οΏ½πŸ” Searching documents for: '{search_query}' in knowledge base: {knowledge_base}")
917
  from rag_service import search_documents_async
918
+ docs = await search_documents_async(search_query, limit=5) # Increased limit for better results
919
  logger.info(f"πŸ“Š Document search returned {len(docs) if docs else 0} results")
920
 
921
+ # Conversational clarity analysis - check if we need clarification
922
+ if not follow_up_context['is_follow_up']: # Don't ask for clarification on follow-ups
923
+ conversational_analysis = conversational_service.generate_conversational_response(
924
+ user_message, docs, session_id
925
+ )
926
+
927
+ if conversational_analysis['needs_clarification']:
928
+ logger.info("❓ Query needs clarification - asking user for more context")
929
+ # Return clarification request instead of search results
930
+ yield {
931
+ "clause_text": conversational_analysis['response'],
932
+ "summary": "Clarification request to better understand your question",
933
+ "role_checklist": ["Please provide more specific information"],
934
+ "source_title": "Conversational Assistant",
935
+ "clause_id": "CLARIFICATION_REQUEST",
936
+ "date": "2024",
937
+ "url": "",
938
+ "score": 1.0,
939
+ "scenario_analysis": None,
940
+ "charts": [],
941
+ "needs_clarification": True,
942
+ "query_type": conversational_analysis.get('query_type', 'unclear')
943
+ }
944
+ return
945
+
946
  # Check if we have relevant documents
947
  has_relevant_docs = docs and any(doc.get("score", 0) > 0.5 for doc in docs)
948
 
session_service.py ADDED
@@ -0,0 +1,212 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Session management service for storing and retrieving conversation context
3
+ to make the voice bot more conversational and personalized.
4
+ """
5
+ import json
6
+ import os
7
+ from datetime import datetime
8
+ from typing import Dict, Optional, List
9
+ import logging
10
+
11
+ logger = logging.getLogger(__name__)
12
+
13
+ class SessionService:
14
+ def __init__(self, storage_dir: str = "./session_data"):
15
+ self.storage_dir = storage_dir
16
+ os.makedirs(storage_dir, exist_ok=True)
17
+
18
+ def get_session_file(self, user_id: str) -> str:
19
+ """Get the file path for a user's session data"""
20
+ return os.path.join(self.storage_dir, f"session_{user_id}.json")
21
+
22
+ async def get_session_summary(self, user_id: str) -> Dict:
23
+ """Retrieve session summary for a user"""
24
+ try:
25
+ session_file = self.get_session_file(user_id)
26
+ if os.path.exists(session_file):
27
+ with open(session_file, 'r', encoding='utf-8') as f:
28
+ data = json.load(f)
29
+ return data.get('session_summary', {})
30
+ return {}
31
+ except Exception as e:
32
+ logger.error(f"Error retrieving session summary for {user_id}: {e}")
33
+ return {}
34
+
35
+ async def get_user_preferences(self, user_id: str) -> Dict:
36
+ """Get user preferences and settings"""
37
+ try:
38
+ session_file = self.get_session_file(user_id)
39
+ if os.path.exists(session_file):
40
+ with open(session_file, 'r', encoding='utf-8') as f:
41
+ data = json.load(f)
42
+ return data.get('preferences', {
43
+ 'language': 'english',
44
+ 'voice_id': 'en-IN-isha',
45
+ 'conversation_style': 'friendly'
46
+ })
47
+ return {
48
+ 'language': 'english',
49
+ 'voice_id': 'en-IN-isha',
50
+ 'conversation_style': 'friendly'
51
+ }
52
+ except Exception as e:
53
+ logger.error(f"Error retrieving user preferences for {user_id}: {e}")
54
+ return {
55
+ 'language': 'english',
56
+ 'voice_id': 'en-IN-isha',
57
+ 'conversation_style': 'friendly'
58
+ }
59
+
60
+ async def store_session_summary(self, user_id: str, summary: str, conversation_history: List[Dict] = None) -> bool:
61
+ """Store session summary and conversation context"""
62
+ try:
63
+ session_file = self.get_session_file(user_id)
64
+
65
+ # Load existing data
66
+ existing_data = {}
67
+ if os.path.exists(session_file):
68
+ with open(session_file, 'r', encoding='utf-8') as f:
69
+ existing_data = json.load(f)
70
+
71
+ # Update session data
72
+ session_data = {
73
+ 'user_id': user_id,
74
+ 'last_updated': datetime.now().isoformat(),
75
+ 'session_summary': summary,
76
+ 'preferences': existing_data.get('preferences', {
77
+ 'language': 'english',
78
+ 'voice_id': 'en-IN-isha',
79
+ 'conversation_style': 'friendly'
80
+ }),
81
+ 'conversation_count': existing_data.get('conversation_count', 0) + 1,
82
+ 'recent_topics': existing_data.get('recent_topics', [])
83
+ }
84
+
85
+ # Add conversation history if provided
86
+ if conversation_history:
87
+ session_data['last_conversation'] = conversation_history[-10:] # Keep last 10 messages
88
+
89
+ # Extract recent topics
90
+ topics = []
91
+ for msg in conversation_history:
92
+ if msg.get('type') == 'user':
93
+ content = msg.get('content', '').lower()
94
+ if 'pension' in content:
95
+ topics.append('pension policies')
96
+ if 'medical' in content or 'allowance' in content:
97
+ topics.append('medical allowances')
98
+ if 'retirement' in content:
99
+ topics.append('retirement planning')
100
+ if 'dearness' in content or 'dr' in content:
101
+ topics.append('dearness relief')
102
+
103
+ # Keep unique recent topics
104
+ session_data['recent_topics'] = list(set(topics + session_data['recent_topics']))[:5]
105
+
106
+ # Save to file
107
+ with open(session_file, 'w', encoding='utf-8') as f:
108
+ json.dump(session_data, f, indent=2, ensure_ascii=False)
109
+
110
+ logger.info(f"βœ… Session summary stored for user {user_id}")
111
+ return True
112
+
113
+ except Exception as e:
114
+ logger.error(f"❌ Error storing session summary for {user_id}: {e}")
115
+ return False
116
+
117
+ async def get_conversation_context(self, user_id: str) -> str:
118
+ """Get conversation context string for system prompt"""
119
+ try:
120
+ session_data = await self.get_session_summary(user_id)
121
+ if not session_data:
122
+ return ""
123
+
124
+ context_parts = []
125
+
126
+ # Add conversation count
127
+ conv_count = session_data.get('conversation_count', 0)
128
+ if conv_count > 1:
129
+ context_parts.append(f"This is conversation #{conv_count} with this user.")
130
+
131
+ # Add recent topics
132
+ recent_topics = session_data.get('recent_topics', [])
133
+ if recent_topics:
134
+ topics_str = ", ".join(recent_topics)
135
+ context_parts.append(f"User has previously discussed: {topics_str}.")
136
+
137
+ # Add session summary if available
138
+ if isinstance(session_data, str):
139
+ context_parts.append(f"Previous conversation context: {session_data}")
140
+ elif isinstance(session_data, dict) and 'summary' in session_data:
141
+ context_parts.append(f"Previous conversation context: {session_data['summary']}")
142
+
143
+ return " ".join(context_parts)
144
+
145
+ except Exception as e:
146
+ logger.error(f"Error getting conversation context for {user_id}: {e}")
147
+ return ""
148
+
149
+ async def generate_personalized_greeting(self, user_id: str, messages_context: List = None) -> str:
150
+ """Generate a personalized greeting based on user history"""
151
+ try:
152
+ session_data = await self.get_session_summary(user_id)
153
+ preferences = await self.get_user_preferences(user_id)
154
+
155
+ conv_count = session_data.get('conversation_count', 0) if isinstance(session_data, dict) else 0
156
+ recent_topics = session_data.get('recent_topics', []) if isinstance(session_data, dict) else []
157
+
158
+ if conv_count == 0:
159
+ # First time user
160
+ return "Hello! I'm your PWC Pension Assistant. I'm here to help you navigate pension policies, calculations, and retirement planning from our comprehensive knowledge base. What pension question can I help you with today?"
161
+ elif conv_count == 1:
162
+ # Second conversation
163
+ return "Welcome back! I'm glad to see you again. How can I assist you with your pension and policy questions today?"
164
+ else:
165
+ # Returning user with history
166
+ if recent_topics:
167
+ topics_mention = f"I remember we discussed {recent_topics[0]} before. "
168
+ else:
169
+ topics_mention = ""
170
+
171
+ return f"Hello again! {topics_mention}I'm here to help with any pension or policy questions you have. What's on your mind today?"
172
+
173
+ except Exception as e:
174
+ logger.error(f"Error generating personalized greeting for {user_id}: {e}")
175
+ return "Hello! I'm your PWC Pension Assistant. How can I help you with your pension questions today?"
176
+
177
+ async def generate_session_summary(self, messages: List, user_id: str) -> str:
178
+ """Generate a session summary from conversation messages"""
179
+ try:
180
+ # Extract key topics and user preferences from conversation
181
+ user_messages = [msg for msg in messages if hasattr(msg, 'content') and 'user' in str(type(msg)).lower()]
182
+
183
+ if len(user_messages) < 2:
184
+ return "Brief conversation about pension policies."
185
+
186
+ # Simple topic extraction
187
+ topics = []
188
+ for msg in user_messages:
189
+ content = msg.content.lower()
190
+ if 'pension' in content:
191
+ topics.append('pension queries')
192
+ if 'medical' in content or 'allowance' in content:
193
+ topics.append('medical allowances')
194
+ if 'retirement' in content:
195
+ topics.append('retirement planning')
196
+ if 'dearness' in content or 'dr' in content:
197
+ topics.append('dearness relief')
198
+ if 'calculation' in content or 'calculate' in content:
199
+ topics.append('pension calculations')
200
+
201
+ unique_topics = list(set(topics))
202
+ if unique_topics:
203
+ return f"User discussed: {', '.join(unique_topics)}. Showed interest in pension policies and government benefits."
204
+ else:
205
+ return "General conversation about pension and policy matters."
206
+
207
+ except Exception as e:
208
+ logger.error(f"Error generating session summary for {user_id}: {e}")
209
+ return "Conversation session completed."
210
+
211
+ # Global session service instance
212
+ session_service = SessionService()
test_conversational.py ADDED
@@ -0,0 +1,150 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Test Conversational Voice Bot Features
4
+ Demonstrates how the bot now asks for clarification like your friend's bot
5
+ """
6
+
7
+ import asyncio
8
+ import logging
9
+
10
+ # Setup logging
11
+ logging.basicConfig(level=logging.INFO, format='%(levelname)s: %(message)s')
12
+ logger = logging.getLogger(__name__)
13
+
14
+ async def test_conversational_features():
15
+ """Test the conversational clarity detection"""
16
+
17
+ print("🎯 Testing Conversational Voice Bot Features")
18
+ print("=" * 60)
19
+ print("This demonstrates how your bot now asks for clarification")
20
+ print("just like your friend's superior voice bot!")
21
+ print()
22
+
23
+ try:
24
+ from conversational_service import conversational_service
25
+
26
+ # Test queries that should trigger clarification
27
+ test_cases = [
28
+ {
29
+ "query": "pension",
30
+ "description": "Vague pension query",
31
+ "expected": "Should ask for specific pension topic"
32
+ },
33
+ {
34
+ "query": "what",
35
+ "description": "Too generic query",
36
+ "expected": "Should ask what topic user wants help with"
37
+ },
38
+ {
39
+ "query": "help",
40
+ "description": "General help request",
41
+ "expected": "Should provide topic options"
42
+ },
43
+ {
44
+ "query": "tender",
45
+ "description": "Vague procurement query",
46
+ "expected": "Should ask about specific procurement aspect"
47
+ },
48
+ {
49
+ "query": "What are the pension rules?",
50
+ "description": "Clear specific query",
51
+ "expected": "Should proceed with search (no clarification needed)"
52
+ }
53
+ ]
54
+
55
+ for i, test_case in enumerate(test_cases, 1):
56
+ print(f"πŸ” Test {i}: {test_case['description']}")
57
+ print(f"Query: \"{test_case['query']}\"")
58
+ print(f"Expected: {test_case['expected']}")
59
+ print("-" * 50)
60
+
61
+ # Analyze query clarity
62
+ clarity_analysis = conversational_service.analyze_query_clarity(
63
+ test_case['query'],
64
+ [] # Empty search results for testing
65
+ )
66
+
67
+ if clarity_analysis['clarification_needed']:
68
+ print("βœ… RESULT: Bot will ask for clarification")
69
+ print("πŸ€– Bot response:")
70
+ print()
71
+ for question in clarity_analysis['clarification_questions']:
72
+ print(f" {question}")
73
+ print()
74
+
75
+ if 'suggested_responses' in clarity_analysis:
76
+ for suggestion in clarity_analysis['suggested_responses']:
77
+ print(f" {suggestion}")
78
+
79
+ else:
80
+ print("βœ… RESULT: Query is clear, will proceed with search")
81
+ print(f"🎯 Confidence: {clarity_analysis['confidence']:.2f}")
82
+
83
+ print()
84
+ print("=" * 60)
85
+ print()
86
+
87
+ except ImportError as e:
88
+ print(f"❌ Import error: {e}")
89
+ print("πŸ’‘ Make sure you're running from the PensionBot directory")
90
+ except Exception as e:
91
+ print(f"❌ Test error: {e}")
92
+
93
+ def demonstrate_conversation_flow():
94
+ """Demonstrate a full conversation flow"""
95
+
96
+ print("πŸ’¬ Conversation Flow Demonstration")
97
+ print("=" * 40)
98
+ print("Here's how a conversation now works:")
99
+ print()
100
+
101
+ conversation = [
102
+ {"speaker": "πŸ‘€ User", "message": "pension"},
103
+ {"speaker": "πŸ€– Bot", "message": """I can help with pension-related questions! To give you the most accurate information, could you specify:
104
+
105
+ πŸ“‹ What specifically about pension?
106
+ β€’ How to calculate pension amount?
107
+ β€’ Pension eligibility criteria?
108
+ β€’ How to apply for pension?
109
+ β€’ Family pension benefits?
110
+ β€’ Commutation of pension?
111
+
112
+ Please let me know which aspect interests you most!"""},
113
+ {"speaker": "πŸ‘€ User", "message": "calculation"},
114
+ {"speaker": "πŸ€– Bot", "message": """Here's the pension calculation formula:
115
+
116
+ **Formula**: (Last drawn basic pay + DA) Γ— service years Γ· 70
117
+ **Minimum pension**: β‚Ή9,000 per month
118
+ **Maximum pension**: No upper limit
119
+
120
+ **Example calculation**:
121
+ - Last basic pay: β‚Ή50,000
122
+ - DA: β‚Ή25,000 (50% of basic)
123
+ - Service years: 30
124
+ - Pension = (50,000 + 25,000) Γ— 30 Γ· 70 = β‚Ή32,143 per month
125
+
126
+ Would you like me to help with a specific pension calculation for your situation?"""}
127
+ ]
128
+
129
+ for exchange in conversation:
130
+ print(f"{exchange['speaker']}: {exchange['message']}")
131
+ print()
132
+
133
+ print("🎯 Key Improvements:")
134
+ print("βœ… Bot asks for clarification instead of guessing")
135
+ print("βœ… Provides structured options for user to choose")
136
+ print("βœ… Follows up with specific, detailed information")
137
+ print("βœ… Encourages further conversation")
138
+ print()
139
+ print("This is exactly how your friend's bot works! πŸŽ‰")
140
+
141
+ if __name__ == "__main__":
142
+ print("🎯 Conversational Voice Bot Test")
143
+ print("Testing enhanced conversational features")
144
+ print()
145
+
146
+ # Run the tests
147
+ asyncio.run(test_conversational_features())
148
+
149
+ print("\n" + "="*60)
150
+ demonstrate_conversation_flow()