Spaces:

yashgori20
/

evolusis

Sleeping

yashgori20 commited on Nov 10

Commit

fcd8049

1 Parent(s): 8850406

Update Evolusis AI Agent with enhanced features

- Improved audio transcription with better file format handling
- Added Loom video demonstration link
- Enhanced error messages for better user experience
- Merged improvements from GitHub repository

Files changed (2) hide show

README.md +2 -3
app.py +21 -4

README.md CHANGED Viewed

@@ -9,10 +9,9 @@ pinned: false
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 # Evolusis AI Agent
 Backend Developer Assignment - AI-powered chat assistant with LLM reasoning and external API integration.
 ## 🎯 Overview

 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 # Evolusis AI Agent
+## Live Deployement - https://yashgori20-evolusis.hf.space
+## Loom Video - https://www.loom.com/share/57867c0d51be40f48e47646023ceaeb0
 Backend Developer Assignment - AI-powered chat assistant with LLM reasoning and external API integration.
 ## 🎯 Overview

app.py CHANGED Viewed

@@ -87,9 +87,25 @@ class ToolRegistry:
         try:
             if not groq_client:
                 return None
             transcription = groq_client.audio.transcriptions.create(
-                file=audio_file,
                 model="whisper-large-v3-turbo",
                 response_format="text"
             )
@@ -775,7 +791,7 @@ if True:
                     process_query(transcription)
                     st.rerun()
                 else:
-                    st.error("Failed to transcribe audio. Please check your GROQ_API_KEY.")
         # Text input
         user_input = st.text_input("⌨️ Or type your question...", key="chat_input_text")
@@ -859,10 +875,11 @@ if True:
             with st.spinner("🎧 Transcribing your voice..."):
                 transcription = agent.tools.transcribe_audio(audio_input)
                 if transcription:
                     process_query(transcription)
                     st.rerun()
                 else:
-                    st.error("Failed to transcribe audio. Please check your GROQ_API_KEY.")
         # Text input for follow-up
         user_input = st.text_input("⌨️ Continue the conversation...", key="followup_text")

         try:
             if not groq_client:
                 return None
+            # Ensure file pointer is at the beginning
+            if hasattr(audio_file, 'seek'):
+                audio_file.seek(0)
+            # Get the original filename or create a default one with proper extension
+            # Streamlit's audio_input typically records in WAV format
+            filename = getattr(audio_file, 'name', 'audio.wav')
+            # Ensure filename has an extension
+            if not any(filename.lower().endswith(ext) for ext in ['.wav', '.mp3', '.webm', '.m4a', '.ogg']):
+                filename = 'audio.wav'
+            # Create a tuple with (filename, file_object) for Groq API
+            # This ensures Groq can properly detect the audio format
+            file_tuple = (filename, audio_file)
             transcription = groq_client.audio.transcriptions.create(
+                file=file_tuple,
                 model="whisper-large-v3-turbo",
                 response_format="text"
             )
                     process_query(transcription)
                     st.rerun()
                 else:
+                    st.error("Failed to transcribe audio. Please check your GROQ_API_KEY and audio format.")
         # Text input
         user_input = st.text_input("⌨️ Or type your question...", key="chat_input_text")
             with st.spinner("🎧 Transcribing your voice..."):
                 transcription = agent.tools.transcribe_audio(audio_input)
                 if transcription:
+                    st.success(f"You said: {transcription}")
                     process_query(transcription)
                     st.rerun()
                 else:
+                    st.error("Failed to transcribe audio. Please check your GROQ_API_KEY and audio format.")
         # Text input for follow-up
         user_input = st.text_input("⌨️ Continue the conversation...", key="followup_text")