• Home
  • About Us
  • Contact Us
  • Disclaimer
  • Privacy Policy
Saturday, September 13, 2025
newsaiworld
  • Home
  • Artificial Intelligence
  • ChatGPT
  • Data Science
  • Machine Learning
  • Crypto Coins
  • Contact Us
No Result
View All Result
  • Home
  • Artificial Intelligence
  • ChatGPT
  • Data Science
  • Machine Learning
  • Crypto Coins
  • Contact Us
No Result
View All Result
Morning News
No Result
View All Result
Home Machine Learning

Crafting a Customized Voice Assistant with Perplexity

Admin by Admin
August 31, 2025
in Machine Learning
0
0rjctnjblsy5drogs.webp.webp
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter

READ ALSO

If we use AI to do our work – what’s our job, then?

10 Python One-Liners Each Machine Studying Practitioner Ought to Know


, Alexa, and Siri are the dominating voice assistants out there for on a regular basis use. These assistants have turn into ubiquitous in virtually each residence, finishing up duties from residence automation, notice taking, recipe steering and answering easy questions. Relating to answering questions although, within the age of LLMs, getting a concise and context-based reply from these voice assistants could be difficult, if not non-existent. For instance, when you ask Google Assistant how the market is reacting to Jerome Powell’s speech in Jackson Gap on Aug 22, it is going to merely reply that it doesn’t know the reply and provides a couple of hyperlinks which you can peruse. That’s if in case you have the screen-based Google Assistant.

Typically you simply desire a fast reply on present occasions, otherwise you wish to know if an Apple tree would survive the winter in Ohio, and infrequently voice assistants like Google and Siri fall wanting offering a satisfying reply. This received me taken with constructing my very own voice assistant, one that might give me a easy, single sentence reply primarily based on its search of the online.

Picture by Aerps.com on Unsplash

Of the varied LLM powered serps out there, I’ve been an avid consumer of Perplexity for greater than a yr now and I exploit it solely for all my searches besides for easy ones the place I nonetheless return to Google or Bing. Perplexity, along with its dwell internet index, which allows it to offer up-to-date, correct, sourced solutions, permits customers entry to its performance by way of a strong API. Utilizing this performance and integrating it with a easy Raspberry Pi, I supposed to create a voice assistant that might:

  • Reply to a wake phrase and be able to reply my query
  • Reply my query in a easy, concise sentence
  • Return to passive listening with out promoting my information or giving my pointless advertisements

The {Hardware} for the Assistant

Picture by Axel Richter on Unsplash

To construct our voice assistant, a couple of key {hardware} parts are required. The core of the mission is a Raspberry Pi 5, which serves because the central processor for our software. For the assistant’s audio enter, I selected a easy USB gooseneck microphone. One of these microphone is omnidirectional, making it efficient at listening to the wake phrase from totally different elements of a room, and its plug-and-play nature simplifies the setup. For the assistant’s output, a compact USB-powered speaker offers the audio output. A key benefit of this speaker is that it makes use of a single USB cable for each its energy and audio sign, which minimizes cable litter.

Block diagram exhibiting the performance of the customized voice assistant (picture by creator)

This method of utilizing available USB peripherals makes the {hardware} meeting simple, permitting us to focus our efforts on the software program.

Getting the setting prepared

As a way to question Perplexity utilizing customized queries and with a view to have a wake phrase for the voice assistant, we have to generate a few API keys. As a way to generate a Perplexity API key one can join a Perplexity account, go to the Settings menu, choose the API tab, and click on “Generate API Key” to create and replica their private key to be used in functions. Entry to API key technology normally requires a paid plan or cost technique, so make sure the account is eligible earlier than continuing.

Platforms that supply wake phrase customization embody PicoVoice Porcupine, Sensory TrulyHandsfree, and Snowboy, with PicoVoice Porcupine offering a simple on-line console for producing, testing, and deploying customized wake phrases throughout desktop, cell, and embedded gadgets. A brand new consumer can generate a customized phrase for PicoVoice Porcupine by signing up for a free Picovoice Console account, navigating to the Porcupine web page, deciding on the specified language, typing within the customized wake phrase, and clicking “Prepare” to supply and obtain the platform-specific mannequin file (.ppn) to be used. Make sure that to check the wake phrase for efficiency earlier than finalizing, as this ensures dependable detection and minimal false positives. The wake phrase I’ve skilled and can use is “Hey Krishna”.

Coding the Assistant

The entire Python script for this mission is obtainable on my GitHub repository. On this part, let’s have a look at the important thing parts of the code to know how the assistant features.
The script is organized into a couple of core features that deal with the assistant’s senses and intelligence, all managed by a central loop.

Configuration and Initialization

The primary a part of the script is devoted to setup. It handles loading the required API keys, mannequin information, and initializing the shoppers for the companies we’ll use.

# --- 1. Configuration ---
load_dotenv()
PICOVOICE_ACCESS_KEY = os.environ.get("PICOVOICE_ACCESS_KEY")
PERPLEXITY_API_KEY = os.environ.get("PERPLEXITY_API_KEY")
KEYWORD_PATHS = ["Krishna_raspberry-pi.ppn"] # My wake phrase pat
MODEL_NAME = "sonar"

This part makes use of the dotenv library to securely load your secret API keys from a .env file, which is a finest follow that retains them out of your supply code. It additionally defines key variables like the trail to your customized wake phrase file and the particular Perplexity mannequin we wish to question.

Wake Phrase Detection

For the assistant to be actually hands-free, it must pay attention constantly for a selected wake phrase with out utilizing vital system assets. That is dealt with by the whereas True: loop within the foremost operate, which makes use of the PicoVoice Porcupine engine.

# That is the principle loop that runs constantly
whereas True:
    # Learn a small chunk of uncooked audio information from the microphone
    pcm = audio_stream.learn(porcupine.frame_length)
    pcm = struct.unpack_from("h" * porcupine.frame_length, pcm)
    
    # Feed the audio chunk into the Porcupine engine for evaluation
    keyword_index = porcupine.course of(pcm)

    if keyword_index >= 0:
        # Wake phrase was detected, proceed to deal with the command...
        print("Wake phrase detected!")

This loop is the guts of the assistant’s “passive listening” state. It constantly reads small, uncooked audio frames from the microphone stream. Every body is then handed to the porcupine.course of() operate. This can be a extremely environment friendly, offline course of that analyzes the audio for the particular acoustic sample of your customized wake phrase (“Krishna”). If the sample is detected, porcupine.course of() returns a non-negative quantity, and the script proceeds to the lively part of listening for a full command.

Speech-to-Textual content — Changing consumer inquiries to textual content

After the wake phrase is detected, the assistant must pay attention for and perceive the consumer’s query. That is dealt with by the Speech-to-Textual content (STT) element.

# --- This logic is inside the principle 'if keyword_index >= 0:' block ---

print("Listening for command...")
frames = []
# Report audio from the stream for a set period (~10 seconds)
for _ in vary(0, int(porcupine.sample_rate / porcupine.frame_length * 10)):
    frames.append(audio_stream.learn(porcupine.frame_length))

# Convert the uncooked audio frames into an object the library can use
audio_data = sr.AudioData(b"".be part of(frames), porcupine.sample_rate, 2)

attempt:
    # Ship the audio information to Google's service for transcription
    command = recognizer.recognize_google(audio_data)
    print(f"You (command): {command}")
besides sr.UnknownValueError:
    speak_text("Sorry, I did not catch that.")

As soon as the wake phrase is detected, the code actively information audio from the microphone for roughly 10 seconds, capturing the consumer’s spoken command. It then packages this uncooked audio information and sends it to Google’s speech recognition service utilizing the speech_recognition library. The service processes the audio and returns the transcribed textual content, which is then saved within the command variable.

Getting Solutions from Perplexity

As soon as the consumer’s command has been transformed to textual content, it’s despatched to the Perplexity API to get an clever, up-to-date reply.

# --- This logic runs if a command was efficiently transcribed ---

if command:
    # Outline the directions and context for the AI
    messages = [{"role": "system", "content": "You are an AI assistant. You are located in Twinsburg, Ohio. All answers must be relevant to Cleveland, Ohio unless asked for differently by the user.  You MUST answer all questions in a single and VERY concise sentence."}]
    messages.append({"function": "consumer", "content material": command})
    
    # Ship the request to the Perplexity API
    response = perplexity_client.chat.completions.create(
        mannequin=MODEL_NAME, 
        messages=messages
    )
    assistant_response_text = response.decisions[0].message.content material.strip()
    speak_text(assistant_response_text)

This code block is the “mind” of the operation. It first constructs a messages checklist, which features a crucial system immediate. This immediate provides the AI its persona and guidelines, comparable to answering in a single sentence and being conscious of its location in Ohio. The consumer’s command is then added to this checklist, and the complete bundle is distributed to the Perplexity API. The script then extracts the textual content from the AI’s response and passes it to the speak_text operate to be learn aloud.

Textual content-to-Speech — Changing Perplexity response to Voice

The speak_text operate is what provides the assistant its voice.

def speak_text(text_to_speak, lang='en'):
    # Outline a operate that converts textual content to speech, default language is English
    
    print(f"Assistant (talking): {text_to_speak}")
    # Print the textual content for reference so the consumer can see what's being spoken
    
    attempt:
        pygame.mixer.init()
        # Initialize the Pygame mixer module for audio playback
        
        tts = gTTS(textual content=text_to_speak, lang=lang, gradual=False)
        # Create a Google Textual content-to-Speech (gTTS) object with the supplied textual content and language
        # 'gradual=False' makes the speech sound extra pure (not slow-paced)
        
        mp3_filename = "response_audio.mp3"
        # Set the filename the place the generated speech can be saved
        
        tts.save(mp3_filename)
        # Save the generated speech as an MP3 file
        
        pygame.mixer.music.load(mp3_filename)
        # Load the MP3 file into Pygame's music participant for playback
        
        pygame.mixer.music.play()
        # Begin enjoying the speech audio
        
        whereas pygame.mixer.music.get_busy():
            pygame.time.Clock().tick(10)
        # Maintain this system working (by checking if playback is ongoing)
        # This prevents the script from ending earlier than the speech finishes
        # The clock.tick(10) ensures it checks 10 occasions per second
        
        pygame.mixer.stop()
        # Give up the Pygame mixer as soon as playback is full to free assets
        
        os.take away(mp3_filename)
        # Delete the short-term MP3 file after playback to scrub up
        
    besides Exception as e:
        print(f"Error in Textual content-to-Speech: {e}")
        # Catch and show any errors that happen through the speech technology or playback

This operate takes a textual content string, prints it for reference, then makes use of the gTTS (Google Textual content-to-Speech) library to generate a short lived MP3 audio file. It performs the file by way of the system’s audio system utilizing the pygame library, waits till playback is completed, after which deletes the file. Error dealing with is included to catch points through the course of.

Testing the assistant

Beneath is an illustration of the functioning of the customized voice assistant. To match its efficiency with Google Assistant, I’ve requested the identical query from Google in addition to from the customized assistant.

As you may see, Google offers hyperlinks to the reply fairly than offering a short abstract of what the consumer desires. The customized assistant goes additional and offers a abstract and is extra useful and informational.

Conclusion

On this article, we seemed on the means of constructing a completely practical, hands-free voice assistant on a Raspberry Pi. By combining the ability of a customized wake phrase and the Perplexity API by utilizing Python, we created a easy voice assistant system that helps in getting info rapidly.

The important thing benefit of this LLM-based method is its capacity to ship direct, synthesized solutions to advanced and present questions — a process the place assistants like Google Assistant usually fall brief by merely offering a listing of search hyperlinks. As an alternative of performing as a mere voice interface for a search engine, our assistant features as a real reply engine, parsing real-time internet outcomes to offer a single, concise response. The way forward for voice assistants lies on this deeper, extra clever integration, and constructing your individual is one of the best ways to discover it.

Tags: AssistantCraftingCustomPerplexityvoice

Related Posts

Mike von 2hzl3nmoozs unsplash scaled 1.jpg
Machine Learning

If we use AI to do our work – what’s our job, then?

September 13, 2025
Mlm ipc 10 python one liners ml practitioners 1024x683.png
Machine Learning

10 Python One-Liners Each Machine Studying Practitioner Ought to Know

September 12, 2025
Luna wang s01fgc mfqw unsplash 1.jpg
Machine Learning

When A Distinction Truly Makes A Distinction

September 11, 2025
Mlm ipc roc auc vs precision recall imblanced data 1024x683.png
Machine Learning

ROC AUC vs Precision-Recall for Imbalanced Knowledge

September 10, 2025
Langchain for eda build a csv sanity check agent in python.png
Machine Learning

LangChain for EDA: Construct a CSV Sanity-Examine Agent in Python

September 9, 2025
Jakub zerdzicki a 90g6ta56a unsplash scaled 1.jpg
Machine Learning

Implementing the Espresso Machine in Python

September 8, 2025
Next Post
Vibe coding rust.png

Vibe Coding Excessive-Efficiency Information Instruments in Rust

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

POPULAR NEWS

0 3.png

College endowments be a part of crypto rush, boosting meme cash like Meme Index

February 10, 2025
Gemini 2.0 Fash Vs Gpt 4o.webp.webp

Gemini 2.0 Flash vs GPT 4o: Which is Higher?

January 19, 2025
1da3lz S3h Cujupuolbtvw.png

Scaling Statistics: Incremental Customary Deviation in SQL with dbt | by Yuval Gorchover | Jan, 2025

January 2, 2025
0khns0 Djocjfzxyr.jpeg

Constructing Data Graphs with LLM Graph Transformer | by Tomaz Bratanic | Nov, 2024

November 5, 2024
How To Maintain Data Quality In The Supply Chain Feature.jpg

Find out how to Preserve Knowledge High quality within the Provide Chain

September 8, 2024

EDITOR'S PICK

Garlinghouse Reveals Ripple Will Spend 200m Defending Itself Against The Sec Even As Case Worries Xrp Holders.jpg

Will XRP Hit $100? Ballot Reveals 42% of Holders Suppose So ⋆ ZyCrypto

February 10, 2025
Data Annotation Trends In 2025.jpg

Knowledge Annotation Traits for 2o25

January 11, 2025
0eh9np4z 9zu4nxj.jpeg

Exploring DRESS Package V2. Exploring new options and notable… | by Waihong Chung | Oct, 2024

October 17, 2024
0 wef7r6u lcz vupz.jpg

The Greatest AI Books & Programs for Getting a Job

May 27, 2025

About Us

Welcome to News AI World, your go-to source for the latest in artificial intelligence news and developments. Our mission is to deliver comprehensive and insightful coverage of the rapidly evolving AI landscape, keeping you informed about breakthroughs, trends, and the transformative impact of AI technologies across industries.

Categories

  • Artificial Intelligence
  • ChatGPT
  • Crypto Coins
  • Data Science
  • Machine Learning

Recent Posts

  • Grasp Knowledge Administration: Constructing Stronger, Resilient Provide Chains
  • Generalists Can Additionally Dig Deep
  • If we use AI to do our work – what’s our job, then?
  • Home
  • About Us
  • Contact Us
  • Disclaimer
  • Privacy Policy

© 2024 Newsaiworld.com. All rights reserved.

No Result
View All Result
  • Home
  • Artificial Intelligence
  • ChatGPT
  • Data Science
  • Machine Learning
  • Crypto Coins
  • Contact Us

© 2024 Newsaiworld.com. All rights reserved.

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?