How to configure Voice Agents

How to configure Voice Agents

This document provides description of Voice Agent feature, setup instructions, use cases and overview of how you can benefit from using Voice Agent for x-bees and Collaboration 7.

Developer documentation: https://docs.wildix.com/.

Created: April 2025

Updated: October 2025

Permalink: https://wildix.atlassian.net/wiki/x/AQBWS

Introduction

Voice Agent is a powerful tool that allows automated responses and routes customers' and your team’s queries via voice AI assistance. Voice Agents can be added via the Voice Bot integration in WMS. There are several Voice Agent integration types:

  • Generative AI let you create highly interactive and intelligent voice agent without any coding expertise. By providing specific instructions, the AI model generates dynamic, context-aware responses that enhance user engagement. You can also create easily incorporate custom functions to interact with third-party servers, allowing your bot to perform actions like fetching real-time data, updating records, or triggering external processes during a conversation. This option doesn’t require coding experience.

  • Webhooks and AWS SQS allow you to take full control over your voice agent sessions. These options are ideal if your voice agent requires custom handling of conversations; you can build a service to analyze events and generate responses.

  • Dialogflow CX is a versatile AI platform, which is great at handling natural language and managing conversations, even when they get a bit complicated. It is good for automating customer interactions, handling queries, and passing customers to agents when needed, especially in cases where you have clear predefined use cases of customer interactions, where the scenarios of communication with customers are generally the same.

  • OpenAI Assistant connects your voice agent to OpenAI’s language models, which lets to handle complex conversations and provide natural responses to user queries. With OpenAI’s advanced AI capabilities, your Voice Bot can understand context, manage dialogues, and deliver personalized interactions.

Voice agents support the following langauges:

  • Arabic

  • Catalan

  • Danish

  • Dutch

  • English (British)

  • English (US)

  • French

  • German

  • Italian

  • Portuguese

  • Spanish

  • Swedish

  • Swiss German

Requirements 

Use Cases

Voice agents can be used in a variety of ways. Here are some examples: 

  • Shop assistance: assisting in finding products, checking availability, or making recommendations based on their interests.

  • Order processing and tracking: helping to place orders and provide updates on delivery status.

  • Call centers: handling routine calls and reducing wait times, so that human agents can focus on more complex issues.

  • Customer service: answering common questions of the customers, providing account information, or helping with troubleshooting over the phone.

  • Language support: communicating with customers in multiple languages to address queries of a more diverse audience.

  • Collecting feedback: gathering customer reviews or feedback through voice interactions to improve services.

Step 1. Create Voice Agent


Note: It is possible to create up to 100 Voice agents per organization.

To create a voice agent, proceed with the following steps:

  • Navigate to WMS -> PBX -> Integrations -> Cloud integrations -> Voice Bots:

  • Click Add New Voicebot:

  • Enter voice agent name

  • Enter First message (optional)

  • Select the integration type for processing events:

    • Generative AI

    • Webhook

    • AWS SQS

    • Dialogflow CX

    • OpenAI Assistant

  • Fill out the necessary fields depending on the selected integration type (see instructions below)

  • Add Tools:

Tools allow a voice agent to execute specific tasks during a call. By integrating tools, you can align your voice agents with your existing workflow. See the list of available tools below:

Transfer

Allows the voice agent to hand over the call to specific extensions in the Dialplan. This is particularly useful when complex inquiries require human intervention for better customer satisfaction. 

Description:
The description feature enables your voice agent to determine the appropriate moments to transfer calls. To enhance the bot's decision-making, it's essential to accurately set the description by including comprehensive information about the case.

Example:

  • If the caller requests to speak with a representative or expresses frustration, transfer the call to a human agent

  • For billing inquiries, transfer the call to the billing department queue

  • If the caller provides account information that cannot be verified, transfer them to the security verification IVR

When the option Generate a reply as instructed and transfer the call after playback is selected, you can provide specific instructions that guide the model on what to say to the caller before initiating the transfer. The transfer will be executed immediately after the generated response is played back to the user, and the user will not have the option to cancel the transfer.

Example:

  • Reply "I’ll transfer you to a representative now. Please hold while I connect you."

  • Reply "I’m transferring you to our billing department. Please stay on the line."

Delegate

Allows to delegate user requests to specialised voice agents for more accurate and efficient processing. Acting as a router, the main voice agent identifies the type of request and directs it to the appropriate expert voice agent, ensuring precise handling of areas like scheduling, support, or sales.

Wait

Allows the voice agent to bypass its response and wait for the next user input, if it detects that the user has not fully completed their statement. This ensures that the bot does not interrupt or misinterpret partial information, leading to a smoother conversation flow.

By setting a custom Description, you can guide the voice agent to make more effective decisions regarding when to wait for additional user input. This ensures that the bot remains silent until all information has been provided by the user, reducing the chances of miscommunication or incomplete responses.

Example:

  • If the user pauses while giving an address, wait for them to finish before responding.

  • If there’s background noise or the user is interrupted, wait for them to resume speaking.

Hangup

Allows the voice agent to end the call once the conversation has concluded, or if the user explicitly requests to end it. By setting a custom Description, you can help the voice agent determine when to end the call, ensuring a smoother more natural user experience.

Third-party Function

Allows to integrate voice agent with various API options.

To add a tool:

  1. Click Add Tool -> choose the necessary option:

  1. Fill out the necessary details:

  1. Set up Advanced Configuration:

  • Model: choose the preferred AI model for generating responses. If no model is selected, the system uses the default model.

voice-bot-model.png
  • Interruption Detection: if enabled, customers can interrupt the agent and the system will stop the playback of the voice agent's response. By default, the option is disabled.

  • Silence Timeout: set the timeout before a call is automatically ended due to inactivity and the action (hangup or transfer) that should be performed when the call ends.
    In case you choose to transfer the call after the voice agent reaches the silence timeout, you need to specify:

    • Context: the Dialplan procedure

    • Extension: extension to which the call should be transferred

  • Maximum Duration: the maximum duration of a call in seconds and action (hangup or transfer) that should be performed when the call ends.

  • Click Add to save your voice agent and proceed with the Dialplan configuration (step 2 below).

Types of Voice Agents

Generative AI

When configuring Generative AI as the integration type, you need to create a clear and precise prompt with instructions for AI agents, which directly impacts voice agent's performance and reliability. Prompt engineering is an iterative process, so based on user feedback, you can refine your prompts for even better voice agent efficiency.

You can divide your system prompts into the distinct sections, each focusing on a specific element of the AI agent's behavior. For example:

  1. Identity: define who the AI agent is, outline its persona and role to set the context for interactions.

  2. Style: establish guidelines for the agent's communication style, including tone, language, and formality.

  3. Response Guidelines: specify preferences for the response format, including any limitations or requirements in terms of the response structure.

  4. Task and Goals: indicate the objectives the agent should achieve and outline the steps it should follow.

generative-ai-voice-bot.png

Starting from WMS Beta 7.05.20251008.1, you can also include the following caller details to metadata:

  • User Name: allows to configure Voice bot to turn to a caller by name

  • User Phone Number: gives the Voice bot access to user phone number, useful for identifying existing customers, verifying accounts

  • User Email: provides the Voice bot access to user email (if the information is available), useful for sending booking confirmations, follow-up information

  • User Company: gives Voice bot access to user company (if available), useful for handling corporate accounts, event bookings, or offering business-specific services

  • Date & Time (dynamic value): allows Voice bot to get information about the current date and time. With this option enabled, the bot can correctly interpret time-related expressions such as “now” (e.g., “Is an agent available now?”), “tomorrow” (e.g., “Can I book a meeting for tomorrow?”), “in two hours” (e.g., “Can I have a delivery in two hours”), or specific days of the week, etc.

How to use the feature:

  1. You need to add the required metadata (e.g. User Name) under the Instructions field by clicking Add context → choose the necessary option:

generative-ai-metadata.png
  1. Make sure to reference it in the prompt depending on the context when it should be used, e.g.: “Please use the User Name metadata when greeting the user.”

Webhook

Specify the following fields, when configuring Webhook as the integration type:

  1. Target: enter the URL that the Webhook will use to send POST requests with the event payload.

  2. Secret: the secret ensures that only requests from Wildix system are accepted, preventing unauthorized access or potential security breaches. The secret key is included in the headers of each POST request sent by the Webhook. Your server should validate this key to ensure the request is legitimate before processing the event data

If you configure AWS SQS as integration type, you need to provide the following details to establish the connection with your AWS SQS queue:

  • Target: enter the URL of your SQS queue. This is where the events are sent, for example, https://sqs.amazonaws.com/11111/wildix-events-queue

  • Key: enter your AWS Access Key ID. It is used to sign the request that x-bees / Collaboration 7 sends to AWS SQS.

  • Secret: enter your AWS Secret Access Key, which is paired with your AWS Key to sign the requests securely.

Dialogflow CX

If you configure Dialogflow CX as the integration type, you need to fill out the following fields to establish the connection between x-bees / Collaboration 7 and your Dialogflow CX agent:

  • Private Key: click Upload and upload the private key file associated with your Google Cloud service account

  • Location: fill out the region where your Dialogflow CX agent is deployed (typically it is a region-specific identifier, for example, europe-west1, us-central1)

  • Language: indicate the language that your Dialogflow CX agent will use to understand and respond to user inputs. Make sure the language code matches the languages supported by your Dialogflow CX agent, e.g.:en for English

  • Agent ID: provide the unique identifier of your Dialogflow CX agent, links your voice agent to the specific Dialogflow CX agent that you’ve configured in Google Cloud.

OpenAI Assistant

If you configure OpenAI Assistant as integration type, you need to fill out the following fields to enable the connection between x-bees / Collaboration 7 and OpenAI's API:

  1. API Key: enter the unique identifier that allows to grant access to the OpenAI API, which lets to send requests and receive responses from the Assistant

  2. Assistant ID: fill out the unique identifier of a specific OpenAI Assistant you created.

Step 2. Configure Dialplan

To add voice agent to a Dialplan, use the Voice Bot application. Before adding the voice agent, make sure to set alaw/ ulaw codecs, as voice agent cannot be started in case the call was answered with opus codec: 

1. Add the Set application -> Codecs -> alaw, ulaw

2. Then, add the Voice Bot application:

  • Choose the necessary voice agent

  • Select language

  • Choose Voice 

Note: Starting from WMS 7.04.20250929.2, it is possible to set custom ElevenLabs voice for Voice bots. For this, in the Voice field, add a link to the preferred voice from ElevenLabs in the following format:

elevenlabs://voice-id?apiKey=api-key

Where "voice-id" is the ID of the preferred voice from ElevenLabs and "api-key" is the ElevenLabs API Key.

  • To get "voice-id" in ElevenLabs, proceed to Voices (1) -> My Voices (2) -> click the three dots in front of the preferred voice -> click Copy Voice ID (3):

  • To get ElevenLabs API Key, proceed to the Developers tab (1) -> API Keys (2): 

elevenlabs-api-keys.png

You can either use an existing API Key or create a new one. 

  • Add Welcome message if required 

Note:

  • The following languages are supported: Arabic, Catalan, Danish, Dutch, English (British), English (US), French, German, Italian, Portuguese, Spanish, Swedish, Swiss German.

  • For some languages, it may not be possible to select a specific voice. In such cases, the default voice is used.

  • Arabic language is not available in the drop-down menu, but you can use the Set application to define the language

  • Basque and Estonian languages are not supported

Manage Voice Agents

The voice agents that you have created are displayed in WMS -> PBX -> Integrations -> Cloud integrations -> Voice Bots section. You can see the voice agent name, ID, and Integration type.

Edit a Voice Agent

  1. To edit a voice agent, click on the Edit (pencil) icon:

  1. Make the necessary changes and click Save:

Delete a Voice Agent

  1. To delete a voice agent, click on the Delete icon:

  1. On the screen that pops up, type the word “delete” and click Delete:

Traces

In Traces section you can see a table with the following information: session ID, voice agent name, caller, duration of the call, date and language:

Clicking on a session in Traces, you can view events of the session:

Voice Agents API

You can find voice agents API here.

Use Case: Using Voice and Chat Agents

You can enhance your experience of using Voice Agent feature by combining it with Chat Agent (you can find Chat Agent documentation here). For example, you can set up voice agent that would gather information from a customer and send it to the conversation with managers via a chat agent:

Step 1. Create a Chat Agent

  1. Go to WMS -> PBX -> Integrations -> Cloud integrations

  2. Select Chat Bots and click Add new Chatbot

  3. Enter a name for your chat agent

  4. Select the Webhook integration type for processing chat events

  5. Fill out the Target field

  6. Enable Allow users to find the chat agent using search checkbox to let users interact with it

  7. Click Add to save and activate your chat agent 

After creating the chat agent, click on Manage API keys to create API key:

  1. Click Create new API Key

  2. Enter a name for identification

  3. Click Create and copy the secret using the Click to reveal button. You will need the secret when configuring voice agent.

Step 2. Create a conversation

Create a conversation in x-bees / Collaboration 7 where you need to add the chat agent you’ve created as well as the managers who should receive the notifications.

Also, make sure to copy conversation ID, which will be required during voice agent creation:

Step 3. Create Voice Agent

Configure voice agent of Generative AI integration type.

In our example, we’ve used the following text as the First Message:
Hello! Do you have any complaints or suggestions regarding Wildix products?

And added the following instructions:

You are a customer care agent that collects all the complaints and suggestions about Wildix products. Try to understand with which product customer is having problems or has a suggestion. Carefully collect all the details. Then pass them to the manager in the chat.
Share with the manager any emotions or sentiments the customer had if any. Ask the customer their name before passing the information to the manager and remember to pass the customer's name as well. Hang up after saying thank you and good-bye if the customer says they have nothing else to add.

In Tools section, add Hangup and Third-party Function options:

We’ve used the following parameters in the Parameters section of Third-party Function section:

{ "type": "object", "properties": { "text": { "type": "string", "description": "The message to the manager containing all the details about complains and suggestions collected from a customer" } }, "required": [ "text" ]

In Integration section, in the URL field (1) next to the POST Method, entered the following data:

https://api.x-bees.com/v2/conversations/channels/{Conversation_ID}/messages

Where {Conversation_ID} is the ID of the conversation from Step 2.

Click Add authorization and enter the Secret from Step 1 into the Bearer field (2):

Click Save to save the changes.

Step 4. Configure Dialplan

Set up voice agent in the Dialplan:

When calling the number set in the Dialplan, the call is answered by voice agent, which gathers the required information and send it to the conversation: