Integration demos

Introduction

In this section, you will find various practical use cases demonstrating how to adapt and use Kroko's on-premise models or the Kroko hosted API in with your infrastructure. The documentation is designed to guide you through each demo, providing detailed information on their setup and usage. Follow the step-by-step instructions to integrate these models into your own environment seamlessly.

Auto Uploader

Introduction

Auto Uploader app is a straightforward application that runs in the background, continuously scanning audio directories you configure. It automatically transcribes any audio files found in these directories and saves the transcripts in a text file of your choice.

Prerequisites:

Python >= 3.9+
Ensure your models for pre-recorded audio are ready. You have two options: use on-premise models or the Kroko hosted API.
- For on-premise models, follow the setup instructions here.
- For the hosted API, register and generate your API key following the instructions here.

Supported Audio Formats:

.wav
.gsm
.mp3

Setup

Clone the Auto Uploader repository

git clone https://github.com/banafo-ai/banafo-asr.git
cd integration-demos/auto_uploader

Run the installation script:
```
./install.sh
```
Configure your directories to monitor for audio files (you can add multiple directories). Make sure to specify either --api and --lang for our Kroko hosted server or --uri for your on-premise server.
- If using the Kroko hosted API, provide your API key with the --api parameter.
```
    cd ../python/
    ./auto_uploader.py -x insert --lang en-US --api 1b4a82e0-54b0-11ef-2ac2-a5860d6a4a87.86b3fec75502ad01438446d15ca57ea79ab6a4f8 --path /urs/local/my-audio-location/ --txt /urs/local/my-transcripts-location/

    systemctl restart auto_uploader_events.service
```
- If using your on-premise server, provide the server URI with the --uri parameter for the Auto Uploader to generate your transcripts.
```
    cd ../python/
    ./auto_uploader.py -x insert --uri ws://127.0.0.1:6006/ --path /urs/local/my-audio-location/ --txt /urs/local/my-transcripts-location/

    systemctl restart auto_uploader_events.service
```
Note:
- Replace /urs/local/my-audio-location/ with your preferred audio directory.
- Replace /urs/local/my-transcripts-location/ with your preferred directory for transcript storage. If omitted, transcripts will be stored in a .txt file inside the Auto Uploader folder.
- Full list of languages you can find here
Check full list of configuration parameters(details here):
```
./auto_uploader.py
```

Using the default installation script, the Auto Uploader will automatically detect if there is an Asterisk or Freeswitch installation on your machine. If such installations are found, their directories will be added to the list of directories to monitor.

After a successful installation, you will notice:

The installation will be placed inside /usr/local/auto_uploader/ directory. Make sure you use this app from the source or installation directory.
A new systemd service called auto_uploader_events will be enabled. This daemon will monitor the preconfigured (or default) directories and generate a transcript whenever a new audio file appears. In case you decide do disable this service, you will need to manually run the scanning using the option to upload listed here

Additional options

In this section, you will find a list of additional customization possibilies for your Auto Uploader setup. After following the instruction in the Setup section, you will have the default version of the Auto Uploader installed. The following allow you to configure the app to suit your custom needs.

For any of the next customizations explained, you need to use the ./auto_uploader.py script with the specified parameters.

Most common parameters:

-x list: Lists the directories currently monitored by the Auto Uploader.

Example:

./auto_uploader.py -x list

-x insert --path=PATH --uri=URI --txt=DIR: Adds more directories to the list of monitored directories for audio files with specified on-premise transcript URI and txt location.

Example:

./auto_uploader.py -x insert --uri ws://127.0.0.1:6006/ --path /urs/local/my-audio-location/ --txt /urs/local/my-transcripts-location/

Note: After executing the insert command, you need to restart the Auto Uploader service:

systemctl restart auto_uploader_events.service

-x insert --path=PATH --api=API-KEY --txt=DIR: Adds more directories to the list of monitored directories for audio files using the Kroko hosted API with specified API key and txt location.

Example:

./auto_uploader.py -x insert --api 1b4a82e0-54b0-11ef-2ac2-a5860d6a4a87.86b3fec75502ad01438446d15ca57ea79ab6a4f8 --path /urs/local/my-audio-location/ --txt /urs/local/my-transcripts-location/

Note: After executing the insert command, you need to restart the Auto Uploader service:

systemctl restart auto_uploader_events.service

-x remove --id=ID: Removes a directory from the list of monitored directories by its ID (ID can be checked using the list command).

Example:

./auto_uploader.py -x remove --id=2

Note: After executing the remove command, you need to restart the Auto Uploader service:

systemctl restart auto_uploader_events.service

-x pending: Shows a list of files pending transcription.

Example:

./auto_uploader.py -x pending

-x success: Shows a list of successfully transcribed files.

Example:

./auto_uploader.py -x success

-x errors: Shows a list of files that failed to get transcribed with their error.

Example:

./auto_uploader.py -x errors

-x upload: Triggers a manual scan of the directories in the list and generates transcripts for the files that have not yet been transcribed. Make sure you execute this command as root as it will need root permissions to write the transcript in a file.

Example:

sudo ./auto_uploader.py -x upload

-x delete: Clears all files in the database regardless of their status (pending, success, error). After executing this command, all files will be treated as not transcribed, allowing for manual upload to be performed.

Example:

./auto_uploader.py -x delete

Full list of options:

./auto_uploader.py

Uninstall:

In case you want to remove the Auto Uploader app, run the following uninstall script:

./uninstall.sh

Note: The script will only remove the Kroko Auto Uploader application files. Your audio files and transcripts generated so far will not be removed.

FreeSwitch

Kroko transcripts using the Auto Uploader App

By setting up our Auto Uploader app as described below, you can obtain transcripts for your FreeSwitch calls. This implementation approach provides the flexibility to get our transcripts and customize you solution according to your specific needs.

Ensure your models for pre-recorded audio are ready. You have two options: use on-premise models or the Kroko hosted API.
- For on-premise models, follow the setup instructions here.
- For the hosted API, register and generate your API key following the instructions here.

Clone the Auto Uploader repository

git clone https://github.com/banafo-ai/banafo-asr.git
cd banafo-asr/integration-demos/auto_uploader/scripts

Run the installation script:
```
./install.sh
```
In case your FreeSwitch installation is not the default one and/or is not detected by the Auto Uploader app, you need to manually configure the recording's directory to get monitored. Make sure to specify either --api and --lang for our Kroko hosted server or --uri for your on-premise server.

If using the Kroko hosted API, provide your API key with the --api parameter.
```
cd /usr/local/auto_uploader/python/
./auto_uploader.py -x insert --lang languageHere --api apiKeyHere --path /urs/local/my-FREESWITCH-audio-location/ --txt /urs/local/my-transcripts-location/

systemctl restart auto_uploader_events.service
```
If using your on-premise server, provide the server URI with the --uri parameter for the Auto Uploader to generate your transcripts.
```
cd /usr/local/auto_uploader/python/
./auto_uploader.py -x insert --uri ws://127.0.0.1:6006/ --path /urs/local/my-FREESWITCH-audio-location/ --txt /urs/local/my-transcripts-location/

systemctl restart auto_uploader_events.service
```
Note:
- Replace /urs/local/my-FREESWITCH-audio-location/ with your FreeSwitch audio directory.
- Replace /urs/local/my-transcripts-location/ with your preferred directory for transcript storage. If omitted, transcripts will be stored in a .txt file inside the Auto Uploader folder.
- Replace languageHere with the code for your preferred language. See the list of supported languages here.
- Replace apiKeyHere with your apiKey. If you don't have one yet, generate it by following these guidelines.

For full list of the Auto Uploader options you can click here

Kroko module for FreeSwitch real-time transcripts.

By setting up the freeswitch-kroko module as described below, you can stream audio from your FreeSwitch calls directly to the Kroko ASR engine and obtain real-time transcripts. This integration provides full flexibility to choose between on-premise or cloud-based speech recognition, and to customize how transcription data is handled within your own FreeSwitch setup or downstream applications.

Setup

Ensure your STT models are ready. You have two options: use on-premise models or the Kroko hosted API.
- For on-premise models, follow the setup instructions here.
- For the hosted API, register and generate your API key following the instructions here.

Clone the Integration demos repository

git clone https://github.com/kroko-ai/integration-demos.git /usr/src/integration-demos/

Clone the Freeswitch repository and add the Kroko module to FreeSwitch build configuration

git clone --branch v1.10.12 --depth 1 https://github.com/signalwire/freeswitch /usr/src/freeswitch

cd /usr/src/freeswitch

sed -i '/src\/mod\/asr_tts\/mod_tts_commandline\/Makefile/a\\t\tsrc\/mod\/asr_tts\/mod_kroko_transcribe\/Makefile' configure.ac

add 'asr_tts/mod_kroko_transcribe' in modules.conf

cp -fvR ../integration-demos/freeswitch-kroko/mod_kroko_transcribe/ 
src/mod/asr_tts/mod_kroko_transcribe/

Compile and install FreeSwitch

cd /usr/src/freeswitch

./bootstrap.sh -j
./configure
make -j`nproc` && make install

Configuration

By default, the Kroko module configuration file is not located in the default FreeSWITCH configuration directory. You should copy it manually to the autoload configs directory:

    cd integration-demos/freeswitch-module/mod_kroko_transcribe/
    cp -v kroko_transcribe.conf.xml /usr/local/freeswitch/conf/autoload_configs/

In the kroko_transcribe.conf.xml file, make sure to adjust the configuration according to your setup and requirements.

The following parameters are crucial for proper module operation:

host and port – Set these to match your Kroko ASR (STT) server or API endpoint.
result_mode – Defines how transcription results are returned:
- none – Raw transcript output from the ASR server (default behavior).
- text – Returns only finalized text segments (e.g., full sentences).
- json – Returns detailed developer-friendly JSON output.
channel_mode – Defines which side of the conversation to be transcribed:
- ro – Transcribe caller only.
- wo – Transcribe callee only.
- rw – Transcribe both caller and callee.
callback_url – (Optional) Specify this if you want the module to send transcription results to a custom URL, for example, to update your own UI or application in real time.

You can explore and fine-tune the remaining options in the configuration file to better match your environment and workflow.

Usage

The module provides both an API and a dialplan application for transcription:

API command: uuid_kroko_transcribe
Dialplan application: kroko_transcribe

API Example:

Run from fs_cli or your scripts:

<freeswitch@kroko-fs-mod> uuid_kroko_transcribe b49d710e-e49f-4bb7-b077-43075835bca8 start en-US

Dialplan Example:

<!-- QA Team accounts -->
<context name="qa-team" >
 <extension name="QA_Team_Extension">

  <condition field="destination_number" expression="^(189[0-9][0-9])$" break="on-true">
    <action application="set" data="answer_delay=0"/>
    <action application="set" data="call_timeout=45"/>
    <action application="set" data="hangup_after_bridge=true"/>
    <action application="set" data="continue_on_fail=true"/>
    <action application="set" data="bypass_media=true"/>
    <action application="bridge" data="user/$1" />
    <action application="hangup" />
  </condition>

 </extension>

 <extension name="playback">
  <condition field="destination_number" expression="^9999$">
    <action application="set" data="answer_delay=0"/>
    <action application="answer"/>
    <action application="playback" data="viki_test.wav"/>
    <action application="hangup" />
  </condition>
 </extension>

 <extension name="local-ws">
    <condition field="destination_number" expression="^99996$">
      <action application="set" data="answer_delay=0"/>
      <action application="answer" />
      <action application="log" data="uuid: ${uuid}"/>
      <action application="set" data="lang=en"/>
      <action application="kroko_transcribe" data="${uuid} start ${lang}" />
      <action application="bridge" data="loopback/9999/qa-team"/>
      <action application="hangup" />
    </condition>
 </extension>

 <extension name="api-wss">
    <condition field="destination_number" expression="^99997$">
      <action application="set" data="answer_delay=0"/>
      <action application="answer" />
      <action application="log" data="uuid: ${uuid}"/>
      <action application="set" data="KROKO_API_KEY=<apiKey>"/>
      <action application="set" data="KROKO_SPEECH_ENDPOINTS=true"/>
      <action application="set" data="lang=en-US"/>
      <action application="kroko_transcribe" data="${uuid} start ${lang}" />
      <action application="bridge" data="loopback/9999/qa-team"/>
      <action application="hangup" />
    </condition>
 </extension>

 <extension name="local-conf-room">
    <condition field="destination_number" expression="^99998$">
      <action application="set" data="answer_delay=0"/>
      <action application="answer" />
      <action application="log" data="uuid: ${uuid}"/>
      <action application="set" data="lang=en"/>
      <action application="kroko_transcribe" data="${uuid} start ${lang}" />
      <action application="conference" data="$1-${domain_name}@default"/>
      <action application="hangup" />
    </condition>
 </extension>

</context>

Note:

local-ws → connects to the Kroko on-premise ASR server
api-wss → connects to the Kroko Streaming API

For API-based connections, register and generate your API key following the instructions here and replace with your actual key.

After updating the dialplan, reload FreeSwitch configuration:

    reloadxml

Testing freeswitch-kroko module with Docker

You can test the module using a preconfigured Docker environment. All required files are located in the ./docker/ subdirectory.

With the kroko-fs Docker container, you’ll have a ready-to-use FreeSwitch setup including test SIP accounts, dialplan examples, and the freeswitch-kroko module preconfigured. Please note that the Kroko transcription service itself is not included in this Docker setup — you’ll need to either configure your own on-premise Kroko server or connect to the Kroko Streaming API as described above.

(Optional) In case you're using the Kroko hosted API:
```
cd docker/
vim Dockerfile
```
Find this line and update it:
```
ENV apiKey "<yourApiKey>"
```
Note:
- Replace apiKeyHere with your apiKey. If you don't have one yet, generate it by following these guidelines.

Build and Run

cd docker/
docker-compose up -d --build

Check Container Status
```
docker ps -a
docker logs kroko-fs
```

The first command lists running containers, and the second shows logs for the freeswitch-kroko container.

Test SIP Accounts

Account 1:

SIP proxy: localhost or 127.0.0.1
SIP proxy port: 9096
SIP username: 18902
SIP password: QA*VoIP-Test_902
audio codecs: g711,opus,speex,etc

Account 2:

SIP proxy: localhost or 127.0.0.1
SIP proxy port: 9096
SIP username: 18903
SIP password: QA*VoIP-Test_903
audio codecs: g711,opus,speex,etc

Test extensions:

99996 to call to Kroko on-premise server ( localhost )
99997 to call to Kroko API ( https://app.kroko.ai )
99998 to call a conference room and send audio stream to Kroko on-promise server

Asterisk

Kroko transcripts using the Auto Uploader App

By setting up our Auto Uploader app as described below, you can obtain transcripts for your Asterisk calls. This implementation approach provides the flexibility to get our transcripts and customize you solution according to your specific needs.

Ensure your models for pre-recorded audio are ready. You have two options: use on-premise models or the Kroko hosted API.
- For on-premise models, follow the setup instructions here.
- For the hosted API, register and generate your API key following the instructions here.

Clone the Auto Uploader repository

git clone https://github.com/banafo-ai/banafo-asr.git
cd banafo-asr/integration-demos/auto_uploader/scripts

Run the installation script:
```
./install.sh
```
In case your Asterisk installation is not the default one and/or is not detected by the Auto Uploader app, you need to manually configure the recording's directory to get monitored. Make sure to specify either --api and --lang for our Kroko hosted server or --uri for your on-premise server.

If using the Kroko hosted API, provide your API key with the --api parameter.
```
cd /usr/local/auto_uploader/python/
./auto_uploader.py -x insert --lang languageHere --api apiKeyHere --path /urs/local/my-ASTERISK-audio-location/ --txt /urs/local/my-transcripts-location/

systemctl restart auto_uploader_events.service
```
If using your on-premise server, provide the server URI with the --uri parameter for the Auto Uploader to generate your transcripts.
```
cd /usr/local/auto_uploader/python/
./auto_uploader.py -x insert --uri ws://127.0.0.1:6006/ --path /urs/local/my-ASTERISK-audio-location/ --txt /urs/local/my-transcripts-location/

systemctl restart auto_uploader_events.service
```
Note:
- Replace /urs/local/my-ASTERISK-audio-location/ with your Asterisk audio directory.
- Replace /urs/local/my-transcripts-location/ with your preferred directory for transcript storage. If omitted, transcripts will be stored in a .txt file inside the Auto Uploader folder.
- Replace languageHere with the code for your preferred language. See the list of supported languages here.
- Replace apiKeyHere with your apiKey. If you don't have one yet, generate it by following these guidelines.

For full list of the Auto Uploader options you can click here

Kroko module for Asterisk real-time transcripts.

By setting up the asterisk-kroko module as described below, you can stream audio from your Asterisk calls directly to the Kroko ASR engine and obtain real-time transcripts. This integration provides full flexibility to choose between on-premise or cloud-based speech recognition, and to customize how transcription data is handled within your own Asterisk setup or downstream applications.

We provide two modules that enable the Kroko STT integration:

Speech (res_speech_kroko module)
Audiohook (res_audiohook_kroko module) — depends on the res_speech_kroko module

Setup via Docker

Ensure your STT models are ready. You have two options: use on-premise models or the Kroko hosted API.
- For on-premise models, follow the setup instructions here.
- For the hosted API, register and generate your API key following the instructions here.

Clone the Integration demos repository

git clone https://github.com/kroko-ai/integration-demos.git /usr/src/integration-demos/
cd integration-demos/asterisk-kroko/

Build and run (for Kroko hosted API usage):

If you plan to use the Kroko hosted API STT, you must provide your apiKey when starting the Docker setup:

    cd docker/
    API_KEY="apiKey" docker compose up -d --build

Note:

If the container is already created, you can update the API key directly in the mounted configuration file:

mcedit /var/lib/docker/volumes/docker_asterisk_etc/_data/res_speech_kroko.conf
put api-key
docker restart kroko-ast

Build and run (for Kroko on premise STT):

cd docker/
docker compose up -d --build

Setup manually

Ensure your STT models are ready. You have two options: use on-premise models or the Kroko hosted API.
- For on-premise models, follow the setup instructions here.
- For the hosted API, register and generate your API key following the instructions here.

Clone the Integration demos repository

git clone https://github.com/kroko-ai/integration-demos.git /usr/src/integration-demos/
cd integration-demos/asterisk-kroko/

Download or use your existing Asterisk installation

Compile and install Asterisk

./bootstrap
./configure --with-asterisk=/usr/src/asterisk/ --prefix=/usr/
make
make install

Enable Kroko modules in your Asterisk setup by modifying modules.conf
```
load = res_speech.so
load = res_http_websocket.so
load = res_speech_kroko.so
load = res_audiohook_kroko.so
```
Note:
- res_audiohook_kroko.so depends on res_speech_kroko.so, which means res_speech_kroko.so must always be loaded first.
- You can load modules manually from the Asterisk CLI using:
```
module load module_name
```
- You can unload modules manually from the Asterisk CLI using:
```
module unload module_name
```

Configuration

To configure the Kroko module, modify the /etc/asterisk/res_speech_kroko.conf file:

    [general]
    debug = no

    [kro-en]
    url = ws://localhost:6006
    ; sample_rate = 16000
    ; callback_url = ws://127.0.0.1:8000
    ; result_mode = none|text|json
    result_mode = text
    ; channel_mode = ro|wo|rw
    channel_mode = wo

    [kro-bg]
    url = ws://localhost:6007

    [kro-en-api]
    url = wss://app.kroko.ai/api/v1/transcripts/streaming
    apiKey = apiKey-from-kroko.ai
    lang = en-US
    ;endpoints = false

Important Configuration Parameters:

url – Specifies the Kroko STT server or API endpoint to connect to.
apiKey - Required only when using the hosted API (e.g., kro-en-api).
result_mode – Controls how transcription results are returned:
- none – Raw transcript output from the ASR server (default behavior).
- text – Returns only finalized text segments (e.g., full sentences).
- json – Returns detailed developer-friendly JSON output.
channel_mode – Defines which side of the conversation to be transcribed:
- ro – Transcribe caller only.
- wo – Transcribe callee only.
- rw – Transcribe both caller and callee.
sample_rate (optional) - Defaults to 16000 if not set.
callback_url (optional) - Use this if you have a server prepared to receive forwarded transcripts from the Kroko module.

Usage

Dialplan Example:

    [sip-test]
    ; to Kroko API GW with ENG models
    exten => _8881,1,NoOp()
    same => n,Answer()
    same => n,SpeechCreate(kro-en-api)
    same => n,SpeechBackground(hello-world)
    same => n,Verbose(0,${SPEECH_TEXT(0)})
    same => n,Hangup()

    ; to Kroko ASR server with ENG models
    exten => _8882,1,NoOp()
    same => n,Answer()
    same => n,SpeechCreate(kro-en)
    same => n,SpeechBackground(hello-world)
    same => n,Verbose(0,${SPEECH_TEXT(0)})
    same => n,Hangup()

    ; to Kroko ASR server with ENG models
    exten => _8883,1,NoOp()
    same => n,Answer()
    same => n,KrokoAudioHook(kro-en)
    same => n,Playback(viki_test)
    same => n,Hangup()

    ; to Kroko API GW with ENG models
    exten => _8884,1,NoOp()
    same => n,Answer()
    same => n,KrokoAudioHook(kro-en-api)
    same => n,Playback(viki_test)
    same => n,Hangup()

Testing asterisk-kroko module with the Docker setup

Download and install Zoiper5.

Configure a SIP account using the following parameters:

SIP server : 127.0.0.1:5297
SIP username : 1107
SIP password : TestQA*1107
audio codecs : g711, opus or gsm

You can dial the following extensions:
- 8881 → Calls kroko.ai API streaming, English model, using the Asterisk ASR interface (via the res_speech_kroko module)
- 8882 → Calls the on-premise Kroko ASR server, English model, using the Asterisk ASR interface (via the res_speech_kroko module)
- 8883 → Calls the on-premise Kroko ASR server, English model, using the Asterisk Audiohook interface (via the res_speech_kroko and res_audiohook_kroko modules)
- 8884 → Calls kroko.ai API streaming, English model, using the Asterisk Audiohook interface (via the res_speech_kroko and res_audiohook_kroko modules)

Note::

Use the following command to view the current Asterisk log output:
```
docker logs -f  kroko-ast
```

FreePBX

Get Kroko transcripts for your PBX calls

By setting up our Auto Uploader app as described below, you can obtain transcripts for your FreePBX calls. This implementation approach provides the flexibility to get our transcripts and customize you solution according to your specific needs.

There are two options to set up transcripts for your FreePBX recordings using Kroko.

Option 1: Using our Auto Uploader app, which includes the following steps:

Ensure your models for pre-recorded audio are ready. You have two options: use on-premise models or the Kroko hosted API.
- For on-premise models, follow the setup instructions here.
- For the hosted API, register and generate your API key following the instructions here.

Clone the Auto Uploader repository

git clone https://github.com/banafo-ai/banafo-asr.git
cd banafo-asr/integration-demos/auto_uploader/scripts

Run the installation script:
```
./install.sh
```
In case your FreePBX installation is not the default one and/or is not detected by the Auto Uploader app, you need to manually configure the recording's directory to get monitored. Make sure to specify either --api and --lang for our Kroko hosted server or --uri for your on-premise server.

If using the Kroko hosted API, provide your API key with the --api parameter.
```
cd /usr/local/auto_uploader/python/
./auto_uploader.py -x insert --lang languageHere --api apiKeyHere --path /urs/local/my-FreePBX-audio-location/ --txt /urs/local/my-transcripts-location/

systemctl restart auto_uploader_events.service
```
If using your on-premise server, provide the server URI with the --uri parameter for the Auto Uploader to generate your transcripts.
```
cd /usr/local/auto_uploader/python/
./auto_uploader.py -x insert --uri ws://127.0.0.1:6006/ --path /urs/local/my-FreePBX-audio-location/ --txt /urs/local/my-transcripts-location/

systemctl restart auto_uploader_events.service
```
Note:
- Replace /urs/local/my-FreePBX-audio-location/ with your FreePBX audio directory.
- Replace /urs/local/my-transcripts-location/ with your preferred directory for transcript storage. If omitted, transcripts will be stored in a .txt file inside the Auto Uploader folder.
- Replace languageHere with the code for your preferred language. See the list of supported languages here.
- Replace apiKeyHere with your apiKey. If you don't have one yet, generate it by following these guidelines.
Ensure that call recordings are enabled in the FreePBX interface.

For full list of the Auto Uploader options you can click here