Voxist Speech-to-text | Scaleway Marketplace

Presentation How to get started Pricing Support

Presentation

Async Speech-to-Text

Voxist API can transcribe pre-recorded audio and/or video files in seconds, with human-level accuracy. Highly scalable to tens of thousands of files in parallel.

Realtime Speech-to-Text

Voxist API can transcribe speech in realtime for your CallBots or to enhance your customer representative.

Auto Punctuation and Casing

Automatically add casing and punctuation of proper nouns to the transcription text.

Speaker Diarization

Detect the number of speakers in your audio file, with each word in the text associated with its speaker.

Word Timings

View word-by-word timestamps across the entire transcript text.

Who is the solution for?

CallBot developers who need real-time speech-to-text for seamless interactions

CallCenters who want transcripts for after-the-fact reporting or in real-time to help their customer representative

Media companies or any companies who need to create subtitles for videos or for conferences/meetings

How to get started

You need to get an API key and then you can use the APIs as described in the swagger documentation.

Pricing

Contact Voxist by clicking on the 'Contact' button for a quote for specific models or onpremise deployment.

Minutes	Hours	Voxist
100 000	1 667	0,65 €
500 000	8 333	0,52 €
1 000 000	16 667	0,42 €
3 000 000	50 000	0,32 €
6 000 000	100 000	0,32 €

Support

Standard support is available by email, phone support requires a subscription.

Learn more about Voxist Terms and Conditions.