Leopard Speech-to-Text

Build accurate and private transcription software.

On-device transcription with cloud-level accuracy bringing control back to enterprises

Press the button
to start transcribing with Leopard

Trusted by thousands of enterprises - from startups to Fortune 500s

Loved by 50,000+ developers

What is Leopard Speech-to-Text?

Leopard Speech-to-Text is software that converts audio and video recordings into text with cloud-level accuracy without sacrificing privacy.

Leopard Speech-to-Text brings speech recognition to where data resides, enabling transcription on the device, mobile, web browsers, on-prem, or cloud.

Build with intuitive Speech-to-Text SDKs

o = pvleopard.create(access_key)

transcript, words = 
  o.process_file(path)
Build with Python

Why Leopard Speech-to-Text?

Speech-to-text APIs require enterprises to send their data to a 3rd party cloud, giving away control over their data and product.

Leopard Speech-to-Text offers the same performance with no compromises.

Don't leave any data behind

Creating new possibilities for your content, product, and database

Just like “best” cloud transcription APIs…

Leopard Speech-to-Text offers cloud-level accuracy, model customization, and cross-platform support…

…with no compromises

…without sacrificing privacy, reliability, and affordability, enabling use cases that were impossible before.

Scientifically-Proven Accuracy

Your product, your decision

Evaluate the accuracy of Leopard Speech-to-Text vs other transcription APIs scientifically with the open-source speech-to-text benchmark, enabling you to make decisions confidently with your data.

Speech-to-Text Model Adaptation

Boost accuracy with custom models

Customize pre-trained speech-to-text models instantly by adding domain-specific vocabulary and boosting frequently-used words on a self-service platform, achieving the highest possible accuracy.

Speech-to-text APIs transfer voice input to the cloud to transcribe it into text, creating privacy, and reliability issues and additional costs.

Cross-platform support

Create seamless experiences

Deploy Leopard Speech-to-Text anywhere and offer seamless experiences across devices, mobile apps, web browsers, on-premise, cloud, or all.

Privacy by design

Do not rely on “check the box” compliance models

Process voice data without sharing it with 3rd parties, ensuring compliance with GDPR, HIPAA, CCPA, and more - including any policies that come in the future.

No downtime and zero latency

Develop dependable products

Build reliable products with predictable response times by bringing speech-to-text closer to your data to bypass network latency, congestion, outages, or throttling.

Cost-effective at scale

Scale your business, not cloud providers’

Do not bear the cost of running bulky models in the cloud. Big Tech uses on-device speech-to-text for their products because running large models in the cloud is costly, even for them.

Get started with

Leopard Speech-to-Text

The best way to learn about Leopard Speech-to-Text is to use it!

Start Now

Forever Free Plan

Pre-trained models
Custom vocabulary
Keyword boosting
Intuitive SDKs
Speaker Diarization
Trucasing and Punctuation
Word-level Confidence Scores
Word-level Timestamps
English, French, German, Italian, Japanese, Korean, Portuguese, and Spanish

JavaScript Speech Recognition

Whisper Speech-to-Text Alternative for Real-time Transcription

Lumina - AI Art Generator using Voice Prompts in Python

Transcribe and Summarize YouTube videos using Twilio, ChatGPT, and Leopard Speech-to-Text in Node.JS

Top Free and Commercial Speaker Diarization APIs and SDKs

Offline Speech-to-Text Features

What is speech-to-text?

Speech-to-text (STT), also known as Automatic Speech Recognition (ASR) and Open-Domain Large Vocabulary Speech Recognition (LVSR), refers to the technology and methodologies that convert voice data into text.

How does on-device speech-to-text differ from cloud-based speech-to-text?

Cloud-based speech-to-text APIs send voice data to vendors’ servers, where the transcription engine resides. On-device voice processing brings voice recognition where voice data is, eliminating all the steps related to cloud processing.

What are the benefits of on-device speech-to-text over cloud speech-to-text APIs?

On-device speech-to-text empowers enterprises to retain ownership and control over their data and product. Sending voice data to the cloud has privacy, latency, reliability, and cost implications. On-device speech-to-text overcomes these challenges, bringing control back to enterprises.

Does Leopard Speech-to-Text support real-time transcription?

Leopard Speech-to-Text doesn’t, but Cheetah Streaming Speech-to-Text does. Cheetah is Picovoice’s on-device streaming speech-to-text engine that provides text output in real time.

Can I use Leopard Speech-to-Text in the cloud?

Yes. You can run Leopard Speech-to-Text in the cloud, whether private, public, or hybrid. Picovoice’s on-device voice recognition technology ensures that data doesn’t have to leave the enterprises’ premises regardless of the platform, instead of making the cloud mandatory. Don’t forget to check tutorials for serverless speech-to-text with AWS Lambda and transcription microservice with gRPC.

Does Leopard Speech-to-Text support Speaker Diarization?

Leopard Speech-to-Text offers an optimized Falcon Speaker Diarization embedded to simplify the development process. Please check Leopard Speech-to-Text documentation for more information.

Does Leopard Speech-to-Text perform Trucasing and Punctuation?

Leopard Speech-to-Text performs Trucasing and Punctuation. Please refer to the Leopard Speech-to-Text documentation to enable or disable automatic punctuation.

Does Leopard Speech-to-Text return Word-level Confidence Scores?

Leopard Speech-to-Text returns Word-level Confidence Scores. Please refer to the Leopard Speech-to-Text documentation for more information.

Does Leopard Speech-to-Text generate Word-level Timestamps?

Leopard Speech-to-Text generates Word-level Timestamps. Please refer to the Leopard Speech-to-Text documentation for more information.

How do I choose the best speech-to-text for my project?

“Best” is a subjective term. Every use case has different business requirements. Several factors, such as accuracy, availability of features, the total cost of ownership, and data privacy and governance, have different weights in different use cases.

Which platforms does Leopard Speech-to-Text support?

Desktop and Servers: Linux, macOS, and Windows
Web Browsers: Chrome, Safari, Edge, and Firefox
Mobile Devices: Android and iOS
Single Board Computers: Raspberry Pi and NVIDIA Jetson

Which languages does Leopard Speech-to-Text support?

Leopard Speech-to-Text supports English, French, German, Italian, Japanese, Korean, Portuguese, and Spanish.

What should I do if I need support for other languages?

Reach out to Picovoice Sales to tell us about your commercial endeavor.

How do I get technical support for Leopard Speech-to-Text?

Picovoice docs, blog, Medium posts, and GitHub are great resources to learn about voice AI, Picovoice technology, and how to start building transcription products. You can report bugs and issues on GitHub. If you need help with developing your product, you can purchase the optional Support Add-on or upgrade your account to the Developer Plan.

How can I get informed about updates and upgrades?

Version changes appear in the and LinkedIn. Subscribing to GitHub is the best way to get notified of patch releases. If you enjoy building with Leopard Speech-to-Text, show it by giving a GitHub star!

Build accurate and private transcription software.

What is Leopard Speech-to-Text?

Build with intuitive Speech-to-Text SDKs

Why Leopard Speech-to-Text?

Don't leave any data behind

Just like “best” cloud transcription APIs…

…with no compromises

Your product, your decision

Boost accuracy with custom models

Create seamless experiences

Do not rely on “check the box” compliance models

Develop dependable products

Scale your business, not cloud providers’

Leopard Speech-to-Text

More from Picovoice

JavaScript Speech Recognition

Whisper Speech-to-Text Alternative for Real-time Transcription

Lumina - AI Art Generator using Voice Prompts in Python

Transcribe and Summarize YouTube videos using Twilio, ChatGPT, and Leopard Speech-to-Text in Node.JS

Top Free and Commercial Speaker Diarization APIs and SDKs

Offline Speech-to-Text Features

Leopard Speech-to-Text

What is speech-to-text?

How does on-device speech-to-text differ from cloud-based speech-to-text?

What are the benefits of on-device speech-to-text over cloud speech-to-text APIs?

Does Leopard Speech-to-Text support real-time transcription?

Can I use Leopard Speech-to-Text in the cloud?

Does Leopard Speech-to-Text support Speaker Diarization?

Does Leopard Speech-to-Text perform Trucasing and Punctuation?

Does Leopard Speech-to-Text return Word-level Confidence Scores?

Does Leopard Speech-to-Text generate Word-level Timestamps?

How do I choose the best speech-to-text for my project?

Which platforms does Leopard Speech-to-Text support?

Which languages does Leopard Speech-to-Text support?

What should I do if I need support for other languages?

How do I get technical support for Leopard Speech-to-Text?

How can I get informed about updates and upgrades?