Cloud Speech API Limited Preview
Speech to text conversion powered by machine learning
Sign Up for the Limited PreviewPowerful Speech Recognition
Google Cloud Speech API enables developers to convert audio to text by applying powerful neural network models in an easy to use API. The API recognizes over 80 languages and variants, to support your global user base. You can transcribe the text of users dictating to an application’s microphone, enable command-and-control through voice, or transcribe audio files, among many other use cases. Recognize audio uploaded in the request, and in upcoming releases, integrate with your audio storage on Google Cloud Storage.
Over 80 Languages
Speech API recognizes over 80 languages and variants to support your global user base. You can also filter inappropriate content in text results for all languages.
Return Text Results In Real-Time
Speech API can stream text results, returning partial recognition results as they become available, with the recognized text appearing immediately while speaking. Alternatively, Speech API can return recognized text from audio stored in a file.
Accurate In Noisy Environments
You don’t need advanced signal processing or noise cancellation before sending audio to Speech API. The service can successfully handle noisy audio from a variety of environments.
Powered by Machine Learning
Apply the most advanced deep learning neural network algorithms to your users’ audio for speech recognition with unparalleled accuracy. Speech API accuracy improves over time as new terms are introduced and usage grows.
Speech API Features
Speech to text conversion powered by machine learning
- Automatic Speech Recognition
- Automatic Speech Recognition (ASR) powered by deep learning neural networking to power your applications like voice search or speech transcription.
- Global Vocabulary
- Recognizes over 80 languages and variants with an extensive vocabulary.
- Streaming Recognition
- Returns partial recognition results immediately, as they become available.
- Inappropriate Content Filtering
- Filter inappropriate content in text results for some languages.
- Real-time or Buffered Audio Support
- Audio input can be captured by an application’s microphone or sent from a pre-recorded audio file. Multiple audio file formats are supported, including FLAC, AMR, PCMU and linear-16.
- Noisy Audio Handling
- Handles noisy audio from many environments without requiring additional noise cancellation.
- Integrated API
- Audio files can be uploaded in the request and, in future releases, integrated with Google Cloud Storage.
CLOUD SPEECH API PRICING
Powerful Speech Recognition
There is no cost for usage of the service during the Limited Preview phase. We will introduce pricing in future phases.