Google Cloud Speech-to-Text is an automatic speech recognition technology by Google Cloud. Its API allows developers to convert spoken language into written text accurately and with low latency. The API supports multiple languages, accents, and audio formats, making it versatile for various applications. Additionally, it offers features like speaker diarization and word-level confidence scores for enhanced transcription quality.

With Google Cloud Speech-to-Text API, developers can integrate speech recognition capabilities into their applications, enabling efficient transcription and voice-based interactions.

Supported Operations

Cloud Speech-to-Text API

Lists operations that match the specified filter in the request. If the server doesn't support this method, it returns `UNIMPLEMENTED`.

Gets the latest state of a long-running operation. Clients can use this method to poll the operation result at intervals as recommended by the API service.

Performs asynchronous speech recognition: receive results via the google.longrunning.Operations interface. Returns either an `Operation.error` or an `Operation.response` which contains a `LongRunningRecognizeResponse` message. For more information on asynchronous speech recognition, see the [how-to](

Performs synchronous speech recognition: receive results after all audio has been sent and processed.


This item is available for early access. It is still in development and may contain experimental features or limitations.

Last Update

3 days ago