Description

Azure AI Vision is a unified service that offers innovative computer vision capabilities. Give your apps the ability to analyze images, read text, and detect faces with prebuilt image tagging, text extraction with optical character recognition (OCR), and responsible facial recognition. Incorporate vision features into your projects with no machine learning experience required.

The Computer Vision API provides state-of-the-art algorithms to process images and return information. For example, it can be used to determine if an image contains mature content, or it can be used to find all the faces in an image. It also has other features like estimating dominant and accent colors, categorizing the content of images, and describing an image with complete English sentences. Additionally, it can also intelligently generate images thumbnails for displaying large images effectively.

Supported Operations

Computer Vision API (v3.2)

Analyze Image

This operation extracts a rich set of visual features based on the image content. <br> <br> Two input methods are supported -- (1) Uploading an image or (2) specifying an image URL. Within your request, there is an optional parameter to allow you to choose which features to return. By default, image categories are returned in the response. <br> <br> A successful response will be returned in JSON. If the request failed, the response will contain an error code and a message to help understand what went wrong. <h4>Http Method</h4> POST

Get Area of Interest

This operation returns a bounding box around the most important area of the image. <br> <br> A successful response will be returned in JSON. Upon failure, the error code and an error message are returned. The error code could be one of InvalidImageUrl, InvalidImageFormat, InvalidImageSize, InvalidThumbnailSize, NotSupportedImage, FailedToProcess, Timeout, or InternalServerError. <h4>Http Method</h4> POST

Describe Image

This operation generates a description of an image in human readable language with complete sentences. The description is based on a collection of content tags, which are also returned by the operation. More than one description can be generated for each image. Descriptions are ordered by their confidence score. All descriptions are in English. <br> <br> Two input methods are supported -- (1) Uploading an image or (2) specifying an image URL. <br> <br> A successful response will be returned in JSON. If the request failed, the response will contain an error code and a message to help understand what went wrong. <h4>Http Method</h4> POST

Detect Objects

This operation Performs object detection on the specified image. <br> <br> Two input methods are supported -- (1) Uploading an image or (2) specifying an image URL. <br> <br> A successful response will be returned in JSON. If the request failed, the response will contain an error code and a message to help understand what went wrong. <h4>Http Method</h4> POST

Get Thumbnail

This operation generates a thumbnail image with the user-specified width and height. By default, the service analyzes the image, identifies the region of interest (ROI), and generates smart cropping coordinates based on the ROI. Smart cropping helps when you specify an aspect ratio that differs from that of the input image <p/> A successful response contains the thumbnail image binary. If the request failed, the response contains an error code and a message to help determine what went wrong. <p/> Upon failure, the error code and an error message are returned. The error code could be one of InvalidImageUrl, InvalidImageFormat, InvalidImageSize, InvalidThumbnailSize, NotSupportedImage, FailedToProcess, Timeout, or InternalServerError. <h4>Http Method</h4> POST

List Domain Specific Models

This operation returns the list of domain-specific models that are supported by the Computer Vision API. Currently, the API supports following domain-specific models: celebrity recognizer, landmark recognizer. <br> <br> A successful response will be returned in JSON. If the request failed, the response will contain an error code and a message to help understand what went wrong. <h4>Http Method</h4> GET

Recognize Domain Specific Content

This operation recognizes content within an image by applying a domain-specific model. The list of domain-specific models that are supported by the Computer Vision API can be retrieved using the /models GET request. Currently, the API provides following domain-specific models: celebrities, landmarks. <br> <br> Two input methods are supported -- (1) Uploading an image or (2) specifying an image URL. <br> <br> A successful response will be returned in JSON. If the request failed, the response will contain an error code and a message to help understand what went wrong. <h4>Http Method</h4> POST

OCR

Optical Character Recognition (OCR) detects text in an image and extracts the recognized characters into a machine-usable character stream. <p/> Upon success, the OCR results will be returned. <p/> Upon failure, the error code together with an error message will be returned. The error code can be one of InvalidImageUrl, InvalidImageFormat, InvalidImageSize, NotSupportedImage, NotSupportedLanguage, or InternalServerError. <h4>Http Method</h4> POST

Read

Use this call to perform a Read operation. The Read API is optimized for text-heavy images and multi-page, mixed language, and mixed type documents. The Read operation executes asynchronously. When you call the Read operation, the call returns with a response header called 'Operation-Location'. The 'Operation-Location' header contains a URL with the Operation Id to be used in the second step. In the second step, you use the <a href="/docs/services/computer-vision-v3-2/operations/5d9869604be85dee480c8750">Get Read Result</a> operation to fetch the detected text lines and words as part of the JSON response. The time for completion of the text extraction process depends on the volume of the text and the number of pages in the document. <br/><br/> See <a href="https://aka.ms/ocr-languages">https://aka.ms/ocr-languages</a> for list of supported languages. <br/><br/>

Get Read Result

Use this operation to retrieve the status and OCR result of a <a href="/docs/services/5d98695995feb7853f67d6a6/operations/5d986960601faab4bf452005">Read</a> operation. The input is the 'operationId' from the 'Operation-Location' response header returned by the Read operation. In the following example from a Read operation result, the Operation Id is <b>49a36324-fc4b-4387-aa06-090cfbf0064f</b>, to be used as the ‘operationId’ parameter to the Get Read Results operation.

Tag Image

This operation generates a list of words, or tags, that are relevant to the content of the supplied image. The Computer Vision API can return tags based on objects, living beings, scenery or actions found in images. Unlike categories, tags are not organized according to a hierarchical classification system, but correspond to image content. Tags may contain hints to avoid ambiguity or provide context, for example the tag "ascomycete" may be accompanied by the hint "fungus". <br> <br> Two input methods are supported -- (1) Uploading an image or (2) specifying an image URL. <br> <br> A successful response will be returned in JSON. If the request failed, the response will contain an error code and a message to help understand what went wrong. <h4>Http Method</h4> POST

Details
Preview

This item is available for early access. It is still in development and may contain experimental features or limitations.

Last Update

4 months ago

Includes
azure-ai-vision-api