Image analysis has become a crucial tool in various applications, driving innovation and enhancing the user experience. In this context, integrating Azure’s Computer Vision API with the .NET ecosystem provides a powerful and accessible approach for developers seeking to enrich their applications with advanced artificial intelligence features.
What is Computer Vision
Computer Vision is a component of Microsoft’s Cognitive Services, specifically designed to perform advanced computer vision tasks. This service offers a cloud-based API that enables developers to integrate image analysis features into their applications, utilizing cutting-edge machine learning and artificial intelligence technologies. Its capabilities include object recognition, text extraction, face detection, scene categorization, and much more, providing valuable insights from images.
How Computer Vision Works
The service works by sending images to the API on Azure, which then processes the images using machine learning models trained for various computer vision tasks. The results are returned in JSON format, containing detailed information about the content of the images, such as tags, descriptions, categories, and detected objects.
Common Applications
Computer Vision is versatile and can be applied across various industries and use cases. Below are some highlights:
Detection of Suspect or Intruder Objects in Surveillance Systems: With the computer vision service, security cameras can be set up to automatically identify suspicious objects, such as weapons, abandoned bags, or anomalous behaviors in monitored areas. Identity verification can also be performed through facial detection and recognition, comparing with a database of known individuals.
Analysis of Medical Images to Assist in Diagnosis: Monitor the progression of diseases over time by comparing medical images at different stages of treatment. Assist doctors in interpreting complex images by highlighting areas of interest and providing second opinions based on trained models.
Obstacle Detection and Scene Analysis in Autonomous Vehicles: Identify and classify obstacles on the road, such as other vehicles, pedestrians, animals, and static objects. Understand the context of a scene, such as recognizing traffic signs, road markings, and traffic lights. Trigger automatic braking systems, obstacle avoidance, and other corrective actions in response to identified imminent dangers.
Benefits
Among the benefits of the Computer Vision service, we can highlight:
Accuracy and Efficiency: The use of advanced algorithms ensures precise and efficient image analysis.
Scalability: The service is highly scalable, allowing the processing of large volumes of images without compromising performance.
Ease of Integration: Well-documented RESTful APIs make it easy to integrate with various applications and platforms.
Maintenance and Updates: Microsoft continuously updates and improves its models, ensuring that analyses are always based on the most recent and effective algorithms.
Creating the Computer Vision Service
To use the computer vision service, you need to have an active Azure subscription. Then, access the Azure portal and click on Create a new resource. Search for “Computer Vision”.
Click on the “Computer Vision” item, and then a screen similar to the image below will appear. Click on “Create Computer Vision.”
Fill in the following information:
- Subscription
- Resource Group
- Resource Group Region (if creating a new Resource Group)
- Region
- Name
- Pricing Tier
In the image below, you can see the settings used in this example.
After filling in all the information, click on Review + Create, and then click Create.
Next, the application screen will appear, where you will find the two access keys, the location/region, and the application endpoint. These three pieces of information will be important when configuring our API.
Creating the Sample Project
For our demonstration, we will integrate the Azure Computer Vision API into an ASP.NET Core application. The code will demonstrate how to configure an endpoint that receives an image and uses Azure Cognitive Services to analyze it. The analysis will include descriptions, tags, categories, object detection, brands, faces, image types, adult content, and colors.
To get started, create a new ASP.NET Web API project using the following command:
dotnet new webapi --name VisionAPI --use-controllers
With the ASP.NET Core Web API project created, install the following package:
dotnet add package Microsoft.Azure.CognitiveServices.Vision.ComputerVision
Create a controller named “VisionController” with an action “AnalyzeImage” that will be responsible for receiving the image to be analyzed. The initial structure of this controller can be seen below:
using Microsoft.AspNetCore.Mvc;
using Microsoft.Azure.CognitiveServices.Vision.ComputerVision;
using Microsoft.Azure.CognitiveServices.Vision.ComputerVision.Models;
namespace VisionAPI.Controllers
{
[Route("api/[controller]")]
[ApiController]
public class VisionController : ControllerBase
{
[HttpPost]
public async Task AnalyzeImage(IFormFile imageFile)
{
try
{
// main code here
}
catch (Exception ex)
{
return StatusCode(500, $"Error analyzing the image: {ex.Message}");
}
}
}
}
Now, inside our “try” block, we will add the necessary data and configurations to connect to the Computer Vision API. It is important to remember that the endpoint includes the name you assigned to your application when you created it.
const string SubscriptionKey = "YOUR_KEY";
const string Endpoint = "https://YOUR_APP_NAME.cognitiveservices.azure.com/";
Next, we will create the client by instantiating the ComputerVisionClient
class and providing the necessary service credentials. The credentials are supplied using the ApiKeyServiceClientCredentials
class, which receives our subscription key.
var client = new ComputerVisionClient(new ApiKeyServiceClientCredentials(SubscriptionKey))
{
Endpoint = Endpoint
};
Next, we will instantiate the imageStream
variable, which opens a read stream for the image file. This stream (imageStream
) contains the binary data of the image that was uploaded by the user.
using (var imageStream = imageFile.OpenReadStream())
{
// analysis code here
}
Within the code above, we will define the visual features that we want the API to extract from the image. This is done using a list of VisualFeatureTypes
.
var visualFeatures = new List
{
VisualFeatureTypes.Description,
VisualFeatureTypes.Tags,
VisualFeatureTypes.Categories,
VisualFeatureTypes.Objects,
VisualFeatureTypes.Brands,
VisualFeatureTypes.Faces,
VisualFeatureTypes.ImageType,
VisualFeatureTypes.Adult,
VisualFeatureTypes.Color,
};
Each item in the list represents a type of analysis that will be performed on the image:
- Description: Generates a textual description of the image, including captions and tags summarizing the content.
- Tags: Produces a list of keywords that describe the visual elements in the image.
- Categories: Classifies the image into one or more predefined categories that help group similar images.
- Objects: Detects and identifies specific objects within the image, providing their locations and types.
- Brands: Recognizes logos and trademarks present in the image.
- Faces: Identifies human faces in the image and provides information such as age, gender, and coordinates of the faces.
- ImageType: Determines the type of the image, such as a photo, drawing, clipart, etc.
- Adult: Evaluates the image to detect adult, explicit, or inappropriate content.
- Color: Analyzes the predominant colors in the image, identifying background and primary colors.
After defining the visual features, the next step is to execute the analysis. This is done by calling the AnalyzeImageInStreamAsync
method from the Azure Computer Vision API.
var analysis = await client.AnalyzeImageInStreamAsync(imageStream, visualFeatures: visualFeatures, language: "pt");
Neste código,” imageStream” é o fluxo de dados da imagem que será analisada. Neste caso, a imagem é obtida a partir do arquivo enviado pelo usuário “imageFile.OpenReadStream()”. O “visualFeatures” é a lista de características visuais que definimos anteriormente. O parâmetro “language” especifica o idioma dos resultados retornados pela API, que aqui está definido como “pt” (português).
Após isso, retornaremos o resultado em um formato JSON:
return Ok(analysis);
To test our implementation, we can now analyze one image. The returned JSON will contain the visual features we added in the “visualFeatures” variable, along with the results of the analysis.
Conclusion
In this article, we explored how to integrate Azure’s Computer Vision API into an ASP.NET Core application. Through code examples, we saw how to configure an endpoint to receive an image and use Azure’s cognitive services to perform detailed analysis, including descriptions, tags, object detection, face recognition, and more.
By following the steps described here, developers can enhance their applications with advanced computer vision capabilities, providing users with a richer and smarter experience. Additionally, the flexibility and ease of integration of Azure Cognitive Services allow these features to be implemented efficiently and scalably in a wide range of application scenarios.
Therefore, by harnessing the powerful features of Azure’s Computer Vision API, developers can open up new creative possibilities and deliver innovative solutions that meet the ever-evolving demands of the market.