Empower your editors with ExamineXâs AI capabilities to transform your media library! Seamlessly search using auto-generated Descriptions, Tags, Categories, Locations, People, and more đ
You can also automatically have the contents of your Umbraco media files (i.e. PDF, MS Word, etcâŚ) indexed with ExamineX without the need for additional indexes.
Requirements
This feature is specifically targeted at Umbraco media when configured with the Umbraco.StorageProviders.AzureBlob package.
This feature is specifically targeted at Umbraco media when configured with the UmbracoFileSystemProviders.Azure.Media (>= 2.0.0) package.
If you are using Azure to host your Umbraco website it is recommended to use blob storage as your media provider. This provides more flexibility with scaling your solution along with the benefits of CDN support.
With this ExamineX extension, your media document files such as PDFs and Microsoft Office docs will automatically have their contentâs indexed. And with AI capabilities enabled, your images will be scanned and and auto-generated metadata can be produced such as OCR extracted text, Descriptions, Tags, Categories, Locations, People, and more đ.
Installation
After installing and configuring ExamineXâŚ
Install, configure and test the Umbraco.StorageProviders.AzureBlob package for your media.
Then install the ExamineX.AzureSearch.Umbraco.Media Nuget package.
Then youâll need to enable the integration in your Startup.cs
and add the .AddExamineXAzureSearchForMedia()
call after the .AddExamineXAzureSearch()
call:
public void ConfigureServices(IServiceCollection services)
{
services.AddUmbraco(_env, _config)
.AddBackOffice()
.AddWebsite()
.AddDeliveryApi()
.AddComposers()
.AddExamineXAzureSearch()
.AddExamineXAzureSearchForMedia()
.Build();
}
Install, configure and test the Umbraco.StorageProviders.AzureBlob package for your media.
Then install the ExamineX.AzureSearch.Umbraco.BlobMedia Nuget package.
Then youâll need to enable the integration in your Startup.cs
and add the .AddExamineXForBlobMedia()
call after the .AddExamineXAzureSearch()
call:
public void ConfigureServices(IServiceCollection services)
{
services.AddUmbraco(_env, _config)
.AddBackOffice()
.AddWebsite()
.AddDeliveryApi()
.AddComposers()
.AddExamineXAzureSearch()
.AddExamineXForBlobMedia()
.Build();
}
Install, configure and test the Umbraco.StorageProviders.AzureBlob package for your media.
Then install the ExamineX.AzureSearch.Umbraco.BlobMedia Nuget package.
Install, configure and test the UmbracoFileSystemProviders.Azure.Media package for your media.
Then install the ExamineX.AzureSearch.Umbraco.BlobMedia Nuget package.
Once the package is installed any PDF files, MS Office document files, and others file types will automatically be indexed and stored in your corresponding internal/external Umbraco indexes with the field name content
.
NOTE: The field name âcontentâ cannot be changed. This is a limitation of Azure Searchâs field mapping. For this to work you should not have a Property Type called content
.
AI Integration (v6.1+)
Once the above package is installed, you can configure the AI integration for image analysis.
Quick config
The simplest config using defaults to enable this feature in your appsettings.json file will be:
{
"ExamineX": {
"AzureSearch": {
"Media": {
"EnableImageAnalysis": true
}
}
}
}
Once this is enabled and the Azure Search configured indexer is executed (i.e. a media item is saved), the media items in the index will be populated with generative AI details as per your configuration. ExamineX also automatically configures the Umbraco back office search to include the relevant fields so that your editors can easily find media based on the AI generated information. For example, for this image:
It will generate and index this information:
Now, when your editors search in the backoffice, media will be found based on what is in the image:
Similarly, when searching for media in a media picker, ExamineX has configured the search to work against the index so your editors can find the media they need quickly:
Integrating UI
The ExamineX.AzureSearch.Umbraco.Media package will install a new Property Editor called: ExamineX Media Info
which is a readonly
property editor to display the generated/indexed information for a media item based on the fields populated by Azure AI Search.
The simplest way to integrate this is to:
- Create a new Data Type based on the
ExamineX Media Info
. - Update your Image Media Type, add a Property Type with your newly created Data Type.
ExamineX Media configuration options
Name | Description | Default value |
---|---|---|
EnableImageAnalysis | Enables image analysis | FALSE |
ExcludedFileNameExtensions | File types to be excluded from being indexed | Â |
ImageAnalysisDefaultLanguage | The default language code applied to the Azure Search ImageAnalysisSkill | âenâ |
ImageAnalysisFeatures | An array of the image analysis features to be enabled. The options are: âocrâ, âcategoriesâ, âdescriptionâ, âbrandsâ, âtagsâ, âcelebritiesâ, âlandmarksâ. See Azure Search docs for more details. |
[âocrâ, âcategoriesâ, âdescriptionâ, âbrandsâ, âtagsâ] |
AzureAiServicesKey | [Optional] Sets the Azure AI (Cognitive Services) Key to use for billing | When empty, image analysis billing will be attached to the same account as the Azure Search service. See Azure Search docs for more details. |
IndexingScheduleInterval | The time interval configured for the Azure Search indexer that scans blob storage media files to re-index new changes. Whenever media is changed in Umbraco, the indexer is manually triggered so the scanning will happen in near real time. | 5 hours |
Index fields
Several index fields will be created/used based on the image analysis features enabled:
Field name | Description |
---|---|
content | The contents of document media files such as PDFs, MS Word, etc⌠|
imageOcr | The extracted OCR text of the image |
imageDescription | The AI generated description of the image |
imageBrands | The detected Brand names found in the image |
imageCategories | The AI generated categories of the image |
imageCategoriesCelebrities | The detected Celebrity names found in the image |
imageCategoriesLandmarks | The detected Landmark names found in the image |
imageTags | The AI generated tags of the image |
Searching
Searching on this content is exactly the same way you would search any field in Examine. For example, if you wanted to search for a term within the contents of a media file in the ExternalIndex, you could do:
if(ExamineManager.TryGetIndex("ExternalIndex", out var index))
{
var searcher = index.GetSearcher();
// Query on the 'content' field for media
var results = searcher
.CreateQuery("media")
.Field("content", searchTerm)
.Execute();
}