Azure Search comes with many more features than the default Lucene engine in Examine and you can leverage these features with ExamineX.

Analyzers

There are many additional analyzers in Azure Search than there are available in Lucene, some of which tend to work better than Lucene’s depending on your requirements. For example, Microsoft’s documentation states:

The default analyzer is Standard Lucene, which works well for English, but perhaps not as well as Lucene’s English analyzer or Microsoft’s English analyzer.

Default analyzers

The default analyzers used in ExamineX are configured to be synonymous with the ones shipped by default in Umbraco:

  • InternalIndex: keyword_lowercase_asciifolding, this is a custom analyzer that ExamineX adds to the index which is the keyword (i.e. whitespace) analyzer with both the lowercase and asciifolding filters applied.
  • ExternalIndex: standard.lucene, the lucene standard analyzer
  • MembersIndex: keyword_lowercase_asciifolding (as above)

It is possible to change the default analyzer used for each index by replacing the IUmbracoIndexesCreator in your own Umbraco composer

Example:

public class MyComposer : IComposer
{
    public void Compose(Composition composition)
    {
        // Replace the default ExamineX implementation
        composition.RegisterUnique<IUmbracoIndexesCreator, MyUmbracoIndexesCreator>();
    }
}

// Override the default ExamineX implementation
public class MyUmbracoIndexesCreator : ExamineXIndexFactory
{
    public MyUmbracoIndexesCreator(
        ExamineXConfig examineXConfig, IUmbracoIndexConfig umbracoIndexConfig, IExamineXLogger logger, 
        ILicenseManager licenceManager, IRuntimeState runtimeState, UmbracoIndexesCreator defaultFactory) 
    : base(examineXConfig, umbracoIndexConfig, logger, licenceManager, runtimeState, defaultFactory)
    {
    }

    // Replace the default analyzer for the external index to be 
    // Microsoft's English Analyzer
    protected override IIndex CreateInternalIndex(string defaultAnalyzer) 
        => base.CreateInternalIndex(AnalyzerName.AsString.EnMicrosoft);
}

Custom field analyzers

It is possible to change the analyzer used per field in ExamineX in almost the same way you do in Examine, however there are slightly different field definition types for ExamineX:

  • AzureSearchFieldDefinitionTypes.FullText - Default. The field will be indexed with the index’s default Analyzer without any fancy type storage or sortability. Generally this is fine for normal text searching.
  • AzureSearchFieldDefinitionTypes.FullTextSortable - Will be indexed with FullText but also enable sorting on this field for search results.
  • AzureSearchFieldDefinitionTypes.FullTextMultiValue - Will be indexed with FullText to allow multiple values per field, sorting cannot be allowed with multiple values.
  • AzureSearchFieldDefinitionTypes.Integer - Stored as a numerical structure.
  • AzureSearchFieldDefinitionTypes.Double - Stored as a numerical structure.
  • AzureSearchFieldDefinitionTypes.Long - Stored as a numerical structure.
  • AzureSearchFieldDefinitionTypes.DateTime - Stored as a DateTime.
  • AzureSearchFieldDefinitionTypes.Raw - Will be indexed with the keyword analyzer so searching will only match with an exact value.

To add/remove/modify field values types, the simplest way to do that is at runtime in your own Umbraco component.

You can modify the field definitions for an index at runtime by using any of the following methods:

  • index.FieldDefinitionCollection.TryAdd
  • index.FieldDefinitionCollection.AddOrUpdate
  • index.FieldDefinitionCollection.GetOrAdd

Example:

public class MyComposer : ComponentComposer<MyComponent>
{
}

public class MyComponent : IComponent
{
    private readonly IExamineManager _examineManager;

    public MyComponent(IExamineManager examineManager)
    {
        _examineManager = examineManager;
    }

    public void Initialize()
    {
        if (!_examineManager.TryGetIndex(
            Constants.UmbracoIndexes.ExternalIndexName,
            out var externalIndex))
        {
            throw new InvalidOperationException(
                $"No index found with name {Constants.UmbracoIndexes.ExternalIndexName}");
        }

        // Set the field definition for "productPrice" to be a Double
        externalIndex.FieldDefinitionCollection.TryAdd(
            new FieldDefinition("productPrice", AzureSearchFieldDefinitionTypes.Double));
    }

    public void Terminate() { }
}

Events

All of the underlying Examine events are available in ExamineX such as TransformingIndexValues, IndexingError and OperationComplete.

These additional events are available in ExamineX:

AzureSearchIndex.CreatingOrUpdatingIndex - Allows you to modify the definition of the index before it is created in Azure Search.

It is advised to not remove any indexes, indexers, fields, field mappings or custom analyzers created with ExamineX otherwise unexpected errors may result

Customizing the Azure Search index

Using the AzureSearchIndex.CreatingOrUpdatingIndex can be quite powerful if you want to leverage more out of Azure Cognitive Search than what is provided by default. For example, with this event you could create custom scoring profiles.

An example of adding an event handler for CreatingOrUpdatingIndex:

if (examineManager.TryGetIndex("ExternalIndex", out var index) 
    && index is AzureSearchIndex azureIndex)
{
    azureIndex.CreatingOrUpdatingIndex += AzureIndex_CreatingIndex;
}

An example of creating a custom scoring profiles:

private void AzureIndex_CreatingOrUpdatingIndex(object sender, CreatingOrUpdatingIndexEventArgs e)
{
    // NOTE: You cannot add a scoring rule for a field unless that field exists in the index definition!
    //       When ExamineX first creates the index it will only contain the fields defined in the
    //       initial field definitions. When new items are indexed and new fields are detected then
    //       the Azure Cognitive Search index is updated.

    // get the azure cognitive search definition
    var index = e.AzureSearchIndexDefinition;

    // get or create scoring profiles list (will be null for new indexes)
    index.ScoringProfiles = index.ScoringProfiles ?? new List<ScoringProfile>();

    // this example will create a scoring profile called "pages"
    const string scoringProfileName = "pages";

    // get or create a scoring profile
    var scoringProfile = index.ScoringProfiles.FirstOrDefault(x => x.Name == scoringProfileName);
    if (scoringProfile == null)
        index.ScoringProfiles.Add(scoringProfile = new ScoringProfile
        {
            Name = scoringProfileName,
            FunctionAggregation = ScoringFunctionAggregation.Sum
        });

    // add a 'boost' of 3 for the "pageTitle" field if the field exists
    if (index.Fields.Any(x => x.Name == "pageTitle"))
    {
        // ensure the object exists
        scoringProfile.TextWeights = scoringProfile.TextWeights ?? new TextWeights(new Dictionary<string, double>());
        scoringProfile.TextWeights.Weights.Add("pageTitle", 3);
    }

    // add a 'boost' for pages that have been updated within the last two days
    if (index.Fields.Any(x => x.Name == "updateDate"))
    {
        // ensure the object exists
        scoringProfile.Functions = scoringProfile.Functions ?? new List<ScoringFunction>();

        // check existing or add
        var updateDateFreshness = scoringProfile.Functions.FirstOrDefault(x => x.FieldName == "updateDate");
        if (updateDateFreshness == null)
            scoringProfile.Functions.Add(updateDateFreshness = new FreshnessScoringFunction
            {
                FieldName = "updateDate",
                Boost = 3,
                Parameters = new FreshnessScoringParameters(new TimeSpan(2, 0, 0, 0)),
                Interpolation = ScoringFunctionInterpolation.Logarithmic
            });
    }

    // Set the default scoring profile
    index.DefaultScoringProfile = scoringProfileName;
}