Metadata-Based Search

Metadata search is a powerful technique used to enhance the searchability and organization of information by leveraging the ontology or metadata of the article. Unlike traditional search methods that rely solely on the keyword search in the document content, metadata search uses descriptive tags, keywords, Subjects, and Topics to locate and retrieve articles more efficiently and accurately. This approach enables users to quickly find relevant documents, files, and other data assets by searching through predefined categories and properties, such as author, date created, file type, and subject matter. Metadata search is particularly useful in large datasets and complex information systems, where it can significantly improve the speed and precision of finding needed information, thereby enhancing productivity and decision-making.

Luma Knowledge uses various search algorithms to identify relevant artifacts. As an administrator, you can configure how the search is performed for your tenant:

Dynamic Search:

The Metadata search in Luma Knowledge uses ontology to perform the search. This method is also referred to as Dynamic Search.
When using the Metadata search algorithm, the system generates metadata or ontology from the user phrase, which is then used to compare and find the Artifacts with the matching metadata. The system looks for Artifacts (Question-Answer pair and Artifacts) with similar Topics, Subjects, Actions, and Motivation and calculates Relevance and Confidence scores for each matching Artifact.

The relevancy Score determines how relevant is the Knowledge Artifact to the user’s question. This is a relative score and is calculated based on the best-matching Artifact. All the matching Artifacts that fall in a specific Relevance score range are considered the Best Response or result for the user’s request. The Relevance score range is derived from the Tenant configuration and score Range percentage.

Description

Default value

Description

Description

Default value

Description

Score Range Percentage

60

Represents the relevance score range for the best response. The system uses the set percentage to calculate the relevance range. All artifacts that fall within the range are considered as Best Response.

For example, If the Score Range percentage for your tenant is set to 60 and the highest Relevancy score of an Artifact for your search is 100, the system calculates the lowest permissible Relevancy score as 60 (that is 60% of 100). This means the artifacts that fall between the range of 90 and 150 are considered relevant and presented as search results.
Increase the percentage to increase the artifacts in the result set.

Confidence Score is calculated based on the matching metadata found in the artifact. It indicates how confident the system is that the Knowledge Artifact answers the user’s query. In Luma Knowledge, each metadata type is assigned a pre-defined confidence weightage. As matching metadata is found, the respective score is awarded to the artifact. More the number of matching metadata, the higher the confidence score. For more information, refer to Calculating Confidence score

The Topic identified from the user’s query is used to filter Artifacts with matching Topic as well as Subject. This ensures that the end-user is presented with all matching Artifacts.

Advanced Search:

Advanced Search in Luma Knowledge uses the trilateral approach to identify the results for a user query. In addition to the Metadata search, the results are enhanced using the Key phrase search and Full-text search.

Key Phrase search in Luma knowledge is based on the KeyBERT algorithm. It is a keyword extraction technique that leverages BERT embeddings to create keywords and key phrases that are most similar to a document. Luma Knowledge uses the keyword extraction technique to match the available Knowledge to the user query. Whenever an Artifact is curated, the system automatically generates Keywords for the artifact based on the Knowledge content. Similarly, Keywords are identified from the user’s search queries. These keywords are matched against the Knowledge Articles, and relevant data is identified.

Full-text search is ideally the fallback option taken by the system in case either Metadata Search or Key Phrase generates appropriate results. In this case, Luma Knowledge searches for the requested information in the complete artifact. Any Artifact with information similar to the user query appears in the search result. This type of search ensures that Luma Knowledge can find search results even if the Metadata or Keywords are not correctly created. You may update the tenant configuration to filter the results presented in the case of a Full-text search.

Displaying Search Results

Once the Best Response is identified, information is presented to the end user. The Best Response may contain matching Precise Answers or a List of Knowledge Articles. The identified results are presented based on the following Tenant level configurations. These configurations determine which Artifacts are presented and how the Artifacts are presented, in a list or a guided conversation. For information, refer to Tenant Configurations.

Description

Default value

Description

Description

Default value

Description

Maximum Artifacts per Query

30

Indicates the maximum number of the artifacts that can be returned as Best Response.

Maximum decision levels

3

Indicates the number of the decision levels or questions Luma Knowledge may ask to locate the Artifact, in case the Best Response is a guided conversation with various Artifacts and Topics.
Increase the count to increase the decision levels in the result set.

Maximum Topics per Decision Level

12

Indicates the number of topics that can be listed per decision levels in the Guided Conversation . The count determines the options that can be presented at each decision level in a guided search.
Increase the count to reduce the decision levels in the result set.

Maximum Artifacts per Topic

10

Indicates the number of artifacts that can be listed under a topic in the result. The count impacts the decision levels presented in a Guided conversation. Increase the count to reduce the decision levels in the result set.