Calculating Confidence score
Confidence Score is one of the most important factors in Knowledge search. It reflects how confident the system is that the Knowledge Artifact answers the user’s query. The confidence score for an Artifact is calculated based on the metadata matches. It determines the final search results displayed to end-users.
In Luma Knowledge, the Confidence score is a number ranging from 0 to 1 and indicates the relevancy of an Artifact. Higher the confidence score, the better the Knowledge.
Each Metadata type in Luma Knowledge is allocated a specific Confidence Score Weightage, which is used to calculate the Confidence score for the matching artifact. Confidence Score Weightage is allocated based on the importance of metadata type and the structure of your Knowledge base.
When an end-user searches for Knowledge, the system generates metadata from the user query. The query’s metadata is matched with the metadata of the Artifacts in Knowledge Base to filter the relevant artifacts. As matching metadata is found, the respective score is awarded to the artifact. More the number of matching metadata, the more the confidence score.
In Luma Knowledge, there are two ways to calculate the Confidence score:
Static Confidence Weightage Allocation
This is the default confidence weightage allocation method used in Luma Knowledge. In 'Static' allocation, the confidence score weightage assigned to each Metadata Type is fixed and the Artifacts Confidence score is evaluated accordingly. By default, the following weightage is assigned to each metadata type:
Topic | Path (Parent Topics if Artifact Topic does not match) | Subject | Action | Motivation |
---|---|---|---|---|
40 | 35 | 30 | 25 | 5 |
For a user query, confidence score weightage is assigned based on the ontology generated from the user phrase. This means, that the more metadata types are identified from the user phrase, the more confidence score is available for allocation to matching Artifacts.
Let us discuss the following example to understand how the Confidence score is calculated:
Example 1: “Describe Knowledge Management”- Query generates Metadata Topic, Subject, and Action
From the user query 'Describe Knowledge Management', metadata types Topic, Subject, and Action are generated. Based on the identified Metadata, the confidence score weightage is allocated to each metadata type. Since Motivation is not identified from the user phrase, it is not used for confidence score evaluation.
Metadata type | Ontology generated | Weightage assigned based on the user phrase |
---|---|---|
Topic | Knowledge Management* | 40 |
Subject | Knowledge Management, Knowledge Management* | 30 |
Action | Describe | 25 |
Motivation |
| NA |
Path |
| 0 |
Luma Knowledge identifies Artifacts that match with the ontology generated from the user query. Based on the metadata type and number of metadata matches, the confidence score for the matching Artifacts is calculated.
Artifact | Matched Metadata | Confidence Score |
---|---|---|
What is Luma Knowledge | path : Luma Knowledge Management, Knowledge Management | 0.95 (40+30+25) |
What is the intent of the Retrieval Accuracy Graph | path : Luma Knowledge Management, Knowledge Building | 0.35 (35) |
What is the intent of the Feedback Response Rate Graph | path : Luma Knowledge Management, Knowledge Building | 0.35 (35) |
Example 2: Multiple Subject match
If a search query generates multiple Subjects and more than one matching subject is available in the Artifacts, an additional weightage of 3% for the phrase is assigned to the metadata type ‘Subject’. The additional weightage for the 'Action' metadata is reduced and assigned to the Subject. The identified metadata is used to calculate the confidence score.
For user query “How to create a dropbox account“, metadata types Topic, Subject, Motivation, and Action are generated.
Metadata type | Ontology generated | Weightage assigned based on the user phrase |
---|---|---|
Topic | dropbox*, Dropbox Account* | 40 |
Subject | dropbox account, dropbox*, Dropbox Account* | 30 |
Action | create | 25 |
Motivation | how | 5 |
Path |
| 0 |
Based on the identified metadata, the confidence score for the matching Artifacts is calculated.
Artifact | Matched Metadata | Confidence Score |
---|---|---|
ChangePassword-Dropbox | path : Dropbox | 0.73 (40+33) |
How to Sign in to Dropbox | path : Dropbox | 0.75 (40+30+5) |
How to change your password for Dropbox | path : Dropbox | 0.45 (40+5) |
In case of multiple subject matches, weightage of 3% is assigned for every additional phrase. i.e.
For 2 subject matches, an additional 3% weightage is assigned to 'Subject' metadata. The confidence weightage is increased to 33%.
For 3 or more subject matches, an additional 6% weightage is assigned. The confidence weightage is increased to 36%
The additional weightage is decreased from the weightage of Action metadata. This means, If Weightage for Subject is 33%, Action is reduced to 22% (25-3).
Dynamic Confidence Weightage Allocation
In Dynamic confidence score allocation, the confidence score is allocated based on the ontology generated from the user query. The confidence score is always calculated based only on the metadata identified. This ensures that Luma Knowledge can find matching Knowledge Artifacts even if the queries are ambiguous or do not generate all the types of metadata.
In dynamic allocation, the metadata types are not assigned a fixed weightage. It is calculated, dynamically based on the metadata identified and the Base weightage points for each metadata type.
There are four steps in calculating the Confidence score in dynamic allocation:
Configure Base points
Base points are the base scores allocated to each metadata type. Every phrase/word identified as the metadata is assigned the configured base score. In other words, if the base score of ‘Action’ metadata is 25, each word identified as ‘Action’ from the query is assigned a weightage of 25.
By default, the base points are configured as below and can be updated as required.
Topic, Path and Subject | Action | Motivation |
---|---|---|
70 | 25 | 5 |
The Base points for the metadata types are configurable and can be updated based on your organization’s required. Currently, the configuration is available in the backend. You may contact the Serviceaide support team to update the configuration.
A total of 100 Base points can be divided among the metadata types.
Calculate Total Weightage points for the query and matched phrases
Using the Base points per metadata type, the Total Weightage points for the search are calculated.
For the ontology generated from the search query, the Weightage points for each metadata type and the Total Weightage points for the search query are calculated. The Total Weightage point is a sum of the weightage points for the identified metadata types. This score is used to derive the Confidence weightage that can be allocated to Artifacts.
For example, if a search query generates metadata Topic, Subject, Action and Motivation:
Base points for Topic and Subjects = 70
Number of words/phrases identified as Topic and Subjects identified = 2
Weightage points for Topic and Subjects = 140 (calculated as 70 x 2 )
Base points for Action = 25
Number of words/phrases identified as Action = 2
Weightage points for Action = 50 (calculated as 25 x 2 )
Base points for Motivation = 5
Number of words/phrases identified as Motivation = 1
Weightage points for Motivation = 5 (calculated as 5 x 1 )
Total Weightage points for the search = 195 ( calculated as 140+50+5)
Derive Confidence Weightage
Now using the Weightage points for each metadata type and Total Weightage points, we can derive the Confidence weightage for each metadata type. Based on metadata generated from the user query, the confidence weightage is calculated. The weightage is calculated only for the metadata type identified from the query string.
In the above example,
Total Weightage points for the search = 195
Weightage points for Topic and Subjects =140
Confidence Weightage for Topic and Subjects = 0.72 ( calculate as 140/195)
Weightage for each phrase = 0.36 ( calculated as 0.72/2 = 0.36)
Weightage points for Action = 50
Confidence Weightage for Action = 0.26 ( calculate as 50/195)
Weightage for each phrase = 0.13 ( calculated as 0.26/2 = 0.13)
Weightage points for Motivation = 5
Confidence Weightage for Motivation = 0.02 ( calculate as 5/195)
Weightage for each phrase = 0.02 ( calculated as 0.02/1 = 0.02)
Calculate the Confidence score for Artifact
Using the confidence weightage for the identified metadata, the confidence score for the matching Knowledge artifacts is calculated. Metadata generated from the search query is matched with the Artifact's metadata and the confidence score is assigned accordingly.
For an Artifact with matching metadata, the confidence score is calculated as below:
Metadata | Artifact 1 | Artifact 2 | Artifact 3 | |||
---|---|---|---|---|---|---|
Matching | Confidence Score | Matching | Confidence Score | Matching | Confidence Score | |
Topic, Subject, Path | 2 | 0.72 | 1 | 0.36 | 2 | 0.72 |
Action | 2 | 0.26 | 1 | 0.13 | 1 | 0.13 |
Motivation | 0 | 0 | 1 | 0.02 | 1 | 0.02 |
Total |
| 0.98 |
| 0.51 |
| 0.87 |
Examples
Let us look at the following examples to understand how the Confidence score is calculated:
Example 1: “How to login to dropbox“
From the user query 'How to login to dropbox', metadata types Topic, Subject, Action, and Motivation are generated. Based on the identified Metadata and configured base point, the confidence score is allocated to each metadata type.
Metadata type | Ontology generated | Weightage points per Metadata | Total Weightage points for the query | Confidence weightage per phrase |
---|---|---|---|---|
Topic | dropbox* | Base points= 70 | 100 | Confidence Weightage = 0.70 ( calculate as 70/100) |
Subject | dropbox, dropbox* | |||
Action | login | Base points= 25 | Confidence Weightage = 0.25 ( calculate as 25/100) | |
Motivation | how | Base points= 5 | Confidence Weightage = 0.05 ( calculate as 5/100) |
Based on the calculated weightage per phrase, Confidence score for the artifact is calculated.
Artifact | Matched Metadata | Confidence Score |
---|---|---|
How to change your password for Dropbox | path : Dropbox | 0.75 |
ChangePassword-Dropbox | path : Dropbox | 0.7 |
Dropbox Description | path : Dropbox | 0.7 |
Example 2: “Apply for a Building Permit“
From the user query 'Apply for a Building Permit', metadata types Topic, Subject, and Action are generated. Based on the identified Metadata and configured base point, the confidence score is allocated to each metadata type.
Metadata type | Ontology generated | Weightage points per Metadata | Total Weightage points for the query | Confidence weightage per phrase |
---|---|---|---|---|
Topic | permits*, permit*, building permit | Base points= 70 | 165 | Confidence Weightage = 0.85 ( calculate as 140/165) |
Subject | permits*, permit*, building permit | |||
Action | Apply | Base points= 25 | Confidence Weightage = 0.15 ( calculate as 25/165) | |
Motivation | - | - | - |
Based on the calculated weightage per phrase, the Confidence score for the artifact is calculated.
Artifact | Matched Metadata | Confidence Score |
---|---|---|
Apply for a Building Permit | path : Building Permits, Construction Permits, Permits, Licenses, and Inspections | 1.0 |
Building Permits Online | path : Building Permits, Construction Permits, Permits, Licenses, and Inspections | 0.85 |
Board of Equalization, BOE | path : Sellers Permit, Business Permits, Permits, Licenses, and Inspections | 0.575 |
Example 3: “Wifi Router“
From the user query 'Wifi Router', metadata types Topic and Subject are generated. Based on the identified Metadata and configured base point, the confidence score is allocated to each metadata type.
Metadata type | Ontology generated | Weightage points per Metadata | Total Weightage points for the query | Confidence weightage per phrase |
---|---|---|---|---|
Topic | router*, wifi router* | Base points= 70 | 140 | Confidence Weightage = 1 ( calculate as 140/140) |
Subject | wifi router, router*, wifi router* | |||
Action | - | - | - | |
Motivation | - | - | - |
Based on the calculated weightage per phrase, the Confidence score for the artifact is calculated.
Artifact | Matched Metadata | Confidence Score |
---|---|---|
How do I make my router perform better in an interference-filled environment? | path : WiFi Router | 1.0 |
To add a translated MAC address to your router: | path : WiFi Extenders | 1.0 |
https://www.actcorp.in/contact-us/faq | subject : router | 0.5 |