Any results that Rainbird produces after completing a query will have a “Certainty Factor” . The Certainty Factor represents how sure Rainbird is of the result it has returned. The Certainty Factor can vary depending on the data inputted by the user, the rules of a knowledge map, and whether Rainbird is working with incomplete data.
Data input into Rainbird is also associated with a certainty factor. End-users can answer questions and provide a certainty level ,which reflects how sure they are of their answer, during a query. The knowledge map’s facts will also have a certainty factor associated with them, and whether Rainbird is more or less certain about a fact will impact the certainty factor of the results Rainbird produces.
This article describes how the “Certainty Factor” of an outcome is calculated, and how it can be used and interpreted by the user. More information about how the certainty factor can be affected by end-users and facts can be found in the allowCF article and the fact / concept instance articles.
Build String Concepts
Example:
Isaac wants fruit for his dessert after dinner. He is very picky about fruits, and only has:
- Green Apples
- Yellow Apples
- Green Pears
He keeps his fruits in his garage, where there’s no light as the light bulb is broken. He can barely see the colours, and can’t differentiate the shape of the fruit by sight.
When picking a fruit, he’ll try to guess what kind of fruit to pick by looking at it, guessing the colour and touching it to guess the type of fruit.
The following Rainbird model estimates what fruit Isaac has picked and returns a certainty factor of how sure Rainbird is, using Certainty Factor sliders and rules with different certainty factors. Here is what the model looks like:
Figure 1: Fruit selector model
Here is how it behaves:
Figure 2: Fruit selector model behaviour
In the example above, Rainbird is 81% certain that Isaac picked a green apple.
In the fruit picking model, Rainbird returned the result “Green Apple” with a certainty factor of 81%.
How was the certainty factor calculated? The evidence tree, and more specifically the salience view, below provides the rationale behind Rainbird’s calculations:
Figure 3: Salience view of the decision
The salience view highlights the different conditions that need to be met to satisfy a rule (see figure 4 below).
The amount by which an individual condition affects the overall certainty factor depends on the condition’s level of certainty and the weighting of the condition inside the rule it is a part of.
In the example rule, each condition makes up 25% of the overall decision, because, as shown in figure 4 below, there are 4 conditions to the rule, each with a weighting of 100. (100/400=0.25=25%).
When answering the query, Isaac said he was only 50% certain the fruit was a Green. Hence it adds 12.5% to the final certainty since the condition “Isaac picks a fruit colour with confidence Green” accounts for 25% of the final result BUT Isaac was only 50% sure that the fruit was green (50% * 25% = 12.5%).
Figure 4: Conditions and their weighting in the main rule
The Rainbird team have built a certainty tool which calculates the certainty factor if a user needs to test the weight and behaviour of conditions, and the certainty factor outcome of facts and rules . The certainty calculator uses an approximation of the following equation:
Average CFRule =the average Certainty Factor of the conditions of the rule we are testing, WCondition =the weight of a condition, CFCondition = the certainty factor of a given condition
Note: the AverageCFRule should not be confused with the GivenCFRule described later in the article.
Substituting in the values from the example query produces 0.81, the certainty factor of the result that Rainbird produced:
The result does not satisfy Isaac, who believes only his estimation should impact the final Certainty Factor.
As Isaac wants his decisions to have more of an impact on the certainty of the outcome, the weight of the conditions “has colour” and “has fruit type” are set to 0 (see figure 5). Now, with the same input, Rainbird will produce a CF of 0.63 for “green apple”.
Figure 5: New settings of the “picks a fruit” rule
The Certainty Factor can also be used to calculate the probability of an outcome. Let’s assume that Isaac knows roughly how many apples and pears he has, and what proportions of the fruit are what colour. Isaac now wants to estimate the probability of picking a certain fruit:
Isaac has the following amount of fruit in his garage:
- Green Apples: 50
- Yellow Apples: 25
- Green Pears: 5
- Yellow Pears: 20
Isaac has a total of 75 apples and 25 pears. He knows that he has a 75% chance to pick an apple, and if he does, there will be a 66.66% chance of the apple being green. The overall chance of picking a green apple therefore is 50%.
A new relationship between “Person” and “Fruit type” called “has probability to pick fruit” has been created.
The relationship is set to plural so multiple facts can be created on the relationship (one for the pears, one for the apples).
The facts have been assigned a certainty factor, based on the amount of apples and pears available to Isaac to pick from (more information on facts can be found in the facts and instances article):
Figure 6: Facts embedded in the “has probability to pick fruit” relationship
A further relationship between “Person” and “Fruit” called “has probability to pick green knowing fruit type”, is created. The “has probability to pick green knowing fruit type” relationship will:
- Have 2 rules, one for pears, one for apples
- Use and inject the certainty factors of the “has probability to pick fruit” relationship whilst running the query
Here’s how the first rule is created:
Figure 7: Creation of Rule 1, 66% CF
Here’s how the second rule is created:
Figure 8: Creation of Rule 2, 20%CF
Now, when running a query, the following result will be produced:
Figure 9: Query result
Note: In order to trigger the fact, we need to input the person’s name linked to the fact we created in the “has probability to pick fruit” . In this case, Isaac.
The previous query only produced a result of green apples, rather than a result for both green pears and green apples. Rainbird may limit the number of results produced if the relationship was not set to “plural”. However, in the example, “has probability to pick” is a plural relationship.
Rainbird only produces once result because of how the minimum certainty settings are configured for the example knowledge map. As the probability of picking a green pear is very low, Rainbird does not display this outcome. The minimum rule certainty setting can be lowered (down to 1%) so that Rainbird will display results with a lower certainty factor (for more information, see the Minimum Rule Certainty article).
Figure 10: Lowering the Minimum Rule Certainty
Here are the outcomes after the minimum certainty has been lowered:
Figure 11: query result with all the outcomes
How does Rainbird calculate these Certainty factors?
As explained in the first part of this article, Rainbird calculates an average of the conditions’ CF, taking into account the individual condition’s weighting, to calculate the CF of the rule. In the probability example, there’s only one condition that impacts the CF: “probability to pick a fruit type”.
Then, as the rule itself has a CF associated to it, Rainbird multiplies both CFs together to generate the final CF, as detailed by the following formula:
CFOutcome = the Certainty of each outcome, Average CFRule = the average certainty factor of the conditions in the rule, Given CFRule= the certainty factor that has been assigned to the rule
Each rule can be assigned a certainty factor by opening up the rule in the relationship:
Figure 12: Where to input the Given CF of a rule
Substituting in values from the example, the CF calculation and outcome would be:
Note: If the relationship was singular, Rainbird would have selected the outcome with the highest CF.
By repeating the steps required to create the previous relationship, another relationship for yellow fruits can be created. The two relationships can be combined into an overall relationship, called “has probability to pick fruit”. Here’s what querying such an overall relationship would look like:
Figure 13: Overall relationship query
Note: Combining relationships is an efficient way to build complicated relationships with multiple rules, but the overall probability relationship could also be built by creating 4 rules on one relationship only.
The RBLang below will generate the example fruit picker map. Click on ‘Export .rbird’ to download the knowledge map, or ‘copy RBLang’ and paste the code directly into Rainbird.
Query and Results
The relationships ‘pick a fruit’, ‘has propbability to pick (colour) fruit’, and ‘has probability to pick fruit’, are the three relationships which should be queried when using the model:
- The outcome of running the query on the relationship ‘pick a fruit’, will show the confidence factor of picking the right fruit , dependent on the confidence level the end-user sets.
- The outcome of running the query on the relationship ‘has probability to pick green knowing fruit’ (or yellow) will show the probability to pick the right fruit.
- To run the combined rules for both green and yellow fruits query the relationship “has probability to pick fruit”.