DupliKit weighting system
Last updated
Was this helpful?
Last updated
Was this helpful?
On the left side of the DupliKit settings is the DupliKit Weights section that forms the basis of the DupliKit weighting system. Click anywhere in this bar to expand the editor.
The weights shown in the below screenshots are on our test environment and are often changed for testing purposes.
The DupliKit weights are numeric values that are set for different fields. DupliKit uses these values when it syncs to determine what will be marked as a duplicate.
Each entity has a threshold of 5 which you will see at the top of the section. When the total score of a record reaches 5 it is defined as a duplicate. Each field which matches another record will increase the score by the value stored here.
For example:
The “Candidate First Name” has been given a weight of 2. This is a low weight as it is likely many candidates will share the same first name. Additional matching fields will therefore be required for the score to reach the threshold of 5.
The “Contact Any email” has a score of 3. Matching email addresses are much more likely to indicate a duplicate record so having a higher weight here would return a more accurate result.
You can use decimal numbers to fine-tune the sync and make your duplicate results more accurate.
Company Weights
We recommend using the Clean name option when finding company duplicates. Full name will only give a match if the complete company names are identical. Clean name is set to 4.8 as a default - and removes the words in the company options field Words to ignore from company names, e.g. limited, LTD, from all company names before testing for a potential match.
Editing any of the field weights will present the save button at the bottom of this section. Clicking the save button will store these weights in DupliKit and refresh the duplicate list.
In addition to the fixed Field weight fields provided it is possible to affect the duplicate matching by selecting two more weight fields per entity. These fields can be selected via the drop-down - two for each of Candidate, Contact and Company. Weights can then be entered in the value boxes and, once saved, these field weightings will be added to the next DupliKit sync.
Any added custom weights contribute towards the same threshold of 5. Adding custom weights is likely to increase the number of duplicates found by DupliKit.
Editing any of the weights will present the save button at the bottom of this section. Clicking the save button will store these weights in DupliKit and refresh the duplicate list.
Adding custom fields and saving your changes will initiate a new sync of your DupliKit database. This may take some time to complete. Please check the DupliKit dashboard and wait for the sync to complete before making any additional changes to settings.
Beneath the weights is the additional report fields section:
When viewing duplicate records it is possible to add more fields to the grid. This can make it easier to determine whether the records are indeed duplicated or two separate entities. If custom weight fields are being used it may be sensible to select the same fields here. This would allow for easy comparison in the duplicates review grid.