Matching Methods

Kleber has a number of matching methods that can create unique match keys for your data to help identify duplicates quickly and easily.

Kleber creates matching keys in 3 varieties – Tight, Standard and Loose. Which one you choose to use will depend upon the situation and business need for finding the matches in the first place.

  • Tight keys allow for little difference between matches. Tight keys are useful for matching where no user interaction is available.
  • Loose keys will identify a lot more matches, some of which may be questionable, but will assist in identifying the last few percent of matches that are difficult to find. Loose Keys should never be used without user interaction to verify matches.
  • Standard keys are a good balance between Tight and Loose key and depending on the outcome required may be used without user interaction.

 EXPERT TIP

When trying to work out which match key/s to use from the many available – it’s best to initially test the keys on large quantities of data with someone verifying the results.

That way you can see which key or combination of keys best suits your business requirements.

What match keys are available in Kleber?

The 5 match methods create the following match keys:

DataTools.Match.Address.Au.CreateKeys

  • Standard Address Key
  • Tight Address Key
  • Loose Address Key
  • Address Locality

DataTools.Match.BusinessName.CreateKeys

  • Standard Business Name Key
  • Tight Business Name Key
  • Loose Business Name Key

DataTools.Match.BusinessNameAndAddress.Au.CreateKeys

  • Standard Business Name and Standard Address Key
  • Tight Business Name and Standard Address Key
  • Loose Business Name and Tight Address Key
  • Standard Business Name Key
  • Tight Business Name Key
  • Loose Business Name Key
  • Standard Address Key
  • Tight Address Key
  • Loose Address Key

DataTools.Match.BusinessNameAndAddress.Au.CreateKeys

  • Standard Business Name and Standard Address Key
  • Tight Business Name and Standard Address Key
  • Loose Business Name and Tight Address Key
  • Standard Business Name Key
  • Tight Business Name Key
  • Loose Business Name Key
  • Standard Address Key
  • Tight Address Key
  • Loose Address Key

DataTools.Match.PersonNameAndAddress.Au.CreateKeys

  • Standard Person Name and Standard Address Key
  • Tight Person Name and Standard Address Key
  • Tight Person Name and Loose Address Key
  • Standard Person Name Key
  • Tight Person Name Key
  • Loose Person Name Key
  • Standard Address Key
  • Tight Address Key
  • Loose Address Key

Detailed explanations of the match keys

Standard Person and Standard Address matches

The base method used for detecting duplicated people. It allows for missing or differences in the unit or level numbers, and a balanced level of phonetic miss-spellings in the last name, street name or building name. In the example below the first name matches on initials only ie: “Alexandra”, “Alex” and “A” will all match as shown, as would “Anne” and “Albert”.

Tight Person and Standard Address matches

Keeps the same address rules as the Person and Address Method, but restricts the differences allowable in the first and last names to gain a match.

This is very useful when used along side the Person and Address Method because it will save you manually perusing these more certain results, allowing you to bring up only the more subjective results for review if desired.

In the example shown above only the first two records would match on Tight Person and Address. In the example below the first two records would be matched but the third record would require the standard Person and Address Method.

Tight Person and Loose Address matches

This finds matches where the first and last name match as in the Tight Person and Address Method, however it loosens up the address matching criteria to detect duplicates where records have missing or differing street numbers, and allows for phonetic matching on street and building names. This is a great way of netting extra tricky duplicates and is also used against other methods so that you only have to review the subjective results if desired.

In the example below the first two records would be matched with Tight Person and Address but the last record would only be found on the Tight Person and Loose Address Method.

Standard Business and Standard Address matches

This is the standard method used when you want to detect duplicate businesses.

This method looks at key elements of the business name and allows for blanks or differences in the unit or level number and a balanced level of misspellings in the street or building names.

Tight Business and Standard Address matches

The same address matching rules as above combined with tighter business name matching that includes extra differentiating components from the business name.

This is very useful when used along side the Business and Address Method because it will save you manually perusing these more certain results, allowing you to bring up only the more subjective results of the other business match methods for review if desired.

It is also used to detect duplicate businesses where you only want to group unique departments or business units together. In the records shown above only the top two records “Dodsun Print” would be matched on Tight Business and Address Match as the other two are differing divisions of the same organisation.

All the records in the example below would be matched on Business and Address Match, but only the top two would match on Tight Business and Address Match.

Loose Business and Tight Address matches

This allows for a balanced level of miss-spelling and phonetics in the business names main component, but requires tight address matching where the unit or level number must be the same and the street or building name is spelt the same.

Tight Address matches

The classic one per household matching method requires that the street or building name be the same and that unit numbers are not missing and must match.

This method is also used to pick out duplicates where the differences in person or business names are too vast to be matched with other matching methods.