Microsoft Idea

Ability to add secondary deduplication tiebreakers

Jordan Hoolachan on 4/28/2022 4:29:27 PM

We are currently able to choose selection rules like "most filled", "most recent", etc when deciding how to pick the winning record during deduplication. However, particularly when using "most filled", it's still possible for there to be duplicates as multiple records could have the same level of "fillness". As far as I can tell, the winner is then chosen at random which could result in different winning records being chosen for the same deduplication group in subsequent runs on the same input files. It would be better if we could add additional rules (e.g. alphabetical) so we could be sure we'd always get the same results from the same input files.

STATUS DETAILS

Needs Votes

Administrator on 6/20/2022 11:31:37 PM

we are grateful for this input. the Cusomer insigthts team would love to consider this idea but optimally it should receive more votes first. we will make sure to track it over time and update on any change to it's status.

Comments

RE: Ability to add secondary deduplication tiebreakers

Scott Stabbert on 2/29/2024 9:21:06 PM

Great suggestion. Would like to see more upvotes. This is on our backlog and will just need to be prioritized.

Category: Data Unification