13

💡 Idea Description:


Currently, Dynamics 365 OCR relies on checksum hash comparison to detect duplicate invoice files. This approach fails when two files contain identical content but are regenerated separately—resulting in different hashes. As a result, duplicate invoices can slip through undetected, leading to potential processing errors and inefficiencies.


Proposal:  Introduce a secondary layer of duplicate detection based on file content analysis. This enhancement would allow D365 to:


  • Detect duplicates even when files are regenerated and have different checksums
  • Reduce manual intervention and invoice reconciliation errors
  • Improve overall accuracy and reliability of OCR processing


Why It Matters:  In real-world scenarios, invoices are often regenerated or re-exported from ERP systems, especially during corrections or reprocessing. Despite having the same content, these files are treated as unique by D365 due to checksum differences. A content-aware duplicate check would significantly improve invoice automation and reduce operational risks.


Suggested Implementation:


  • Use text extraction or semantic comparison to identify content-level duplicates
  • Provide a configurable threshold for similarity detection


If this feature would benefit your organization, please vote and share!

Category: Development
STATUS DETAILS
New