I've noticed an inconsistency in the handling of the SinkCreatedOn
field within the parquet/delta files generated from the spark pool's conversion of csv files from Dynamics Finance and Operations Synapse Link. The field seems to update with every record change, which deviates from the standard practice where a createdon
datetime field remains static post-record creation.
The importance of having a consistent SinkCreatedOn
field lies in its ability to accurately reflect the original creation date of a record, which is crucial for tracking when it first entered the delta lake format. Additionally, the current practice of having identical values for SinkCreatedOn
and SinkModifiedOn
leads to data redundancy. This redundancy, when multiplied across every record and table, could result in a significant increase in storage space usage, which is neither efficient nor cost-effective.
To align with the standard practices observed in other applications, I recommend that the SinkCreatedOn
field should not be altered after the initial creation of the record. This change would not only ensure data integrity but also optimize storage utilization within the Synapse Link project.
Please consider this feedback for future iterations to enhance the project's data management strategy.