Microsoft Idea

Export To Data Lake

Sanjay Sonawane on 6/22/2020 9:06:12 AM

“Export to Data Lake” capability :

1. “Export to Data Lake” is no-database solution, which is good but there are below points need to consider from consumption perspective. From consumption point of view, current DataExportService solution is easy than Data Lake.
2. We still do not see soft delete . We need to know when this will be available ? Which month and year ?
3. Data are partition by year value from CreatedOn field of that entity. Partition of data should be optional rather than hard code at design level.
a. Expected Behavior :
Data Partition should be optional rather than hard code design. This will increase usages if customer do not want data to be partitioned.
b. Downstream system perspective Pain Point :
Sending data to downstream system is not straightforward. Downstream systems have to scan through all records from all partitioned folder and identify update( either from main entity folder or Snapshot folder).
4. There is no header row in each CSV from entity folder and also under snapshots folder. It is very difficult to consume data from downstream system perspective to align data from each CSV with schema of that entity from Model.Json.
5. Data is in CSV format. If there is new line feed then that single record data is split into more than one line and difficult to align. This is setback to downstream system point of view.
6. It could have been better if data is written using Key: Value pair than writing CSV, this will help with SQL as well as NoSQL consumption perspective. (Bring some modern way of consumption than old century way of consumption😊 )
7. Not easy to consume/view data. Customer must use Power BI or CDM SDK or Azure Stack or Azure Synapse . Consumption is not straight forward and limited options for downstream system to consume data.

STATUS DETAILS

Needs Votes