-
Provide option to exclude individual fields that are used in a unification combined group
Suggested by Matthew Bennett – New – 0 Comments
"Combine a group of fields" detailed here : https://learn.microsoft.com/en-us/dynamics365/customer-insights/data/data-unification-merge-tables
Currently when you combine address fields into a group it still leaves the original fields in the unification output (the customer profile table).
This can be confusing as you end up with 2 addressline1, 2 addressline2 etc, one that isn't combine and one that is.
Please can the original field either allowed to be excluded or hidden from the result of the unification if the fields are used in a combined group.
-
Remove the 15 limit from "Combine a group of fields"
Suggested by Matthew Bennett – New – 1 Comments
Currently the "Combine a group of fields" has a 15 data source limit.
The group by provides a way of keeping together fields in unification, like address fields, that should not be treated as individual fields, The current limit means that you would need to pick which data sources not to combine and show in the end unification which is problematic.
Please can this be removed so it can match the number of data sources.
-
Improve Choice Column Handling in Dataverse Tables Ingested in CID
Suggested by Maryam Khoshkar – New – 0 Comments
Currently, columns of type Choice ingested from Dataverse into Customer Insights Data display numeric values instead of readable labels. It would be very helpful to support ingesting Choice columns with their human-readable labels and values for better usability.
-
extract markdown from binary doc format
Suggested by Joseph-Sacha SCHUTZ – New – 0 Comments
Today, there is no satisfactory solution for a datascientist to extract text from old word files ( binary doc and not docx ).
This is a big problem. We have millions of documents in this format, and it's very difficult to extract text from them.
- the word API only works under windows
- Openoffice doesn't handle parallelism very well.
- Antiword supports very few formats.
- Tikka is a java implementation that works well but remains limited.
However, the Doc binary format specification has been published:
https://learn.microsoft.com/en-us/openspecs/office_file_formats/ms-doc/ccd7b486-7881-484c-a137-51170af7cc22
The funny thing is that even tools like Microsoft/MarkitDown don't support the doc format. You have to use openoffice.
What we want is a command-line program, written in a compiled language (C++ / Rust), to extract a text file from a doc format.
extract-doc file.doc > file.txt
The format specification has been published in open access:
https://learn.microsoft.com/en-us/openspecs/office_file_formats/ms-doc/ccd7b486-7881-484c-a137-51170af7cc22
If you can't do it, we can do it ourselves in a collaboration with Microsoft. But It has to be done. This is an essential need in the age of LLMs.
-
Customizable Layout & Activity Names in Customer Profile
Suggested by Maryam Khoshkar – New – 0 Comments
Add the ability to customize the layout of the Customer Profile page (move sections like Activity Timeline and Info) and let users rename activities name, currently it shows only source table names!