Hello
Recently I found out a weird issue with available aggregations for fields that contain some 'null' values in the data set. Let's say I have a numeric field which contains following aggregations like:
- Sum
- Average
- Count
- Min
- Max
and a few more.
But the whole list is much more restricted once the data for the field contains a even one 'null' value in a given data set (especially when it happens to be the first value in a set). In that case the pivot only leaves me with two options:
- Count
- Distinct Count
What could be the reason behind it? Is there anything I can do about it? I've done some research and tried to create some fix for it, which do work, but I'm not entirely sure how that change could affect the rest of the code - probably in a bad way. What I'm almost certain is that the reason could be behind the part which is looking at the first row in the data set and tries to evaluate lots of default things for a given data source like: field types, mentioned available aggregations etc.. (GetMembers() private method in the Controller)
It is worth to mention I'm also using a custom data server based on your example here (with a few changes here and there to fit my use-case):
https://github.com/flexmonster/api-data-source
More information about my implementation, like code and so on, you can find in my previous support ticket.
Hello, Greg!
Thank you for reaching out to us.
Our example custom API server uses the GetSchema
function in CubeController.cs
, which is building a JSON response for the /fields request. As you correctly mentioned, data types and available aggregations are set based on the first JSON row. In our implementation, when the column's data contains `null` in the first row, the column's type is considered as a "string"
. The supported aggregations for fields with the "string"
type are "count"
and "distinctcount"
. That's why you see the limited aggregation set for the numeric field with null
in the first row.
Currently, there are three data types available for custom API - "string"
, "number"
and "date"
. The list of available aggregations differs from data type to data type (please check the /fields request doc page for more information).
Considering the information above, we recommend changing the logic of type resolving. A possible solution may include explicitly defining the types for specific field or even changing the parser's logic.
Please let us know if further assistance is needed.
Best Regards,
Maksym
Hello Maksym,
Thanks for you help, I've managed to improve the evaluation of column types by extending the search of sample field values from one (first) row only to the logic which perform a deep search and looking for the nearest 'non-null' value for each unique field key in the dataset.
Best Regards
Hello,
Thank you for the update.
We are glad to hear that you found this solution.
Feel free to contact us if more questions arise.
Best Regards,
Maksym