We have updated Flexmonster Software License Agreement, effective as of September 30, 2024. Learn more about what’s changed.

Problem on Handling Large Dataset by Elasticsearch and Data server.

Answered
Ricky Man asked on November 5, 2020

Hi, i have a report with quite a lot aggregations on a large dataset (~14GB), but there's problem on loading time. Any suggestions on improve the performance on loading large dataset?
 
I have tried Elasticsearch as the data source, even i increased the request time out, it's still a problem that a report have to load up over mins.
 
Then, i tried with Data server with local csv file (also with ~14GB so far). But job is killed when i start the server. 
 
Logs are like:
2020-11-05 06:41:37.1189|INFO|Flexmonster.DataServer.HostedServices.MonitorUserUpdateService|Monitor User Storage Service is running
2020-11-05 06:41:37.4189|INFO|Flexmonster.DataServer.Core.PrepopulatingCacheService|Prepopulation service start working
2020-11-05 06:41:37.4248|INFO|Flexmonster.DataServer.Core.DataStorages.DataStorage|Start loading index sample-index
2020-11-05 06:41:37.7646|INFO|Flexmonster.DataServer.Core.PrepopulatingCacheService|Index sample-index was loaded in 0.3443089 seconds
2020-11-05 06:41:37.7646|INFO|Flexmonster.DataServer.Core.DataStorages.DataStorage|Start loading index test-index
Killed
 
 
 Thanks!

1 answer

Public
Mykhailo Halaida Mykhailo Halaida Flexmonster November 9, 2020

Hi Ricky,
 
Thank you for writing to us.
 
We would suggest narrowing down your dataset if possible – it is already rather large for a CSV, and it gets even bigger after being processed by the Data Server.
 
The Data Server then stores the processed data in your machine's RAM, which simply might not provide that many resources, thereby causing the mentioned error.
 
We hope this helps.
 
Regards,
Mykhailo

Please login or Register to Submit Answer