High ram usage after caching with Flexmonster Data Server.

Answered
Ken Wong asked on July 12, 2022

Hi,
Good days.
This post is written to address two questions. We would like to check if there is any way to
1) Estimate the RAM based on the source file size.
2) Check if there is any way to reduce the RAM usage (other than reducing the Source Data).
We are trying to load a 13 GB file into cache with Flexmonster Data Server and we realized that RAM usage is spiked up to 700% ~ 800% of the source file size. 
This massive spike in RAM usage is totally out of our expectation and we would like to check if there is any way to estimate the expected RAM usage.
Below shows a glimpse of the sample file format:
Unit Price,Quantity,Total Amount,Region,Branch,Department,Product,Category,Cashier,Receipt No,Group,Year,Month,Day,Transaction Date,SKU,Basket Size,Divison
100,100,1000,"ASIA","Branch A","FOOD","SKU10001 - Burger A","BURGER","CA10001","2022070100001","MAIN COURSE",7,31,"2022-7-31","SKU10001","","MAIN"
We have up to hundred million lines of data, which increased the file size to 13GB.
Any help is greatly appreciated.
Thanks.

8 answers

Public
Solomiia Andrusiv Solomiia Andrusiv Flexmonster July 13, 2022

Hello, Ken!
 
Thank you for reaching out to us and for the detailed explanation of your use case.
 
Please find the answers to your questions below:
 
1. Estimate the RAM based on the source file size
From our experience, the Flexmonster Data Server tends to use 200%-300% of memory compared with the original file size.
 
2. Check if there is any way to reduce the RAM usage (other than reducing the Source Data)
We kindly suggest some approaches to reduce memory usage:

  • Turn off KeepDataOnRefresh
    This property is enabled by default and is used to keep a copy of index data during index reload, which requires more RAM while refreshing is in progress.
  • Implement the custom data source API to dynamically generate the response without storing all the data in RAM.
    This is our custom protocol designed to pass the data from your own server implementation to Flexmonster in ready-to-shown format. 
    You can find more information about the custom data source API in our docs: https://www.flexmonster.com/doc/introduction-to-custom-data-source-api/.

 
Hope you will find this answer helpful. Feel free to ask if any further questions arise.
 
Regards,
Solomiia.

Public
Solomiia Andrusiv Solomiia Andrusiv Flexmonster July 20, 2022

Hello, Ken!

Hope you are doing well.

Our team is wondering if you had some time to try the suggested approaches to reduce memory use. Could you please let us know if any of them works for your case?

Looking forward to hearing from you.

Regards,
Solomiia

Public
Ken Wong July 21, 2022

Hi Solomiia, 
Turning KeepDataOnRefresh off did reduce memory usage a little bit, but it did not do much to it. We tested with 12 GB file size and it still took up to 60 GB of the RAM usage which is near to 500% increase. 
We have not tried implementing our own caching with Custom Data Source API yet but given what is being observed, I believe this is the only way we can go for. 
If you do have any other alternative solution to this, do let us know thanks.

Public
Solomiia Andrusiv Solomiia Andrusiv Flexmonster July 21, 2022

Hello, Ken!
 
Thank you for your quick response.
 
We suggest considering the Elasticsearch data source as a possible alternative to implementing the Flexmonster custom data source API. Elastic allows working with large datasets without loading them to RAM. Here is our guide for reference: https://www.flexmonster.com/doc/connecting-to-elasticsearch/.
 
Feel free to contact us if any additional details are needed about custom data source API or Elascticsearch.
 
Regards,
Solomiia

Public
Solomiia Andrusiv Solomiia Andrusiv Flexmonster July 28, 2022

Hello, Ken!

Hope you are having a great week.

Our team is wondering if you had some time to try the Elasticsearch data source. Could you please let us know if it works for your case?

Looking forward to your response.

Regards,
Solomiia

Public
Solomiia Andrusiv Solomiia Andrusiv Flexmonster August 4, 2022

Hello, Ken!

Hope you are doing well.

Just checking in to ask if the Elasticsearch data source works for your case.

Looking forward to hearing from you.

Regards,
Solomiia

Public
Ken Wong August 5, 2022

Hi Solomia, 
We did not actually try with the Elastic Search yet as we are currently still implementing our custom API with our own caching.
One thing we have found out from your given sample source code for Custom API, storing the data retrieved from CSV in the form of Dictionary<string, dynamic> is probably what causing the 4 ~ 5 times increased in Memory Size. 

Thanks.
 

Public
Solomiia Andrusiv Solomiia Andrusiv Flexmonster August 5, 2022

Hello, Ken!

Thank you for sharing your thoughts with us.

We have put this case in our backlog to research possible ways to optimize data storage.

You are welcome to contact us if you need any assistance with your custom data source API implementation.

Best regards,
Solomiia 

Please login or Register to Submit Answer