Create file to access in program

In the program I've written, I stream data into a vector<struct> from some files up to sizes of 3GB. Each time I run the program it has to read the file and stream the data. I'm still learning to program C++, but how could I create a library, maybe?, to store this data as a vector<struct> and have the program use that file?

Not sure what steps to take.

Thanks
How often do you have to read the same 3GB file?
If the answer is "many times", there may be a benefit in saving to another file.

How much of the 3GB file do you throw away because it isn't necessary for your needs? For example, only 2 or 3 fields from a CSV file containing dozens of fields.
If the answer is "lots", there may be a benefit in saving to another file.

What variable types are in your struct?
If it's just int, double, char (and arrays of those things), writing to a file can be pretty simple.
But if you've got pointers, and C++ objects like std::string, things get more interesting.

How often might you change struct to incorporate additional fields, or remove old fields?
Any changes would invalidate (or at least make it harder) to handle pre-existing saved files.

> to store this data as a vector<struct> and have the program use that file?
Serialisation is a minefield.
https://isocpp.org/wiki/faq/serialization

Whilst it's always possible to do it, the benefit might be overshadowed by the complexity of doing it.
it sounds like you already have something. does it work? Is it too slow?
3gb and a file your code writes may be a good place to say 'this would be better in binary file form' if it is not currently. Stream seems to mean different things to different people, but if you are sending this over a network or on a machine with a poor performing hard disk you may want to also compress the data. It depends; compression loses your binary advantage if you wanted that, but its less hardware bottleneck. You can also do a once-over pass to see if you are bloating (eg using double if float works, using int64 where a char would do, etc).

A compiled file with the data in it is not going to magically fix anything. The computer still has to load all that from disk to memory when your program loads, so all you are doing is moving the issue around a bit. It saves some code, but it may cost you elsewhere depending on what you end up doing. You can surely pack the data into a library and call a function to get it if you want. Do not put the data with any code, so you don't have to recompile or fix things if your data changes, you just swap out the library.
Last edited on
Topic archived. No new replies allowed.