How do you persist data for next execution of the program?


An application creates a file (whose name contains the date) the start of each day and periodically writes binary data to it throughout the day. New data is appended to the existing file and if the file is moved or renamed, the application creates a new file (with the same date) in its place.

I need to write an application to read new binary data after it's been written to the file.

Unlike the application which is always running, my program will be invoked intermittently by Window Task Scheduler. Because it isn't always running, it can't receive notifications about changes to the file and act on those changes.

So another approach is to have the program store the number of bytes it read in its current execution and read from that point onwards the next time it executes. To improve on this idea, my program could cache the current state of the binary file (which doesn't grow too large, ~100 KB) to do a byte-by-byte comparison the next time the binary file is read. Reading would then resume on the first byte that doesn't match.

I could write the byte-offset (or the cached data) to a file but I don't like this idea because it involves exposing the file to the user. If that file gets deleted, the previous state of execution is loss and the program would have to parse the binary file from the beginning (which is a nuisance but not the end of the world).

What strategy would you folks take?
TBH, 100KB might not be worth the trouble of trying to do anything fancy.
Especially if your "intermittent" is in the 10's of minutes or more between iterations.

> New data is appended to the existing file and if the file is moved or renamed,
> the application creates a new file (with the same date) in its place.
So taking an edge case of
- your program runs, and saves an offset
- the data file is deleted, and filled back by the other program to exactly offset.
- your program runs again and does what?

The other place on Windows to store non-transient data is the Registry.
But that's a whole new can-o-worms.
simple idea :keep track of where you were in the file and a hash (even a crc) of the file up to where you have read it.
open file, hash up to your read position, does it match? Yes: read from left off point. No: read from start, its a new file.

you only need 2 values, both integers, can store in registry or a little text file somewhere or whatever else. you can even write then back as the last few bytes of your own executable if you want to get snarky, though this is probably frowned upon in polite society (and it may not work anymore, but it used to).
Last edited on
@Salem c
- your program runs, and saves an offset
- the data file is deleted, and filled back by the other program to exactly offset.
- your program runs again and does what?


Point taken. I suppose it's more robust to take into consideration the content of the file.

The binary file is a dumping ground for partial ascii/partial binary data. Something like this:

1
2
3
4
5
6
7
8
9
10
01031327SKY9958 2019-10-18 00:51:36 AXYS_SSP
$W5M5A,191018,005029,510401c449816173,10,10.1880N,64.0300W,0,12.66,3.4*07

01031327SKY9958 2019-10-18 00:59:47 AXYS_SSP
$W5M5A,191018,005930,510401c449816173,4,213.62,-27.19,28.15,29.33,30.30*22

01031327SKY9958 2019-10-18 01:00:30 AXYS_SSP
Œ  M Ž X¾$  2(JA 	>:ú @
 Ð      }¬²2 " þ  h¼Ò	 ÿ «P  €  
 '93    rSaõÿ 


Where each record, denoted by the header 01031327SKY9958 YYYY-MM-DD HH:MM:SS AXYS_SSP, comes from one of a few data sources (call each a sensor "output"). In this case, there are 3 unique kinds of outputs -- one relaying GPS coordinates, another from sensor 1, another from sensor 2. I only call it a binary file because it contains data that's meant to be parsed byte-wise (like the third output).

The ordering of data output is not fixed (i.e., sensor output is written asynchronously to the file). Also, one record might be garbled with another (considered to be garbage).

There are many strategies to parsing this file for the most recent record. What I was going to try first was:

Write a regex for the header's form and match anything in between two headers (or header and EOF) as data. That way, we can just parse the small file entirely for timestamp strings and use them as keys to determine whether new timestamps exist.
@jonnin

keep track of where you were in the file and a hash (even a crc) of the file up to where you have read it. open file, hash up to your read position, does it match? Yes: read from left off point. No: read from start, its a new file.


That's a neat idea.

you can even write then back as the last few bytes of your own executable if you want to get snarky, though this is probably frowned upon in polite society (and it may not work anymore, but it used to).


This is my kind of sinister but I don't want the next programmer to find out where I live and chuck a brick through my window.
yea I was mostly kidding. Put it in a little file, and if the user deletes it, that is the user's fault. If your requirement is to keep it safe from the user, that could get pretty involved… you can set it to read-only and toggle that when you need to use it, to prevent accidental deletes if you want a simple protection.
Topic archived. No new replies allowed.