Direct File-write: insertion by stream

I'm not new, but I feel this is probably a simple thing to do.

I know how to replace a line by loading the file into a vector, and re-writeing a modified vector, but how do I do it directly with the stream?

All purposes considered, I want the program I am writing to take up as little memory as possible. I loads info relating to stuff that will (possibly) be loaded, and it will have to save directly to the file. This is not a project for school, but a personal endeavor. I would appreciate any help you can give me.

string temps = "";
out.open(file.c_str(), ios::[...?, out deletes everything... and app adds it to the end...]);

^
|
another thing i just realized... lol

I would appreciate all the help you give me.

The reason for this algorithm:

A data structure, with a sub-structure represents an object. Since this object member of this object has a data type varying in length, and I want there to be little-to-no limitations on it, so I not want to load it all into memory at any one time.

... I feel that I'm going to hear somthing about iterators for this, lol.
Last edited on
A standard method would be to create a separate output file. Copy the input to the output, line-by-line. At the required point, insert the new line into the output file. Then continue to copy the rest of the input to the output.
If required, at the end, delete the original input file, and rename the output file.
Did I mention the sub-data structure is part of a list?

psuedo example:

1
2
3
4
5
6
7
8
9
struct structure{
       string data...
       vector<string> whatever = vector<string>();
};

//this is what the file represents
struct object{
    vector<structure> stuff = vector<structure>();
};


There is good potential for the file to be very long, which would make your solution (one which I am hoping to avoid) extremely in-efficient. Not to mention, I'm also hoping to avoid using a lot of heavy file-writing.
Last edited on
The obvious alternative is to add the new data at the end of the file.
If you really need to insert data into the middle of a file, then you should probably be using some sort of database instead.
@Chervil
How experienced are you?

I do not need to insert, or add it to the end, I know how to do that, and both ways would require me to load the entire dam thing, or write a new file/delete the odl one.

I need to access the file directly, and as you have been blabbering on, I have Looked up a few helpful things I may be able to use.

While I eat my lunch, I will provide you/any othe people who are kind enough to help me with a rephrased question:

How can I replace part of a file?

and

How can i delete characters in a file (as though they were never written to it)?

No loading into memory, no re-writing stuff. Just direct writing.

notes:

I know I can use seekg()/tellg() and seekp() and tellp() to get positions, but from there i need to erase/re-write some of the data.

I have found a few functions and I may do some testing after I eat.
Last edited on
The original topic heading said "insertion", I took my cue from that.

If the requirement is to over-write some existing portion of the file with new values, that is relatively straightforward.
How can I replace part of a file?

Write over it. Provided what you want to replace with is the same size as what you're replacing.


How can i delete characters in a file (as though they were never written to it)?

You cannot.


I do not need to insert, or add it to the end, I know how to do that, and both ways would require me to load the entire dam thing, or write a new file/delete the odl one.

Adjust the format of the file so you can do so (or use a database which already supplies this functionality as already recommended and summarily dismissed.)
@Chervil
I apologize for the mis-communication.

---------------------------------------------

Here is exactly what I want to do:

I want to be able to completely replace a single line from a file, without writing a new one(file), or completely loading all of the data. All data from that line will be 'erased' and replaced with the new data.

Ex:

before file:

hi
how
are
you
today?


Replace hi with hello:

hello
how
are
you
today?


remove the first line:

how
are
you
today?


@cire
I do not understand what you mean by 'format the file'? Whould writing the data to it in one way allow me to delete it or somthing?
@ IWishIKnew no problem. I realise my answers have been relatively brief and not fully addressed all your concerns.

I think "format the file" simply means give the file a strictly controlled layout. In the simplest case, each set of data would be of a fixed length, possibly padded with spaces or binary zeros in order to make it a fixed length.

That way, when the new values are written to the file, it will replace the existing data (again, possibly padding with spaces etc which will cover the previous values).

There's an example shown for the seekp function.
http://www.cplusplus.com/reference/ostream/ostream/seekp/


I tried a case with the file opened for input and output in binary mode , something like this:
fstream theFile(filename, ios::in | ios::out | ios::binary);
These examples you gave were problematic.
The word "hello" is longer than the word "hi" so there's no room to do that.
Removing the first line in a simple way isn't possible.

However, you could consider each line as a fixed-length string, and use the normal convention of adding a null terminator to mark the end. Then the word "hi" could look like this:
"hi\0\0\0\0\0\0\0\0"

Hello would look like this:
"hello\0\0\0\0\0"

and the deleted line would look like this:
"\0\0\0\0\0\0\0\0\0\0"
Each line has to be able to vary in length, and the number of lines must be able to vary.

@Chervil
Interesting. Does that literally write (using the '\') in binary? (in other words, when you open the file, you don't see the 1's and 0's because it's being 'translated' from binary to text)
Last edited on
The "\0" is just the C++ way of representing a character code of zero, which is not the same as the digit zero "0".
If you view the file using a hex editor for example, you will see 00 (hex) which is a single character, in the same way that the digit '0' would appear as 30 (hex).

I see the requirements:
Each line has to be able to vary in length, and the number of lines must be able to vary.

An update to an ordinary sequential file with those requirements suggests one of two approaches:

1. Allow extra blank space to accommodate possible changes.
    advantage - simple and fast to implement
    disadvantage - file may be much larger than necessary.
    may not be flexible enough.

2. Apply the updates by rewriting the entire file.
    advantage - file size is kept as small as possible
    disadvantage - performance may be slow. Real-world systems often store up a batch of changes in a transaction file, and apply them all at once, avoiding excessive file i/o.


There are other options:
3. See for example, "Random Access with Variable Length Records" here:
http://cplus.about.com/od/learning1/ss/random_7.htm
Personally I'd avoid this as the programming overhead could be significant.


Another possibility exists:
4. Use a file organisation which allows random access - such as a database.
    advantage - completely flexible as to both the variation in size and the ability to update any part without affecting the rest.
    Fast.
    disadvantage - requires some sort of database to be set up.
also requires some knowledge how to access the database, e.g. using SQL
AFAIK std::fstream doesn't load the entire file into memory.
Last edited on
@naraku
Yes, I know it doesn't. The problem is that there has to be the following two steps for what I want that can not be avoided:

1. Select where to input data
2. replace existing/append new data

Obviously, the easiest way is to load it all into a vector, sort it into a data structure, unsort it into a vector, and write the vector. This means that basically all you're doing for saving/loading is going through a list.

The other way, which I'm currently emplementing right now, is to identify the data loaded in memory, identify what is in the file, and write any unloaded data into the new file, while recognizing when we are to write loaded data that replaces old data we are reading. Then when we have everything saved, we delete the old file, and rename the new file as the old file.

The first way basically takes up a lot of memory (hypothetically, lets just say we have 10Mb...), while the latter writes directly to the disk, while still allowing data to be selected and 'picked' out, at the expense of the hard disk drive.
_____________________________________________________________________

Just an FYI: This is a note taking program. I'm writing it for college (in my spare time, good practice and will be very usfull, and mabey i can sell it) because it would be quite awsome to have all my notes saved for evey year, and be able to pull em up very quickly. That is the goal of this program. The reason I want it to be memory efficient, is because I'm writing it for the long term.

I do supppose I could break up a journal into separate files.... so, say we want note[x], and it isn't in file 1, so we look in file 2 to load it, and when we save it, we don't re-write the whole journal, just the whole file part the note was in... I could emplement a file-size limit, but that would probably be somthing I would want to do after the program is done (somthing for afterward as an improvement mabey).

Thank you chervil for your help/advice, I very much appreciate it.
Last edited on
Topic archived. No new replies allowed.