Need help coming up with approach

Pages: 12
What about in the case where you want to access data linearly, one by one, and don't need to manipulate the data as a whole (sorting for example).

Then for getting elements you would either read from file or the vector's indexes one by one
Is reading from files more expensive than reading from vector index?

BTW we're not talking about my program anymore.
reading from rotating disks is still the slowest thing you can do on a computer.
Flash disks are much faster, and soon I expect them to be close to ram speeds if they are not already (I am slightly out of the loop on those).

vectors exist in ram, unless your machine has decided to shovel them to the disk via virtual memory/swap file management. It only does that if you run low on memory, so try not to push out to the bleeding edge of your available ram.
> Is reading from files more expensive than reading from vector index?

Reading the same data repeatedly from a file would be quite slow (unless the OS has cached the data of interest).

If the original data is in a file, and we need to read each item just once, in sequence, directly read it off the file. We would have to read it once from the file anyway to get it into a vector.
Reading the same data repeatedly from a file would be quite slow (unless the OS has cached the data of interest).


I meant we're reading data line by line from the file. We're not repeatedly reading the file. Just reading the data once to output it. So in that case reading from files is more expensive than reading from RAM?

Reading from file can be more expensive than reading from input buffer? Just a thought.
Last edited on
your question is hard to follow.
if you have data in a file, and you want to display it, you MUST read that file. You may choose to store it or you can just display and discard, but regardless, you must read the file. There may be some tricks that avoid this explicitly (memory mapping) but at some point in time, somewhere, somehow, the file must be moved from disk to memory ... whether its a ram disk or whatever other trick you used.

if you just want to display and discard, read the whole file into one buffer and then write that buffer to screen is fastest. If you want to use the data from the file in processing, you probably need to store it in a data structure, which does cost some processing time but relative to (a rotating) disk its small. Not sure what input buffer means, you mean cin? No one can type faster than the machine can read a file and spew it back out. If you mean something else, how did the data get into the buffer? If it was from the file, we are back to 'must read file'...
Last edited on
> we're reading data line by line from the file. Just reading the data once to output it.
> So in that case reading from files is more expensive than reading from RAM?

No. Directly reading from the file is obviously more efficient.
Read from file, output what was read.
Read from file into a vector, then read from vector, output it.

We can directly dump the file buffer to stdout:

1
2
3
4
5
6
7
8
9
#include <iostream>
#include <fstream>

int main()
{
    // replace __FILE__ with the path to the actual file to be read
    const char* const file_name = __FILE__ ;
    std::cout << std::ifstream(file_name).rdbuf() ;
}

http://coliru.stacked-crooked.com/a/55c3875b21b79654
Okay so using files is worse because you have to input data from the file INTO a variable. I thought that you could directly read lines from the file to stdout like so:
 
while(getline(file_name, stdout)


But guess not, or is there a function that writes from file to file?

But I did not understand you guys' replies. I think there is confusion.

1
2
3
4
5
6
7
8
9
10
// I Initially thought this:
ofstream file("list");

while(getline(file, stdout) { /* do stuff */ }

// VS

vector file = { elements };

for(i=0; i<file.size(); i++) { /* do stuff */ }


1
2
3
4
5
6
7
8
9
10
11
// How it would look like now that I know above is not possible
ofstream file("list");
string line;

while(getline(file, line)) { /* do stuff */ }

// VS 

vector file = { elements }

for(i=0; i<file.size(); i++) { /* do stuff */ }



In the above example, the file method involves copying data into a variable and outputting. Which, granted, is more expensive than directly picking a vector index and displaying that.

But with the file you are only allocating memory for 1 variable whereas for the vector you're allocating for a lot of variables.

So if you have to linearly work with a lot of data, then surely files are a better option no?
Last edited on
When you want to hide data from the user, windows allows you to literally "hide" directories.
But what's a cross-platform way to write data and not have the user access it, make it invisible for them?


Btw long text ahead:

Back on topic, I think I've thought of a better way to do this game. I'm scrapping the previous code and starting over.

Here's my thought and I would really appreciate suggestions or need somebody to tell me if it's a good idea.

Instead of how we're doing it now, that is to operate and sort given data, how about not having to do any calculation or sorting and rely on file networks?

(IS IT A BAD IDEA TO have hundreds of thousands of file directories inside a file?)

So, after asking the user for how many letters his word has, the computer asks if the user's word has a particular letter (will come to how this is decided), if it doesn't then it will enter a directory which has subdirectories which contain textfiles that do not have that letter, if it does then it will enter the second of the two possible directories which again contains subdirectories which contain textfiles.

This loops until each subdirectory (a folder) has only ONE WORD. Which is the answer.

(How can I reduce the use of writing file directories?)

The particular letter to be asked would be the one with highest frequency if we weren't worried about randomness.

So in this way we only have to write the file directory once and from that point the program does not need to do ANY calculation of any sort. It just needs to prompt the user, enter the directory and fetch answer.

Now if we were to add randomness, for picking which letter is asked to the user instead of using the highest frequency' letter, we can make note of top 5 frequency (5 possible letters to choose from) and on run time, we generate a random number and pick one of those top 5.

Similarly this is done throughout all directories and subdirectories.

But having to pick from 5 would mean having more directories. So 2 or 3 may be a better idea.
what's a cross-platform way to write data and not have the user access it

Encrypt the data.

(IS IT A BAD IDEA TO have hundreds of thousands of file directories inside a file?)
Do you mean inside a directory? In other words, lots of subdirectories. This can be a problem, depending on the file system that you're using. It can also take up a lot of space, again, depending on the file system. Each directory will occupy some minimum amount of space (typically somewhere between 512 bytes and 8k).

It would probably be more efficient to code this as a decision tree in a single file.

The big advantage of this sort of thing is that the amount of code needed to play the game is very small. For a long time, RAM was expensive so programmers did tricks like this to fit their programs into available RAM. Today it isn't so bad.

Can I start reading a file from a specific line?

It would probably be more efficient to code this as a decision tree in a single file.

But the thing is, if the tree were statically declared in the program itself, you could not update your list of words without having to rewrite parts of the code. So the tree should be outside the program so that when we want to update the tree we can just ask the program to recalculate the tree.

But I'm not sure that's what you meant.

If I can start reading a file from a specific line without having to read the lines before it then that would be great and maybe I can use this.

For now I'm stuck at trying to download html properly (for downloading initial word data)
refer this thread: http://www.cplusplus.com/forum/general/247659/

I'm having issues with "encoding" apparently. Is there really no way I can properly download html with C++?

Encrypt the data.

Encrypting won't be useful because we're not trying to hide the words from the user, the game is supposed to support all words. It's more preventing the user for deleting or tampering with the file. No use encrypting if the user types in random stuff into the file.. and if he does that it will cause a lot of problem for the program and the program couldn't detect that.

So is there no way to actually prevent the user from modifying the file?

Should I save the textfile in temp folder?
Windows has "GetTempPath" to give path of temp folder, but what about other OS? How do I get temp path?
Last edited on
Can I start reading a file from a specific line?
No you can't because you don't know where a line starts. The normal would be to read all the lines into a vector. Then you can access each line by an index.

It's more preventing the user for deleting or tampering with the file.
One option on Windows would be to use a resource file which gets linked into the .exe
However a skilled user could use a resource editor to modify the text inside the .exe

Why isn't the user allowed to do it? If it is his/her program he/she might want to use her own word lists.
Eventually I want to add UI for the program itself to allow the user to see and edit the words. Not having an easy to use GUI in C++ is a big problem, will have to use console and windows functions.

I just want to prevent the user from messing up the words by mistake. Typing an extra character to an existing word will ruin that word. Typing a new word with a number might make the program never end.

One can only write so many exceptions (eg. if word has number then that's not normal). Ultimately you cannot check whether the user had edited the file or not, at least not without using another file.

Ah yes resource file. That's perfect. Will google.
I guess I'll throw trying to scrape HTML with C++ in the bin for now. I'll post any updates on the code over here.

Topic archived. No new replies allowed.
Pages: 12