reading/writing file on multiple threads

Forum

Forum
General C++ Programming
reading/writing file on multiple threads

reading/writing file on multiple threads

I'm currently working on a server for handling clients in a 2d online game and I wrote some regular fstream file code for handling the file that stores their information and I was about to implement it into the server, then I realized there might be a problem having it open multiple times concurrently, so I googled it and came up with
posts like

http://stackoverflow.com/questions/1632470/accessing-a-single-file-with-multiple-threads

http://stackoverflow.com/questions/1576187/can-createfile-open-one-file-at-the-same-time-in-two-different-thread

I'm wondering if I can just treat it like everything else or will I have to do something specific for opening on multiple threads?

p.s. I did read those posts but I'm very new to multithreading
Appreciate any insight

Last edited on

closed account (zb0S216C)

As far as I know of, the OS doesn't provide mutually exclusive access to a file between threads/processes. The obvious solution is to introduce a synchronisation device (e.g. mutex). Each thread would attempt to "lock" the mutex before performing I/O operations on the file. When a thread is finished with the file, it "unlocks" the mutex, allowing the next thread to gain exclusive access to the file.

If you don't know already, a "mutex" is a simple construct that threads use for synchronised access to a common resource shared by multiple threads. In single-threaded environments, a mutex isn't necessarily essential, but in multi-threaded environments, mutexes are crucial (unless you write lock-less code).

Typically, a mutex, or other synchronisation devices, are implemented with atomic operations. "atomic" operations are CPU instructions that prevents other CPU cores from writing to the same location while another CPU core read/writes to a memory location. Most operating systems provide some form of mutex.

Anyway, back to the question in hand. Yes, it's possible to have synchronised access to a common resource.

_Wazzak

Last edited on

huike (43)

Thank you for the response!

huike (43)

I did a bit of looking around, and found a site that was using windows critical sections, would simply putting this around the resource i want to restrict do the job?

CRITICAL_SECTION critSec;
InitializeCriticalSection(&critSec);

//thread
EnterCriticalSection(&critSec);

// usage of resource here

LeaveCriticalSection(&critSec);
//end thread

//cleanup
DeleteCriticalSection(&critSec);

Last edited on

andywestken (4087)

Could I just check a few things:

1. you're working on Windows (going by the references to CreateFile)
2. you have a thread per client
3. each client thread need to write to the same file

Yes?

Andy

huike (43)

That is correct.

andywestken (4087)

One thing you could do is use a queue based approach. You create a writer thread which owns the file The other threads queue information to be written to the file and the writer thread deques the information and writes it to file. This means that only the queue and dequeue operations (which are fast) need to be protected by the critical section.

Andy

Last edited on

huike (43)

Okay I will look into that, I've heard of a queue based system like that on a few tutorials

Would what I posted before work though? Simply enteringcsection before resource handling and leavingcsection after?

andywestken (4087)

If by resource, you mean the Visual C++ fstream, then yes.

Apparently the C++ standard does not make any guarantees, but Microsoft (and some other implementations) do so.

The advantage of the queue base approach is that threads are blocked for less time.

Andy

Thread Safety in the Standard C++ Library
http://msdn.microsoft.com/en-us/library/c9ceah3b.aspx

huike (43)

Okay, thanks.

andywestken (4087)

By the way. the idiomatic way to handle multiple sockets is not to spawn one thread per connection. Instead you use asynchronous i/o to handle all the sockets using a pool of threads. Of course, this approach is system specific.

The Windows SDK includes a sample (overlap) which show one way to do this (it's in C)
C:\Program Files\Microsoft SDKs\Windows\v7.0\Samples\netds\winsock\overlap

Also see:
A simple application using I/O Completion Ports and WinSock
http://www.codeproject.com/Articles/13382/A-simple-application-using-I-O-Completion-Ports-an

Andy

High Performance I/O on Windows
http://int64.org/2009/05/13/high-performance-io-on-windows/

Last edited on

huike (43)

Okay, so I really went hard on this today. I've spent about 14 hours today (literally) trying to get all my code working. So what I've done is I have two 2d games that send their data (sprite position, which sprite frame is loaded, name) to the server. Because the function for the threads are the exact same and and spawn whenever a connection is established, I'm saving their data to a file and reading from it. The problem is, even when just RECEIVING data from both of the clients, using critical sections as I understand they are supposed to be used, the data is extremely jumbled in the text file. When I remove the critical sections it is (seemingly) almost just as garbled. Also it seems to break completely and instead of doing the functions any individual thread does (which replaces old information, and then puts new info) it just adds and adds to the file and both clients are slowed to a crawl.

I'm tired of bothering with this since it's becoming more and more apparent this method is bad. Could some explain to me a bit more on how critical sections actually work, and what method I should be using for sending character data like this? Is it necessary to store it to a file or can I share variables between threads somehow? Is asynchronous I/O the absolute best way to go?

kbw (9488)

I wrote some regular fstream file code for handling the file that stores their information and I was about to implement it into the server, then I realized there might be a problem having it open multiple times concurrently

There is no problem opening multiple files at the same time. The OS is designed to do that.

I did a bit of looking around, and found a site that was using windows critical sections, would simply putting this around the resource i want to restrict do the job?

You only need to synchronise data structures that are accessed from different threads of execution. If a data structure isn't accessed from multiple threads, there's nothing to synchronise.

The problem is, even when just RECEIVING data from both of the clients, using critical sections as I understand they are supposed to be used, the data is extremely jumbled in the text file. When I remove the critical sections it is (seemingly) almost just as garbled.

It's difficult to help without knowing how the server works. We can't speculate on how you're handling this received data, so we can't comment on synchronisation.

Could some explain to me a bit more on how critical sections actually work, and what method I should be using for sending character data like this? Is it necessary to store it to a file or can I share variables between threads somehow?

A Windows Critical Section is Mutux mechanism that works within process. It guarantees that only on thread will execute a section of code at a time. It's more efficient than using a Mutex because is doesn't force a context switch (which is particularly expensive on Windows). That's probably all you need to know at this point.

As for your server maintaining a model of the clients views, each client starts doing stuff from a know state; the server will modle this state in some data strucutres. If a client changes state, it needs to notify the server, so it sends an update to the server.

The server knows what state the client started in, and listens for updates from its clients. These updates are then applied to the servers data structures (model). These updates need to be synchronised if data is received on different threads, but the whole thing can be done in a single thread. Threading isn't the issue, it's your algorithm for sending/applying the updates and your protocol that the clients/server use to communicate that are the big design features.

Is asynchronous I/O the absolute best way to go?

No. It's an optimisation that removes some wait states at the cost of complexity. But if you don't have a working system, you don't have anything to optimise (yet).

huike (43)

Okay. I'm sending/recving my data from the client once per a while loop that has no delay, my server is doing (basically) the same thing. I may try to rewrite my server to see if my file algorithm is the problem (which it sounds like it is).

Is it bad to send/receive in this way? If it's a matter of simply waiting to send until there's is new information that would be a pretty easy fix.

Last edited on

kbw (9488)

Is it bad to send/receive in this way?

Yes. It doesn't scale.

If it's a matter of simply waiting to send until there's is new information that would be a pretty easy fix.

That's pretty much the approach. You need to send as little as you can, as infrequently as you can.

The server must always read what clients send, but the clients must reduce the amount and frequency of their updates.

To do that you need to design a scheme where these deltas can be calculated and sent, then decoded and amalgamated. Also, you may find that the clients will need to know about each other, so the server may need to send periodic updates of the whole system.

huike (43)

What do you mean by it doesn't scale?

kbw (9488)

What do you mean by it doesn't scale?

It probably comes from this: http://en.wikipedia.org/wiki/Scale_%28ratio%29

But we take it to mean scale up, increase capacity or number of clients or do more of what it does or handle with more connections or do more work.

In this case, I mean support more clients.

The idea is some algorithms require an increasingly higher resources are the number of things grow. We call this complexity, see http://en.wikipedia.org/wiki/Big_O_notation#Orders_of_common_functions

The idea is to have O(log n) like complexity as the computational requirements go up more slowly as more work is done. Let's look at your algorithm.

Each client sits in a loop sending updates without delay. What's the impact on the server? For each client, the server must:
1. receive the update
2. apply to the model
3. eventually, the server will need to send out updates so all clients can know what the others clients are doing.

Assume the model update delay grows linearly with each client, and the same for the receive and the same for the server update.

The complexity O(n * n * n), which is exponential growth; the opposite of what we want.

Last edited on

Topic archived. No new replies allowed.