Parallel WRITE calls in parallel in C WRITE test code

Hi all ,

I am writing READ and WRITE test codes in C to read and write the data to a storage cluster. So basically the test code configures the cluster, creates an IO context and begins reading and writing data. Currently I can write data synchronoulsy.

Now I want to test high number of writes. Which means there should be a lot of WRITE calls in parallel in my C WRITE test code. I am struggling with it. How can I improve the following code to have high number of WRITES ? . Should I use a for loop? PLEASE help me on this . Following is the example WRITE test code:








1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
#include<stdio.h>
#include<stdlib.h>
#include<string.h>

int main (int argc, const char* argv[])
{

        /* Declare the cluster handle and required arguments. */
        rados_t cluster;
        //char cluster_name[] = "ceph";
        char cluster_name[] = "test cluster";
        char user_name[] = "client.admin";
        uint64_t flags;

 /* Initialize the cluster handle with the "ceph" cluster name and the "client.admin" user */
        int err;
        err = rados_create2(&cluster, cluster_name, user_name, flags);

        if (err < 0) {
                fprintf(stderr, "%s: Couldn't create the cluster handle! %s\n", argv[0], strerror(-err));
                exit(EXIT_FAILURE);
        } else {
                printf("\nCreated a cluster handle.\n");
        }


        /* Read a configuration file to configure the cluster handle. */

        err = rados_conf_read_file(cluster, "/etc/ceph/neo-neo.conf");
        if (err < 0) {
                fprintf(stderr, "%s: cannot read config file: %s\n", argv[0], strerror(-err));
                exit(EXIT_FAILURE);
        } else {
                printf("\nRead the config file.\n");
        }

        /* Read command line arguments */
        err = rados_conf_parse_argv(cluster, argc, argv);
        if (err < 0) {
                fprintf(stderr, "%s: cannot parse command line arguments: %s\n", argv[0], strerror(-err));
                exit(EXIT_FAILURE);
        } else {
                printf("\nRead the command line arguments.\n");
        }

        /* Connect to the cluster */
        err = rados_connect(cluster);
        if (err < 0) {
                fprintf(stderr, "%s: cannot connect to cluster: %s\n", argv[0], strerror(-err));
                exit(EXIT_FAILURE);
        } else {
                printf("\nConnected to the cluster.\n");
        }

        /* First declare an I/O Context */

        rados_ioctx_t io;
        //char *poolname = "data";
        char *poolname = "neo";

        err = rados_ioctx_create(cluster, poolname, &io);
        if (err < 0) {
                fprintf(stderr, "%s: cannot open rados pool %s: %s\n", argv[0], poolname, strerror(-err));
                rados_shutdown(cluster);
                exit(EXIT_FAILURE);
        } else {
                printf("\nCreated I/O context.\n");
        }

        /* Write data to the cluster synchronously. */
        err = rados_write(io, "test", "Test Message!", 16, 0);
        if (err < 0) {
                fprintf(stderr, "%s: Cannot write object \"neo-obj\" to pool %s: %s\n", argv[0], poolname, strerror(-err));
                rados_ioctx_destroy(io);
                rados_shutdown(cluster);
                exit(1);
        } else {
                printf("\nWrote \"Test\" to object \"neo-obj\".\n");
        }

        char xattr[] = "en_US";
        err = rados_setxattr(io, "neo-obj", "lang", xattr, 5);
        if (err < 0) {
                fprintf(stderr, "%s: Cannot write xattr to pool %s: %s\n", argv[0], poolname, strerror(-err));
                rados_ioctx_destroy(io);
                rados_shutdown(cluster);
                exit(1);
        } else {
                printf("\nWrote \"en_US\" to xattr \"lang\" for object \"neo-obj\".\n");
        }

} 
Last edited on
I would call your write tests in parallel, which you can do by running on many machines or vms, or running many copies of same program on one good machine, or by adding a bunch of threads to your test on a good machine. The first 2 are just with for loops, the last is threading, of course. Not sure what that library you are using is, and the answer depends on 'what you mean by high' number of writes and how parallel those writes will be when the code is in use.

a single for loop writing is just going to be serial hits on the target. If the library queues those up and parallels them so that your loop can spam it, that sorta works but its still not a fair test if the system will later have 50 monkeys all sending data at once each with their own for loop spam.
Last edited on
Hi Jonnin,

Thanks for your reply....

By high number of writes I mean lets say if I want to to do ten thousand writes, then?

Regarding the library, it does support asynchronous IO. Here (https://docs.ceph.com/docs/master/rados/api/librados/#asynchronous-io).

May be I can use asynchronous IO version of write operation?

yes, that should work if you do it right. this is identical to my threading suggestion.
At some point, if this is a real system with lots of users etc, you need to test it with a bunch of computers hitting it at once, though, something approximating a higher than normal load to start with. That is, you need something that approximates the real environment scaled down, like a small server being hit by a bunch of clients. You can only do so much testing with 1-2 pcs or a pc/phone type setup. If you don't have anything else, you can play with a free cloud account or something and throw a few bucks into that for a few days to beat on it with a bunch of vms.
Last edited on

Hi Jonnin,

Thanks for your feedback. Now I am using the asynchronous version of read operation from the given library as follows :


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37


/*
         * Read data from the cluster asynchronously.
         * First, set up asynchronous I/O completion.
         */

        rados_completion_t comp;
        err = rados_aio_create_completion(NULL, NULL, NULL, &comp);
        if (err < 0) {
                fprintf(stderr, "%s: Could not create aio completion: %s\n", argv[0], strerror(-err));
                rados_ioctx_destroy(io);
                rados_shutdown(cluster);
                exit(1);
        } else {
                printf("\nCreated AIO completion.\n");
        }

        /* Next, read data using rados_aio_read. */
        char read_res[100];
        err = rados_aio_read(io, "test-object", comp, read_res, 16, 0);

        /* Wait for the operation to complete */
//      rados_wait_for_complete(comp);
	sleep(1);

        if (err < 0) {
                fprintf(stderr, "%s: Cannot read object. %s %s\n", argv[0], poolname, strerror(-err));
                rados_ioctx_destroy(io);
                rados_shutdown(cluster);
                exit(1);
        } else {
                printf("\nRead object \"test-obj\". The contents are:\n%s\n", read_res);
        }






The rados_aio_read in above code line 21 is the asynchronous version of read operation.

The question is: do I need to call this rados_aio_read multiple times for increase number of read operations?
I would say yes, assuming these things do what i think they do just from the names. And the same things I said already, loop test will exercise it a little, multi-threaded test will exercise it somewhat and a botnet spamming it will exercise it a lot :)

Topic archived. No new replies allowed.