Delete//Update row in csv file without rewriting it by using C++?

I am student of BSSE first year and we have project and am doing that. but our requirement that manipulate data using files, so i am using csv file.

but the problem i am facing is that, it needed to rewrite whole file once i want to update specific row or delete it, is not there any way to do that magic? i know the database concept but i am curious to understand how Databases do that under the hood, and am interested to build it using files.

Let suppose there are 1 Million records in file does not it took time to rewrite on every update or delete row!.

here is my code that i am using to delete record

1
2
3
4
5
6
7
8
  ofstream myfile;
  myfile.open ("example.csv");
  for (unsigned int i=0; i< persons.size(); i++) {
    if (id == persons[i].id) {
      continue;
    }
    myfile << persons[i].id << ";" << persons[i].name << ";" << persons[i].num << endl;
  }

i can use same code for update too.

Here are the whole code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
#include <iostream>
#include <iomanip>
#include <fstream>
#include <string>
#include <sstream>
#include <vector>

using namespace std;
struct Person {
    int id;
    string name;
    int num;
};
vector<Person> persons;
void write_data(string data)
{
  ofstream myfile;
    myfile.open ("example.csv");
    myfile << data;// "Writing this to a file.\n";
    myfile.close();
  return;
}
void readData()
{

  ifstream rfile;
  rfile.open("example.csv");
  if (!rfile)
  {
      cout << "File not open\n";
      return;
  }

  string line;
  const char delim = ',';

  while (getline(rfile, line))
  {
      istringstream ss(line);
      Person person;
      ss >> person.id; ss.ignore(10, delim);
      getline(ss, person.name,delim);
      ss >> person.num;
      if (ss)
          persons.push_back(person);
  }

}
void display_data()
{
  readData();
  for (unsigned int i=0; i< persons.size(); i++)
      cout << setw(5)  << persons[i].id
           << setw(25) << persons[i].name
           << setw(8)  << persons[i].num
           << '\n';
}
void delete_data(int id)
{
  //string data;
  readData();
  ofstream myfile;
  myfile.open ("example.csv");
  for (unsigned int i=0; i< persons.size(); i++) {
    if (id == persons[i].id) {
      continue;
    }
    myfile << persons[i].id << "," << persons[i].name << "," << persons[i].num << endl;
  }

  myfile.close();
}
int main()
{
  //delete_data(5);
  display_data();
  return 0;
}

sample csv file.

1
2
3
1,Ali,230
5,Bilal,255
6,Pasha,430

Link to gist https://gist.github.com/Lablnet/6f135b25640bebeb1bca7ddc97ada523

Thanks you so much.

Last edited on
You are correct on both counts. With a text file, any change requires rewriting the entire file, or at least everything from the change itself to the end. It's possible to avoid this if the change is to simply modify some characters without adding or deleting any other characters, but that rarely happens.

For your application, you could use fixed-length records:
struct Person {
int id;
char name[80]; // space for the largest name expected.
int num;
};

Then you can read & write the records to a binary file. Since they are all the same length, you can change one without modifying the others. Deletion still requires copying the other records.

Do you have a performance problem with the csv format? It's very convenient because you cal look at it easily and that's a huge advantage. You can also manipulate csv files in excel if they are small enough. Do you really have a million records?

You could also put it in a database and learn to use that.
Do you really have a million records?
No i do not have really a million records, its just my semester project, but i was curious to know about it.

Thanks you so much for your time.
databases under the hood use all kinds of different approaches. The larger ones are running on large memory, multi cpu, multi disk very powerful machines, and they are written to attack the problem in parallel. They also add hidden items to all the records to 'index' the data to make searching it faster. Also, databases are relatively** slow -- you can search a local binary file on a PC before an average database realizes you have queried it for small files like you million records. If you get closer to a billion records doing a string match search, the database's higher cpu count will eventually beat your laptop, so it depends. But databases support many users reading and writing the same data at once (it uses smoke and mirrors to do what the users need for this) and have loads of other things going on; they do a lot more than just store and fetch, much of it totally unnecessary if you just want to store and fetch (think about it.. it validates whether you have permission to talk to it, then it parses and validates your query, then it executes that and has to go find the files it needs, read and search, organize the data as per the query, bundle and transmit the data back (with encryption etc) ) ... very different from even an advanced file problem with 1 user looking for a record in a file or file-set on the local disk. You have already found your data in a small file by the time the database is done figuring out if you have permission to do anything :P

** the human may not care. half a second vs a third of a second, human won't give a flip, but in computer terms, that is an eon :)
Last edited on
Topic archived. No new replies allowed.