Hello, I would like to ask you a question about finding an index of a record in a structure that is in a vector. The searching should be according to some parameters.
There is a structure together with a vector<structure>:
Name: John, address: street 1, Birth certficate number: 801206/2393
Name: Jean, address: street 2, Birth certficate number: 701206/2393
Name: Matt, address: street 3, Birth certficate number: 601206/2393
Name: Peter, address: street 4, Birth certficate number: 501206/2393
Name: Pierre, address: street 5, Birth certficate number: 901206/2393
Name: Ian,address: street 6, Birth certficate number: 861206/2393
Name: Matrix, address: street 7, Birth certficate number: 781208/2393
Now I would like to delete and update a specific record. The first thing I have to do is to find the specific item.
The criteria of searching should be according to:
1) name and address
DeletePerson( "Jean", "street 2" ));
UpdateAddressOfPerson("Jean", "street 2", "new street"));
or
2) birth certificate number
DeletePerson ( "701206/2393" ) );
UpdateAddressOfPerson("701206/2393", "new street"));
Lets say that there are thousands and thousands records and I need to find it quicky - non-linearly for the purpose of deleting and updating.
I can neither use map nor set, it is a requirement of the task that I have to solve.
I would like to mention that it should work insensitevely, so it should ignore the size of letters, for example: John is jOhn is JoHn etc. Street 1 ..sTrEet 1...
Examples that I have now which works, but are slow:
It can't be done in sublinear time using a vector. The requirement of a vector that all elements must be stored contiguously means that any auxiliary search structures must be updated (which takes O(n)) whenever an element is removed.
If you relax that requirement and store the SPersons non-contiguously (by newing each one independently), the process of search and removal can be done in O(log n) (worst case) using binary search structures, or in O(1) (average case) using hash tables.
you can actually reserve a bunch of space in the vector and hash the data yourself into the vector by computing the index off the data and having a deleted flag for dead data. It can give you the best of hash map and vector in one, but it can also be a huge effort depending on the data at hand, and it only works well off 1 key data item, you can't do it (well, you can reduce the search considerably but you can't get O(1)) for searches of multiple columns.
Delete flag data works for slowly changing data. If you delete and add nonstop, it gets horrible fast. Once in a while, in off hours, you can run a cleanup to cull the deleted items and re-map everything for the next day to maintain efficiency.
you can maintain each column in a smaller data structure designed to search for that data item, and have pointers to connect the structures to each other to get the whole record.... this is like making your own database though.
-----------
so what I am saying is if you made some objects...
class address
{
int bc_index;
int name index;
string address;
}
and vectors of those...
you can hash address and produce an index into your address vector to store this.
Then if you look up address, that costs O(1). And it gives you the rest of the record, just grab from your BC number vector at the given index, and the name vector at the given index.
This is obviously a ton of work to get right. High speed code takes a lot of time, design, and work -- which is why so many programs are so slow. This is a really rough starting point. You could use 4 hash maps and tie them together. The implementation can vary, this is just a c++ terms way of explaining the design if you want O(1) lookups. Its a deeper look at Kemort's comment.