I am having trouble putting together a program design which fulfills certain needs for me. My situation is as follows.
I use data files to build objects for used in data analysis. Many of these files are related to one another but not all of my analysis requires the use of every data file. The most basic structure I have is a protein which takes a specific data file to fill in its properties. Some of my analysis requires the use of interface data whereas others require the use of area data. This data comes to me in two respective files. What i want to be able to do is to create a base class Protein and two further classes (inherited from protein? Perhaps not? This is where my question really lies) which can be included into a project based on whether the appropriate data file (interface or area) is available for any given project. And for projects which use both interface AND area data I want to be able to create objects that incorporate all of this information and can cross access it so as to perform further data analysis.
I have studied some polymorphism and inheritance but I do not yet feel as though I have a solid solution. Ideally I would have a system whereby I could include certain header files which would expand my protein class as appropriate to include methods and properties required for the interface or area data AND be such that if I include BOTH interface AND area data I have access to all of the methods and functions.
I'm not getting any advice, perhaps my problem was not well posed. I will try to explicate my problem further.
I work with several data files all related to proteins. Each data file adds a new layer of information about a protein. Many of my programs do not require every layer to be present, whereas some of them require a certain combination of layers to be present. The basic layer of information is the crystal structure layer. Then, in no particular order, I have solvency data, bond data and protein interaction data. Many of the problems that I solve require the use of two of these which then requires the creation of new methods to deal with both data types.
To further add portability and clarify in my projects the protein is abstracted in several layers. First, at the atom level, then the amino acid level, next the chain level and finally the protein level. Each of these levels is affected by the addition of new data. For example, there is relevant data to store for each atom when I add the solvency data into the mix.
I have a very basic solution thus far. It involves making the following classes as base classes:
Each of these layers of abstraction has related template functions and private variables to give it access to the correct containers. For example, within each version of chain there is a private member aminoAcids which is a vector of amino acids in the chain. The class type for these amino acids depends on what kind of data is available. Therefore each version has its own private version of the acids vector. The public getAcids method is a template function which can be used to return the appropriate class type.
This solution requires the creation of a new class for every combination of data layers I wish to include. I want to try to avoid this. I am really looking for an implementation whereby I include one header file and have access to all of the relevant methods as they relate to the corresponding data layer I am using.
Anyone who is willing to help me with this would be a great help to me. A back and forth about solutions and my needs will likely lead me to understanding the subtleties of inheritance that I need to understand in order to practically accomplish my needs.
I want to be able to include the protein header files that represent the data format I need. From there I want to be able to pull from the protein all chains and chains within the protein. The exact type of chain or amino acid is dependent upon the type of data present. When I call getChains from the protein class I need it to return a vector of pointers to the correct chain type.
Yes, I understand that. I will be doing similar actions. What I want is to be able to do the same actions with different layers of data. What I want is a system whereby I store data in the same way and can use the same functions to access the data in the same ways regardless of which layer of data I have available.
The problem has been that the residues are stored in the protein as a vector of pointers. I want a way to access those residues with the same methods if the residue has data about solvency and thus methods relating to solvency data or if the residue has data about surface area instead and the respective methods.
I do have a solution at the moment that I have worked out. It uses template classes which have the container variables for the various vectors and maps which give the protein access to the residues and atoms and chains, as well as their getter and setter methods of course. These template classes are inherited by each kind of protein, chain, residue and atom. I have not completed the coding so I cannot be sure that I have not run into other problems, but so far I have had good luck. I have not run into any compiler or linker errors.