Using ESBTL and PDB files

So, I'd just like to preface my post by saying that I'm a biological scientist, and I have very little experience in programming (I've done some writing in Matlab and C++, but that's about it).

Basically, I need to use the ESBTL (Easy Structural Biology Template Library), and input some PDB files containing protein structure details into the library.

Firstly, I'm using a Mac. From my own understanding, I believe I have to create a directory for the ESBTL, so that when I'm actually writing in C++ the ESBTL can then be located when I enter it in the pre-processer command section. I don't know if this is correct, so any info would be appreciated!

Secondly, once the ESBTL has been called in the actual text-editor/compiler, how is each individual PDB file then called in order to input the file to the ESBTL? Do I need to create a directory for the files also?

Once again, I'm very new to this sort of stuff, so any information or help would be greatly appreciated!
Last edited on
Did you go through the "getting started" section?
http://esbtl.sourceforge.net/
Last edited on
I have, but I don't understand where (or what) CPLUS_INCLUDE_DIR is?
@JdittoJ,

Glad to meet you online. I take a particular interest in your post. I've been a consultant for decades to a number of professionals in your exact position, though more often in physics, robotics, AI and other engineering disciplines. My own expertise is primarily software engineering, specifically focused on C++.

The CPLUS_INCLUDE_DIR is an environment variable, a list of variables stored by the operating system as a way to store string information like directory paths, but can store other configuration information. The instruction step from esbtl is merely a means of adding the include path of the library for building future projects.

Though MAC is UNIX, it is particularly configured by Apple to work "their way", though it is fully compliant with UNIX. However, the instruction esbtl provides appears to be for Linux, and I have confirmed from a quick skim of their documentation that they don't support MAC yet, only Linux.

That's not so much a problem as it sounds. However, it will be far less troublesome to install Linux in a virtual machine (please ask here if you're not familiar). That would provide the exact environment ESBTL expects and is supported under. An experienced developer with time and motivation to tinker might get this working on the MAC OS, but I doubt you really want that path.

I've only begun skimming the documentation for ESBTL so I can be better prepared to recognize what issues you'll run into.

Are you familiar with the Boost library, and did you download it? It is a prerequisite for ESBTL, and until you have a Linux install for development on this, I'd suggest you wait for that step.

From my own understanding, I believe I have to create a directory for the ESBTL, so that when I'm actually writing in C++ the ESBTL can then be located when I enter it in the pre-processer command section. I don't know if this is correct, so any info would be appreciated!


The unpacking/installing instructions for most libraries typically require one to choose a directory, and often developers create a root directory for all such libraries because we usually end up with dozens. Boost will be a requirement, aside the ESBTL directory, so you'll refer to both from your development projects.

Secondly, once the ESBTL has been called in the actual text-editor/compiler, how is each individual PDB file then called in order to input the file to the ESBTL? Do I need to create a directory for the files also?


Working backwards, it is always a wise organization to use directories as an organizational tool.

From my quick skim, a few lines establish the components required to build a "System", which is ESTBL's design for storing and accessing hierarchical data, organizing the information from models made of chains, where chains are made of residues and residues are made of atoms.

Then, you provide the filename of the PDB.

This informs me that you'll also be asking, soon, about how to select files in a directory, how to loop through a collection of files in a directory...all typical C++ development tasks.

That, too, suggests to me you may be intent on building a GUI application, but if you're keeping this simple, you might build command line tools - as of yet I can't tell.

As we continue here in this thread it may become of interest to establish a more direct, corroborative exchange, so you can send PM's to me if that becomes appropriate and of interest to you.

EDIT & Update:

I decided to try a build on MAC to see if it works. It does, so the Linux box requirement is not actually required. There are a few, simple steps to get that to work, but easily accomplished (relatively).

One must install boost on the MAC first (I used 1.7.0 - and built the entire binary set - not certain if all of it is required, but I use boost elsewhere anyway).

The simple example provided with ESBTL compiled as expected and worked with the sample PDB provided, so it is a "go".

No doubt you can, and probably should, ignore the "CPLUS_INCLUDE_DIR" instruction, and merely indicate the include paths as part of the build configuration (it is more flexible that way).

You also need CMake (an easy install for MAC).

Here are the basics of the steps (and a few key details to help).

Download boost (UNIX), and unzip it. Move the resulting directory to a preferred location (I chose /Users/niccolo/Development - where upon boost resides in boost_1_70_0 in the Development directory).

Then, as the instructions suggest, open a terminal, navigate to the boost directory (the one I just detailed), and issue the boostrap command (more in a moment). This command is in the "getting started" instructions for boost, which can be a bit daunting because they don't get to the point right away.

assuming the terminal is changed to the boost directory, this is the command to enter, substituting your preferred path instead of mine

./bootstrap.sh --prefix=/Users/niccolo/Development/boost_1_70_0

This builds the build engine for boost (I know, seems redundant).

Next, you'll build boost. This takes a while...depends on your computer's performance. The command I recommend is this, but read the note(s) following before issuing the command

sudo ./b2 -a cxxflags=-std=c++17 variant=debug,release link=shared,static threading=single,multi address-model=64 install stage --layout=tagged runtime-link=shared,static -j3

This mess of a command line chooses the important options for building boost. This configuration choice worked without issue to build the ESBTL sample on MAC.

It does assume you have a recent XCode installed with the command line tools (automatic upon first launch of XCode). It builds debug and release versions, for static and shared libraries as well as static and shared runtimes, all in 64 bit modes (all recent versions of MAC OS require 64 bit builds), with the tagged layout.

The "-j3" at the end tells the build system to compile at most 3 targets at once in parallel. I don't recommend maximizing your machine, which is to say if you have 4 cores with 8 threads, don't use -j4 or -j8. If you know your machine really well it is ok to do that, but I don't know your machine well and if there's sufficient lint in your heatsink then maximizing the load could overheat your computer. If you know your machine is very, very well cooled, choose as you like, but for smooth operation at the expense of some extra time, let your machine work at around half or so of its core count. On a 4 core, -j2 (maybe -j3) is ok. In winter I do -j8 on a 4 core with 8 threads, but after 15 or 20 minutes the heat would possibly exceed margins of safety in summer, especially with my cats nearby.

More in another post....

The tagged layout is particularly important here. This means that any library, say chrono, will be built for each of the options (static/shared, debug/release, threaded/not-threaded), and will be named as such. You won't need to know the names, but as an example:

1
2
libboost_chrono_d-x64.a
libboost_chrono_mt_d_x64.a


These two (of 6 different permutations) are the debug and multi-threaded debug versions of the chrono library. Fortunately the way boost operates means the appropriate library is automatically selected based on the build configuration of your application.





Last edited on
@JdittoJ,

So, the previous entry builds boost. When it is finished, download and unpack the ESBTL library. It creates a directory tree, which I moved to /Users/niccolo/Development (where the directory ESBTL... is now the second in the list, boost being the first).

So, at this point all is installed and ready for a test...but one thing.

CMake. Merely download and install (very simple).

Run CMake.

It won't be but a few minutes now and you'll have the first sample running.

Click the "Browse Source" button...navigate to the ESBTL directory for examples/Simple, and choose that.

Click the "Browse Build" button, navigate to that same Simple directory but create a "build" directory (named build) inside it.

Click the "advanced" option so it is on - in the center just below the "browse build" path name.

Next, click the "Configure" button (lower left).

The whole window in the center will turn red - this is normal (why they alarm new users with that is beyond me, but it's just how the GUI Cmake works). It merely means nothing has been correctly selected in the build options.

Look for "CMAKE_BUILD_TYPE" in this central window, click the empty "Value" field for it, and enter "debug" (release is an option for later).

Look for "CMAKE_CXX_FLAGS" and enter this in the value field, after reading the notes which follow:

-std=c++17 -I/Users/niccolo/Development/boost_1_70_0 -I/Users/niccolo/Development/ESBTL-1.0-beta01/include

Here the flags inform the compiler to use C++17 standards, and the two "-I" options list the include directories (this was the CPLUS_INCLUDE_DIR I otherwise suggest you ignore).

Substitute the path I've given here with your preferred path for both boost and ESBTL's include directory

Next, click the "configure" button again - the window should lose it's red background.

Finally, click "generate".

This creates the appropriate "makefile" for the example project.

Open a terminal window, navigate to the ESBTL directory, then into the examples directory, then into the Simple directory, then into the build directory

Next, type the command:

make

That compiles and links the executable (including the boost libraries ESBTL uses), producing some sample programs.

They ESBTL instructions said to test it...and because this is in the "build" directory, this is the command to test it with:

./to_xyzr ../../2cfa.pdb

This differs from the instructions slightly. They issue "../2cfa.pdb", but they assumed you would build inside the "Simple" directory where the source files are (that's messy). So, the "../" must move up two directories instead of one - hence the repeated "../../" before the name.

It should produce an output I can't say I understand, but you might recognize as one of the outputs of this library.

The rest....well...that's up to both ESBTL tutorial and further exchanges here if you require.

If all that seems very "un-MAC" like, well....MAC is UNIX, and this is UNIX like it was for decades - RAW, command line, text oriented UNIX.




Last edited on
@Niccolo Thank you so, so much for all of the information!

However, I'm having a bit of an issue when trying to build Boost. When I enter the command line you wrote out, I get the following message:

"Unable to load Boost.Build: could not find "boost-build.jam"

BOOST_ROOT must be set, either in the environment, or on the command-line with -sBOOST_ROOT=..., to the root of the boost installation.

Attempted search from /Users/admin up to the root at /Users/share/boost-build and in these directories from BOOST_BUILD_PATH and BOOST_ROOT: /usr/share/boost-build."

Now, has this issue arisen due to me incorrectly entering the location of the boost directory, or is it some other issue?
I should also say that after I entered the command line, it asked for a password (I simply entered my user account's password) if this is of any significance ( I don't think it is, but hey).

Once again, any information would be greatly appreciated.

First, did you perform the "./bootstrap.sh...." step?

If so, did you edit the example command line I posted to point to the boost directory?

If not, that would explain the problem.

More on that in a moment.

The BOOST_ROOT is an environment variable which "./bootstrap.sh..." should have taken care of if required. I had boost on a High Sierra based MAC from a year ago, but I cleared that machine with a clean install of Mojave (long story). I had not completed the steps of preparing the machine for development beyond installing XCode (which updated just that day, as in Apple published an update right then).

So, I decided since I need boost there anyway, I'd use it to try ESBTL to see if Linux was really required (it's not), prompted by your inquiry. I built boost maybe 5 times, because I kept leaving out some parameter or other, not getting all of the library forms I require (hence the command line I posted, it was experiment 5 that worked).

When I examine the MAC on which I built boost, it doesn't have an environment variable set for BOOST_ROOT, and I believe this is a case common to all computer responses where it doesn't actually say what it means, especially from the second message down. The first message, "Unable to load Boost.Build" is likely the most meaningful, hinting that "./bootstrap.sh" may not have been run, or didn't run correctly.

It is important that we are clear as to where the boost library you downloaded was unpacked. In my own example the download arrived at "downloads" (the "downloads" directory). A double click accidentally unpacked boost right there, in "downloads", so I moved it to the new directory "/Users/niccolo/Development/boost_1_70_0", where "niccolo" is the user name on my machine.

I typically use "Development" as a root directory to all code on my machines, with directories in "Development" for each library I download and each application I create, with the additional step of creating directories under Development for various categories of code. For example, on my Windows machine the Development directory has a "cplusplus" directory, which hosts dozens of directories, one for each experimental bit of code I test from this site.

While there are myriad ways to organize code, something like this is useful to define as a personal standard on your own machine. Such a habit means I hardly ever have to wonder were I put libraries or code.

On the old MAC I created "Development" on the root, so I could mount a second drive onto that directory and use that as my development platform, which means that other than the path being "/Development/boost_1_67_0" (an older version), the idea was the same.

So, let's assume, as it was here, that boost was unpacked and located at "/Users/niccolo/Development/boost_1_70_0".

I then open a terminal window. The first step, a prerequisite to issuing these commands I did not detail above, change to the directory for boost with something like:

cd /Users/niccolo/Develpment/boost_1_70_0

Experience shows that's difficult to type out and get right. You'd have to substitute this path with your location.

Each "leaf" is a separate destination directory. "/Users" is a directory, "niccolo" is a directory under it, and so on.

Quite often, I issue several CD commands walking this path in case I don't spell something right.

Like this:

1
2
3
4
cd /Users
cd niccolo
cd Development
cd boost_1_70_0


When I get one wrong along the way, I just repeat it, attempting to correct spelling.

The underscores are always subject to "fat finger" faults, so I use, occasionally, the shortcut method like this:

1
2
3
4
cd /Users
cd nic*
cd Dev*
cd boost*


As long as there aren't multiple matches (like a nick and a niccolo), this figures out what I mean.

Then, to verify "where you are", use the "pwd" command. That displays the "current directory"

When I was in the right place, the pwd command displayed

/Users/niccolo/Development/boost_1_70_0

That is when I issued the "./bootstrap.sh" command.

Note, in the "boostrap" command, you must edit the directory entered for the --prefix parameter to the location of your boost installation.

This configures the rest of boost to "know" that directory is home (among a great many other things), and on my machine did that without creating an environment variable BOOST_ROOT.

Once "./boostraph.sh..." does it's work, then the build command (it starts with sudo) is issued.

Now, sudo means "do this under the super user account". That's why it asks for a password. It want's to know you have that authority. It's used because the library might install into a system directory (in some versions of the command), and only the super user has the authority to write to such directories. It is a security step.

That should build the libraries. The process always issues some warnings that mean nothing, it is just the nature of a library made to work on nearly every machine. Some things are "force fitted" to make it work, but that still works for boost (and has for years).

If that still doesn't get boost built, PM me for an exchange of ideas for how I can help more directly (which might mean you need to enable PM messages in this forum).
Last edited on
Thank you!

Okay, I think I'm almost there.

However, I'm having a problem with the CMake/ESBTL part.

Once CMAKE is open, I select the ESBTL directory for the "Browse Source" line, and the newly-created "build" directory (that I created within the ESBTL directory" for the "Browse Build" line.

I then clicked the advanced box and then "Configure".

A box then pops up that asks me to "Specify the generator for this project" and I select Xcode. An error message then pops up stating "Error in configuration process, project files may be invalid" (this may be the red screen you were referring to?).

Now, the main problem is that the central window is blank during this whole time (and after I've done everything stated above), and so I can't select (or search for) "CMAKE_BUILD_TYPE".

Do you have any idea why this might be?
Topic archived. No new replies allowed.