Scanning a header file to save image data

I'm trying to read from a header file to take the image dimensions and other variables but I don't understand how to iterate through the "key" so it saves each line.

This is what the header file will always look like:


!INTERFILE :=
!imaging modality := nucmed
!version of keys := 3.3
;
!GENERAL DATA :=
!name of data file := /Phantoms/130521/130521-full-ATN.i33
;
!GENERAL IMAGE DATA :=
!type of data := Tomographic
!total number of images := 1000
!imagedata byte order := LITTLEENDIAN
;
!process status := Reconstructed ; This is necessary for AMIDE viewing.
!matrix size [1] := 210
!matrix size [2] := 210
!number format := unsigned integer
!number of bytes per pixel := 2
!scaling factor (mm/pixel) [1] := +2.000e+00
!scaling factor (mm/pixel) [2] := +2.000e+00
!slice thickness (pixels) := 1 ; This is the proper slice thickness key read by AMIDE.
!slice thickness (mm/pixel) := 2.0 ; This is an additional key for GATE.
!END OF INTERFILE :=




and right now I have old trial code and I'd like to build from here, but I'm not sure it's the best way to read the header in the first place ...


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
  string m_fileName = "headerATN.h33";
  FILE* fp=fopen(m_fileName.c_str(),"r");
  if (!fp) {
    G4cerr << G4endl << "Error: Could not open header file '" << m_fileName << "'!" << G4endl;
    return;
  }

  
  char keyBuffer[256],valueBuffer[256];
  fscanf(fp,"%[^=]",keyBuffer);
  fscanf(fp,"%[^\n]",valueBuffer);

  char *keyPtr = keyBuffer;
  while ( ( *keyPtr == '!' ) || ( *keyPtr == ' ' ) || ( *keyPtr == '\n' ) )
    keyPtr++;

  char *endptr = keyPtr + strlen(keyPtr) - 1;
  *(endptr--)=0;
  while ( *endptr == ' ')
    *(endptr--)=0;
  std::string key(keyPtr);

  char *value = valueBuffer+1;
  while ( *value == ' ')
    value++;
  
  cout << keyPtr <<endl;

  if ( key == "total number of images" ) {
    sscanf(value,"%d", n_image);
  } else if ( key ==  "matrix size [1]" ) {
    sscanf(value,"%d",m_dim);
  } else if ( key ==  "matrix size [2]" ) {
    sscanf(value,"%d",m_dim+1);
  } else if ( ( key ==  "number of slices" ) || (key ==  "total number of images") ) {
    sscanf(value,"%d",&m_numPlanes);
  } else if ( key ==  "scaling factor (mm/pixel) [1]" ) {
    sscanf(value,"%f",m_pixelSize);
  } else if ( key ==  "scaling factor (mm/pixel) [2]" ) {
    sscanf(value,"%f",m_pixelSize+1);
  } else if ( key ==  "slice thickness (mm/pixel)" ) {
    sscanf(value,"%f",&m_planeThickness);
  } else if ( key ==  "name of data file" ) {
    m_dataFileName = std::string(value);
  } else if ( key ==  "number format" ) {
    if ( (strcmp(value,"float")==0) || (strcmp(value,"FLOAT")==0) )
      m_dataTypeName = "FLOAT";
    else if ( (strcmp(value,"unsigned integer")==0) || (strcmp(value,"UNSIGNED INTEGER")==0) )
      m_dataTypeName = "UNSIGNED INTEGER";
    else
      cout << "Unrecognised type name '" << value << "'" << endl;
  } else if (key == "imagedata byte order") {
    if ( strcmp(value,"BIGENDIAN") == 0 ) 
      m_dataByteOrder = "BIGENDIAN";
    else if ( strcmp(value,"LITTLEENDIAN") == 0)
      m_dataByteOrder = "LITTLEENDIAN";
    else 
      cout << "Unrecognized data byte order";
  }



Ideally it would just skip the keys that aren't wanted and keep moving through the lines. Should there be a for loop for the key (and if so, how does that work with pointers?) or should this method just be scratched...
Lines 10 and 11 are a buffer overflow exploit waiting to happen. Make sure to put some limits on the input.
10
11
12
13
14
  /* keyBuffer <-- everything before any '=' sign, discards '=' sign
     valueBuffer <-- everything after, discards newline character.  */
  char c,keyBuffer[256],valueBuffer[256];
  if ((fscanf(fp,"%255[^=]%c",keybuffer,  &c) != 2) || (c != '='))   fooey();
  if ((fscanf(fp,"%255[^\n]%c,valueBuffer,&c) != 2) || (c != '\n')) fooey(); 

The next step is to have an appropriate data structure to store all data from the header.

1
2
3
4
struct m_header
{
  
};

Holy crap. You're using C++? Why are you dinking around with scanf()?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
struct interfile_header
  {
  std::map <std::string, std::string> keyvalues;
  int total_number_of_images;
  int number_of_slices;
  enum { unknown_endian, little_endian, big_endian } image_data_byte_order;
  int matrix_size[ 2 ];
  int number_of_bytes_per_pixel;
  float scaling_factor[ 2 ];
  int slice_thickness_AMIDE;
  float slice_thickness_GATE;

  bool load_from_file( const std::string filename );
  };

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
bool interfile_header::load_from_file( const std::string filename )
  {
  std::ifstream f( filename.c_str() );
  if (!f) return false;

  // Find and extract all the key-value pairs between INTERFILE begin and end
  std::string s;
  while (std::getline( f, s ))
    {
    if (s.find( "!INTERFILE :=" ) == 0) break;
    };
  while (std::getline( f, s ))
    {
    if (s[ 0 ] != '!') continue;
    if (s.find( "!END OF INTERFILE :=" == 0) break;
    
    size_t n = s.find( ":=" );
    if (n == std::string::npos) continue;

    keyvalues[ trim( s.substr( 1, n ) ) ] = trim( s.substr( n+2 ) );
    }

  // Convenience conversions for data
  total_number_of_images = string_to <int> ( keyvalues[ "total number of images" ] );
  number_of_slices = string_to <int> ( keyvalues[ "number of slices" ] );
  if (keyvalues[ "imagedata byte order" ] == "LITTLEENDIAN") image_data_byte_order = little_endian; else
  if (keyvalues[ "imagedata byte order" ] == "BIGENDIAN")    image_data_byte_order = big_endian; else
    image_data_byte_order = unknown_endian;
  matrix_size[ 0 ] = string_to <int> ( keyvalues[ "matrix size [1]" ] );
  matrix_size[ 1 ] = string_to <int> ( keyvalues[ "matrix size [2]" ] );
  number_of_bytes_per_pixel = string_to <int> ( keyvalues[ "number of bytes per pixel" ] );
  scaling_factor[ 0 ] = string_to <float> ( keyvalues[ "scaling factor (mm/pixel) [1]" ] );
  // and so on...  

  return f.good();
  }

For trim functions (line 19), look here:
http://www.cplusplus.com/faq/sequences/strings/trim/#cpp

And string_to <> ()

1
2
3
4
5
6
7
8
9
10
// Convenience function to convert a string value to a typed value.
template <typename T>
T string_to( const std::string& s, const T& default_value = T() )
  {
  std::istringstream ss( trim( s ) );
  T result = default_value;
  ss >> result;
  if (!ss.eof()) result = default_value;
  return result;
  }

All this is just typed off the top of my head. Good luck!
I have old trial code and I'd like to build from here

If the code you pasted is representive of the rest of the trial code, I would prob. just use it as inspiration... to do better!

Do the m_'s mean that the code is from the method of some class? The style otherwise looks like it's written in "C with cout" style (with the exception of one string used for comparison purposes). It would be better to be consistent with C or C++. But, along with Duoas, I suggest you look for a C++ solution.

If I am to believe that you're talking about GATE for Medical Physics [1], then then I have to ask if the app you are writing is for some sort of clinical or related use. If it is, then there is no way the posted code is fit for purpose. For serious use, I would expect to see value validation and error handling and reporting.

Even if it remains in C (or C-like style) the scanf calls should be replaced with calls that can be (more easily) validate (strtol, strtod, plus string manipulation functions.)

I'm not sure it's the best way to read the header in the first
place ...

I'd have to know more about the file format before I would take a stance on whether to parse directly to data members versus using some sort of intermediary value caching (using a map or otherwise.)

One thing I don't know is how many different fields you have to handle; whether they can each be specified only once, or multiple times; whether some fields are mandatory, so their absence should cause the file load to fail.

And I see from examples on the GATE web site than not all lines begin with a !, though I didn't see an explanation for this difference, e.g.

energy window [1] := Tc99m

and

!matrix size [1] :=64

Is this because only the !-prefixed fields are relevant?

Ideally it would just skip the keys that aren't wanted and keep moving through the lines.

Easy enough to do, in a number of ways.

Should there be a for loop for the key (and if so, how does that work with pointers?)

Not sure what you mean by pointers. The code just needs a loop added around lines 9 to the end (line 59). Then it should nominally work.

or should this method just be scratched...

I would rework it. I think that the you'd require to make the original code robust would be better spent on a rework.

Incidentally, if there are a lot of fields to handle, then it's just kind of case where I would use code generation to help me.

Andy

[1] GATE Documentation
http://wiki.opengatecollaboration.org

PS Regarding Duoas code. It is a lot better than the older code, but the use of the std::map's operator[] concerns me. This always succeeds (unless there's some horrible out-of-memory condition), so you can't tell where you're overwriting a value (when you set a value) or if you're retrieving a non-existent value (if you ask for a value using operator[], it will "helpfully" create a new element, with the appropriate default value, if there is no pre-exisiting value in the map.) For robustness, I would favour the use of map::insert, to set values, and map::find, to retrieve them.

map::find is also one possible solution for key filtering. You could add all the keys your interested in to the map up front; the file parsing code would then look for a pre-existing map entry before bothering to actually process a value.

PPS Also, there is a bug in one of the trim functions Duoas pointed you at:

For trim functions (line 19), look here:
http://www.cplusplus.com/faq/sequences/strings/trim/#cpp


The problem is in trim_left_copy

1
2
3
4
5
6
inline std::string trim_left_copy(
  const std::string& s,
  const std::string& delimiters = " \f\n\r\t\v" )
{
  return s.substr( s.find_first_not_of( delimiters ) );
}


If this function is passed either an empty string or one full of spaces, then find_first_not_of will return npos, which is some large number, and then feed it to substr. But the value of the first parameter to substr must be between 0 and the length of the string, otherwise an out_of_range exception will be thown.

pos

Position of the first character to be copied as a substring.
If this is equal to the string length, the function returns an empty string.
If this is greater than the string length, it throws out_of_range.
Note: The first character is denoted by a value of 0 (not 1).

From substr
http://www.cplusplus.com/reference/string/string/substr/

A quick fix for trim_left_copy is:

1
2
3
4
5
6
7
8
9
inline std::string trim_left_copy(
  const std::string& s,
  const std::string& delimiters = " \f\n\r\t\v" )
{
  const size_t pos = s.find_first_not_of( delimiters );
  if(s.npos == pos)
    return s;
  return s.substr( pos );
}


Last edited on
A minimal repair job of the original, with just enough extra support code to run, is (built using VC++2010 and GCC 4.6.2 (MinGW)):

Note that I did not alter the indent of the original code so the change would be more diff friendly (154 lines.)

Andy

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
#define _CRT_SECURE_NO_WARNINGS
// NOTE: to mute VC++ warnings

#include <iostream>
#include <iomanip>
#include <string>
#include <cstdio>
#include <cstring>
using namespace std;

#define G4cout std::cout
#define G4cerr std::cerr
#define G4endl std::endl

class Interfile_Header
{
private:
  int m_image;
  int m_dim[2];
  int m_numPlanes;

  float m_pixelSize[2];
  float m_planeThickness;

  std::string m_dataFileName;

  std::string m_dataType;
  std::string m_dataByteOrder;

public:
  Interfile_Header() : m_image(0), m_numPlanes(0), m_planeThickness(0)
  { 
    m_dim[0] = 0; m_pixelSize[0] = 0;
    m_dim[1] = 0; m_pixelSize[1] = 0;
  }

  int load_from_file(char* file_path);

  void dump(std::ostream& os);
};

int main(int argc, char* argv[]) 
{
  if(2 != argc)
  {
    G4cerr << "usage: test <file path>" << G4endl;
    return 0;
  }

  int ret_val = 0;

  char* file_path = argv[1];

  Interfile_Header header;

  ret_val = header.load_from_file(file_path);

  if(0 == ret_val) {
    header.dump(G4cout);
  }

  return ret_val;
}

int Interfile_Header::load_from_file(char* file_path)
{
  FILE* fp=fopen(file_path,"r");
  if (!fp) {
    G4cerr << G4endl << "Error: Could not open header file '" << file_path << "'!" << G4endl;
    return 2;
  }

  while(!feof(fp)) { // NOTE: start of loop

  char keyBuffer[256] = {0};
  char valueBuffer[256] = {0};
  fscanf(fp,"%255[^=]",keyBuffer); // NOTE: now specifies size
  fscanf(fp,"%255[^\n]",valueBuffer); // NOTE: ditto

  char *keyPtr = keyBuffer;
  while ( ( *keyPtr == '!' ) || ( *keyPtr == ' ' ) || ( *keyPtr == '\n' ) )
    keyPtr++;

  char *endptr = keyPtr + strlen(keyPtr) - 1;
  *(endptr--)=0;
  while ( *endptr == ' ')
    *(endptr--)=0;
  std::string key(keyPtr);

  char *value = valueBuffer+1;
  while ( *value == ' ')
    value++;
  
  G4cerr << "key   = " << keyBuffer   << G4endl
         << "value = " << valueBuffer << G4endl
         << G4endl;

  if ( key == "total number of images" ) {
    sscanf(value,"%d",&m_image); // NOTE: was n_image
  } else if ( key ==  "matrix size [1]" ) {
    sscanf(value,"%d",m_dim);
  } else if ( key ==  "matrix size [2]" ) {
    sscanf(value,"%d",m_dim+1);
  } else if ( ( key ==  "number of slices" ) || (key ==  "total number of images") ) {
    sscanf(value,"%d",&m_numPlanes);
  } else if ( key ==  "scaling factor (mm/pixel) [1]" ) {
    sscanf(value,"%f",m_pixelSize);
  } else if ( key ==  "scaling factor (mm/pixel) [2]" ) {
    sscanf(value,"%f",m_pixelSize+1);
  } else if ( key ==  "slice thickness (mm/pixel)" ) {
    sscanf(value,"%f",&m_planeThickness);
  } else if ( key ==  "name of data file" ) {
    m_dataFileName = std::string(value);
  } else if ( key ==  "number format" ) {
    if ( (strcmp(value,"float")==0) || (strcmp(value,"FLOAT")==0) )
      m_dataType = "FLOAT"; // NOTE: was m_dataTypeName
    else if ( (strcmp(value,"unsigned integer")==0) || (strcmp(value,"UNSIGNED INTEGER")==0) )
      m_dataType = "UNSIGNED INTEGER"; // NOTE: ditto
    else
      cout << "Unrecognised type name '" << value << "'" << endl;
  } else if (key == "imagedata byte order") {
    if ( strcmp(value,"BIGENDIAN") == 0 ) 
      m_dataByteOrder = "BIGENDIAN";
    else if ( strcmp(value,"LITTLEENDIAN") == 0)
      m_dataByteOrder = "LITTLEENDIAN";
    else 
      cout << "Unrecognized data byte order";
  }

  } // while(!feof(fp)) { // NOTE: close of added while loop

  // check // NOTE: copy value of m_numPlanes from m_image if not set
  if (m_numPlanes == 0) {
    m_numPlanes = m_image;
  }

  return 0;
}

void Interfile_Header::dump(std::ostream& os)
{
  os << "m_image          = " << m_image          << G4endl
     << "m_dim[0]         = " << m_dim[0]         << G4endl
     << "m_dim[1]         = " << m_dim[0]         << G4endl
     << "m_numPlanes      = " << m_numPlanes      << G4endl
     << std::scientific << std::setprecision(3)   << std::showpos
     << "m_pixelSize[0]   = " << m_pixelSize[0]   << G4endl
     << "m_pixelSize[1]   = " << m_pixelSize[1]   << G4endl
     << std::fixed << std::setprecision(1) << std::noshowpos
     << "m_planeThickness = " << m_planeThickness << G4endl
     << "m_dataFileName   = " << m_dataFileName   << G4endl
     << "m_dataTypeName   = " << m_dataType       << G4endl
     << "m_dataByteOrder  = " << m_dataByteOrder  << G4endl;
}


Results for file provided in opening post:

key   = !INTERFILE
value = =

key   = 
!imaging modality
value = = nucmed

key   = 
!version of keys
value = = 3.3

key   = 
;
!GENERAL DATA
value = =

key   = 
!name of data file
value = = /Phantoms/130521/130521-full-ATN.i33

key   = 
;
!GENERAL IMAGE DATA
value = =

key   = 
!type of data
value = = Tomographic

key   = 
!total number of images
value = = 1000

key   = 
!imagedata byte order
value = = LITTLEENDIAN

key   = 
;
!process status
value = = Reconstructed ; This is necessary for AMIDE viewing.

key   = 
!matrix size [1]
value = = 210

key   = 
!matrix size [2]
value = = 210

key   = 
!number format
value = = unsigned integer

key   = 
!number of bytes per pixel
value = = 2

key   = 
!scaling factor (mm/pixel) [1]
value = = +2.000e+00

key   = 
!scaling factor (mm/pixel) [2]
value = = +2.000e+00

key   = 
!slice thickness (pixels)
value = = 1 ; This is the proper slice thickness key read by AMIDE.

key   = 
!slice thickness (mm/pixel)
value = = 2.0 ; This is an additional key for GATE.

key   = 
!END OF INTERFILE
value = =

key   = 
value = 

m_image          = 1000
m_dim[0]         = 210
m_dim[1]         = 210
m_numPlanes      = 1000
m_pixelSize[0]   = +2.000e+000
m_pixelSize[1]   = +2.000e+000
m_planeThickness = 2.0
m_dataFileName   = /Phantoms/130521/130521-full-ATN.i33
m_dataTypeName   = UNSIGNED INTEGER
m_dataByteOrder  = LITTLEENDIAN


Last edited on
And for completeness, a repaired version of Duoas code (147 lines):

Andy

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
#define _CRT_SECURE_NO_WARNINGS

#include <iostream>
#include <iomanip>
#include <fstream>
#include <sstream>
#include <string>
#include <map>

#define G4cout std::cout
#define G4cerr std::cerr
#define G4endl std::endl

inline std::string trim_right_copy(
  const std::string& s,
  const std::string& delimiters = " \f\n\r\t\v" )
{
  return s.substr( 0, s.find_last_not_of( delimiters ) + 1 );
}

inline std::string trim_left_copy( // NOTE: repaired version
  const std::string& s,
  const std::string& delimiters = " \f\n\r\t\v" )
{
  size_t pos = s.find_first_not_of( delimiters );
  if(s.npos == pos)
    return s;
  return s.substr( pos );
}

inline std::string trim_copy(
  const std::string& s,
  const std::string& delimiters = " \f\n\r\t\v" )
{
  return trim_left_copy( trim_right_copy( s, delimiters ), delimiters );
}

// Convenience function to convert a string value to a typed value.
template <typename T>
T string_to( const std::string& s, const T& default_value = T() )
  {
  std::istringstream ss( trim_copy( s ) ); // NOTE: was trim
  T result = default_value;
  ss >> result;
  if (!ss.eof()) result = default_value;
  return result;
  }

struct interfile_header
  {
  std::map <std::string, std::string> keyvalues;
  int total_number_of_images;
  int number_of_slices;
  enum { unknown_endian, little_endian, big_endian } image_data_byte_order;
  int matrix_size[ 2 ];
  int number_of_bytes_per_pixel;
  float scaling_factor[ 2 ];
  int slice_thickness_AMIDE;
  float slice_thickness_GATE;

  bool load_from_file( const std::string filename );

  void dump(std::ostream& os); // NOTE: added dump method
  };

int main(int argc, char* argv[]) 
  {
  if(2 != argc)
    {
    G4cerr << "usage: test <file path>" << G4endl;
    return 0;
    }

  int ret_val = 0;

  char* file_path = argv[1];

  interfile_header testData;

  bool ret = testData.load_from_file(file_path);

  if(ret)
    {
    testData.dump(G4cout);
    }

  return ret_val;
}

bool interfile_header::load_from_file( const std::string filename )
  {
  std::ifstream f( filename.c_str() );
  if (!f) return false;

  // Find and extract all the key-value pairs between INTERFILE begin and end
  std::string s;
  while (std::getline( f, s ))
    {
    if (s.find( "!INTERFILE :=" ) == 0) break;
    };
  while (std::getline( f, s ))
    {
    if (s[ 0 ] != '!') continue;
    if (s.find( "!END OF INTERFILE :=" ) == 0) break; // NOTE: added missing )
    
    size_t n = s.find( ":=" );
    if (n == std::string::npos) continue;

    size_t l = s.find( ";", n+2 ); // NOTE: added temp code to strip trailing comment
    if (l != std::string::npos) l -= (n+2); // NOTE: ditto

    keyvalues[ trim_copy( s.substr( 1, n-1 ) ) ] = trim_copy( s.substr( n+2, l ) ); // NOTE: was trim; n changed to n-1 for key
    }

  // Convenience conversions for data
  total_number_of_images = string_to <int> ( keyvalues[ "total number of images" ] );
  number_of_slices = string_to <int> ( keyvalues[ "number of slices" ] );
  if (keyvalues[ "imagedata byte order" ] == "LITTLEENDIAN") image_data_byte_order = little_endian; else
  if (keyvalues[ "imagedata byte order" ] == "BIGENDIAN")    image_data_byte_order = big_endian; else
    image_data_byte_order = unknown_endian;
  matrix_size[ 0 ] = string_to <int> ( keyvalues[ "matrix size [1]" ] );
  matrix_size[ 1 ] = string_to <int> ( keyvalues[ "matrix size [2]" ] );
  number_of_bytes_per_pixel = string_to <int> ( keyvalues[ "number of bytes per pixel" ] );
  scaling_factor[ 0 ] = string_to <float> ( keyvalues[ "scaling factor (mm/pixel) [1]" ] );
  scaling_factor[ 1 ] = string_to <float> ( keyvalues[ "scaling factor (mm/pixel) [2]" ] ); // NOTE: added line
  slice_thickness_AMIDE = string_to <int> ( keyvalues[ "slice thickness (pixels)" ] );      // NOTE: added line
  slice_thickness_GATE = string_to <float> ( keyvalues[ "slice thickness (mm/pixel)" ] );   // NOTE: added line
  // and so on...  

  return f.good();
  }

void interfile_header::dump(std::ostream& os)
  {
  os << "total_number_of_images    = " << total_number_of_images << G4endl
     << "number_of_slices          = " << number_of_slices       << G4endl
     << "image_data_byte_order     = " << image_data_byte_order  << G4endl
     << "matrix_size[0]            = " << matrix_size[0]         << G4endl
     << "matrix_size[1]            = " << matrix_size[0]         << G4endl
     << "number_of_bytes_per_pixel = " << number_of_bytes_per_pixel << G4endl
     << std::scientific << std::setprecision(3)   << std::showpos
     << "scaling_factor[0]         = " << scaling_factor[0]   << G4endl
     << "scaling_factor[1]         = " << scaling_factor[1]   << G4endl
     << std::fixed << std::setprecision(1) << std::noshowpos
     << "slice_thickness_AMIDE     = " << slice_thickness_AMIDE << G4endl
     << "slice_thickness_GATE      = " << slice_thickness_GATE  << G4endl;
  }


Results for file provided in opening post:

total_number_of_images    = 1000
number_of_slices          = 0
image_data_byte_order     = 1
matrix_size[0]            = 210
matrix_size[1]            = 210
number_of_bytes_per_pixel = 2
scaling_factor[0]         = +2.000e+000
scaling_factor[1]         = +2.000e+000
slice_thickness_AMIDE     = 1
slice_thickness_GATE      = 2.0

Last edited on
And a different take -- no map, but does do some basic error handling (259 lines)

Andy

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
#define _CRT_SECURE_NO_WARNINGS

#include <iostream>
#include <iomanip>
#include <fstream>
#include <sstream>
#include <string>
#include <cstring>

#define G4cout std::cout
#define G4cerr std::cerr
#define G4endl std::endl

#if defined(_MSC_VER) || defined(__MINGW32__)
#define G4strcasecmp _stricmp
#else
#define G4strcasecmp strcasecmp
#endif

enum EStatus {
  eStatus_OK           = 0,
  eStatus_GeneralError = 1,
  eStatus_FileNotFound = 2,
  eStatus_InvalidValue = 3
};

enum EDataType {
  eDataType_Unknown = -1,
  eDataType_Float   =  0,
  eDataType_UInt
};

enum EDataByteOrder {
  eDataByteOrder_Unknown   = -1,
  eDataByteOrder_BigEndian =  0,
  eDataByteOrder_LittleEndian
};

void trim_in_place(std::string& str) {
  size_t pos_from = str.find_first_not_of(' ');
  if(str.npos == pos_from) {
    str.clear();
  } else {
    size_t pos_to = str.find_last_not_of(' ');
    str = str.substr(pos_from, pos_to - pos_from + 1);
  }
}

bool parseValue(const std::string& value, EDataType& dataType) {
  bool ok = true;
  dataType = eDataType_Unknown;
  if ( (G4strcasecmp(value.c_str(),"float")==0) )
    dataType = eDataType_Float;
  else if ( (G4strcasecmp(value.c_str(),"unsigned integer")==0) )
    dataType = eDataType_UInt;
  else
    ok = false;
  return ok;
}

const char* getName(EDataType dataType) {
  const char* name = "UNKNOWN";
  switch(dataType) {
    case eDataType_Float: name = "FLOAT"           ; break;
    case eDataType_UInt : name = "UNSIGNED INTEGER"; break;
    default: { /* already set */ }
  }
  return name;
}

std::ostream& operator<<(std::ostream& os, EDataType dataType) {
  os << getName(dataType);
  return os;
}

bool parseValue(const std::string& value, EDataByteOrder& dataByteOrder) {
  bool ok = true;
  dataByteOrder = eDataByteOrder_Unknown;
  if ( G4strcasecmp(value.c_str(),"BIGENDIAN") == 0 ) 
    dataByteOrder = eDataByteOrder_BigEndian;
  else if ( G4strcasecmp(value.c_str(),"LITTLEENDIAN") == 0)
    dataByteOrder = eDataByteOrder_LittleEndian;
  else
    ok = false;
  return ok;
}

const char* getName(EDataByteOrder dataByteOrder) {
  const char* name = "UNKNOWN";
  switch(dataByteOrder) {
    case eDataByteOrder_BigEndian   : name = "BIGENDIAN"   ; break;
    case eDataByteOrder_LittleEndian: name = "LITTLEENDIAN"; break;
    default: { /* already set */ }
  }
  return name;
}

std::ostream& operator<<(std::ostream& os, EDataByteOrder dataByteOrder) {
  os << getName(dataByteOrder);
  return os;
}

template <typename T>
bool parseValue(std::string value, T& t) {
  trim_in_place(value);
  std::istringstream ss(value);
  ss >> t;
  if (!ss.eof()) {
    t = T();
    return false;
  }
  return true;
}

bool parseValue(const std::string& value, std::string& str) {
  str = value;
  return !str.empty();
}

class Interfile_Header {
private:
  int m_image;
  int m_dim[2];
  int m_numPlanes;

  float m_pixelSize[2];
  float m_planeThickness;

  std::string m_dataFileName;

  EDataType      m_dataType;
  EDataByteOrder m_dataByteOrder;

public:
  Interfile_Header() : m_image(0), m_numPlanes(0), m_planeThickness(0) { 
    m_dim[0] = 0; m_pixelSize[0] = 0;
    m_dim[1] = 0; m_pixelSize[1] = 0;
  }

  int load_from_file(const char* file_path);

  void dump(std::ostream& os);
};

int main(int argc, char* argv[]) {
  if(2 != argc) {
    G4cerr << "usage: test <file path>" << G4endl;
    return 0;
  }

  int status = eStatus_OK;

  char* file_path = argv[1];

  Interfile_Header header;

  status = header.load_from_file(file_path);

  if(eStatus_OK == status) {
    header.dump(G4cout);
  }

  return status;
}

void warn_bad_value(const std::string& key, const std::string& value) {
  G4cerr << "warning : Invalid value for \'" << key << "\' : \'" << value << "\'"
         << G4endl;
}

void warn_unhandled(const std::string& key, const std::string& value) {
  G4cerr << "warning : Unhandled key-value-pair : \'" << key << "\' = \'" << value
         << "\'" << G4endl;
}

void report_error(const std::string& key, const std::string& value, int status) {
  G4cerr << "error " << status << " : while processing key-value-pair: \'" << key
         << "\' = \'" << value << "\'" << G4endl;
}

template<typename TValue>
bool extract_value(const std::string& key, const std::string& value,
    const std::string& match, TValue& output, int& status) {
  bool ok = false;
  if ( key == match ) {
    ok = parseValue(value, output);
    if (!ok) {
      status = eStatus_InvalidValue;
      warn_bad_value(key, match);
    }
  }
  return ok;
}

bool strip_comment(std::string& str) {
  size_t pos = str.find(";");
  if(pos != str.npos) {
    str = str.substr(0, pos);
    trim_in_place(str);
    return true;
  }
  return false;
}

bool split_key_value_pair(const std::string& str, std::string& key, std::string& value) {
  size_t pos_from = str.find_first_not_of("! \n");
  if(pos_from != str.npos) {
    size_t pos_to   = str.find(":=", pos_from);
    if(pos_to != str.npos) {
      key   = str.substr(pos_from, pos_to - pos_from);
      value = str.substr(pos_to + 2);
      trim_in_place(key);
      trim_in_place(value);
      return true;
    }
  }
  return false;
}

int Interfile_Header::load_from_file(const char* file_path) {
  std::ifstream ifs(file_path);
  if (!ifs.is_open()) {
    G4cerr << G4endl << "Error: Could not open header file '" << file_path << "'!" << G4endl;
    return eStatus_FileNotFound;
  }

  int         status = eStatus_OK;
  std::string line;

  while((eStatus_OK == status) && std::getline(ifs, line)) {
    strip_comment(line);

    if(line.empty())
      continue;

    std::string key;
    std::string value;
    split_key_value_pair(line, key, value);

    if(value.empty())
      continue;

#ifdef _DEBUG
    G4cerr << "key   = " << key   << G4endl
           << "value = " << value << G4endl;
#endif

    if (    !extract_value(key, value, "total number of images"  , m_image        , status)
         && !extract_value(key, value, "matrix size [1]"         , m_dim[0]       , status)
         && !extract_value(key, value, "matrix size [2]"         , m_dim[1]       , status)
         && !extract_value(key, value, "number of slices"        , m_numPlanes    , status)
         && !extract_value(key, value, "scaling factor (mm/pixel) [1]", m_pixelSize[0], status)
         && !extract_value(key, value, "scaling factor (mm/pixel) [2]", m_pixelSize[1], status)
         && !extract_value(key, value, "slice thickness (mm/pixel)", m_planeThickness, status)
         && !extract_value(key, value, "name of data file"       , m_dataFileName , status)
         && !extract_value(key, value, "number format"           , m_dataType     , status)
         && !extract_value(key, value, "imagedata byte order"    , m_dataByteOrder, status) ) {
      if(eStatus_OK == status) {
        warn_unhandled(key, value);
      } else {
        report_error(key, value, status);
      }
    }

#ifdef _DEBUG
    G4cerr << G4endl;
#endif

  } // while((eStatus_OK == status) && std::getline(ifs, line)) {

  if(eStatus_OK == status) {
    if (m_numPlanes == 0) {
      m_numPlanes = m_image;
    }
  }

  return status;
}

void Interfile_Header::dump(std::ostream& os) {
  os << "m_image          = " << m_image          << G4endl
     << "m_dim[0]         = " << m_dim[0]         << G4endl
     << "m_dim[1]         = " << m_dim[0]         << G4endl
     << "m_numPlanes      = " << m_numPlanes      << G4endl
     << std::scientific << std::setprecision(3)   << std::showpos
     << "m_pixelSize[0]   = " << m_pixelSize[0]   << G4endl
     << "m_pixelSize[1]   = " << m_pixelSize[1]   << G4endl
     << std::fixed << std::setprecision(1) << std::noshowpos
     << "m_planeThickness = " << m_planeThickness << G4endl
     << "m_dataFileName   = " << m_dataFileName   << G4endl
     << "m_dataTypeName   = " << m_dataType       << G4endl
     << "m_dataByteOrder  = " << m_dataByteOrder  << G4endl;
}

Last edited on
Wow andy, that's +10. You've definitely had more fun with this than I did... :O)

@klw
I don't know anything about the dataset being referenced here, but if it is indeed medical software you should take andywestken's advice and make sure that your software cannot fail to function properly. A program that halts is never acceptable in medical use.

Handling I/O is one of the more difficult tasks in computer science, especially because of all the things that can go wrong with it. The C style I/O functions are particularly dangerous because you need to be especially careful to handle bad input. C++ I/O makes life much easier in this respect, but you must still dot your i's and cross your t's. Make sure to learn where they are.

Andy went above and beyond to help you. Just so you know, you will get general pointers and advice in small cases to help you out.

/me goes away and leaves the excellent responses to andy.
Topic archived. No new replies allowed.