On fread() and arrays, big/little endian

Forum

Forum
General C++ Programming
On fread() and arrays, big/little endian

On fread() and arrays, big/little endian integers.

A function in my program is intended to read a file for integer values. The values may take up 1 to 4 bytes, as indicated by parameter. My issue is with fread().

int read_val (FILE *handle, int len)
{
  unsigned char bytes_read[4];
  int *pi; /* pointer to integer, not PI.  */
  if (len == 1)
  {
    return fgetc (handle);
  }
  else if (len < 1 || len > 4)
  {
    return 0;
  }
  else
  {
    pi = bytes_read;
    *pi = 0;
    fread (bytes_read, sizeof (char), len, handle);
    return *pi;
  }
}

fread() here should put len bytes into bytes_read, should it not? I'm dealing with a 4-byte integer word size, so I would expect that when len is 2, that *pi should have the value I need times 65536 (being shifted two bytes left, no?), and I originally had

fread (bytes_read + (4 - len), sizeof (char), len, handle);

but that didn't work -- it works as printed above.

Could somebody explain this? (I'm compiling with mingw on winXP sp3. Any big/little endian differences between the file and my os wouldn't seem to make sense either; if there were a difference would it not screw up for len=4?)

Last edited on

rwan (23)

Hi,

I don't know if this is any different with Win XP, but let me see if I can help...

fread (bytes_read, sizeof (char), len, handle);

reads in "len" characters and puts them in "bytes_read". If on disk you have:

<byte 1><byte 2><byte 3><byte 4>

then if len=2, only the first two bytes are read. byte 3 and byte 4 remain on disk and the next call to fread will start from byte 3. I think what you're thinking of is:

1
2

unsigned int bytes_read = 0;
fread (&bytes_read, sizeof (unsigned int), 1, handle);

In this case, bytes 1-4 are read. If you're only using two of those bytes, then the other two will be 0's. Just keep in mind a char is a byte and len means how many bytes you are gobbling from the file. Similarly, an unsigned int is 4 bytes in length (depends on architecture, of course).

Hope this is correct :-) and that it helps! BTW, if you are returning bytes_read to the calling function, you might want to dynamically allocate it with new or malloc...

Ray

xabnu (72)

It may help to know that specifically I'm reading RIFF WAVE files. I read here

https://ccrma.stanford.edu/courses/422/projects/WaveFormat/

that RIFF would indicate little endian, RIFX would indicate big endian.

In the format chunk the data are written back to back in a predefined order. If you look at the file it would be like

<A Byte 0> <A Byte 1>|<B Byte 0> <B Byte 1>|<C Byte 0> <C Byte 1> <C Byte 2> <C Byte 3>|<D Byte 0> <D Byte 1> <D Byte 2> <D Byte 3>|<D Byte 0> <D Byte 1>|<D Byte 0> <D Byte 1>.

If I ran fread() on an int * 1 element, I'd suck up A and B data and it would be useless.

As I understand, in RAM,

fread (bytes_read, sizeof (char), len, handle);

would put into my array <A Byte 0> <A Byte 1> <0> <0>, and that

fread (bytes_read + (4 - len), sizeof (char), len, handle);

would put <0> <0> <A Byte 0> <A Byte 1>.

My integer pointer should be aimed at the 0th byte of my array, such that depending on the preference Windows has, *pi is either

(A0) + (256 * A1) + (256 * 256 * 0) + (256 * 256 *256 * 0)
or
(256 * 256 *256 * A0) + (256 * 256 * A1) + (256 * 0) + (0).

The value should be and really is 256*A0 + A1, so apparently I'm misunderstanding something.

rwan (23)

Hi,

Honestly, I'm not much of an endian-person so I can't answer your question and hopefully someone else can. I was answering your question about fread.

If your data is in this format:

<A Byte 0> <A Byte 1>|<B Byte 0> <B Byte 1>|<C Byte 0> <C Byte 1> <C Byte 2> <C Byte 3>|<D Byte 0> <D Byte 1> <D Byte 2> <D Byte 3>|<D Byte 0> <D Byte 1>|<D Byte 0> <D Byte 1>

How about for A, you do this instead:

1
2

unsigned short int bytes_read = 0;
fread (bytes_read, sizeof (unsigned short int), 1, handle);

I think you're doing what you're doing because you want to use the same function for reading both 2-byte numbers and 4-byte numbers? You might save yourself a headache if you read in short ints for one and unsigned int for the other.

I think endian is important if the computer that generated the data is of different endian from the machine that's reading the data. If you have four bytes: <A, B, C, D>, the reversed endian is <D, C, B, A> (I believe).

Also, in your code:

fread (bytes_read + (4 - len), sizeof (char), len, handle);

If len=2 and bytes_read is an unsigned char array, you are writing what bytes 0 and 1 of A (from what I can tell), but:

* the first two bytes of bytes_read are uninitialized, unless you did it before
* you should pass the address of (bytes_read + (4 - len)) if you declared bytes_read as: unsigned int bytes_read.

I hope this helps or that someone else corrects me if I am wrong...

Ray

xabnu (72)

Thanks, I probably will end up calling different functions depending on whether len is 1,2 or 4, although I wanted a function that could handle all that and the wacky possibility of 3 bytes too lol. So anyway, I actually initialized it at *pi = 0, and your last comment illuminates the exact problem I have -- I have something that works (except when len < 4 what should be negative numbers appear, understandably, positive), and I can't explain why.

I'm going to flag as solved to reduce the headaches, but here is the befuddling output:

When using

fread (bytes_read + (4 - len), sizeof (char), len, handle);

  Format           65536
  Channels         131072
  Sample Rate      44100
  Byterate         176400
  Block Align      262141
  Bits Per Sample  1048576

fread (bytes_read, sizeof (char), len, handle);

  Format           1
  Channels         2
  Sample Rate      44100
  Byterate         176400
  Block Align      4
  Bits Per Sample  16

Only the outputs with 2 bytes are affected, and they are the bitreverse of what what intended. I checked a million times to make sure I wasn't reading the compilation of the wrong one, too. Maybe it'll click in my head some day.

Anyway thanks for your interest. Happy 'grammin!

rwan (23)

Hi,

Glad I helped out in some way and sorry I can't help. Endian is something that if I need to know, I can dig up some web page about it -- but literally days later, I'll forget what I read.

I will say that I had a similar problem as you where I had to read in sequence of 2-bytes or 4-bytes and from a maintenance point of view, it would be nice to have one function to do it all. Kind of like what you're doing with a len parameter. Never figured out how; and in the end, I guess it wasn't worth it. The computer doesn't know the difference...only I do for wondering if I could have done it better! :-)

Ray

Topic archived. No new replies allowed.

C++

Forum

On fread() and arrays, big/little endian integers.