C++ neophyte question

3ndal1s · November 1, 2010

Hello i'm a C++ newbie and i am starting my exploration of mangos sourcecode.

The first files i tried to read have been some /libmpq/ sources, huffman.cpp in particular.

I don't really understand a statement in that file, can someone explain it to me with some patience?

The line of interest lies within a function called libmpq_huff_get_8bits:

static unsigned long libmpq_huff_get_8bits(struct huffman_input_stream *is) {
   unsigned long one_byte;

   if (is->bits <= 8) {
       is->bit_buf |= *(unsigned short *)is->in_buf << is->bits;
       is->in_buf  += sizeof(unsigned short);
       is->bits    += 16;
   }

   one_byte      = (is->bit_buf & 0xFF);
   is->bit_buf >>= 8;
   is->bits     -= 8;

   return one_byte;
}

this function takes on a huffman_input_stream * pointer as a parameter. This struct is declared in huffman.h as it follows:

struct huffman_input_stream {
   unsigned char *in_buf;                /* 00 - Input data */
   unsigned long bit_buf;                /* 04 - Input bit buffer */
   unsigned int bits;                /* 08 - Number of bits remaining in 'byte' */
};

so now let's get to the point i'm missing:

is->bit_buf |= *(unsigned short *)is->in_buf << is->bits;

What does this line do exactly?

Anticipated thanks for the ones who will answer me and my apologies if i committed any mistakes in posting this question here.

3ndal1s · November 1, 2010

i'm adding here the results of my research on operator precedence:

1.-> selection of a member of the pointed variable

2.* pointer dereference

3.(typeID) casting in C-style

4.<< bitwise left shift

5.|= bitwise OR and assignment

darkstalker · November 1, 2010

libmpq is not strictly part of mangos source, just a dependency

faramir118 · November 1, 2010

You mixed up step 2 and 3 - cast to ushort*, then dereference

Basically, it gets the next ushort from in_buf and stores it next to the existing value in bit_buf

PS - libmpq is pure C. It's also a 3rd party lib (from libmpq.org), and won't help you understand mangos code. You should look in the src/game/ directory instead.

Patman128 · November 1, 2010

struct huffman_input_stream {
   unsigned char *in_buf; // this is a pointer to our data
   unsigned long bit_buf; // this buffer stores 16 bits of our input (in_buf) at a time
   unsigned int bits; // number of bits in bit_buf (if we have 8 or less we need to load more data into it)
};

static unsigned long libmpq_huff_get_8bits(struct huffman_input_stream *is) {
   unsigned long one_byte; // this is the byte we will return

   if (is->bits <= 8) { // if we have 8 or less bits in our buffer...
       is->bit_buf |= *(unsigned short *)is->in_buf << is->bits; // load the next 16 bits into bit_buf
       is->in_buf  += sizeof(unsigned short); // increment the input so that we are pointing to the next 16 bits to load (for the next time)
       is->bits    += 16; // add 16 to bits so we know that we have 16 more bits loaded
   }

   one_byte      = (is->bit_buf & 0xFF); // set one_byte to the first byte of our buffer
   is->bit_buf >>= 8; // push out the first byte of our buffer (since we've now used it)
   is->bits     -= 8; // remove a byte from our counter (since we just pushed it out)

   return one_byte;
}

darkstalker · November 1, 2010

looks like a C implemetation of what would be std::deque<bool> in C++

Patman128 · November 1, 2010

so now let's get to the point i'm missing:
is->bit_buf |= *(unsigned short *)is->in_buf << is->bits;
What does this line do exactly?

For an example, let's say that in_buf = 0x1000 0000. At 0x1000 0000, we have stored the data, the next 16 bits being "0110 0000 0011 0010". bit_buf = 0000 0000 0001 0110 and bits = 8 (since we have 8 bits still loaded in bit_buf).

Now, we want to load the next 16 bits into bit_buf. The first step is to get the next 16 bits from in_buf, which is accomplished by *(unsigned short *)is->in_buf (since in_buf is a pointer, we need to use * to get the actual data from it, and since we only want 16 bits we cast it to unsigned short.)

Loaded data: 0110 0000 0011 0010

The next step it to make sure we are only loading 16 bits and not losing any data. Right now there is 8 bits still loaded into bit_buf, so if we just set bit_buf to what we just loaded, we will lose those 8 bits. So what we do is shift our newly loaded data left by however many bits are still in bit_buf so we aren't overwriting anything but we aren't getting out of order.

Loaded data is now: 0011 0010 0000 0000

Next, we will use the OR operator to set our data without losing the old data. OR takes 2 binary numbers and combines them so that any bits that = 1 in either numbers = 1 in the new number, and any bits that = 0 in both numbers = 0 in the new number.

bit_buf: 0000 0000 0001 0110

New data: 0011 0010 0000 0000

After OR: 0011 0010 0001 0110

And there you have it. bit_buf is now loaded with the next 16 bits.

Patman128 · November 1, 2010

looks like a C implemetation of what would be std::deque<bool> in C++

It isn't. It works more like C++ streams than any container.

Patman128 · November 1, 2010

Now time for my question.

   if (is->bits <= 8) {
       is->bit_buf |= *(unsigned short *)is->in_buf << is->bits;
       is->in_buf  += sizeof(unsigned short);
       is->bits    += 16;
   }

Wouldn't this part of the code actually lose data? Every time we run this, we are loading 16 bits of data from the input; however, not all of this will actually be put into bit_buf, most of the time only the first 8 bits will actually be loaded, but we are incrementing the input by 16 bits. Wouldn't those last 8 bits then just be lost since the input is moved forward past them?

And that's enough spam from me.

darkstalker · November 1, 2010

looks like a C implemetation of what would be std::deque<bool> in C++

It isn't. It works more like C++ streams than any container.

double ended queues are used to implement streams

faramir118 · November 1, 2010

@Patman128

Basically, the shifts only operate on int32 and uint32 types. This means the bits are only lost if you assign the result into a type that is too narrow to store it.

You can read the details in the C99 standard, sections 6.3.1.1 (for integer promotion) and 6.5.7 (for bitwise-shift semantics)

Patman128 · November 2, 2010

@Patman128
Basically, the shifts only operate on int32 and uint32 types. This means the bits are only lost if you assign the result into a type that is too narrow to store it.

You can read the details in the C99 standard, sections 6.3.1.1 (for integer promotion) and 6.5.7 (for bitwise-shift semantics)

Thanks, but that's not exactly what I meant.

Let's say we had our input data:

1011 0101 0100 0101 0000 0101 0111 0001

When we begin, in_buf points to the beginning:

1011 0101 0100 0101 0000 0101 0111 0001

^ <- in_buf points to first bit

Now, we load our data. We load 16 bits shifted left by however many are still in the bit buffer. So if our bit buffer has 8 bits:

0000 0000 0110 1111

then we want to load:

0000 0101 0111 0001

shifted left 8 bits, or:

0111 0001 0000 0000

so that they will OR together without losing data. However, then the in_buf pointer is moved over 16 bits:

1011 0101 0100 0101 0000 0101 0111 0001

^-----<-----<----<---- move pointer forward 16 bits

So my question is: what happens to the 8 bites that were skipped and never loaded?

1011 0101 0100 0101 0000 0101 0111 0001

|---------| <- these bits were never loaded!

I apologize ahead of time if I'm being obtuse and you already explained it and I just didn't understand.

faramir118 · November 2, 2010

In these snipets, I use little-endian hex - lshifts actually move numbers right, and vise versa.

Also, we read from the left side of the buffer.

8 bits (previous read) | 16 bits (current read, lshifted 8 bits) => bit_buf (32 or 64 bit variable, depending on which source you have):

pp 00 00 00 | (cc cc 00 00 << 8) => pp cc cc 00

Here is a more drawn-out example...

Initial contents:

in_buf    43 7F 20 89 4C 11 2D FF
bit_buf   00 00 00 00
bits      0

First call to get_8bit:

// after extract from in_buf
in_buf    20 89 4C 11 2D FF
bit_buf   43 7F 00 00
bits      16

// after extract from bit_buf
in_buf    20 89 4C 11 2D FF
bit_buf   7F 00 00 00
bits      8
one_byte  43

Second call to get_8bit:

// after extract from in_buf
in_buf    4C 11 2D FF
bit_buf   7F 20 89 00
bits      24

// after extract from bit_buf
in_buf    4C 11 2D FF
bit_buf   20 89 00 00
bits      16
one_byte  7F

Third call to get_8bit:

// don't need to extract from in_buf

// after extract from bit_buf
in_buf    4C 11 2D FF
bit_buf   89 00 00 00
bits      8
one_byte  20

Sign In

C++ neophyte question

Recommended Posts

3ndal1s

3ndal1s

darkstalker

faramir118

Patman128

darkstalker

Patman128

Patman128

Patman128

darkstalker

faramir118

Patman128

faramir118

Recently Browsing 0 members

Contact Us

Repositories

Support

Wiki

Browse

Activity

My Activity Streams

Important Information