Jump to content

[8072][patch] Get rid of data blob (first step)


Auntie Mangos

Recommended Posts

I like the idea to reduce the data field to what-is-really-needed.

But there are several code-snippets and projects around, that use a lot of the current content of the data field.

For my part for example I have written a signature generator that extracts armor, hp, mana, different hit and crit ratings.

As far as I understand these values should not recieve an own field in the new character table because these values are updates at character login.

To calculate values like mana, hp, armor,... such scripts like mine would have to "analyse" the whole character by itself. Calculations for these values are not just additions of values that can be read direcly from the item_template table or others. Class talents and other effects that manipulate these values are pretty much harder to compute...

I'm just thinking about supergadgets MBA (http://getmangos.eu/community/viewtopic.php?id=3867).. he must scream in pain when he gets aware of the changes that have to be done...

Are there any thoughts about such needs of the information in the current data field?

Link to comment
Share on other sites

I like the idea to reduce the data field to what-is-really-needed.

But there are several code-snippets and projects around, that use a lot of the current content of the data field.

For my part for example I have written a signature generator that extracts armor, hp, mana, different hit and crit ratings.

As far as I understand these values should not recieve an own field in the new character table because these values are updates at character login.

To calculate values like mana, hp, armor,... such scripts like mine would have to "analyse" the whole character by itself. Calculations for these values are not just additions of values that can be read direcly from the item_template table or others. Class talents and other effects that manipulate these values are pretty much harder to compute...

I'm just thinking about supergadgets MBA (http://getmangos.eu/community/viewtopic.php?id=3867).. he must scream in pain when he gets aware of the changes that have to be done...

Are there any thoughts about such needs of the information in the current data field?

I don't see why this values that are calculated every login shouldn't receive their own field... it would consume some space as a trade for cpu consuming. I'm wondering with the retail armory needs to calculate anything... :mellow:

I know that this isn't MaNGOS problem directly... but... in some sort of way it is...

I would like to see the blob expanded for the simplicity of reading the database... not to remove anything.

Also hunuza made 3 important points! To learn do not need necessarily to be difficult!

Link to comment
Share on other sites

  • 39 years later...

Hi,

this patch is the first step to get rid of the data blob (characters.data).

For now it increases redundancy by adding new fields to the character table (level, xp, money, etc.),

but when all data from the data blob is saved separately the data blob can be removed.

Patch: http://pastebin.ca/1469152

ATTENTION: If your database contains somehow characters that are not up-2-date the update sqls might not work. Be sure to make backups before applying this patch.

Please report any bugs.

Link to comment
Share on other sites

Minimenager is not updataing for a very long time and after this get implemented into mangos he will be very unusable. It's not the end of php web for mangos you missunderstand me.

But that this is litle offtopic. I'm just compiled mangos whit that patch and so far it's working!

Link to comment
Share on other sites

I think what he meant was that if this patch is applied everyone would have to re-write their web front ends / etc...

Personaly i dont care, i just dont see how this would 'improve' anything, nor does it 'destroy' so could go either way...

One from main reasons (in final state): this will illiminate need to create character `data` convertion at client switches. Instead normal fields add/deleted will be used.

Link to comment
Share on other sites

One from main reasons (in final state): this will illiminate need to create character `data` convertion at client switches. Instead normal fields add/deleted will be used.

Is is really <that> much easier...?

And more colums > more DB usage (not much but still more)...

Personaly i think its easy enough to select from the data-blob, if you keep track of the order...

To simplify the DB to much will only alienate people who join to learn...

Link to comment
Share on other sites

Is is really <that> much easier...?

And more colums > more DB usage (not much but still more)...

Personaly i think its easy enough to select from the data-blob, if you keep track of the order...

To simplify the DB to much will only alienate people who join to learn...

Oh .. so I probably missed something.

Many sql servers try to buffer/cache something when they are able to (and config allows them to) .. also a storage optimization is in place. All those things doesn't apply on text fields, because the server can't know what does the field actually contain. A page in a book? A list of bitmask values? It cannot guess if the field can be fragmented for better performance or cached in parts.

When you split the field into parts which are clear for the sql server, it can handle them better. Imagine someone writing tons of volatile asm() in C just because he thinks it will outperform the same code compiled by a C compiler, which isn't true in many cases as the compiler is aware of many things the programmer might not be.

As for the size (I already wrote that somewhere) - a string of three digits (+ whitespace) is equivalent to normal 32-bit integer in size, both have 4 bytes. If the number has less than 3 digits, it's smaller in it's decimal string representation, if it has more (say .. 4 bilion), it's much smaller in it's normal form.

The question is whether we play the smart guys and try to organize our data blob or leave this work to the sql server. I'd personally prefer the second variant. As for the first one - we could use a simple text file for storing everything instead of database.

Link to comment
Share on other sites

Is is really <that> much easier...?

And more colums > more DB usage (not much but still more)...

Personaly i think its easy enough to select from the data-blob, if you keep track of the order...

To simplify the DB to much will only alienate people who join to learn...

You're wrong. It IS whole lotta easier. It's the hardest thing to convert, decode, operate on, especially for new users.

Thanks, hunuza

Link to comment
Share on other sites

The question is whether we play the smart guys and try to organize our data blob or leave this work to the sql server. I'd personally prefer the second variant. As for the first one - we could use a simple text file for storing everything instead of database.
I don't think what is currently being done with the data blob can be called playing smart.. Not only is the vast majority of the data in it already contained in other tables, a lot of it actually needs to be regenerated at login to avoid bugs. Even in a text file, it's better to only store things once, and not store things you don't need.

For example, initially the stats were just loaded at login from the data blob but that caused an endless stream of bugs and data corruption so now it is calculated at login from other fields instead, meaning the value is loaded from the DB, then discarded/overwritten.

There are some parts of the data field that could use additional caching, like the set of worn items at login (you only need their displayid, it's faster to load one blob than process all the items separately, but not if the blob is as large as it is now), but those can be dealt with separately.

The benefits of removing the blob are clear, a lot more performance in the cases where only parts of the blob were needed. While frontends relying on the current form of the data will need to be changed .. no doubt everyone will find that extracting the information from other fields / tables will be much easier than constantly needing to check field offsets, no need to change the information in several places when it changes and most importantly no more need to update the offsets when the client version changes.

Link to comment
Share on other sites

I don't think what is currently being done with the data blob can be called playing smart.. Not only is the vast majority of the data in it already contained in other tables, a lot of it actually needs to be regenerated at login to avoid bugs. Even in a text file, it's better to only store things once, and not store things you don't need.

For example, initially the stats were just loaded at login from the data blob but that caused an endless stream of bugs and data corruption so now it is calculated at login from other fields instead, meaning the value is loaded from the DB, then discarded/overwritten.

There are some parts of the data field that could use additional caching, like the set of worn items at login (you only need their displayid, it's faster to load one blob than process all the items separately, but not if the blob is as large as it is now), but those can be dealt with separately.

The benefits of removing the blob are clear, a lot more performance in the cases where only parts of the blob were needed. While frontends relying on the current form of the data will need to be changed .. no doubt everyone will find that extracting the information from other fields / tables will be much easier than constantly needing to check field offsets, no need to change the information in several places when it changes and most importantly no more need to update the offsets when the client version changes.

I agree with that, my "playing smart" was just a sarcasm on "try to optimize our data in the Big Blob ourselves".

I found something from 2006:

MySQL: Maximum number of columns in one table - 3398; size of a table row - 65534 (BLOB and TEXT not included).

Oracle: Maximum number of columns in one table - 1000. Up to 32 columns in index key.

PostgreSQL: Rows - unlimited, columns - 1600; size of a table row - 1.6TB.

so it's maybe a good idea to split some to a separate table anyway .. in the long term.
Link to comment
Share on other sites

The question is whether we play the smart guys and try to organize our data blob or leave this work to the sql server. I'd personally prefer the second variant. As for the first one - we could use a simple text file for storing everything instead of database.

I'd have to be against that, though. MaNGOS should avoid as much hard drive access as possible IMO.
Link to comment
Share on other sites

And more colums > more DB usage (not much but still more)...

Actually it's less db load because a lot data in the data blob is not really used and overwritten on player load and you do not have to load/save the whole data field if you just want to update one value (for example money).

Another reason why it is actually less db load was already given by freghar.

Personaly i think its easy enough to select from the data-blob, if you keep track of the order...

Well, but their are several problems:

* changing data field structure (as Vladimir already mentioned)

Every time we switch to a different client version you currently need to update all your tools because the data field offsets have changed.

* need of lots of string operations which are very CPU-intensive

When you access data in the data blob, you probably use SQL functions like SUBSTRING_INDEX() etc. I actually did not test it, but I can't imagine that those string functions are very CPU friendly...

* difficult to understand and read

Especially for people who are new to MaNGOS the data field might be quite incomprehensible. It's not that easy for somebody, who is not very experienced, to for example figure out how the skill data is stored there. With a new table that stores all the skill data it should be a lot easier and more intelligible to figure out how the skill data is stored.

To simplify the DB to much will only alienate people who join to learn...

How should a badly designed database help somebody who wants to learn about MMORPG servers?

Link to comment
Share on other sites

Selecting characters ordered by honor points gives me the creeps, with SUBSTRING_INDEX on WHERE and ORDER BY clauses, i really don't want to know how much CPU time each query eats...apart from the fact that the query is unreadable :)

If our little realm had more characters we'd probably have to cache those values in a special table anyway, or get rid of the statistics page...

Also, there's indeed problems with reading data blobs blindly, while writing the converter for our arcemu DB, i've noticed that when you don't set max item durability in the item instances for example, you get indestructible items because it is never reloaded from item_template apparently...

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue. Privacy Policy Terms of Use