Jump to content

[patch][8402] Speeding up guild event loading


Recommended Posts

Posted

What bug does the patch fix? What features does the patch add?

It improves Guild::RenumBankLogs() and Guild::RenumGuildEventlog() which are used during guild loading at startup. The currently used SQL doesn't renumber the LogGuids in guild_eventlog and guild_bank_eventlog tables properly. The patch fixes that.

The SQL part adds a missing index (by adding primary keys) to guild_eventlog which will speed up guild loading on big databases heavily.

(Got the guild loading part of mine with 50k log entries and 850 guilds from 102secs down to ~23sec)

For which repository revision was the patch created?

8088 0.12 branch

I think this can be "upported" to master branch too.

Is there a thread in the bug report section or at lighthouse? If yes, please add a link to the thread.

Nope

Who has been writing this patch? Please include either forum user names or email addresses.

That would be me.

For some reason the code highlighter fucks up []...

v4: http://pastebin.com/f6e656ff9

SQL part:

[HIGHLIGHT=SQL]

-- Applying the primary keys

ALTER TABLE `guild_eventlog`

ADD PRIMARY KEY (`guildid`, `LogGuid`);

-- Drop useless index

ALTER TABLE `guild_bank_eventlog`

DROP INDEX `guildid_key`;

[/HIGHLIGHT]

Applying primary keys to that table might not be that simple. I had the problem of duplicated primary keys on live data from my server, so i created a workaround.

If mangos devs decide to use this stuff you might also want to provide the following code in a sql rollup patch.

It cleans up the table with the same SQL statement used in the patch above and applies the primary key afterwards. This is only required as rollup, on a fresh database use the sql above.

[HIGHLIGHT=SQL]USE characters;

-- Add temporary index to guild_eventlog to speed up the following renumbering

ALTER TABLE `guild_eventlog`

ADD INDEX `guildid_key` (`guildid`);

-- Renumbering LogGuid of guild_eventlog manually in preperation of apply the primary key

DELIMITER $$

DROP PROCEDURE IF EXISTS `guild_eventlog_cleanup`$$

CREATE PROCEDURE `guild_eventlog_cleanup`()

BEGIN

DECLARE no_more_rows BOOLEAN;

DECLARE var INT(11) UNSIGNED;

DECLARE cur CURSOR FOR SELECT guildid FROM guild;

DECLARE CONTINUE HANDLER FOR NOT FOUND SET no_more_rows = TRUE;

OPEN cur;

loop_cur_res: LOOP

FETCH cur INTO var;

IF no_more_rows THEN

CLOSE cur;

LEAVE loop_cur_res;

END IF;

UPDATE guild_eventlog AS target INNER JOIN (SELECT guildid, logguid, (SELECT @COUNT := -1) FROM guild_eventlog WHERE guildid = var ORDER BY logguid ASC)

AS source ON source.logguid = target.logguid AND source.guildid = target.guildid SET target.logguid = (@COUNT := @COUNT + 1);

END LOOP loop_cur_res;

END$$

DELIMITER ;

CALL guild_eventlog_cleanup;

DROP PROCEDURE guild_eventlog_cleanup;

-- Applying the primary keys

ALTER TABLE `guild_eventlog`

ADD PRIMARY KEY (`guildid`, `LogGuid`);

-- Dropping the temporary index (uneeded since the primary key indices do the trick now)

ALTER TABLE `guild_eventlog`

DROP INDEX `guildid_key`;

-- Same issue for guild_bank_eventlog

ALTER TABLE `guild_bank_eventlog`

DROP INDEX `guildid_key`;[/HIGHLIGHT]

Thanks for reading.

Posted

I had that before but it suffers from two major problems:

[HIGHLIGHT=SQL]

@count := 1;

UPDATE tabelle

SET nummer = @count := @count + 1

WHERE type = "foo"

ORDER BY nummer;

[/HIGHLIGHT]

It can't deal with duplicate key entries, so it just breaks there and it is also a two statement query (initializing the variable) and i don't think mangos can deal with that.

**edit

v2: http://pastebin.com/f8c42714

This should keep code compatibly with postgres.

Posted
I had that before but it suffers from two major problems:

[HIGHLIGHT=SQL]

@count := 1;

UPDATE tabelle

SET nummer = @count := @count + 1

WHERE type = "foo"

ORDER BY nummer;

[/HIGHLIGHT]

It can't deal with duplicate key entries, so it just breaks there and it is also a two statement query (initializing the variable) and i don't think mangos can deal with that.

**edit

v2: http://pastebin.com/f8c42714

This should keep code compatibly with postgres.

I think primary key has to be dropped before renumbering.

Posted

I tested this patch at server with 934 guilds and loading speedup is incredible, really thanks for this patch :)

Btw is there any way to speedup areana teams loading too...? :)

Posted

What are your timings at loading?

16:14:22 Loading Guilds...
16:14:44 >> Loaded 832 guild definitions

16:14:44 Loading ArenaTeams...
16:14:46 >> Loaded 3899 arenateam definitions

Don't have any problems with arena team loading so far.

Posted

Hmmmh you have quite speed arena loading...

Loading before your patch

2009-06-28 14:46:21 Loading Guilds...
2009-06-28 14:48:06 >> Loaded 927 guild definitions

2009-06-28 14:48:06 Loading ArenaTeams...
2009-06-28 14:48:42 >> Loaded 6042 arenateam definitions

Loading after your patch

 2009-07-02 14:43:29 Loading Guilds...
2009-07-02 14:43:34 >> Loaded 934 guild definitions

2009-07-02 14:43:34 Loading ArenaTeams...
2009-07-02 14:44:08 >> Loaded 6032 arenateam definitions

otherwise i have more arena teams but difference between my and your results is litlle huge :/

Posted

hi, i would have another solution for this problem.

When loading mangos ... select all guild events (for 1 guild) from table ordered by TimeStamp column ASC. Then the first returned line will be for latest event. Now count first 100 (or how many we need) if there are 100+ records set oldest_logguid variable to logGuid - 1 of the first record(latest) (if it is 0 then set it to 100) , then delete all other logguids - 101, 102 ...

If there arent enough logguids we need to start from 100 (that would be used for the first (oldest) log id record and then decrease it until 0 ...

Then when we will have new guild event we just do REPLACE INTO guild_event_log (xxx) VALUES

(guildid, oldest_logguid, ...) (or DELETE xxx then INSERT xxx) so the oldest event will be replaced and now we need to decrease oldest_logguid by 1 and if it is already 0 then we need to set it to 100 ...

I hope my thoughts are clear for you. There is one disadvantage of this point of view, that it is not user friendly solution ... because logguid doesn't mean the place in sequence of guild_events but it is only used to keep the order.

For this solution there would be useful clustered index for timestamp column - or make that column ordered and index for guild_id.

We suppose that log_guid column is ordered same as timestamp column.

And same thing for bank log....

Posted

True, this fixed slot log storage screams after some LIFO system to keep it clean and avoid those renumbering queries.

The problem here is that you can't order by timestamp cause they are "just" seconds and stuff can happen in the same second. So if you if have lets say 3 log events at the same time and you happen to have that log slot overflow, you have no chance of getting the _real_ next log slot number (logguid in our case) back. MySQL for example seems to help itself by looking into the primary keys to sort equal values (the timestamp in that explicit case) and since the logguid is part of the primary it would use that number (which is almost meaningless for the order in this overflow case) for sorting.

The only way i see atm to avoid this problem is to store the "next sequence number" outside of that table (maybe a new field in guild) but that would lead to additional queries which would totally not help with performance.

Posted

hi,

next sequence number isn;t needed, cause we can use 2 columns in the order part of select query. I mean :

SELECT * FROM guild_eventlog WHERE guildid = xx ORDER BY TimeStamp, LogGuid DESC (or ascending order - i'm not sure which one would be necessary..)

So the same timestamp value would not be problem, if some events happens in the same second, then we assign to that events various LogGuids..

I'm not sure if Ascending or Descending order would be better, because i didn't looked to the code... I'm just suggesting the solution. I hope everything is clear for you.

Posted

I try to make my point more clear with a simple example:

LogGuid are our 5 logslots per guild, timestamp in this case very simplyfied and also the issue i described above where multiple events have the same timestamp. I also "abused" NewRank to show the order in which they have been inserted.

So assuming the server starts up now, we need to get LogGuid 3 because it was the last log entry being inserterted, then we can decrease that by 1 and get our next log slog in the rotation (i use 4 -> 0 rotation in this example)

guildid  [b]LogGuid[/b]  EventType  PlayerGuid1  PlayerGuid2  [b]NewRank[/b]  TimeStamp
     1        [b]0[/b]          5      1026267      1245903        [b]2[/b]         [b]16[/b]
     1        [b]1[/b]          5      1183380      1081428        [b]1[/b]         [b]16[/b]
     1        [b]2[/b]          5      1026267      1100058        [b]0[/b]         [b]15[/b]
     1        [b]3[/b]          5      1026267      1166813        [b]4[/b]         [b]16[/b]
     1        [b]4[/b]          5       234234       234234        [b]3[/b]         [b]16[/b]

Server starting, trying to get the next log slot for guild, executing:

SELECT * FROM guild_eventlog ORDER BY TIMESTAMP DESC;

guildid  LogGuid  EventType  PlayerGuid1  PlayerGuid2  NewRank  TimeStamp
     1        0          5      1026267      1245903        2         16
     1        1          5      1183380      1081428        1         16
     1        3          5      1026267      1166813        4         16
     1        4          5       234234       234234        3         16
     1        2          5      1026267      1100058        0         15

As you can see MySQL uses for sorting equal values (TimeStamp) the primary key as helper, but no matter how we sort the LogGuid in addition to the time stamp we always get the wrong value. In this case we either get 4 or 0 with is both wrong since we need LogGuid 3 because this was the last log entry being inserterted.

Posted

ok, i understand what you mean.

Then core would have to handle this special situation differently ... if record has LogGuid = 0 and it has timestamp same as other records, then the first record should be the record where the sequence of LogGuids is broken like (0, 1, ...i-1, i, !! k, k+1 ...) - sequence of LogGuids with same TimeStamp value ... - k in example. If all logguids would have same timestamp, then there is no way to find out which one is first and it doesn't matter then....

I think we can assume that there won't be more than 10 guild events with same timestamp

  • 4 weeks later...
  • 2 weeks later...
Posted

any good news about this patch?

Another point of view:

It is also possible, that we have to store all guild events and only smaller part of those are sent to client.

Maybe there should be another table that should store old (and not used by mangos) events... where we would move oldest guild events.

Or another solution would be, that we will just add new events and when mangos is starting up, we would delete oldest and not used events. I mean something like:

SELECT * FROM guild_event_log WHERE guild_id = xx ORDER BY log_id LIMIT 100;

then we process the result in mangos and we will find log_id of the oldest record.

Then we know minimal_used_log_id:

DELETE FROM guild_event_log WHERE guild_id = xx AND log_id < minimal_used_log_id; 

So the log_id will grow to infinity... but we will have fast and simple code that handles them.

Posted

I tried to reimplement code that loads : guild_eventlog and guild_bank_eventlog tables.

I renamed few variables, therefore patch is big.

Patch is a bit tested. Unexpected behavior wasn't found.

Patch adds new config options Guild.EventLogRecordsCount and Guild.BankEventLogRecordsCount.

[url]http://paste2.org/p/387609[/url]

Real change of this patch isn't big, just i changed the way how events are stored.

Pls if you have time, test it.

SQL scripts needed to run patch:


-- THIS SCRIPT DELETES table `guild_eventlog` - MAKE BACKUP, if you need it.

DROP TABLE IF EXISTS `guild_eventlog`;
CREATE TABLE `guild_eventlog` (
 `guildid` int(11) NOT NULL COMMENT 'Guild Identificator',
 `LogGuid` int(11) NOT NULL COMMENT 'Log record identificator - auxiliary column',
 `EventType` tinyint(1) NOT NULL COMMENT 'Event type',
 `PlayerGuid1` int(11) NOT NULL COMMENT 'Player 1',
 `PlayerGuid2` int(11) NOT NULL COMMENT 'Player 2',
 `NewRank` tinyint(2) NOT NULL COMMENT 'New rank(in case promotion/demotion)',
 `TimeStamp` bigint(20) NOT NULL COMMENT 'Event UNIX time',
 PRIMARY KEY (`guildid`, `LogGuid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT 'Guild Eventlog';

-- The reason i decided for such dramatic change is that old guild_eventlog table hadn't Primary key and 
-- used LogGuids from 0 to infinity
-- New system uses LogGuids from 0 to number defined in config.



-- THIS SCRIPT DELETES table `guild_bank_eventlog` - MAKE BACKUP, if you need it.

DROP TABLE IF EXISTS `guild_bank_eventlog`;
CREATE TABLE `guild_bank_eventlog` (
 `guildid` int(11) unsigned NOT NULL default '0' COMMENT 'Guild Identificator',
 `LogGuid` int(11) unsigned NOT NULL default '0' COMMENT 'Log record identificator - auxiliary column',
 `TabId` tinyint(3) unsigned NOT NULL default '0' COMMENT 'Guild bank TabId',
 `EventType` tinyint(3) unsigned NOT NULL default '0' COMMENT 'Event type',
 `PlayerGuid` int(11) unsigned NOT NULL default '0',
 `ItemOrMoney` int(11) unsigned NOT NULL default '0',
 `ItemStackCount` tinyint(3) unsigned NOT NULL default '0',
 `DestTabId` tinyint(1) unsigned NOT NULL default '0' COMMENT 'Destination Tab Id',
 `TimeStamp` bigint(20) unsigned NOT NULL default '0' COMMENT 'Event UNIX time',
 PRIMARY KEY  (`guildid`,`LogGuid`,`TabId`),
 KEY `guildid_key` (`guildid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

-- The reason i decided for such dramatic change is that old guild_bank_eventlog table used `TabId` = 0 for Money events and 
-- used `LogGuid` from 0 to infinity
-- New system uses `LogGuid` from 0 to number defined in config.

Posted

patch commited in [8402].

Remember patch drops guild_eventlog and guild_bank_eventlog tables. So you have to make backup if you need any info from those tables.

Thx to leak for pointing on problem and useful comments.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue. Privacy Policy Terms of Use