Speeding up XWatch
This is about XWATCH-146.
To resume the discussions on the mailing lists about speeding up Watch, here are a couple of summarized results.
The results in the tables that follow were obtained for 3 databases of 20, 60 and 120 thousands articles. I need to mention that the 20 000 articles database is a "real" database while the other two are generated databases. Even if I tried to generate data as random as possible for the two, the fact remains that the entropy is not quite the same as in real situations (for example, all articles belonging to a feed get inserted in the database at successive positions). On the other hand, I tried to randomize as much as possible the values that are used as keys in the database relations: feed names, feed urls, article names.
Keep in mind that most of the time results were pretty unstable, I tried to write down in the tables the 'overall feeling', so that we can get an idea about the improvements these methods could bring on XWiki Watch and their scalability.
Non-indexed database
Here are the results obtained with no indexes on the tables other than the ones in the database administration guide.
<style type="text/css"> table tr.c20 { background-color: #FFF8A5; } table tr.c60 { background-color: #FFD66C; } table tr.c120 { background-color: #FF9849; } table table { width: 100%; } tr.c20 td, tr.c60 td, tr.c120 td { color: #000000; } table.results td { color: #000000; } </style> <table class="results"> <tr><td>Operation</td> <td> Sql </td> <td>Optimized Sql</td> <td>Lucene</td></tr> <tr> <td> Initial load (first time)</td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td>30-40 seconds </td></tr><tr class="c120"><td>120:</td><td>can go up to 60s</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td>around 10</td></tr><tr class="c120"><td>120:</td><td>can go up to 60s</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td>up to 20 (15-16)</td></tr><tr class="c120"><td>120:</td><td></td></tr></table></td> </tr> <tr> <td>Initial load</td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td>~15 seconds</td></tr><tr class="c120"><td>120:</td><td>20-23?</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td>7s</td></tr><tr class="c120"><td>120:</td><td>13-14s</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td>~4-5 on average (but can go up to 10)</td></tr><tr class="c120"><td>120:</td><td></td></tr></table></td> </tr> <tr> <td>All articles</td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td>around 7-8-9 seconds </td></tr><tr class="c120"><td>120:</td><td>17-18</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td>5 on average, from 3 to 7</td></tr><tr class="c120"><td>120:</td><td>5-6 but can go as low as under a second</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td>1</td></tr><tr class="c120"><td>120:</td><td></td></tr></table></td> </tr> <tr> <td>One feed</td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td>3 seconds</td></tr><tr class="c120"><td>120:</td><td>7-8 seconds, up to 10</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td>under a second (0.7-0.8)</td></tr><tr class="c120"><td>120:</td><td>under a second but can go up to 2-3</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td>from under a second to a couple seconds </td></tr><tr class="c120"><td>120:</td><td></td></tr></table></td> </tr> <tr> <td>Pagination navigation</td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td>2-3 seconds </td></tr><tr class="c120"><td>120:</td><td>4-5 seconds but can go up to 10-15</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td>2-3 on average (from under one second to 4-5)</td></tr><tr class="c120"><td>120:</td><td>usually under a second but can go to 4-5</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td>1s on average</td></tr><tr class="c120"><td>120:</td><td></td></tr></table></td> </tr> <tr> <td>Keyword search</td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td>7-8 on average; as low as 5 but as high as 15</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td>3-4 on average. From under a second to 5-6</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td></td></tr></table></td> </tr> <tr> <td>Tag search</td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td></td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td></td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td></td></tr></table></td> </tr> <tr> <td>Flagged articles</td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td></td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td></td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td></td></tr></table></td> </tr> <tr> <td>Combined filter (kw + tag + flag)</td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td></td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td></td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td></td></tr></table></td> </tr> </table>
Indexed database
Note first that Lucene times should not change if the custom mapping tables for the feeds are indexed, since the Lucene retrieval does not use the database but for getting the documents, and object mapping in documents only uses the feeds tables ids, which are primary keys therefore indexed by default.
And since we can assume that indexing will always be possible for feeds tables, we focused our research on this case, namely to see how good can optimised sql get and how does it scale.
The indexes that we used are:
alter table feeds_feedentry add index (fee_title(255)), add index (fee_author(255)), add index (fee_category(255)), add index (fee_content(1024)), add index (fee_fullcontent(1024));
alter table feeds_aggregatorurl add key (agg_url(255)), add key (agg_name(255));
alter table feeds_aggregatorgroup add key (agg_name (255));
alter table feeds_aggregatorurlgroups add key (agl_value);
alter table feeds_feedentrytags add key (fet_value);
alter table feeds_keyword add key (key_name (255)), add key (key_group (255));
to try to optimise the following queries:
- default select for getting the list of all articles:
from XWikiDocument as doc, BaseObject as obj, XWiki.FeedEntryClass as feedentry
where doc.fullName=obj.name and obj.className='XWiki.FeedEntryClass' and obj.id=feedentry.id and (feedentry.flag>-1 or feedentry.flag is null) and doc.web='Watch'
order by feedentry.date desc
which translates to the following sql query:
from xwikidoc xwikidocum0_, xwikiobjects baseobject1_, feeds_feedentry feedentryc2_
where xwikidocum0_.XWD_FULLNAME=baseobject1_.XWO_NAME and baseobject1_.XWO_CLASSNAME='XWiki.FeedEntryClass' and baseobject1_.XWO_ID=feedentryc2_.fee_id and (feedentryc2_.fee_flag>-1 or feedentryc2_.fee_flag is null) and xwikidocum0_.XWD_WEB='Watch'
order by feedentryc2_.fee_date desc
- the optimised query for the articles list:
from BaseObject as obj, XWiki.FeedEntryClass as feedentry where obj.id=feedentry.id and obj.className='XWiki.FeedEntryClass' and (feedentry.flag>-1 or feedentry.flag is null) and obj.name like 'Watch.%'
order by feedentry.date desc
which is translated to the following query:
from xwikiobjects baseobject0_, feeds_feedentry feedentryc1_
where baseobject0_.XWO_ID=feedentryc1_.fee_id and baseobject0_.XWO_CLASSNAME='XWiki.FeedEntryClass' and (feedentryc1_.fee_flag>-1 or feedentryc1_.fee_flag is null) and (baseobject0_.XWO_NAME like 'Watch.%')
order by feedentryc1_.fee_date desc
A few hacks, for the sake of performance (since "fast" and "clean" are two words that rarely go together) were to:
- drop the documents table from the query and get the document names from the object names. The only place where the documents table was used was to test the space, which we replaced by "obj.name like 'Watch.%'". There are cases in which this can return odd results (if database is mysql then like is case insensitive by default) but I think they're rare enough.
- drop the distinct condition for the returned results: the only situation when a document name can appear twice is when there are two objects of class XWiki.FeedEntryClass in the same document in the Watch space, which does not happen by normal usage of XWiki Watch. The distinct constraint is costfull to evaluat especially for large result sets. Note that, even if we query with a small limit, the distinct and the ordering are performed for the whole result set by the database engine.
The results are the following, noting that we focused on the performance on the 20 000 articles real database and the 120 000 articles database, to see how the optimisation scales.
<table class="results"> <tr><td>Operation</td> <td> Sql </td> <td>Optimized Sql</td> <td>Lucene</td></tr> <tr> <td> Initial load (first time)</td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td></td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td></td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td>4s?</td></tr></table></td> </tr> <tr> <td>Initial load</td> <td><table><tr class="c20"><td>20:</td><td>8-10s</td></tr><tr class="c60"><td>60:</td><td>15-20s</td></tr><tr class="c120"><td>120:</td><td>30s and higher</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td>2-3 seconds on average; can go under 1 second</td></tr><tr class="c60"><td>60:</td><td>4s</td></tr><tr class="c120"><td>120:</td><td>3-4s? can go up to 12 though</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td></td></tr></table></td> </tr> <tr> <td>All articles</td> <td><table><tr class="c20"><td>20:</td><td>6-7s on average</td></tr><tr class="c60"><td>60:</td><td>7-8-9s</td></tr><tr class="c120"><td>120:</td><td>15s and higher</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td>1s; can go a lot under</td></tr><tr class="c60"><td>60:</td><td>0.6-0.7s</td></tr><tr class="c120"><td>120:</td><td>can be very fast, less than 0.5s but also up to 3</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td>1.5s</td></tr></table></td> </tr> <tr> <td>One feed</td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td>2-3s</td></tr><tr class="c120"><td>120:</td><td>1-3s but also 10s</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td>0.5 seconds on average; never > 1s</td></tr><tr class="c60"><td>60:</td><td>0.5s</td></tr><tr class="c120"><td>120:</td><td>under a second but also a lot above, 3-4-5</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td>0.8, very low for few articles and as high as 1.5 for 1500+</td></tr></table></td> </tr> <tr> <td>Pagination navigation</td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td></td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td></td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td>1s on average</td></tr></table></td> </tr> <tr> <td>Keyword search</td> <td><table><tr class="c20"><td>20:</td><td>5s</td></tr><tr class="c60"><td>60:</td><td>5s</td></tr><tr class="c120"><td>120:</td><td>8-9s?</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td>0.8-1s</td></tr><tr class="c60"><td>60:</td><td>1s?</td></tr><tr class="c120"><td>120:</td><td>1.2s?</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td>0.5s</td></tr></table></td> </tr> <tr> <td>Tag search (see note) </td> <td><table><tr class="c20"><td>20:</td><td>6-7s</td></tr><tr class="c60"><td>60:</td><td>2-3s</td></tr><tr class="c120"><td>120:</td><td>6s?</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td>0.8-1s</td></tr><tr class="c60"><td>60:</td><td>0.9s?</td></tr><tr class="c120"><td>120:</td><td>1.2s</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td></td></tr></table></td> </tr> <tr> <td>Flagged articles</td> <td><table><tr class="c20"><td>20:</td><td>0.5s?</td></tr><tr class="c60"><td>60:</td><td>0.6s</td></tr><tr class="c120"><td>120:</td><td>1-2s</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td>0.3s</td></tr><tr class="c60"><td>60:</td><td>0.3s</td></tr><tr class="c120"><td>120:</td><td>0.5s</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td>0.8</td></tr></table></td> </tr> <tr> <td>Combined filter</td> <td><table><tr class="c20"><td>20:</td><td>0.5s?</td></tr><tr class="c60"><td>60:</td><td>from under a second to 3-4s, depends on the filter</td></tr><tr class="c120"><td>120:</td><td>?</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td>0.3s-0.4s</td></tr><tr class="c60"><td>60:</td><td>very low, 0.1. depends on the criteria too.</td></tr><tr class="c120"><td>120:</td><td>as low as 0.2-0.3 but also depends on filter</td></tr></table></td> <td><table><tr class="c20"><td>20:</td><td></td></tr><tr class="c60"><td>60:</td><td></td></tr><tr class="c120"><td>120:</td><td></td></tr></table></td> </tr> </table>
Remarks
- in the indexed table we notice a substantial improvement of the optimized sql over the "default" sql and its relative proximity to the results brought by lucene. The interesting results, from the speed improvement point of view are the ones one the 20 000 articles database, which is a real database with "real entropy"
- even if the table might not show it, the optimised sql results go often enough under 1 second for basic operations such as listing all articles for a feed and basic filtering and these times scale up pretty well.