Parallel Execution Skew – Addressing Skew Using Manual Rewrites

Comments 0

Share to social media

In this part of the series we will have a look at different approaches to how Parallel Execution Skew can be addressed via manual re-writes of an affected query. Let’s have a look at a couple of options.

Approach 1: Mimic the new 12c skew aware feature

Looking at the new 12c special skew aware distribution feature that I’ve introduced in the previous instalment of this series, we can see that broadcasting the few skewed values and using a hash distribution for the remaining values is a viable solution to the problem, so we could come up with something like the following:

I’ve split the original query into two parts: The first part of the new query covers the non-popular values that should distribute well using a hash distribution – it explicitly filters the popular values.

The second part covers the popular values only. Based on the assumption that there are only a few popular values out of many, restricting T_1 on those popular values should actually result in a very small row source in comparison to T_2 – which is filtered on the popular values, too.

So this very small row source should be distributed by BROADCAST automatically by the optimizer, otherwise this could be hinted using a corresponding PQ_DISTRIBUTE hint. By using the BROADCAST distribution for T_1, the skewed value distribution in T_2.FK_ID_SKEW doesn’t matter as this information is not used to distribute the data – in this case it is simply a random distribution based on the PX BLOCK ITERATOR operator of the parallel full table scan of T_2.

Running this query on my test system I’m indeed back to my optimal 15 seconds runtime. However this construct has several disadvantages:

– You need to know your data and explicitly code into the query the popular values. It might be possible to automate this by dynamically generating the SQL based on a previous query that looks for popular values in the join columns, similar to what 12c does under the cover

– It requires accessing the row sources twice, instead of once – certainly a drawback for large row sources and more complex queries where the row source is not a simple table but itself already the result of a join

– It is a SQL construct that might be difficult to implement for tools that generate SQL automatically based on metadata like reporting tools

Here you can see the beauty of the new 12c skew aware distribution, as it does all above automatically and dynamically, but unfortunately applies only to a limited number of scenarios at present, as demonstrated in the previous part of this series.

Are there other ways we can achieve good distribution that don’t have above disadvantages?

Another possibility is a re-mapping of the popular values, basically spreading the popular values among many other values and by doing so avoiding the skewed distribution.

Assuming that the join keys are positive integers we could make use of the unused range of negative integer values to re-map the popular values, thus making sure that the remapped values don’t overlap with the original values in the columns. Depending on the actual case different solutions might be required for finding suitable values that can be used for re-mapping.

For that purpose let’s create a table that we can use to re-map our original popular values:

There are probably many ways how this table could be set up and used – above code assumes that there are several positive integer values that are popular in some tables and for each of those values it will generate &distrib_size entries of negative integer values in a non-overlapping way.

You can already see at this stage that this approach still requires you to know the data – as the table needs to be pre-populated, so it doesn’t address point number one above.

Now we can re-write our sample query in the following way to make use of the new table:

The approach here is to use a view for the “other” row source holding the popular values as single entries / primary keys that spreads the popular values across &distrib_size other values that don’t overlap with any other existing values in the tables, and doesn’t touch any other value in that table that isn’t popular, hence the outer join between T_1 and T_REDISTRIB. I’ve filtered T_REDISTRIB on the popular values to re-map in that particular usage of the table, assuming that T_REDISTRIB could be used for other joins with other tables / values, too, otherwise this filter is unnecessary.

Then this view is joined to the row source with the popular value. Assuming there is a suitable value in that table (obviously most suitable in this case would be a primary key using positive integers) we can make use of that value to re-map the popular values to the same range of values as used to populate the T_REDISTRIB table and that come out of the view T_1_R for the popular values.

In the example above the popular value 1 according to T_REDISTRIB will be remapped to 10000 values ranging from -10001 to -20000, so instead of one row with value ID = 1 the view T_1_R will return 10000 rows in the negative range mentioned. The join expression “case when t_2.fk_id_skew in (1/*, 3, 7*/) then (t_2.fk_id_skew * -&distrib_size) – mod(t_2.id, &distrib_size) – 1 else t_2.fk_id_skew end” will do the same re-mapping for T_2.FK_ID_SKEW based on the primary key value T_2.ID, so whenever the original value is 1 it will be remapped to some random value in the range between -10001 and -20000.

The net effect of that re-mapping on both sides of the join is that the popular value will be replaced by &distrib_size other values and hence the hash distribution will be based on those replacement values rather than the original popular values without altering the result of the query.

If I run this query on my test system I’m again back to the optimal 15 seconds, so this approach seems to work fine. The good thing here is that we don’t need to access the row sources twice, so we address at least one important disadvantage of the other solution above. However, this construct is still hard to achieve with a generic SQL generator, and when looking at the execution plan the optimizer is clearly confused by the complex join expression and therefore the estimated join cardinality is way off – a potential problem for more complex queries where this join result would be to be joined to other row sources:

An Optimized Approach 2

So can we do better? Yes, we can. We can do two things. First, we can put the inline view T_1_R into a stored view, so that it could be used like a regular table/view:

Second, at least from Oracle 11g on we can make use of a virtual column for the complex join expression (and for pre-11g versions we could at least create a function-based index on the expression to collect statistics for the optimizer to come up with improved cardinality estimates):

The effect of the virtual column is two-fold: First, we can now simply use the new virtual column in our join expression. Second, when gathering statistics for the virtual column, the optimizer can use these column statistics to come up with much improved join cardinality estimates. Our query would now look like this:

That looks pretty similar to our original query and should be something that could be generated by reporting tools or similar. The join cardinality estimate looks much better now as well if you check the resulting execution plan.

So this approach at least addresses two out of the three disadvantages mentioned above, which is not too bad. You still need to know the popular values and code that into the view and join expression, but apart from that this approach looks pretty good.

Article tags

Load comments

About the author

Randolf Geist's contributions