DB:
MySQL 5.7
Table Engine:
ENGINE=InnoDB
Query:
SELECT count(*) as COUNT
FROM Data d
WHERE d.StartDate >=?
AND d.StartDate <?
AND d.EntityID IN (1245)
AND d.Condition01 > 0
AND d.Condition02 = 'abs'
AND (d.Condition03 LIKE '%abs%' OR p.Condition04 LIKE '%abs%');
Parameters:
String start = "2021-12-01 00:00:00";
String end = "2022-04-01 00:00:00";
On a 200+ millions of rows query executes 20 seconds.
If start and date range divide to smaller periods, for example:
2021-12-01 00:00:00 2021-12-02 00:00:00
2021-12-02 00:00:00 2021-12-03 00:00:00
...
2022-03-30 00:00:00 2022-03-31 00:00:00
2022-03-31 00:00:00 2022-04-01 00:00:00
and execute queries with this parameters in parallel results will be obtained in 2 seconds.
Approach works well on Server with MySQL 5.6 but does not work on Server with MySQL 5.7.
Is it MySQL inner optimization or can be something else?
Can TABLE LOCK cause problem?
What else can cause the same long execution for count queries in parallel execution as for one query with big date range?
Solution of the problem
For that query, this is likely to be optimal:
INDEX(Condition02, EntityID, StartDate)
We could discuss a potentially faster approach using Summary Tables, but first, tell us some details about the selectivity of each column.
Aucun commentaire:
Enregistrer un commentaire