-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Events: Stats sometimes too slow. #8
Comments
On IntelMQ-CB-Mailgen installations with large databases, we see that the stats calls are slow. Initial calls of PostgreSQL's Typical is to first have a time frame, which gives a very large result set and then more criteria, for which no index is used anymore, because the indexes used to far to not have information that can be used to apply a second index on a result set of the first. Because the other search criteria also matching often, it also does not make sense with many queries to do the other criteria first because the intermediate result set would also be large leading to the same problem. The idea is to change the structure of the database to allow indexes to be applied after each other. At least the advantage of the time range should be used. |
If apache2 is used as https Server, we can add the time it takes to serve a request to the LogFormat: e431153 |
If apache is used to log stuff, the relevant log files are
Here some commands that can help to count certain requests, here with parameters in the url:
or only requests not made by the client python-requests:
|
AnalysisOne problem seems to be that many queries will use the This is likely to have an impact as the PostgreSQL estimate of table sizeAs a real SELECT reltuples::bigint AS estimate
FROM pg_class
WHERE oid = to_regclass('public.events'); understand where time is spendlong queries can be understood better by issuing as statement like the following: EXPLAIN (ANALYZE, BUFFERS) statement |
Fine tuningMeasurements with EXPLAIN and atop showed that there was room for finetuning of postgresql 9.5, One source of optimisation ideas is https://pgtune.leopard.in.ua . Looking at logsThe following commands from the https://en.wikipedia.org/wiki/The_Unix_Programming_Environment zgrep --invert-match python-requests fody-backend-access.log.2.gz | \
sed 's/\(^.*\) \([0-9]*$\)/\2 \1/' | \
awk '{ if ($1 > 1000) print $0 }' | \
sort -n Short explanation: The |
When running a query over a full month it may take several minutes (up to 10) on a test-system we are running which has a couple of million events in that timeframe (e.g. 25M events).
One reason is that the SQL query sometimes uses the index and sometimes it does not.
The text was updated successfully, but these errors were encountered: