[CWB] [ cwb-Bugs-2893764 ] CQPweb: number of files in query is not
cached
SourceForge.net
noreply at sourceforge.net
Sat Nov 7 10:53:33 CET 2009
Bugs item #2893764, was opened at 2009-11-07 09:53
Message generated for change (Tracker Item Submitted) made by andrewhardie
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=722303&aid=2893764&group_id=131809
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: CQPweb
Group: None
Status: Open
Resolution: None
Priority: 8
Private: No
Submitted By: Andrew Hardie (andrewhardie)
Assigned to: Andrew Hardie (andrewhardie)
Summary: CQPweb: number of files in query is not cached
Initial Comment:
kwic display is _painfully_ slow for large result sets, especially when there are more than 1 million matches.
The culprit is the following line in lib/concordance.inc.php:
/* get a list of texts with frequencies && count 'em */
$num_of_files = count( $cqp->execute("group $qname match text_id") );
So whenever a page of query hits is displayed, you use CQP's "group"
command to re-calculate the number of different texts containing
matches. This can be very expensive, so the information _must_ be
cached somewhere in the database.
Solution: add a "number of files" field to table saved_queries, and read this instead of running CQP's group command.
That way, the group command will only be run when a new query is cached for the first time.
(This must include the creation of postprocessed (thinned) files: this can probably be generalised in concordance-post.inc.php)
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=722303&aid=2893764&group_id=131809
More information about the CWB
mailing list