[CWB] [ cwb-Feature Requests-2888410 ] CQPweb: derive metadata from
XML attribute
SourceForge.net
noreply at sourceforge.net
Mon Dec 14 07:28:10 CET 2009
Feature Requests item #2888410, was opened at 2009-10-29 00:23
Message generated for change (Settings changed) made by andrewhardie
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=722306&aid=2888410&group_id=131809
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: CQPweb
Group: None
>Status: Closed
Priority: 8
Private: No
Submitted By: Andrew Hardie (andrewhardie)
Assigned to: Andrew Hardie (andrewhardie)
Summary: CQPweb: derive metadata from XML attribute
Initial Comment:
For corpora with CWB pre-indexing.
Allow text-level metadata to be extracted from existing XML attributes that have been indexed as s-elements.
Use the following CQP trick:
> Texts = <text> [];
> tabulate Texts match text_id, match cat_name, match cat_code, match
> cat_major > "brown_meta.txt";
(Actually, it might be possible to get the data straight from the CQP::execute method - look into this)
----------------------------------------------------------------------
>Comment By: Andrew Hardie (andrewhardie)
Date: 2009-12-14 06:28
Message:
Both new methods are complete in latest commit; but extracting metadata
from s-atts has not been tested as I do not have any corpora to test it on.
----------------------------------------------------------------------
Comment By: Andrew Hardie (andrewhardie)
Date: 2009-11-07 07:07
Message:
Also, allow a metadata table to be auto-generated containing just the
text_ids, for corpora without any metadata fields.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=722306&aid=2888410&group_id=131809
More information about the CWB
mailing list