[CWB] Aternative and patterns order in queries
Sébastien Jacquot
sebastien.jacquot at univ-fcomte.fr
Tue Apr 7 11:35:55 CEST 2015
Hi,
in a corpora with this structure :
<text date="1900">
<body>
<p>...<q>...</q>...</p>
<p>...</p>
</body>
</text>
<text date="1901">
<body>
<p>...</p>
<p>...</p>
</body>
</text>
...
I'd like to get the tokens outside the "q" tags.
Do you know why these 2 queries don't return the same tokens ?
<text>[!q]+<q> | </q>[!q]+<q> | </q>[!q]+</text>;
</q>[!q]+</text> | <text>[!q]+<q> | </q>[!q]+<q>;
The first query doesn't work as expected, the returned tokens match only
the first alternative pattern part : <text>[!q]+<q>
as if the pipe character would act like the OR boolean condition instead
of the REGEX alternative.
The second query seems to work as expected and returns all the tokens
outside the "q" tag.
Could these 2 behaviors be different because of the matching strategy
configuration ?
Thanks in advance for the help.
Cheers,
Sebastian
--
ELLIADD, EA 4661
UFR SLHS - Université de Franche-Comté
30-32 rue Mégevand
25030 Besançon cedex
03.81.66.54.22
More information about the CWB
mailing list