<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML xmlns="http://www.w3.org/TR/REC-html40" xmlns:v =
"urn:schemas-microsoft-com:vml" xmlns:o =
"urn:schemas-microsoft-com:office:office" xmlns:w =
"urn:schemas-microsoft-com:office:word" xmlns:m =
"http://schemas.microsoft.com/office/2004/12/omml"><HEAD>
<META content="text/html; charset=utf-8" http-equiv=Content-Type>
<META name=GENERATOR content="MSHTML 9.00.8112.16872">
<STYLE>@font-face {
        font-family: Cambria Math;
}
@font-face {
        font-family: Calibri;
}
@font-face {
        font-family: Verdana;
}
@page WordSection1 {size: 612.0pt 792.0pt; margin: 72.0pt 72.0pt 72.0pt 72.0pt; }
P.MsoNormal {
        MARGIN: 0cm 0cm 0pt; FONT-FAMILY: "Times New Roman",serif; FONT-SIZE: 12pt
}
LI.MsoNormal {
        MARGIN: 0cm 0cm 0pt; FONT-FAMILY: "Times New Roman",serif; FONT-SIZE: 12pt
}
DIV.MsoNormal {
        MARGIN: 0cm 0cm 0pt; FONT-FAMILY: "Times New Roman",serif; FONT-SIZE: 12pt
}
A:link {
        COLOR: blue; TEXT-DECORATION: underline; mso-style-priority: 99
}
SPAN.MsoHyperlink {
        COLOR: blue; TEXT-DECORATION: underline; mso-style-priority: 99
}
A:visited {
        COLOR: purple; TEXT-DECORATION: underline; mso-style-priority: 99
}
SPAN.MsoHyperlinkFollowed {
        COLOR: purple; TEXT-DECORATION: underline; mso-style-priority: 99
}
P.msonormal0 {
        FONT-FAMILY: "Times New Roman",serif; MARGIN-LEFT: 0cm; FONT-SIZE: 12pt; MARGIN-RIGHT: 0cm; mso-style-name: msonormal; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto
}
LI.msonormal0 {
        FONT-FAMILY: "Times New Roman",serif; MARGIN-LEFT: 0cm; FONT-SIZE: 12pt; MARGIN-RIGHT: 0cm; mso-style-name: msonormal; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto
}
DIV.msonormal0 {
        FONT-FAMILY: "Times New Roman",serif; MARGIN-LEFT: 0cm; FONT-SIZE: 12pt; MARGIN-RIGHT: 0cm; mso-style-name: msonormal; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto
}
SPAN.EmailStyle18 {
        FONT-STYLE: normal; FONT-FAMILY: "Verdana",sans-serif; COLOR: #1f497d; FONT-WEIGHT: normal; TEXT-DECORATION: none; mso-style-type: personal-reply
}
.MsoChpDefault {
        FONT-SIZE: 10pt; mso-style-type: export-only
}
DIV.WordSection1 {
        page: WordSection1
}
</STYLE>
<!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></HEAD>
<BODY lang=EN-GB bgColor=white vLink=purple link=blue>
<DIV><FONT size=2 face="Courier New">Thanks, Andrew, for those constructive
ideas.</FONT></DIV>
<DIV><FONT size=2 face="Courier New"></FONT> </DIV>
<DIV><FONT size=2 face="Courier New"></FONT> </DIV>
<DIV><FONT size=2 face="Courier New">I have experimented with your second
suggestion of adding a "lemma" column to the input. (For info, what
is marked up in my text is *partial* lemmatisation, covering changes at the
beginning of words, so I'll call it "demut(ation)" rather than
"lemma". Full lemmatisation would require attention to terminal
inflection as well.)</FONT></DIV>
<DIV><FONT size=2 face="Courier New"></FONT> </DIV>
<DIV><FONT size=2 face="Courier New"></FONT> </DIV>
<DIV><FONT size=2 face="Courier New">So, I could generate extra columns, like
this:</FONT></DIV>
<DIV><FONT size=2
face="Courier New">
b^hean
bean
bhean</FONT></DIV>
<DIV><FONT size=2
face="Courier New">
^mbean
bean
mbean</FONT></DIV>
<DIV><FONT size=2
face="Courier New">
Bean
bean
Bean</FONT></DIV>
<DIV><FONT size=2 face="Courier New">The first column is what is in the
text; this column can be removed from the file when the
other two have been generated from it. The second is the index term
("demut"). The third is what I want to see in contexts
("word").</FONT></DIV>
<DIV><FONT size=2 face="Courier New"></FONT> </DIV>
<DIV><FONT size=2 face="Courier New"></FONT> </DIV>
<DIV><FONT size=2 face="Courier New">While this will work, <FONT size=2>I am not
comfortable with the idea of storing two columns to hold things which
(unlike with normal lemmatisation) can be automatically generated from
one column — during the indexing process, if access by a user-supplied
script were usable there, acting on the text shown in column 1 to produce
what is shown in column 2.</FONT></FONT></DIV>
<DIV><FONT size=2 face="Courier New"></FONT><FONT face="Courier New"><FONT
size=2></FONT></FONT> </DIV>
<DIV><FONT size=2 face="Courier New"></FONT> </DIV>
<DIV><FONT face="Courier New"><FONT size=2>Turning from the index keywords to
the contexts,</FONT></FONT><FONT face="Courier New"><FONT size=2> I am
unsure how the extra-column approach will handle the case where a single
token of text is to be split into two index items (column 2), which should be
displayed in context without any space between them.</FONT></FONT></DIV>
<DIV><FONT size=2
face="Courier New">
sean+b^hean
sean
sean+</FONT></DIV>
<DIV><FONT size=2
face="Courier New"> bean
bhean</FONT></DIV>
<DIV><FONT size=2 face="Courier New">Here I have used a + sign at the end of an
item in column 3, to show that I wish to have no space inserted in the context
before the following word. Is there already a way of doing this in
CWB? If not, access by a user-supplied script to the production
of contexts could act on the text shown in column 1 to
produce "seanbhean".</FONT></DIV>
<DIV><FONT size=2 face="Courier New"></FONT><FONT size=2
face="Courier New"></FONT> </DIV>
<DIV><FONT size=2 face="Courier New"></FONT> </DIV>
<DIV><FONT size=2 face="Courier New">Software of my own gives proof of
concept of processing text marked up as in column 1 above,
allowing interpretation of the markup during both the extraction of indexing
terms and the production of contexts, and I would still like the CWB
developers to consider my request for the facility to execute a
user-supplied script at these two points in the process.</FONT><FONT
size=2></FONT></DIV>
<DIV><FONT size=2 face="Courier New"></FONT> </DIV>
<DIV><FONT size=2 face="Courier New"></FONT> </DIV>
<DIV><FONT size=2 face="Courier New">Many thanks again for your
advice,</FONT></DIV>
<DIV><FONT size=2 face="Courier New">Ciarán.</FONT></DIV>
<BLOCKQUOTE
style="BORDER-LEFT: #000000 2px solid; PADDING-LEFT: 5px; PADDING-RIGHT: 0px; MARGIN-LEFT: 5px; MARGIN-RIGHT: 0px">
<DIV style="FONT: 10pt arial">----- Original Message ----- </DIV>
<DIV
style="FONT: 10pt arial; BACKGROUND: #e4e4e4; font-color: black"><B>From:</B>
<A title=a.hardie@lancaster.ac.uk
href="mailto:a.hardie@lancaster.ac.uk">Hardie, Andrew</A> </DIV>
<DIV style="FONT: 10pt arial"><B>To:</B> <A title=cwb@sslmit.unibo.it
href="mailto:cwb@sslmit.unibo.it">Open source development of the Corpus
WorkBench</A> </DIV>
<DIV style="FONT: 10pt arial"><B>Sent:</B> Friday, March 16, 2018 7:04
PM</DIV>
<DIV style="FONT: 10pt arial"><B>Subject:</B> Re: [CWB] Suggestion: user
intervention in constructing an index</DIV>
<DIV><FONT size=2 face=Arial></FONT><FONT size=2 face=Arial></FONT><FONT
size=2 face=Arial></FONT><FONT size=2 face=Arial></FONT><FONT size=2
face=Arial></FONT><FONT size=2 face=Arial></FONT><FONT size=2
face=Arial></FONT><FONT size=2 face=Arial></FONT><FONT size=2
face=Arial></FONT><FONT size=2 face=Arial></FONT><FONT size=2
face=Arial></FONT><BR></DIV>
<DIV class=WordSection1>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US">Hi</SPAN>
<SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US">Ciarán,<o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US"><o:p> </o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US">There
are two answers here… <o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US"><o:p> </o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US">First,
it most certainly is already possible to adjust the form of the words as they
are indexed. Simply prepare a script to make the change and pipe your files
through it into the cwb-encode standard input (cwb-encode reads from standard
input if no files are specified).<o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US"><o:p> </o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US">(Or
just run your converter separately on the data to create a modified version,
and then index that, to avoid mucking about with pipes!)<o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US"><o:p> </o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US">Second,
although that is the direct answer to your question, actually it is probably
not “the right thing” to do. What you are talking about here is effectively
lemmatisation – since <I>bean/bhean/mbean</I> are different forms of a single
lemma, converting them all to “bean” means lemmatising. So what you’re talking
about is indexing the lemma in place of the wordform. But the “right way” to
do this in CWB is to add the lemma as a separate attribute – allowing the
lemma to be queried, as well as / instead of the word.<o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US"><o:p> </o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US">This
means adding the lemma as a second column of the input file, like
thus:<o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US"><o:p> </o:p></SPAN></P>
<P style="MARGIN-LEFT: 36pt" class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US">Bean
bean<o:p></o:p></SPAN></P>
<P style="MARGIN-LEFT: 36pt" class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US">(…)<o:p></o:p></SPAN></P>
<P style="MARGIN-LEFT: 36pt" class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US">ar
ar<o:p></o:p></SPAN></P>
<P style="MARGIN-LEFT: 36pt" class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US">mbean
bean<o:p></o:p></SPAN></P>
<P style="MARGIN-LEFT: 36pt" class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US">(…)<o:p></o:p></SPAN></P>
<P style="MARGIN-LEFT: 36pt" class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US">mo
mo<o:p></o:p></SPAN></P>
<P style="MARGIN-LEFT: 36pt" class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US">bhean
bean<o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US"><o:p> </o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US">(and
likewise for plural forms of <I>bean</I>, etc etc.)<o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US"><o:p> </o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US">I
don’t know what lemmatisation tool is considered standard for Gaelic at the
moment, but I guess there must be options out there?
<o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US"><o:p> </o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US">You
can then do queries like this:<o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US"><o:p> </o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US">
[lemma="bean"];<o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US"><o:p> </o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US">…
to retrieve <I>bean/mbean/bhean</I> all at the same
time.<o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US"><o:p> </o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US">The
advantage of encoding the lemma as a separate attribute is that the
concordance can <I>display</I> the actual form that appears in the
word-attribute, even if you have <I>searched</I> on the lemma-attribute.
Whereas if you replace the word forms, you don’t get
that.<o:p></o:p></SPAN></P>
<P class=MsoNormal><I><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US"><o:p> </o:p></SPAN></I></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US">Hope
this helps!<o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US"><o:p> </o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US">best<o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US"><o:p> </o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US">Andrew.<o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US"><o:p> </o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Verdana',sans-serif; COLOR: #1f497d; FONT-SIZE: 10pt; mso-fareast-language: EN-US"><o:p> </o:p></SPAN></P>
<DIV>
<DIV
style="BORDER-BOTTOM: medium none; BORDER-LEFT: medium none; PADDING-BOTTOM: 0cm; PADDING-LEFT: 0cm; PADDING-RIGHT: 0cm; BORDER-TOP: #e1e1e1 1pt solid; BORDER-RIGHT: medium none; PADDING-TOP: 3pt">
<P class=MsoNormal><B><SPAN
style="FONT-FAMILY: 'Calibri',sans-serif; FONT-SIZE: 11pt"
lang=EN-US>From:</SPAN></B><SPAN
style="FONT-FAMILY: 'Calibri',sans-serif; FONT-SIZE: 11pt" lang=EN-US>
cwb-bounces@sslmit.unibo.it [mailto:cwb-bounces@sslmit.unibo.it] <B>On Behalf
Of </B>Ciarán Ó Duibhín<BR><B>Sent:</B> 16 March 2018 18:18<BR><B>To:</B>
cwb@sslmit.unibo.it<BR><B>Subject:</B> [CWB] Suggestion: user intervention in
constructing an index<o:p></o:p></SPAN></P></DIV></DIV>
<P class=MsoNormal><o:p> </o:p></P>
<DIV>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Arial',sans-serif; FONT-SIZE: 10pt">I would like to
suggest/request a facility in CWB (or its successor) where a user can
intervene in the construction of an index.</SPAN><o:p></o:p></P></DIV>
<DIV>
<P class=MsoNormal> <o:p></o:p></P></DIV>
<DIV>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Arial',sans-serif; FONT-SIZE: 10pt">I envisage allowing
the user to supply a script which can receive the token, extracted from the
text and destined to be placed in an index, and can transform it.
The transformed token would be placed in the index, rather than the
original form.</SPAN><o:p></o:p></P></DIV>
<DIV>
<P class=MsoNormal> <o:p></o:p></P></DIV>
<DIV>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Arial',sans-serif; FONT-SIZE: 10pt">The attached
concordance output (tobar.jpg) — if attachments are allowed on the list
— was made by another program, and shows an example of why I need this
facility.</SPAN><o:p></o:p></P></DIV>
<DIV>
<P class=MsoNormal> <o:p></o:p></P></DIV>
<DIV>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Arial',sans-serif; FONT-SIZE: 10pt">In my example, under
the keyword "bean" are indexed/concorded several different forms, including
"bean" and "bhean" and "mbean" and "Bean", among others. As far as I am
aware, this cannot be achieved with CWB at
present.</SPAN><o:p></o:p></P></DIV>
<DIV>
<P class=MsoNormal> <o:p></o:p></P></DIV>
<DIV>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Arial',sans-serif; FONT-SIZE: 10pt">In my texts, "bhean"
is marked up as "b^hean", and "mbean" as "^mbean". I would like to be
able to supply a script which, in my case, would drop the character "^"
and the letter immediately following it.</SPAN><o:p></o:p></P></DIV>
<DIV>
<P class=MsoNormal> <o:p></o:p></P></DIV>
<DIV>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Arial',sans-serif; FONT-SIZE: 10pt">In displayed
contexts, I would need to be able to drop the character "^h" but retain the
letter following it. This is what happens in the program which produced
the screenshot.</SPAN><o:p></o:p></P></DIV>
<DIV>
<P class=MsoNormal> <o:p></o:p></P></DIV>
<DIV>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Arial',sans-serif; FONT-SIZE: 10pt">In my case again, I
would also make my script lower-case the token, bringing "Bean" into the
family.</SPAN><o:p></o:p></P></DIV>
<DIV>
<P class=MsoNormal> <o:p></o:p></P></DIV>
<DIV>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Arial',sans-serif; FONT-SIZE: 10pt">It would further be
necessary to allow the script to return more than one keyword. For
example, the text might contain "seanbhean", which I encode as
"sean+b^hean". My script here would act on the character "+" and return
TWO words for the index, "sean" and "bean". Contexts would show
"seanbhean", with "^" and "+" both deleted.</SPAN><o:p></o:p></P></DIV>
<DIV>
<P class=MsoNormal> <o:p></o:p></P></DIV>
<DIV>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Arial',sans-serif; FONT-SIZE: 10pt">For contexts, it
might suffice (for my needs) to give CWB a list of characters to be
dropped from contexts, without going to the lengths of allowing a user script
for contexts, in addition to the script for
keywords.</SPAN><o:p></o:p></P></DIV>
<DIV>
<P class=MsoNormal> <o:p></o:p></P></DIV>
<DIV>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Arial',sans-serif; FONT-SIZE: 10pt">With
thanks,</SPAN><o:p></o:p></P></DIV>
<DIV>
<P class=MsoNormal><SPAN
style="FONT-FAMILY: 'Arial',sans-serif; FONT-SIZE: 10pt">Ciarán Ó
Duibhín.</SPAN><o:p></o:p></P></DIV>
<DIV>
<P class=MsoNormal> <o:p></o:p></P></DIV>
<DIV>
<P class=MsoNormal> <o:p></o:p></P></DIV></DIV>
<P>
<HR>
<P></P>_______________________________________________<BR>CWB mailing
list<BR>CWB@sslmit.unibo.it<BR>http://liste.sslmit.unibo.it/mailman/listinfo/cwb<BR></BLOCKQUOTE></BODY></HTML>