Re: Corpora: kwic concordances with Perl

Doug Cooper (doug@th.net)
Fri, 08 Oct 1999 02:36:26 +0700

At 14:20 7/10/99 +0000, you wrote:
>a) will not detect multiple occurrencences on a line,
>b) nor find complex patterns across several line
>Can someone suggest other ways of writing simple kwic programs in Perl?

Try this:

#!/usr/bin/perl
($fileName, $string, $width) = @ARGV; # eg: kwic data "find me" 10
open (F, "$fileName") || die "Could not open file $fileName. Bailing";
undef $/; # eliminate the input record delimiter
$data = <F>; # snarf in the entire file
$string =~ s/ /\\s/g; # Let spaces match across _and print_ newlines
#$data =~ s/\n/ /g; # Uncomment this to match/print newlines as spaces
while ($data =~ /(.{0,$width}$string.{0,$width})/g ) { #$1 holds the match
print "$1\n"; # print the string with 0..width characters on either side
} #all done

Best,
Doug Cooper
__________________________________________________
1425 VP Tower, 21/45 Soi Chawakun
Rangnam Road, Rajthevi, Bangkok, 10400
doug@th.net (662) 246-8946 fax (662) 246-8789

Southeast Asian Software Research Center, Bangkok
http://seasrc.th.net --> SEASRC Web site
http://seasrc.th.net/sealang --> SEALANG Web site