OpenEuphoria: Forum: strings, mirc-Eu

1. strings, mirc-Eu

Posted by Kat <KSMiTH at PELL.NET> Dec 28, 1999
548 views

Ok, i sent a zip to RDS, included are:

1) the mirc script to talk to Eu
2) the strtok library for Eu (formerly named mirctok, renamed for copyright
reasons)
3) String_9 , a backwards step from String.e
4) robobot3 , where you handle protocols, it that handles Srvsckip
5) Srvsckip , which handles
6) Win32lib
7) EuBot_5 an example in Eu
8) a readme file

1) The mirc script runs only in mIRC, it's not Eu code, and it's easy enough
to learn and modify to do what you need. I have trapped and sent to Eu
almost everything than can happen on irc, if it's useful for irc bots. It
shows advanced mirc scripting and what can be done to connect mirc to other
servers besides irc. My first http server and browser i wrote in mirc, it
can also do email and telnet, after all, it's just code...

2) This is the asked-for strings library, aka mirctoks . It lets you relate
to sequences at a "higher" level, to the words of strings, and to what
separates the words if you like. It's easier, to me, to deal with strings as
collections of words, rather than bytes in a sequence, altho the sequences
used in strtok are real Eu sequences, so you can still byte them. Gabriel
had the foresight to make parse(), which is now global. I added getxml() for
parsing <tags>..</tags> in databases, you can easily change it to use
[tags]..[/tags] if you wish. Lucius is prolly still working on speeding this
up,, not sure. I may add some more xml processing code any day now. With
strtok you can count words, specify words, locate them, delete them, change
them, add them, retrieve specific words, etc.
Email me suggestions.

3) More string routines, required for useful bot workings, most noteable is
strippunct() and strip(), removing punctuation and excess spaces and color
codes,, again, for dealing with words in a sentence. In NLP it is often best
to have punctuation removed initially, and no one uses any part of English
(or other languages) properly on irc anyhow...

4) Robobot is the place to do different protocols, this version handles what
we set up for EuBot.ew or Bot.e files and the mirc code included in the zip
file. It handles the init, handshaking, sock open/close/block, and processes
the header for the mirc code, you'll change this if you do an http
interface, or telnet, email, etc.,, otherwise it's a stable version,, right,
Greg?

5) an enhanced Srvsckip, but i didn't look to see what was enhanced, all i
know is that it works now, and faster. Greg may be still working and editing
it.

6) Win32lib ,, what else can one say? I included this because i know the
other zip files  works with this edition. /me bows to David Cuny.

7) an example of what can be done to the data coming from mirc, and how mirc
can tell Eu what to do, and how Eu can tell mirc what to do. We did this to
connect mirc to Eu and build frames to do other things.

8) the readme,, just some setup stuff from me, but if you don't read it,
don't blame me if it doesn't run!

One point that can be made is that Eu doesn't know it's connection is to
mIRC, it could be another Eu or Pirch or Perl (barf) or C code. And same for
mIRC, it could be connected to another mirc, etc. Also, i/we all used
127.0.0.1/localhost as the ip to connect to, so the code as it is will look
for handshakes on only your machine, to change it to another puter on an
intra/internet, change the ip.

Enjoy.

Kat

new topic » topic index » view message » categorize

2. Re: strings, mirc-Eu

Posted by "Cuny, David at DSS" <David.Cuny at DSS.CA.GOV> Dec 28, 1999
509 views

I've started some work on perl-like pattern matching. I don't know if I'll
have it done in a timely manner, so here's the basic rundown of how it works
(in theory), in case anyone finds it interesting enough to continue with.
The basic pattern matcher has been written, so I know the approach is
feasible.

Here's a simple basic perl pattern:

   [a-z]

This says that the character needs to match once (and only once) a character
in the range of "a" through "z". That gets coded into a Euphoria sequence
like so:

   { IS, 1, 1, "az" }

The first argument specifies that the range must match (as opposed to NOT
matching). Arguments 2-3 specify the number of times this match is to take
place. The final argument is the range string. The pattern:

   [a-zA-Z_]

specifies that the character must be within a-z or A-Z or '_'. That gets
coded as:

   { IS, 1, 1, "azAZ_" }

The modifiers ? (zero or one), * (zero or more) +, (one or more) and
{min,max} (at least min, but no more than max} get coded as:

   { .. 0, 1, .. }     -- ? in perl; zero or one
   { .. 0, MANY, .. }  -- * in perl; zero or more
   { .. 1, MANY, .. }  -- + in perl; one or more
   { .. min, max }     -- {min,max} in perl; at least min but no more than
max

Incidentally, the constant MANY is merely -1. As an example, here's the
specification for a string of alphanumeric characters:

   [a-zA-Z0-9_]+

It gets coded as:

   { IS, 1, -1, "azAZ09__" }

There are only a couple more operators needed to complete the code. The
STRING operator matches against a given string:

   { STRING, <string> }


AND and OR allow tests to be string together. A simple example:

   fee|fie|foe

gets coded:

   { OR,
      { STRING, "fee" },
      { STRING, "fie" },
      { STRING, "foe" }
   }

More practically, here's a specification for a Euphoria word. It begins with
an alpha character, and is optionally followed by one or more alpha,
numeric, or underscore characters:

   [a-zA-Z][a-zA-Z0-9_]+

The coding is:

   { AND,
      { IS, 1, 1, "azAZ" },
      { IS, 0, MANY, "azAZ09__" }
   }

I haven't forgotten about the parenthesis operators; I just haven't coded
them up yet. Externally, they simply get coded as {OPEN} and {CLOSE}. For
example:

   (\d+) (\d+)

would be coded as:

   { AND
      { OPEN },
      { IS, 1, MANY, "09" },
      { CLOSE },
      { OPEN },
      { IS, 1, MANY, "09" },
      { CLOSE }
   }

Since I haven't actually coded the parenthesis handlers (although I *think*
I know how to do it), I won't pretend to explain them.

Here are some small 'real life' examples (parenthesis coding omitted):

Perl expression:

   /^Subject: (.*)/

Euphoria coding:
   { AND,
      { STRING, "Subject:" },
      { IS_NOT, 1, MANY, "\n\n" }
   }

Perl expression:
   /^Date (\d+) (\w+) (\d+) (\d+): (\d+) (.*)$/

Euphoria coding:

   { AND
      { STRING, "Date:" }
      { IS, 1, MANY, "09" }
      { IS, 1, MANY, "az09__" }
      { IS, 1, MANY, "09" }
      { IS, 1, MANY, "09" }
      { STRING, ":" }
      { IS, 1, MANY, "09" }
      { NOT, 0, MANY, "\n\n" }
   }

I think this shows that an implementation of perl-like pattern matching in
Euphoria is certainly feasible. I've already coded up a simple pattern
matcher, *without* parenthesis assignment (I'm working on it!). Writing a
parser to convert Perl patterns into code is a bit more difficult.

Is anyone interested in this sort of thing?

Thanks.

-- David Cuny

new topic » goto parent » topic index » view message » categorize

3. Re: strings, mirc-Eu

Posted by Caballero Rojo <pampeano at ROCKETMAIL.COM> Dec 29, 1999
492 views

Hello David,
      Although I don't know Perl, I always wanted to learn it. I think
      that if you can make that works, you can really help me and
      others than want to learn Perl and already knows Eu4.
      Well, if it doesn't take too much time, I think you should keep
      working on it.
      That's all.
      Good luck

--
Best regards,
 Caballero Rojo                            mailto:pampeano at rocketmail.com


Tuesday, December 28, 1999, 7:00:39 PM, you wrote:

CDD> I've started some work on perl-like pattern matching. I don't know if I'll
CDD> have it done in a timely manner, so here's the basic rundown of how it
works
CDD> (in theory), in case anyone finds it interesting enough to continue with.
CDD> The basic pattern matcher has been written, so I know the approach is
CDD> feasible.

CDD> <-- CUTTED MAIL HERE -->

CDD> I think this shows that an implementation of perl-like pattern matching in
CDD> Euphoria is certainly feasible. I've already coded up a simple pattern
CDD> matcher, *without* parenthesis assignment (I'm working on it!). Writing a
CDD> parser to convert Perl patterns into code is a bit more difficult.

CDD> Is anyone interested in this sort of thing?

CDD> Thanks.

CDD> -- David Cuny



__________________________________________________
Do You Yahoo!?
Talk to your friends online with Yahoo! Messenger.
http://messenger.yahoo.com

OpenEuphoria

1. strings, mirc-Eu

2. Re: strings, mirc-Eu

3. Re: strings, mirc-Eu

Search

Include:

Quick Links

User menu

Misc Menu