OpenEuphoria: Forum: clarification needed: regex new()

1. clarification needed: regex new()

Posted by _tom (admin) Oct 04, 2010
1173 views

Here is a code sample that I cannot explain:

 
include std/regex.e as re 
include std/console.e 
 
object x 
 
display("This is not expected to work" ) 
x = re:find( `World`, "Hello World!" ) 
? x 
display(  error_to_string( x ) ) 
-- ERROR_NULL 
 
display("\n This is expected to work" ) 
regex r=re:new( `World` ) 
x = re:find( r, "Hello World!" ) 
? x 
 
---- 
-- given a preceding re:new(`World`) 
-- this now works 
display( "\n Why does this work?" ) 
x = re:find( `World`, "Hello World!" ) 
display( x ) 
-- { 
--     {7,11} 
-- }

So, what is going on in the internals of PCRE?

new topic » topic index » view message » categorize

2. Re: clarification needed: regex new()

Posted by useless Oct 04, 2010
1218 views

Re:

regex r=re:new( `World` )

, why not

regex r=re:new( "World" )

? It's a sequence, no?

useless

new topic » goto parent » topic index » view message » categorize

3. Re: clarification needed: regex new()

Posted by mattlewis (admin) Oct 04, 2010
1143 views

_tom said...

Here is a code sample that I cannot explain: [snip]

So, what is going on in the internals of PCRE?

This is due to the way regular expressions are implemented. We've added a bit of metadata to doubles and sequences. The metadata can serve several purposes, but one of them is to deal with the extra data for a regular expression. PCRE allocates some memory for its own use.

I think that what's happening is that the regex data is being attached to the literal `World` object, rather than creating a copy, as it probably should. So when you pass `World` after it was passed to regex:new(), it's still carrying the PCRE information. Before that call, add:

delete( `World` )

...and the final call will stop working, because that PCRE data has been freed. In reality, we should probably always return a new copy of the sequence when you call regex:new().

Matt

new topic » goto parent » topic index » view message » categorize

4. Re: clarification needed: regex new()

Posted by jeremy (admin) Oct 05, 2010
1120 views

useless said...

Re:

regex r=re:new( `World` )

, why not

regex r=re:new( "World" )

? It's a sequence, no?

"World" and `World` are the same in 4.x. `...` is a new string delimiter. It is different in that escape characters are not evaluated in a `...` string, thus nice to use for regular expressions. For example:

sequence seq1 = "Hello\nWorld" 
sequence seq2 = `Hello\nWorld` 
 
printf(1, "seq1=%s\nseq2=%s\n", { seq1, seq2 }) 
 
-- Output: 
-- seq1=Hello 
-- World 
-- seq2=Hello\nWorld

Where this comes in handy with regular expressions is when you use character classes, modifiers, etc... For example:

-- String to match: 48594: A - This is a description of ticket 48594 
 
sequence reg1 = `\d+:[A-C]\s+-\s+\b[\w\s]+`  
 
-- Written using "" it would have to look like: 
 
sequence reg2 = "\\d+:[A-Z]\\s+-\\s+\\b[\\w\\s]+"

There are of course many other uses, but that's why you'll see it used in regular expressions even when the regular expression might not (yet) require special modifiers. Just easier to change and use them in the future.

Jeremy

new topic » goto parent » topic index » view message » categorize

5. Re: clarification needed: regex new()

Posted by useless Oct 05, 2010
1068 views

So i can do:

sequence bloo = 'This is 
a sentence across 
three lines' 
 
? bloo 
This is 
a sentence across 
three lines

?
useless

new topic » goto parent » topic index » view message » categorize

6. Re: clarification needed: regex new()

Posted by mattlewis (admin) Oct 05, 2010
1097 views

useless said...

So i can do:

sequence bloo = 'This is 
a sentence across 
three lines' 
 
? bloo 
This is 
a sentence across 
three lines

?

Not quite. You used a regular single quote. In order to do that, you need to use the back tick. On a US keyboard, at least, it's up next to the 1 key and shares the key with the tilde.

 
sequence bloo = `This is 

a sentence across 
three lines` 

 
puts(1,  bloo ) 
This is 
a sentence across 
three lines

Matt

new topic » goto parent » topic index » view message » categorize

OpenEuphoria

1. clarification needed: regex new()

2. Re: clarification needed: regex new()

3. Re: clarification needed: regex new()

4. Re: clarification needed: regex new()

5. Re: clarification needed: regex new()

6. Re: clarification needed: regex new()

Search

Include:

Quick Links

User menu

Misc Menu