Phix segfault when wildcard_match file between 1.5 and 2 million lines

new topic     » topic index » view thread      » older message » newer message

Hi,

I setup a program to test the execution speed of Phix.
The program should read a text file line by line and discard all lines that match two patterns, as well as the previous line.
All other lines should be kept.
This is not even meant to save the kept lines because I didn't want to add the time needed for saving.

When the text file file.txt is up to +1.5 million lines the program finishes successfully after some seconds.
But if the text file is over 2 million lines, there's a segmentation fault.
The 2 million lines files is just a repetition of segments of the 1.5 million lines file, so there's no new symbol, char or pattern in the extra 500.000 lines.
The text file has some unicode characters like ★★ Some word ★★

integer file_in 
constant ERROR = 2 
string t_file = "file.txt" 
file_in = open(t_file, "r") 
if file_in = -1 then  
	puts(ERROR, "Could not open " & t_file) 
	abort(2) 
end if 
 
object txt = read_lines(t_file) 
integer match0, match1 
sequence buffer = {} 
object line 
integer skip_next = 0, i = 0 
while 1 do 
	i += 1 
	line = gets(file_in) 
	if atom(line) then 
		exit 
	end if 
	if skip_next = 1 then 
		skip_next = 0 
		continue 
	end if 
	match0 = wildcard_match("*some string*", line) 
	match1 = wildcard_match("*other*", line) 
	if match0 + match1 = 0 then 
		buffer = append(buffer, line) 
	else 
		skip_next = 1 
	end if 
end while 
close(file_in) 

Pete, I can send you the text file if you want to try it yourself.

new topic     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu