OpenEuphoria: Forum: [clickbait] Slowest oe/p program

1. [clickbait] Slowest oe/p program

Posted by _tom (admin) Jun 24, 2020
1525 views

A shocking discovery...Python dictionary works really well!

Read about 130_000 lines whose contents looks like:

... 
?QUESTION-MARK  K W EH1 S CH AH0 N M AA1 R K 
ABACUS  AE1 B AH0 K AH0 S 
...

CMU Pronouncing Dictionary. You can download it from http://www.speech.cs.cmu.edu/cgi-bin/cmudict or from http://thinkPHIX2.com/code/c06d

The idea is to:

read the c06d file
make the key lowercase (like ABACUS --> abacus)
make the value (like AE1 B AH0 K AH0 S )
put this into a dictionary

Benchmarks

Python dictionary is really fast.

`tables` is by Jiri Babor (a clever use of sequences)

`map` is using OE map.e hash

`tree` is a Phix tree

Benchmarking is simplistic:

atom t = time() 
{} = system_exec( "p64 tables.ex" ) 
? time() - t

Mint19 64bit i5

More seconds is bad.

		time (seconds)
python.py	python3	1.6	!
map.ex	eui	2.7
map	euc	1.9
tables.ex	eui	1.6
oe_tables	euc	1.5	*
tables.ex	p64	1.6
p64_tables	p64 -c	1.55
tree.ex	p64	6.5
tree	p64 -c	6.47	?

be well
_tom

Python Code

"""This module contains a code example related to 

 
Think Python, 2nd Edition 
by Allen Downey 
http://thinkpython2.com 
 
Copyright 2015 Allen Downey 
 
License: http://creativecommons.org/licenses/by/4.0/ 
""" 

 
from __future__ import print_function, division 
 
 
def read_dictionary(filename='c06d'): 
    """Reads from a file and builds a dictionary that maps from 

    each word to a string that describes its primary pronunciation. 
 
    Secondary pronunciations are added to the dictionary with 
    a number, in parentheses, at the end of the key, so the 
    key for the second pronunciation of "abdominal" is "abdominal(2)". 
 
    filename: string 
    returns: map from string to pronunciation 
    """ 

    d = dict() 
    fin = open(filename) 
    for line in fin: 
 
        # skip over the comments 
        if line[0] == '#': continue 
 
        t = line.split() 
        word = t[0].lower() 
        pron = ' '.join(t[1:]) 
        d[word] = pron 
 
    return d 
 
 
if __name__ == '__main__': 
    d = read_dictionary() 
    for k, v in d.items(): 
        print(k, v)

Phix Dictionary

atom fn = open( "c06d", "r") 
sequence raw = get_text(fn, 1 ) 
 
integer d = new_dict() 
 
for i=1 to length(raw) do 
    if raw[i][1] == '#' then continue end if 
    raw[i] = split(raw[i]) 
    putd( lower(raw[i][1]), join( raw[i][2..$], ' ') ) 
end for 
 
        function show( object key,data,user) 
            printf(1, "%s  %s \n", {key,data} ) 
            return 1 
            end function 
 
traverse_dict( routine_id("show") )

OE Map

atom fn = open( "c06d", "r" ) 
sequence raw = read_lines(fn) 
 
include std/map.e 
include std/io.e 
include std/search.e 
include std/sequence.e 
include std/text.e 
 
map d = new() 
 
for i=1 to length(raw) do 
    if raw[i][1] = '#' then continue end if 
    integer n = find( ' ', raw[i] ) 
    put(d,  lower(raw[i][1..n-1]), raw[i][n+1..$] ) 
end for 
 
 
sequence foo = pairs( d, 1 ) 
for i=1 to length(foo) do 
    printf(1, "%s  %s\n", { foo[i][1], foo[i][2] } ) 
end for

Babor Table

I cheat in creating the `table` without using stables.e functions.

include stables.e 
 
ifdef PHIX then 
    atom fn = open( "c06d", "r") 
    sequence raw = get_text(fn, 1 ) 
        ? length(raw) 
 
elsedef 
    include std/sequence.e 
    include std/text.e 
    include std/io.e 
    sequence raw = read_lines("c06d") 
    ? length(raw) 
end ifdef 
 
sequence d = ET 
 
sequence data={}, keys={} 
 
for i=1 to length(raw) do 
    if raw[i][1] = '#' then continue end if 
    integer n = find(' ', raw[i]) 
    keys = append(keys, lower(raw[i][1..n-1]) ) 
    data = append(data, raw[i][n+1..$] ) 
end for 
 
d = append(data, keys) 
 
for i=1 to length(d)-1 do 
        printf(1,"%s  %s \n", { d[$][i], d[i] } ) 
end for

... continues on reply ...

new topic » topic index » view message » categorize

2. Re: [clickbait] Slowest oe/p program

Posted by _tom (admin) Jun 24, 2020
1445 views

J Babors's table code is in the pastebin

https://openeuphoria.org/pastey/328.wc

or, get it from the archive

new topic » goto parent » topic index » view message » categorize

Search

Quick Links

User menu

Not signed in.

OpenEuphoria