1. [clickbait] Slowest oe/p program
- Posted by _tom (admin) Jun 24, 2020
- 1148 views
A shocking discovery...Python dictionary works really well!
Read about 130_000 lines whose contents looks like:
... ?QUESTION-MARK K W EH1 S CH AH0 N M AA1 R K ABACUS AE1 B AH0 K AH0 S ...
CMU Pronouncing Dictionary. You can download it from http://www.speech.cs.cmu.edu/cgi-bin/cmudict or from http://thinkPHIX2.com/code/c06d
The idea is to:
- read the c06d file
- make the key lowercase (like ABACUS --> abacus)
- make the value (like AE1 B AH0 K AH0 S )
- put this into a dictionary
Benchmarks
Python dictionary is really fast.
`tables` is by Jiri Babor (a clever use of sequences)
`map` is using OE map.e hash
`tree` is a Phix tree
Benchmarking is simplistic:
atom t = time() {} = system_exec( "p64 tables.ex" ) ? time() - t
Mint19 64bit i5
More seconds is bad.
time (seconds) | |||
---|---|---|---|
python.py | python3 | 1.6 | ! |
map.ex | eui | 2.7 | |
map | euc | 1.9 | |
tables.ex | eui | 1.6 | |
oe_tables | euc | 1.5 | * |
tables.ex | p64 | 1.6 | |
p64_tables | p64 -c | 1.55 | |
tree.ex | p64 | 6.5 | |
tree | p64 -c | 6.47 | ? |
be well
_tom
Python Code
"""This module contains a code example related to Think Python, 2nd Edition by Allen Downey http://thinkpython2.com Copyright 2015 Allen Downey License: http://creativecommons.org/licenses/by/4.0/ """ from __future__ import print_function, division def read_dictionary(filename='c06d'): """Reads from a file and builds a dictionary that maps from each word to a string that describes its primary pronunciation. Secondary pronunciations are added to the dictionary with a number, in parentheses, at the end of the key, so the key for the second pronunciation of "abdominal" is "abdominal(2)". filename: string returns: map from string to pronunciation """ d = dict() fin = open(filename) for line in fin: # skip over the comments if line[0] == '#': continue t = line.split() word = t[0].lower() pron = ' '.join(t[1:]) d[word] = pron return d if __name__ == '__main__': d = read_dictionary() for k, v in d.items(): print(k, v)
Phix Dictionary
atom fn = open( "c06d", "r") sequence raw = get_text(fn, 1 ) integer d = new_dict() for i=1 to length(raw) do if raw[i][1] == '#' then continue end if raw[i] = split(raw[i]) putd( lower(raw[i][1]), join( raw[i][2..$], ' ') ) end for function show( object key,data,user) printf(1, "%s %s \n", {key,data} ) return 1 end function traverse_dict( routine_id("show") )
OE Map
atom fn = open( "c06d", "r" ) sequence raw = read_lines(fn) include std/map.e include std/io.e include std/search.e include std/sequence.e include std/text.e map d = new() for i=1 to length(raw) do if raw[i][1] = '#' then continue end if integer n = find( ' ', raw[i] ) put(d, lower(raw[i][1..n-1]), raw[i][n+1..$] ) end for sequence foo = pairs( d, 1 ) for i=1 to length(foo) do printf(1, "%s %s\n", { foo[i][1], foo[i][2] } ) end for
Babor Table
I cheat in creating the `table` without using stables.e functions.
include stables.e ifdef PHIX then atom fn = open( "c06d", "r") sequence raw = get_text(fn, 1 ) ? length(raw) elsedef include std/sequence.e include std/text.e include std/io.e sequence raw = read_lines("c06d") ? length(raw) end ifdef sequence d = ET sequence data={}, keys={} for i=1 to length(raw) do if raw[i][1] = '#' then continue end if integer n = find(' ', raw[i]) keys = append(keys, lower(raw[i][1..n-1]) ) data = append(data, raw[i][n+1..$] ) end for d = append(data, keys) for i=1 to length(d)-1 do printf(1,"%s %s \n", { d[$][i], d[i] } ) end for
... continues on reply ...
2. Re: [clickbait] Slowest oe/p program
- Posted by _tom (admin) Jun 24, 2020
- 1072 views
J Babors's table code is in the pastebin
https://openeuphoria.org/pastey/328.wc
or, get it from the archive