Re: Euphoria web usage analysis
- Posted by rforno at tutopia.com Feb 16, 2002
- 461 views
Rob: Do you know why indentation of programs written with 'ed' is screwed up when you post the file to the list? ----- Original Message ----- From: "Robert Craig" <rds at RapidEuphoria.com> To: "EUforum" <EUforum at topica.com> Subject: Re: Euphoria web usage analysis > > J. Kenneth Riviere writes: > > Do you have such a program that you would be willing to put in the archive? > > It's really very specific to my needs. > I doubt that anyone else could make much use of it, > but it's a good example of the kind of thing > Euphoria is good at, since it requires speed, > but it's also something that I wanted to develop quickly > and play around with a lot. (without having to compile, link and > resolve machine crashes). > > I'm using it right now to evaluate various "pay-per-clickthrough" > advertising sites. It tells me how many people came from > various search keywords that I bid on, and how "interested" they were > when they arrived, based on the number of extra pages that they > viewed after seeing the main page. I've found significant > differences in the "quality" of the visitors that various > places send me, and of course differences depending on > what the keyword is. This will influence which places I continue > with, and how much I bid for various words. > > A typical line in my log file looks like (wrapped onto 5 lines here): > > 195.92.168.171 - - [03/feb/2002:09:59:55 -0800] > "get /spellchk.zip http/1.1" > 200 32768 > "http://www.programmersheaven.com/search/download.asp?fileid=14415" > "mozilla/4.6 [en-gb]c-cck-mcd netscapeonline.co.uk (win98; i)" > > It shows the IP address of the visitor, the date, the file that he accessed, > info on the success of the access, the URL the person was referred from, > what kind of browser they are using, their o/s etc. > > By the way, there were 28,533 visits to the RapidEuphoria Web site > in January, smashing the previous record. > > Here's the code, for what it's worth. > Sorry about the indentation and lack of comments. > > -- extract stats from RapidEuphoria.com access_log > without type_check > > include sort.e > > constant TO_LOWER = 'a' - 'A' > function fast_lower(sequence s) > -- Faster than the standard lower(). > -- Speed of lower() is very important for "any-case" search. > integer c > > for i = 1 to length(s) do > c = s[i] > if c <= 'Z' then > if c >= 'A' then > s[i] = c + TO_LOWER > end if > end if > end for > return s > end function > > sequence target_list, target_count, referrer_list, referrer_count > integer line_count, gif_count > integer total_referrers, unknown_referrers > sequence referrer > sequence ip_address > sequence cl > > cl = command_line() > if length(cl) < 3 then > puts(2, "Usage: ex stats access_log\n") > abort(1) > end if > > sequence special_referrer, special_target, special_words > > special_referrer = { > "freshmeat", > "linkexchange", > "directhit.com", > "google.com", > "altavista.com" > } > > special_target = { > "?sp981", -- all Sprinks > "?bayf", -- Bay9 freeware > "?bayc", -- Bay9 C > "?baysh", -- Bay9 Shareware > "?bayso", -- Bay9 Software > "?bayfs", -- Bay9 Free Software > "?bayd", -- Bay9 DOS > "?gc981", -- all goCLick > "?fw981", -- Overture freeware > "?pl981", -- Overture programming language > "?f981", -- all FindWhat > "?7se" -- all 7Search > } > > constant S_WORD = 1, > S_LIST = 2, > S_DUPS = 3 > > constant L_EXTRA = 1, > L_IP = 2, > L_LINE = 3 > > special_words = special_referrer & special_target > for i = 1 to length(special_words) do > special_words[i] = {special_words[i], {}, 0} > end for > > procedure visitor(sequence word) > -- a person has entered with a special target or referrer > integer dups > > -- ignore visualbasic from sprinks > -- if equal(word, "?sp981") then > -- if not match("basic", referrer) and not match("visual", referrer) then > -- if not match("cplus", referrer) then > -- return > -- end if > -- end if > > for i = 1 to length(special_words) do > if equal(word, special_words[i][S_WORD]) then > dups = special_words[i][S_DUPS] > for j = 1 to length(special_words[i][S_LIST]) do > if equal(ip_address, special_words[i][S_LIST][j][L_IP]) then > dups += 1 > exit > end if > end for > special_words[i][S_LIST] = prepend(special_words[i][S_LIST], > {0, ip_address, line_count}) > special_words[i][S_DUPS] = dups > return > end if > end for > puts(2, "Couldn't find " & word & '\n') > end procedure > > procedure credit(sequence ip_address) > -- give credit for this ip_address to special target or referrer > sequence list, temp > > for i = 1 to length(special_words) do > list = special_words[i][S_LIST] > for j = 1 to length(list) do > if line_count > list[j][L_LINE]+3000 then > exit > end if > if equal(ip_address, list[j][L_IP]) then > if line_count < list[j][L_LINE]+3000 then > special_words[i][S_LIST][j][L_EXTRA] += 1 > special_words[i][S_LIST][j][L_LINE] = line_count > > -- move it to first position > temp = special_words[i][S_LIST][j] > special_words[i][S_LIST][j] = special_words[i][S_LIST][1] > special_words[i][S_LIST][1] = temp > exit -- allow double credit for two or more words, > -- but not for the same word > end if > end if > end for > end for > end procedure > > procedure gather_stats() > -- one pass through the access log > integer q, s, p, special > object line > sequence target > integer log_file > > log_file = open(cl[3], "r") > if log_file = -1 then > puts(2, "Couldn't open " & cl[3] & '\n') > end if > target_list = {} > target_count = {} > referrer_list = {} > referrer_count = {} > line_count = 0 > gif_count = 0 > > total_referrers = 0 > unknown_referrers = 0 > > while 1 do > line = gets(log_file) > if atom(line) then > exit > end if > line_count += 1 > line = fast_lower(line) > > if match(".gif ", line) or match(".jpg ", line) then > gif_count += 1 > else > q = find(' ', line) > if q then > ip_address = line[1..q-1] > else > ip_address = "" > end if > > credit(ip_address) > > q = find('"', line) > if q then > -- target address > line = line[q+1..length(line)] > s = find('/', line) > if s then > target = "/" > while 1 do > s += 1 <snip> > if v > 0 then > printf(1, "Average extra pages: %.2f\n", total / v) > end if > puts(1, '\n') > end for > puts(2, '\n') > > print(2, time()-t) > > Regards, > Rob Craig > Rapid Deployment Software > http://www.RapidEuphoria.com > > > >