Re: Euphoria web usage analysis
- Posted by rforno at tutopia.com
Feb 16, 2002
Rob:
Do you know why indentation of programs written with 'ed' is screwed up when
you post the file to the list?
----- Original Message -----
From: "Robert Craig" <rds at RapidEuphoria.com>
To: "EUforum" <EUforum at topica.com>
Subject: Re: Euphoria web usage analysis
>
> J. Kenneth Riviere writes:
> > Do you have such a program that you would be willing to put in the
archive?
>
> It's really very specific to my needs.
> I doubt that anyone else could make much use of it,
> but it's a good example of the kind of thing
> Euphoria is good at, since it requires speed,
> but it's also something that I wanted to develop quickly
> and play around with a lot. (without having to compile, link and
> resolve machine crashes).
>
> I'm using it right now to evaluate various "pay-per-clickthrough"
> advertising sites. It tells me how many people came from
> various search keywords that I bid on, and how "interested" they were
> when they arrived, based on the number of extra pages that they
> viewed after seeing the main page. I've found significant
> differences in the "quality" of the visitors that various
> places send me, and of course differences depending on
> what the keyword is. This will influence which places I continue
> with, and how much I bid for various words.
>
> A typical line in my log file looks like (wrapped onto 5 lines here):
>
> 195.92.168.171 - - [03/feb/2002:09:59:55 -0800]
> "get /spellchk.zip http/1.1"
> 200 32768
> "http://www.programmersheaven.com/search/download.asp?fileid=14415"
> "mozilla/4.6 [en-gb]c-cck-mcd netscapeonline.co.uk (win98; i)"
>
> It shows the IP address of the visitor, the date, the file that he
accessed,
> info on the success of the access, the URL the person was referred from,
> what kind of browser they are using, their o/s etc.
>
> By the way, there were 28,533 visits to the RapidEuphoria Web site
> in January, smashing the previous record.
>
> Here's the code, for what it's worth.
> Sorry about the indentation and lack of comments.
>
> -- extract stats from RapidEuphoria.com access_log
> without type_check
>
> include sort.e
>
> constant TO_LOWER = 'a' - 'A'
> function fast_lower(sequence s)
> -- Faster than the standard lower().
> -- Speed of lower() is very important for "any-case" search.
> integer c
>
> for i = 1 to length(s) do
> c = s[i]
> if c <= 'Z' then
> if c >= 'A' then
> s[i] = c + TO_LOWER
> end if
> end if
> end for
> return s
> end function
>
> sequence target_list, target_count, referrer_list, referrer_count
> integer line_count, gif_count
> integer total_referrers, unknown_referrers
> sequence referrer
> sequence ip_address
> sequence cl
>
> cl = command_line()
> if length(cl) < 3 then
> puts(2, "Usage: ex stats access_log\n")
> abort(1)
> end if
>
> sequence special_referrer, special_target, special_words
>
> special_referrer = {
> "freshmeat",
> "linkexchange",
> "directhit.com",
> "google.com",
> "altavista.com"
> }
>
> special_target = {
> "?sp981", -- all Sprinks
> "?bayf", -- Bay9 freeware
> "?bayc", -- Bay9 C
> "?baysh", -- Bay9 Shareware
> "?bayso", -- Bay9 Software
> "?bayfs", -- Bay9 Free Software
> "?bayd", -- Bay9 DOS
> "?gc981", -- all goCLick
> "?fw981", -- Overture freeware
> "?pl981", -- Overture programming language
> "?f981", -- all FindWhat
> "?7se" -- all 7Search
> }
>
> constant S_WORD = 1,
> S_LIST = 2,
> S_DUPS = 3
>
> constant L_EXTRA = 1,
> L_IP = 2,
> L_LINE = 3
>
> special_words = special_referrer & special_target
> for i = 1 to length(special_words) do
> special_words[i] = {special_words[i], {}, 0}
> end for
>
> procedure visitor(sequence word)
> -- a person has entered with a special target or referrer
> integer dups
>
> -- ignore visualbasic from sprinks
> -- if equal(word, "?sp981") then
> -- if not match("basic", referrer) and not match("visual", referrer)
then
> -- if not match("cplus", referrer) then
> -- return
> -- end if
> -- end if
>
> for i = 1 to length(special_words) do
> if equal(word, special_words[i][S_WORD]) then
> dups = special_words[i][S_DUPS]
> for j = 1 to length(special_words[i][S_LIST]) do
> if equal(ip_address, special_words[i][S_LIST][j][L_IP]) then
> dups += 1
> exit
> end if
> end for
> special_words[i][S_LIST] = prepend(special_words[i][S_LIST],
> {0, ip_address, line_count})
> special_words[i][S_DUPS] = dups
> return
> end if
> end for
> puts(2, "Couldn't find " & word & '\n')
> end procedure
>
> procedure credit(sequence ip_address)
> -- give credit for this ip_address to special target or referrer
> sequence list, temp
>
> for i = 1 to length(special_words) do
> list = special_words[i][S_LIST]
> for j = 1 to length(list) do
> if line_count > list[j][L_LINE]+3000 then
> exit
> end if
> if equal(ip_address, list[j][L_IP]) then
> if line_count < list[j][L_LINE]+3000 then
> special_words[i][S_LIST][j][L_EXTRA] += 1
> special_words[i][S_LIST][j][L_LINE] = line_count
>
> -- move it to first position
> temp = special_words[i][S_LIST][j]
> special_words[i][S_LIST][j] = special_words[i][S_LIST][1]
> special_words[i][S_LIST][1] = temp
> exit -- allow double credit for two or more words,
> -- but not for the same word
> end if
> end if
> end for
> end for
> end procedure
>
> procedure gather_stats()
> -- one pass through the access log
> integer q, s, p, special
> object line
> sequence target
> integer log_file
>
> log_file = open(cl[3], "r")
> if log_file = -1 then
> puts(2, "Couldn't open " & cl[3] & '\n')
> end if
> target_list = {}
> target_count = {}
> referrer_list = {}
> referrer_count = {}
> line_count = 0
> gif_count = 0
>
> total_referrers = 0
> unknown_referrers = 0
>
> while 1 do
> line = gets(log_file)
> if atom(line) then
> exit
> end if
> line_count += 1
> line = fast_lower(line)
>
> if match(".gif ", line) or match(".jpg ", line) then
> gif_count += 1
> else
> q = find(' ', line)
> if q then
> ip_address = line[1..q-1]
> else
> ip_address = ""
> end if
>
> credit(ip_address)
>
> q = find('"', line)
> if q then
> -- target address
> line = line[q+1..length(line)]
> s = find('/', line)
> if s then
> target = "/"
> while 1 do
> s += 1
<snip>
> if v > 0 then
> printf(1, "Average extra pages: %.2f\n", total / v)
> end if
> puts(1, '\n')
> end for
> puts(2, '\n')
>
> print(2, time()-t)
>
> Regards,
> Rob Craig
> Rapid Deployment Software
> http://www.RapidEuphoria.com
>
>
>
>
|
Not Categorized, Please Help
|
|