1. MIT's Technology Review Article

Interesting technology article about Tim Berners-Lee. Hopefully it doesn't
require a subscription... :)

in particular, kat, I'd be interested in your viewpoint on what the
article refers to as the Semantic Web and its use in AI.

http://www.technologyreview.com/articles/04/10/frauenfelder1004.asp?trk=nl

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » topic index » view message » categorize

2. Re: MIT's Technology Review Article

On 27 Sep 2004, at 7:00, cklester wrote:

>=20
>=20
> posted by: cklester <cklester at yahoo.com>
>=20
> Interesting technology article about Tim Berners-Lee. Hopefully it doesn'=
t
> require a subscription... :)
>=20
> in particular, kat, I'd be interested in your viewpoint on what the
> article refers to as the Semantic Web and its use in AI.
>=20
> http://www.technologyreview.com/articles/04/10/frauenfelder1004.asp?trk=
=nl

It's no more than some web domains have been using for years, and nothing=
=20
is, or can be, tightly standardised. It's the XML spec for web design,
basically. Some sites which don't use XML will use <!-- tags --> for blocks=
 of=20
"data", or they use classes linked to / defined in the css file. These not-=
quite-
semantic-tags are the way to go, imho, if there was a general agreement on=
=20
using <!-- words with only one meaning -->. Right now, i am parsing out dat=
a=20
on some domains using these tags.

The article has this line: "It still doesn=92t solve the semantic one, thou=
gh. For=20
that, the Semantic Web first gives names to the basic concepts involved in=
=20
the data: date and time, an event, a check, a transaction, temperature and=
=20
pressure, and location. These are all defined just to mean whatever they=
=20
mean in the system which produces the data" , which leave out the system=
=20
which you just bought to understand that data. It also leaves out other
human languages, which is important because the XML isn't machine
generated now, it's hand coded, in English. I've seen none in other human=
=20
languages, so only the usa-brit-aussy point of view is semantically tagged?=
=20
That makes for a pretty biased Ai, CK. The same paragraph goes on to
discuss ontologies from on high, which leaves out the human tagger, i
suppose? Then we need supercomputers that don't exist yet, just ask Lenat,=
=20
who has been working on-off on Cyc and it's predecessors since the early=
=20
1980's, it's got millions of (wo)man-years of semantic data and is still no=
t off=20
the ground. I've got some of Cyc's files, they are horrendous. I cannot
imagine that mess on everyone's PDA. (not to say it's useless mess, but the=
=20
presentation, with the repetitive nature of the tags, is disregarding the=
=20
computer's strong suit of being able to reconstruct the useless data itself=
.=20
Hint: "class inheritance")

It goes on to discuss interoperability, which i am going to assume isn't so=
lo=20
standard meanings for each and every word, but is meant to be understood=
=20
as "all the computers that have anything valuable must be left on 24-7 on=
=20
broadband connection, otherwise, all the rest of the internet will crash wa=
iting=20
on CK to turn his laptop back on".. Distributed data works only as well as =
it=20
is valid data, and it's available as needed. I am finding many commercial=
=20
hosts drop of the internet at least once an hour, due to split second break=
s=20
here and there, no matter how reliable the host is. And i did a random spot=
=20
check of some data, and found 2/3 missing, 2/3 wrong data (and no human
interested in fixing it), and 1/6 was correct. (i forget the other 1/6, but=
 it=20
wasn't good data.)

He discusses "FOAF" files, which brings up a huge wall to climb over: the=
=20
data must be available and free from artificial constraints. Much data isn'=
t=20
online or isn't indexed at all, but it would suffice to merely point to it.=
 Some=20
data is in such form on webpages that not even Google indexes it, such as=
=20
valid data buried in javascript code or linked framesets. Creating a whole =
new=20
XML file, and having a XML tag on each word in every existing file, would=
=20
bloat the internet to a crawl. I've seen 5K XML semantic files that had not=
hing=20
to say. Literally. But even files which do appear online often disappear af=
ter a=20
month, a year,, sometimes a few hours. If there was an automagic tagger
built, so as to not use a human to tag a file which has a lifetime of mere=
=20
hours,  then why not dispense with tagging, and move that tagger to the
recipient, and not spew XML/semantics all over the internet? Frankly, i hav=
e=20
decided to NOT mine some semantically XML tagged sites, because of the
page bloat.

And, if we disregard the code to process and "understand" the data, i have=
=20
occasionally presented semanticly/syntacticly tagged data here, and the
means to retrieve it was/is in strtok.e. I am looking forwards to seeing if=
 Eu=20
v2.5 spawns a fast string execution function or procedure, because
"understanding"  in mirc is just abysmally slow.

Kat

new topic     » goto parent     » topic index » view message » categorize

3. Re: MIT's Technology Review Article

Kat wrote:
> 
> On 27 Sep 2004, at 7:00, cklester wrote:
> > kat, I'd be interested in your viewpoint on what the
> > article refers to as the Semantic Web and its use in AI.
> > <a
> > href="http://www.technologyreview.com/articles/04/10/frauenfelder1004.asp?trk=">http://www.technologyreview.com/articles/04/10/frauenfelder1004.asp?trk=</a>
>
> That makes for a pretty biased Ai...

That was somewhat my thought, as well. There are hundreds of languages
on the planet, many of which are represented on the web. How will this
Semantic Web be able to handle the translations?

> Some data is in such form on webpages that not even Google indexes it,
> such as valid data buried in javascript code or linked framesets.
> Creating a whole new XML file, and having a XML tag on each word in
> every existing file, would bloat the internet to a crawl. I've seen
> 5K XML semantic files that had nothing to say.

> to say. Literally. But even files which do appear online often disappear
> after a month, a year,, sometimes a few hours. If there was an automagic
> tagger built, so as to not use a human to tag a file which has a lifetime
> of mere hours,  then why not dispense with tagging, and move that tagger
> to the recipient, and not spew XML/semantics all over the internet?

Not quite the efficiency needed to make the "internet a database." :/

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

4. Re: MIT's Technology Review Article

> On 27 Sep 2004, at 7:00, cklester wrote:
> 
> >=20
> >=20
> > posted by: cklester <cklester at yahoo.com>
> >=20
> > Interesting technology article about Tim Berners-Lee. Hopefully it doesn'=
> t
> > require a subscription... :)
> >=20
> > in particular, kat, I'd be interested in your viewpoint on what the
> > article refers to as the Semantic Web and its use in AI.
> >=20
> > http://www.technologyreview.com/articles/04/10/frauenfelder1004.asp?trk=
> =nl

No one else here had a comment?

Kat

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu