Kant in the information age

Thursday, March 25, 2004

Devilish Definition of the Semantic Web

Finally read the proper defintion of semantic web by Danny Ayers.

March 22, 2004
Devilish Definition
Semantic Web, proper noun

An attempt to apply the Dewey Decimal system to an orgy.

The Devil's Dictionary (2.0)

// posted by Rogier Brussee @ 8:30 AM (0) comments

Wednesday, March 17, 2004

Save the planets!

So they say that a new planet called Sedna (2003 VB12) has been discovered. Therefore all astrological computations have to be redone and life takes a completely different course, right ? According to the discoverers, Mike Brown (Caltech), Chad Trujillo (Gemini Observatory) and David Rabinowitz (Yale) my life willl change course but for an unexpected reason. Sedna is not a planet, but a planetoid. It is just a large object in the inner Oort cloud, almost certainly not more massive than the rest of the Oort cloud put together. Maybe it is not even the largest one in this class. However they also argue on similar grounds that Pluto should be demoted to the status of planetoid and is just the largest Kuiperbelt object.

Astrologists of all nations, unite! Save the planets! If this continues what remains certain in life?

// posted by Rogier Brussee @ 1:35 AM (0) comments

Monday, March 15, 2004

Croeso: Google within companies?

In Croeso: Google within companies? Andy Boyd comments on the fact that Google does not work particularly well for the intranet because most of the information is in databases mailinglists and ERP systems. I am a little surprised about the mailinglists (I guess there are just no external links to the posts in the archive and the mails do not link to each other well enough), but that the databases are overlooked is completely expected. The thing is, Google does not work very well for the web either because most of the information on the web is in databases too. You usually donot realise this because google does not find it for you, and on the web you know less well what ought to be found. This is called the deep web phenomenon. People estimate that most of the information on the web is actually hidden in the deep web. Some people say its 99 % of teh information but it all depends on how you measure things. The human genome accounts for lots of bytes on the web but I donot think every gene should count as a webpage.

But Google is smart and they want to go IPO right ?

The people at Google are very smart indeed and google does an amazing job at what it does, but it is not clairvoyant. Google even has no understanding of what it reads, it just does some statistics. Now suppose you are 13 3/4 years old and want to know the production of the norwegian leather industry. If you poke around with Google you eventually find this

http://www.euroleather.com/database.html

It is a webpage, on the internet, it wants to be found. I could find it only by asking google to look for leather production and database. Google found a page containing leather and a link called database. The database has a pull down menu containg Norway. I.e. found the page above by outguessing the system and a little luck. But what should have happened is that

the database describes itself to the world and particularly to google as

1. it contains information which you can query on a geo:country in geo:Europe and optionally on leather:tannery and the following list of goods,
2. it returns the amount of leather:leather in unit unit:tonsPerYear
3. it has a well defined interface

Google should "know", having indexed the geo ontology, that Norway is a country in Europe.

Then google could query the database for you and you would get the answer. The possibility to self describe a information source (in terms more general then simply listing all its entries) is a key challenge for the semantic web. Apparently Amazon and Google do something like this already. With such a delegation model you have a much cleaner division of responsibilities.

// posted by Rogier Brussee @ 9:40 AM (0) comments

Thirteen

Last week my girlfriend and I went to see Thirteen. It is a disconcerting movie because you see how easy it is for a sweet teenage girl to get lost in the wide wild world. Actually I also felt sorry for the the mean teeange girl who is depicted as a having had every possibility to get damaged by life. Incidentally the mean girl is played by the girl who wrote the script and who had the sweet girls real life experience. I remembered that while celebrating my own daughters birthday. She is getting to be a big girl. Fortunately it was just a passing thought, we didn't let it spoil the fun.

// posted by Rogier Brussee @ 3:19 AM (0) comments

Getting an RSS feed

Lilia kindly pointed me to this site which allows me to have an rss feed from my atom feed. Basically it is a little service that takes my atomfeed and converts it to an rss feed Like so:

http://www.2rss.com/atom2rss.php?atom=http://www.2rss.com/blog/atom.xml

Thanks Lilia !

// posted by Rogier Brussee @ 1:04 AM (0) comments

Friday, March 12, 2004

The relative importance of things

BBC NEWS | Europe | Spain mourns train attacks dead

The shocking news form Madrid makes you realise the relative unimportance of all our elegant little arguments

// posted by Rogier Brussee @ 1:12 AM (0) comments

Wednesday, March 10, 2004

On the classification of Weblogs

This is the result of a brainstorm session on the blogging phenomenon and on the classification of weblogs, which I had with Henk de Poot

We distinguish three different types:

P:: the personal weblog, the personal voice of someone.
A:: an aggregation of personal weblogs,
C:: community weblogs, basically they are mailinglists NG, or the voice of a whole community.

Those types are roughly in ascending order of publicness.

The personal weblog is the weblog "classic" . Somebody writes a diary or simply makes public what he/she is interested in.

The aggregated weblog is a more recent phenomenon, and seems to be taken up mostly by the hardcore hacker community. Aggregation at the personal level is part of the core business of the blogger to make his/her life easier while trying to follow what goes on in the bloggosphere, by simply selecting the RSS feeds of those bloggers he/she wants to follow. However the mechanism can clearly be reused for first aggregating the rss feeds of a few bloggers and then pushing the combined feed back out to the world. Crucially, writers of aggregated weblogs subscribe to the aggregation channel log themselves, although there will often be a moderator who decides whether the blogger really belongs to the community or not. Thus it is an active act of choosing the community in which one wants ones voice to be heard, and it seems that the posters to the aggregated list read the list themselves and react upon it. An aggregated blog has a much “louder voice” and it can therefore be easier (and possibly more prestigious) to be heard as an individual inside this channel. On the other hand it also easier to get drowned out by the rest of the crowd. I guess it is a matter of scale: if the channel is open for too many people it looses it's distinghuishing feature and the signal to noise ratio drops too much. But my guess is that it is the same phenomenon is at work that accounts for competing shops crowding together in a mall or the synchronised flashing of certain tropical fireflies to attract females: those you want to attract go where the action is, so you have to be there yourself.

Examples planet rdf, planet gnome .
(Both are very nerdy sites but especially in the latter people also tell about the bloggers lifes outside of their frantic coding activity)

If blog entries can be (and are) given keywords (or semantic categories), then there is also the possibility to filter on keyword, and mutatis mutandis to aggregate on that basis

Example

my Girls and Beer blog

my Political Blog

my knowledge mgt blog

my company blog

my Movies blog

Ofcourse some blog entries can go to several channels simultaneously. I am not sure whether this works in practice, because it assumes that people tag their logs with keywords, or that reasonable automatic indexing takes place.

// posted by Rogier Brussee @ 9:16 AM (0) comments

Thursday, March 04, 2004

Root systems

I realised two things about root and weight systems that I could have realised years ago.

The first realisation is that for a representation V of a Lie algebra g with maximal torus t, the weights of V can be identified with the support of the sheaf V^~ obtained by localising the S^* t module V over the scheme t ^*= spec S^* t. For semi simple Lie algebra's this space consists of reduced points and the restriction of V^~ over a weight lambda is just the weight space V_λ. To prove this, write

v = ∑ v_μ

in the weight decomposition. Then if t -< λ,t > 1 is a hyperplane on t ^* vanishing in λ, we have for all n

(t- < λ,t >)ⁿ v = ∑_μ (t - < λ,t > )ⁿ v_μ = ∑_μ (< μ, t > -< λ,t > )ⁿ v_μ.

If v has support in λ then (t-<λ,t>)ⁿ v = 0 for n >> 0 and we see that all v_μ =0. Conversely, if all the v_μ are zero for λ ≠ μ (so v = v_λ), then (t-< λ,t > ) v = 0. Therefore v has support on λ and the support of the sheaf V^~ is reduced there.

The other realisation I got from the wikipedia is that the exceptional root system G₂ is just the configuration of the star of David.

// posted by Rogier Brussee @ 2:23 AM (0) comments