Wednesday 22 February 2006

Connectedness - 3

This is the sixth in a series on "Information, language, knowledge and connectedness". The previous posts are part 1, part 2, part 3, part 4 and part 5.

We manipulate objects (information, concepts, goods etc) via handle. If you are managing a retail shop of fashion wear, you don't move all the clothes daily. You look at your computer records. Each item in your store is represented by a item number in your database. (Yes, occassionally, you do try to match the real thing with your computer record - that's call stock taking, I think.)

When we are managing information, e.g. the paper submitted by a student, we give that paper a code (John's essay). After we marked the paper, it is given a score. This score is linked to John in our record book. At the end of the year, all these scores are collated to give an overall score for John. This is again some sort of handle to represent a much larger collection of information and teacher's opinion on John's performance.

Librarian used to handle a large collection of information (books we used call them). Every book is given a unique identifier. Multiple copies of the same book are also given item identifier too.

The same applies to information when it is digitised. Information is given URI.

End of story? Wrong! It is the beginning actually.

In order for us to find information, we try to collect some of the characteristics of the information. We found that there are some common characteristics which are useful, e.g. the author, the date of publications, the subject domain the information is about. So librarians added "metadata" to the information.

But some may call author writer, others call the same creator. So, we need to create a common handle for similar concepts (similar internal world views of different people). So we have metadata standards which
1. standardise the handle (the name of the characteristics),
2. the meaning to the handle,
3. the way the value to be expressed (firstname first, or firstname last?, how to handle middle name etc.)

Soon we found that there are variations of "similar" concepts. The metadata needed "qualifiers" both to extend the grey area of "similarity" and "overload" the handle.

"Metadata" are data themselves. So we can apply the same principle on metadata which leads to "metametadata". Metametadata are data themselves. So we can apply the same principle on metametadata which leads to "metametametadata".... [see Meta Meta Meta Data Draft 0.2]

Metadata is NOT the only way to find information. Google has demonstrated another way - using inverted index of the words within the information. Tags (folksonomy) is the current trend (see also my view).

Information is instrinsically connected. How can we exploit this to make the search better?

Metadata world would use the citation - explicit connectedness provided by the author of the information (example). Google would use "key word" clustering. Folksonomy uses "tag cloud".

How these techniques compare with the "connectedness" of an expert in the subject domain?

When we refer "connectedness" to connectedness as implemented by the above examples, are we approaching a society that can learn?

No comments: