Monday, October 10, 2005

Review Assignment on New Search Engines

     Phil Bradley (2004), in one of his regular columns in ARIADNE takes a look at new search engines to see if they are comparable to the current biggest and best, including Google.  In his article, “Search Engines:  A Mixed Bag:  A review of some new search engines”, Bradley reviews four new search engines to see how useful they are at providing helpful information, rather than simply list them as usual commercial sites.  These sites are Euroclips:  The definitive European directory, YouSearched:  The accessible Web search, Ujiko, and A9.  The results were split down the middle with two sites getting a “thumbs up” and the other two search engines being “very disappointing”.

     Ujiko is based o the Yahoo search technology and gives quick access to over 4 billion web pages.  It is described as an “unusual looking” search engine, but the test results are quite impressive with the level of “personalization” that it offers.  Personalization is a hot topic in the future of the search engines and Ujiko has taken an early and interesting step down to this road.  However, for the author one question remains unanswered and that is why such an unusual name for this search engine.

     A9 is another good search engine from a specially branded and operated subsidiary of amazon.com.  It provides good and quick access to date and like Ujico has interesting features that allows for personalization.  This site opened in Palo Alto, California in October 2003 and is able to offer Web search results powered by Google, as well provides access to excerpts from books (but you have to be registered with Amazon).  Another good feature is that the search history can be stored on the server, and therefore can be accessed by searcher from any computer. A9 is a good search engine and as it can check the content of books as well as web pages and is good for people who move from computer to computer a lot.  

     Euroclips:  The definitive European directory looks muck like an older style of “portal” services.  The home page is very busy with many different type of information (i.e., business news, travel, European holidays, etc.).  The result of the searches are very commercial in nature and not very helpful.  This site is dedicated to commerce and provides very few useful and informative sites and is good to look for a product to purchase.

     YouSearched:  The accessible Web search has been specifically designed for people with visual impairments and is an approved site from RNIB and Bobby.  This site is a site that can be categorized as a site that has been designed poorly with images that one would expect to find in a “primary school”.  The question that comes to mind is that although people might have a visual impairment, it does not mean that they should be treated like a child and be provided with second-rate search engines.  

     After reading this article I read another article that I would like to compare with this one.   In a recent article in PC Magazine, titled, “Forward thinking, Michael J. Miller (2005) writes about new search technology.  In his article, Miller talks about the fact that the search technology has come a long way, but he also expresses that he would like to see more improvement and progress.  Even though the search engines have indexed tremendous number of documents (Yahoo 19.2 billion and Google 8.1 billion), however the question still remains the same, how accurate, complete, and current are these Web sites?  Miller also expresses that he would like to see more improvement in the search engines to deliver more “local content” with the ability for users to rate the sites they visit rather than having computers themselves determine which sites are popular.  This concept called a “community search” allows users to comment and share information on the usefulness of the sites they find.  However, Miller goes on to say that more of the sites he reviewed (including Yahoo’s My Web, del.icio.us, Shadows, Clipmarks, Jeteye) had yet to achieve the critical mass of information that would make it “my primary research tool”.

     So it would appear that both Bradley and Miller agree that while many of these new sites are inventive and provide interesting information on popular topics, however they may not be able to gain enough users to stand up to the big search engines and so may eventually be swallowed up.  Miller also has concerns that the new community search engines may not have enough users, so their collection is not complete (because of lack of community participants).  The other problem with these community search engines is that spammers and people with political agendas could bias the content.  However, I agree with both, Bradley and Miller that we will just have to wait and see how really useful these new search engines will become.
Reference:

Bradley, Phil (2004).  Search Engines:  A Mixed Bag:  A review of some new search
     Engines.  ARIDANE, July (40).  Retrieved October 4, 2005, from
     http://www.aariadne.ac.uk/issue40/search-engines/intro.html.

Miller, Michel J. (2005).  Forward thinking.  PC Magazine, 24(17), 7-8.

Monday, October 03, 2005

A Digital Object Identification System

Introduction
A Digital Object Identifier (DOI) is a unique name (not a location) for an entity on digital networks and it provides a system for identification and exchange of this information (International DOI Foundation (IDF), 2004).

DOIs assigned in one context may be re-used in another place (or time) without consulting the assigner. As the services provided by DOI’s are outside the direct control of the assigner, they must be designed to be interoperable, persistent and extensible. For example, in a web context a DOI may be used in an http form as a URL (through a proxy server). Hence a DOI is designed as a generic framework applicable to any digital object, providing a structured, extensible means of identification, description and resolution.

The DOI system was built using several existing standards-based components, which have been brought together and further developed by the International DOI Foundation (IDF). The IDF is a cross-industry, cross-sector, not-for-profit organisation, which was founded in 1998. Membership in the IDF is open to all organizations with an interest in electronic publishing and related enabling technologies.

The DOI has recently been accepted for standardisation (ISO TC46/SC9) and is currently in widespread use for scientific publishing and in government documents. New applications, which demonstrate more added value and enhanced functionality, are generating strong interest from the music recording and other related publishing industries (many of which have led the way in developing it).

DOI System Components
The DOI system is made up of the following components:
  • A specified standard numbering syntax;

  • A resolution service (based on an existing Handle System);

  • A data model incorporating a data dictionary (based on the indecs Data Dictionary); and

  • An implementation mechanism of policies and procedures for the governance and application of DOIs.

DOI Syntax
The DOI syntax is a standard for constructing an opaque string with naming authority and delegation (NISO Z39.84, DOI Syntax) and provides a "container" which can accommodate any existing identifier. For example:

10.1234/NP5678
10.5678/ISBN-0-7645-4889-4 and
10.2224/2004-10-ISO-DOI
are all valid DOI syntax.

The DOI has two components, the prefix and the suffix, which together form the DOI. The portion following the "/" character (the DOI Suffix) may be an existing identifier (e.g., an ISBN or bar code). The portion preceding the "/" character (the DOI Prefix) denotes a unique naming authority that is assigned to an organization that wishes to register DOIs. This combination of a unique prefix and suffix avoids the need for the centralised allocation of DOI numbers.

DOI Resolution
Resolution is the process in which a DOI request is input to a network service to provide a specific output of one or more pieces of current information related to the identified entity (such as a ULR where the object can be found). Resolution provides a level of managed indirection between an identifier and the output and can be summarised as shown in the following diagram (International DOI Foundation (IDF), 2005).

http://www.doi.org/doi_presentations/resolution/doi_resolution_feb05.jpg

DOI Data Model
The DOI data model consists of a data dictionary and a framework for applying it. Together these provide tools for defining what a DOI specifies (through use of a data dictionary), and how DOIs relate to each other, (through a grouping mechanism, Application Profiles, which associate DOIs with defined common properties). This provides semantic interoperability, enabling information that originates in one context to be used in another in ways that are as highly automated as possible.

The DOI system uses an interoperable data dictionary, which contains terms from different computerized systems and shows the relationships they have with one another in a formal way. The purpose of an interoperable data dictionary is to support the use together of terms from different systems. The IDF is the Registration Authority (RA) for one such dictionary, the ISO/IEC MPEG-21 Rights Data Dictionary, and is the co-developer of a wider indecs Data Dictionary that includes this and is used by DOIs.

DOI Implementation
DOI is implemented through the IDF, which governs and safeguards (owns or licences on behalf of registrants) all intellectual property rights relating to the DOI System. It works with Registration Authorities to ensure that any improvements made to the DOI system (including creation, maintenance, registration, resolution and policymaking of DOIs) are available to any DOI registrant, and that no third party licenses might reasonably be required to practice the DOI standard (or the resolution of a DOI).

The IDF is not a standards body, but a central authority and maintenance agency. The IDF is the appointed Registration Authority (RA) for the ISO/IEC MPEG 21 Rights Data Dictionary, and is the proposed RA for the DOI System within ISO TC46/SC9. The IDF licenses authority to use the system through Registration Agencies, each of which can develop its own applications and use DOI in ways appropriate for their community and/or industry.

Links to Additional Information

http://www.doi.org/ Main DOI Web Page

http://www.doi.org/welcome.html International DOI Foundation

http://www.doi.org/about_the_doi.html Overview

http://www.doi.org/faq.html Frequently Asked Questions

http://www.doi.org/announce.html News and Events


Reference:
The International DOI Foundation (IDF). (n.d.). Introductory Overview: The Digital Object Identifier System. Updated 17 December, 2004, from
http://www.doi.org/overview/sys_overview_021601.html

The International DOI Foundation (IDF). (n.d.). Illustration showing DOI resolution process. (JPG). Updated 14 February 2005, from
http://www.doi.org/doi_presentations/resolution/doi_resolution_feb05.jpg