Home page
DELOS offcial website
About Workpackage
Aims & objectives
DL Evaluation Resources
Testbeds & toolkits
Discussion Forum for WP Members

This work continues the 1st DELOS Network of Excellence Working Group on Evaluation (2000-2002).

Previous Working Group Website

Aims & Objectives

Tasks [ task 0 - task 1 -
task 2 - task 3 - task 4 - task 5 ]
Expected results
Previous evaluation WG website

Digital libraries need to be evaluated as systems and as services to determine how useful, usable, and economical they are and whether they achieve reasonable cost-benefit ratios. Results of evaluation studies can provide strategic guidance for the design and deployment of future systems, can assist in determining whether digital libraries address the appropriate social, cultural, and economic problems, and whether they are as maintainable as possible. Consistent evaluation methods also will enable comparison between systems and services.
The evaluation cluster will work both on evaluation methodologies in general as well as on providing the infrastructure for specific evaluations. Thus, the following objectives are addressed:
- Development of a comprehensive theoretical framework for DL evaluation, which can serve as reference point for evaluation studies in the DL area.
- Research on new methodologies will be supported in order to overcome the lack of appropriate evaluation approaches and methods.
- Development of corresponding toolkits and testbeds in order to enable new evaluations and to ease the application of standard evaluation methods.

In order to reach these goals, the following activities will be carried out:
- Workshops on DL evaluation, for collecting existing evaluation approaches and methods.
- Evaluation support to the DL community, by creating an evaluation forum for enabling communication between evaluation specialists and DL developers.
- Development of new evaluation approaches and methods, in order to overcome the weaknesses of current approaches and the lack of methods for new types of applications.
- Development of evaluation toolkits, e.g. for collecting and analysing experimental data.
- Creation of testbeds for new content and usage types in DLs, by starting from the existing testbeds for XML and cross-lingual retrieval and extending these towards new media, applications and usage types.
- Creation of testbeds for usage-oriented evaluation, by extending existing testbeds or by creation of testbeds of user interactions.

back to top

WP7 consists of 5 tasks plus the management task.

Task 0: Cluster management
(Task leader: Norbert Fuhr, Univ. of Duisburg-Essen)

This task will oversee the work of the Evaluation cluster. It will include the following activities:
- organization of cluster workshops in which the past and future activities of the cluster will be discussed and monitored
- setting up and maintenance of the cluster website

Task 1: Evaluation forum
(Task leader: Sarantos Kapidakis, IoU)

In order to bring together DL developers and evaluators, an electronic forum will be created and maintained, comprising the following components:
- a discussion forum on evaluation issues for enabling communication about evaluation issues, and for facilitating bilateral prototype evaluations.
- a collection of existing evaluation approaches
- a collection of existing testbeds and toolkits for DL evaluation
Creation of this forum will be supported by a working group of evaluation specialists. The survey on evaluation approaches will be based on the results of a corresponding workshop.

back to top

Task 2: Evaluation models and methods
(Task leader: Norbert Fuhr, Univ. of Duisburg-Essen) - partly in cooperation with the Information Access and Personalization cluster.

This task will focus on integrated research on theoretical and practical evaluation issues. During the first 18 months, work will focus on the specification of standard evaluation methods for DLs, starting with a comparison and evaluation of existing evaluation methodology, and then developing new techniques, methods and measures.

Task 3: INEX
(Task leader: Mounia Lalmas, Queen Mary University of London)

INEX 2002 has been the first round of a large-scale evaluation of XML retrieval, and the 2003 round has just started, with more than 40 participating groups.
For the first 18 months of DELOS, we envisage four actions.
- Evaluation of retrieval effectiveness, especially by refining the evaluation criteria, in order to consider how XML elements satisfy information needs in the context of digital libraries.
- Evaluation of efficiency, taking into account the larger number of possible answers (XML elements) and their possible overlap.
- Prototype evaluation of usability, considering various types of information-seeking activities in an interactive setting.
- Investigation of new testbeds for structured multimedia documents, aiming at a transition from a purely text-based XML collection to the inclusion of other media as e.g. in MPEG-7 annotated documents.

back to top

Task 4: CLEF
(Task leader: Carol Peters, CNR-ISTI)

The CLEF activity within DELOS is aimed at evaluating components of multilingual retrieval systems that are of particular relevance to the DL application area. During the first 18 months, the infrastructure and the organization of an evaluation campaign will be implemented, enabling the following tasks:
- Evaluation of multilingual information retrieval systems
- Evaluation of interactive components of cross-language systems focused on query formulation, document selection and/or translation issues
- Evaluation of cross-language systems using controlled vocabularies
- Evaluation of mono-(non-English) and cross-language question answering systems
- Evaluation of systems for cross-language spoken document retrieval

Task 5: Evaluation testbeds
(Task leader: Norbert Fuhr, Univ. of Duisburg-Essen)

This task aims at the creation of new testbeds, especially for user studies and user-centered evaluation. Initially, the work will focus on the creation of a testbed for usage-oriented evaluation, either by collecting user interactions with DLs or by creating a testbed framework with extensible services.

back to top

Expected results
- Survey of existing evaluation methodology
- A conceptual framework for DL evaluation
- Collection of evaluation approaches and methods
- Collection of evaluation toolkits and testbeds
- New testbeds for new content and usage types and for usage-oriented evaluation

Norbert Fuhr
Universitat Duisburg-Essen Germany

Previous evaluation WG website
Find more information about the previous evaluation Working Group on DELOS Working Group 2.1 Website.

back to top

New DELOS NoE Newsletter published
Joint Conference on Digital Libraries (JCDL) 2006 Announcement
INEX 2005 Call for Participation
ICADL 2005 Call for Papers
CLEF 2005 Call for Participation
News Archive
| Last update: 26-09-05 |
| Contact with webmaster
[remove spam protection] |
DELOS offcial website