|
Aims
& Objectives
Objectives
Activities
Tasks [ task
0 - task 1 - task
2 - task 3 - task
4 - task 5 ]
Expected
results
Coordinator
Previous
evaluation WG website
Objectives
Digital libraries need to be evaluated as systems and
as services to determine how useful, usable, and economical
they are and whether they achieve reasonable cost-benefit
ratios. Results of evaluation studies can provide strategic
guidance for the design and deployment of future systems,
can assist in determining whether digital libraries address
the appropriate social, cultural, and economic problems,
and whether they are as maintainable as possible. Consistent
evaluation methods also will enable comparison between
systems and services.
The evaluation cluster will work both on evaluation methodologies
in general as well as on providing the infrastructure
for specific evaluations. Thus, the following objectives
are addressed:
- Development of a comprehensive theoretical framework
for DL evaluation, which can serve as reference point
for evaluation studies in the DL area.
- Research on new methodologies will be supported in order
to overcome the lack of appropriate evaluation approaches
and methods.
- Development of corresponding toolkits and testbeds in
order to enable new evaluations and to ease the application
of standard evaluation methods.
Activities
In order to reach these goals, the following activities
will be carried out:
- Workshops on DL evaluation, for collecting existing
evaluation approaches and methods.
- Evaluation support to the DL community, by creating
an evaluation forum for enabling communication between
evaluation specialists and DL developers.
- Development of new evaluation approaches and methods,
in order to overcome the weaknesses of current approaches
and the lack of methods for new types of applications.
- Development of evaluation toolkits, e.g. for collecting
and analysing experimental data.
- Creation of testbeds for new content and usage types
in DLs, by starting from the existing testbeds for XML
and cross-lingual retrieval and extending these towards
new media, applications and usage types.
- Creation of testbeds for usage-oriented evaluation,
by extending existing testbeds or by creation of testbeds
of user interactions.
back
to top
Tasks
WP7 consists of 5 tasks plus the management task.
Task 0: Cluster
management
(Task leader: Norbert Fuhr, Univ. of Duisburg-Essen)
This task will oversee the
work of the Evaluation cluster. It will include the following
activities:
- organization of cluster workshops in which the past
and future activities of the cluster will be discussed
and monitored
- setting up and maintenance of the cluster website
Task 1: Evaluation
forum
(Task leader: Sarantos Kapidakis, IoU)
In order to bring together
DL developers and evaluators, an electronic forum will
be created and maintained, comprising the following components:
- a discussion forum on evaluation issues for enabling
communication about evaluation issues, and for facilitating
bilateral prototype evaluations.
- a collection of existing evaluation approaches
- a collection of existing testbeds and toolkits for DL
evaluation
Creation of this forum will be supported by a working
group of evaluation specialists. The survey on evaluation
approaches will be based on the results of a corresponding
workshop.
back
to top
Task 2: Evaluation
models and methods
(Task leader: Norbert Fuhr, Univ. of Duisburg-Essen) -
partly in cooperation with the Information Access and
Personalization cluster.
This task will focus on
integrated research on theoretical and practical evaluation
issues. During the first 18 months, work will focus on
the specification of standard evaluation methods for DLs,
starting with a comparison and evaluation of existing
evaluation methodology, and then developing new techniques,
methods and measures.
Task 3: INEX
(Task leader: Mounia Lalmas, Queen Mary University of
London)
INEX 2002 has been the first
round of a large-scale evaluation of XML retrieval, and
the 2003 round has just started, with more than 40 participating
groups.
For the first 18 months of DELOS, we envisage four actions.
- Evaluation of retrieval effectiveness, especially by
refining the evaluation criteria, in order to consider
how XML elements satisfy information needs in the context
of digital libraries.
- Evaluation of efficiency, taking into account the larger
number of possible answers (XML elements) and their possible
overlap.
- Prototype evaluation of usability, considering various
types of information-seeking activities in an interactive
setting.
- Investigation of new testbeds for structured multimedia
documents, aiming at a transition from a purely text-based
XML collection to the inclusion of other media as e.g.
in MPEG-7 annotated documents.
back
to top
Task 4: CLEF
(Task leader: Carol Peters, CNR-ISTI)
The CLEF activity within
DELOS is aimed at evaluating components of multilingual
retrieval systems that are of particular relevance to
the DL application area. During the first 18 months, the
infrastructure and the organization of an evaluation campaign
will be implemented, enabling the following tasks:
- Evaluation of multilingual information retrieval systems
- Evaluation of interactive components of cross-language
systems focused on query formulation, document selection
and/or translation issues
- Evaluation of cross-language systems using controlled
vocabularies
- Evaluation of mono-(non-English) and cross-language
question answering systems
- Evaluation of systems for cross-language spoken document
retrieval
Task 5: Evaluation
testbeds
(Task leader: Norbert Fuhr, Univ. of Duisburg-Essen)
This task aims at the creation
of new testbeds, especially for user studies and user-centered
evaluation. Initially, the work will focus on the creation
of a testbed for usage-oriented evaluation, either by
collecting user interactions with DLs or by creating a
testbed framework with extensible services.
back
to top
Expected
results
- Survey of existing evaluation methodology
- A conceptual framework for DL evaluation
- Collection of evaluation approaches and methods
- Collection of evaluation toolkits and testbeds
- New testbeds for new content and usage types and for
usage-oriented evaluation
Coordinator
Norbert
Fuhr
Universitat Duisburg-Essen Germany
Previous
evaluation WG website
Find more information about the previous evaluation Working
Group on DELOS
Working Group 2.1 Website.
back
to top
|