Main Page
-
This is the TREC Enterprise Track WIKI | House Rules | Member Contributions | Track Homepage
TREC Homepage
Introduction
The goal of the Enterprise Track is to study the issues that arise when searching the documents of an enterprise (organisation). It involves some things that are new to TREC:
1. New data. Organisations have a mixture of documents types: web pages, news/email archives and document archives (for example on a shared fileserver or in a version control system). Each type has particular characteristics, including inter and intra document structure.
2. New tasks. Experiments should be representative of what enterprise users search for on enterprise data. This might involve a single data type (the ‘find a known item in email’ task) or integrate search across types (the ‘adhoc search across mixed data types’ task).
Enterprise search is interesting because it has not been sufficiently addressed in research, and it is of immense practical importance in real organisations.
In 2004 we built and made available an enterprise test collection, the W3C corpus. We started to develop ideas for search tasks, and invited Web Track participants to download the data and think about what tasks interest them. We then discussed the possible experiments at the TREC-2004 pre-track workshop.
TREC_2005 was the first year of the track. There were two main tasks: Email search and Expert search. Official details of the experiments appear there and in the TREC 2005 Proceedings
.
One reason this site is a wiki is to make it easily manageable by the coordinators. Another reason is to encourage Member Contributions from track participants. These pages are where you can add your comments and suggestions regarding these and future tasks. Do you have notes from the workshop? Add them here. Perhaps you have references to add to the bibliography page. Maybe you have data that people might be interested in, or suggestions of data we could use. Maybe you talked to your local W3C member and have some insights to add to the interview page. How do you think people search mailing lists? What are plausible invormation needs in the case of "expert search"? Please read the House Rules but in particular note that most pages are world readable.
TREC 2008 Information
The TREC 2008 guidelines include information on the current tasks and track deadlines. The 2008 topics will be available to active TREC participants from http://trec.nist.gov/act_part/tracks.html.
Past enterprises
TREC 2005:
Track guidelines
Topics and relevance judgments
Track overview paper
TREC 2006:
Track guidelines
Topics and relevance judgments
Track overview paper
TREC 2007:
Track guidelines
Topics and relevance judgments
Track overview paper
Data
TREC 2007-2008
The CSIRO Enterprise Research Collection corpus is available from
CSIRO's Enterprise Search website
.
You must complete an Organisational Agreement
to be sent to CSIRO, and keep signed
Individual Agreements
for each person who has access to the data.
TREC 2005-2006
For 2005 and 2006, the track used the W3C corpus
. TREC participants (i.e., responded to the CFP) who have signed the TREC results-dissemination agreement can gain download access by contacting lori.buckland@nist.gov.
- W3C
;
- W3C corpus
;
- During topic development, Glasgow University has kindly provided a search interface on the W3C data
using Terrier
.
Bibliography
Enterprise Search Bibliography
Mailing List
The mailing list for the TREC Enterprise Track: trec-ent@nist.gov.
To subscribe, send a mail message to listproc@nist.gov such that the body consists of the line
subscribe trec-ent <FirstName> <LastName>
Information on the mailing list archive.