Upgrade to Pro — share decks privately, control downloads, hide ads and more …

NTCIR-17 Transfer Task

Hideo Joho
September 28, 2022

NTCIR-17 Transfer Task

Introducing a new pilot task called Transfer at the kick-off event of NTCIR-17

Hideo Joho

September 28, 2022
Tweet

More Decks by Hideo Joho

Other Decks in Research

Transcript

  1. NTCIR-17 Transfer Task Resource Transfer Based Dense Retrieval English Version

    with audio 日本語版(音声付き) Hideo Joho University of Tsukuba Atsushi Keyaki Hitotsubashi University Yuki Ohba University of Tsukuba
  2. Examples of resource transfer • Task transfer ◦ Fine-tuning from

    navigational queries to informational queries ◦ Fine-tuning from a language model to a ranking model • Domain transfer ◦ Domain adaptation from Web documents to academic writing • Language transfer ◦ English models to Japanese models • and so on…
  3. Data to be available • Existing data ◦ MS MARCO

    (ver 1) English version (aka eMARCO) ◦ NTCIR-1 Ad-hoc test collection (Ja) ◦ NTCIR-2 Ad-hoc test collection (Ja) ◦ BERT models (En/Ja) • Data to be constructed and provided ◦ MS MARCO (ver 1) Japanese translation version (aka jMARCO) ▪ Document collection and dev topics (Initial translation has been completed) ▪ JParaCrawl version 2 + DeepL API ◦ ColBERT Model trained on jMARCO ◦ BERT-Reranker trained on dev / jMARCO
  4. Subtask 1: Dense First Stage Retrieval • Input/Output ◦ Input:

    Ad-Hoc task topic description ◦ Output: Ranked list of top 1,000 document IDs • Dev/Test ◦ Dev: NTCIR-1 Ad-Hoc/CLIR (Ja) 83 topics ◦ Test: NTCIR-2 Ad-Hoc/CLIR (Ja) 49 topics • Metrics ◦ nDCG
  5. Subtask 2: Dense Reranking Subtask • Input/Ouput ◦ Input: Top

    1,000 documents from the 1st stage retrieval (Doc IDs, vectors, etc.) ▪ Provided by the organizer ◦ Output: Reranked list of top 100 document IDs • Dev/Test ◦ Dev: NTCIR-1 Ad-Hoc/CLIR (Ja) 83 topics ◦ Test: NTCIR-2 Ad-Hoc/CLIR (Ja) 49 topics • Metrics ◦ nDCG / MRR
  6. Tentative Schedule • September 28th, 2022: Kick-off event • January

    30th, 2023: Final task guideline release, all resources release • February 1st, 2023: Formal Run: Dev/Test topics release • May 1st, 2023: Formal Run: Task registration due • June 1st, 2023: Formal Run: Run submission due • August 1st, 2023: Formal Run: Evaluation results returned • August 1st, 2023: Task overview paper release (Draft) • September 1st, 2023: Participant paper submission due (Draft) • November 1st, 2023: Camera-ready submission due • December 2023: NTCIR-17 Conference
  7. Task Design Consideration 1. No sparse runs (e.g., BM25 only)

    but a simple fine-tuned model is acceptable 2. Subtask 2 has a fixed 1K docs set (Use outputs from Subtask 1?) 3. Currently focusing on Japanese in the target task (Other languages?) 4. Currently no restrictions on data/models to generate runs 5. Currently no Dry Run period 6. Accepts 3-5 runs per team (More?) 7. We trust participants not looking at qrels of test sets (Important) 8. We might perform additional relevance assessments 9. We might introduce a leaderboard 10. We aim to build a resource guide / best practice information
  8. Advisory Board • Noriko Kando (NII, Japan) • Doug Oard

    (University of Maryland, USA) • to be added