Comparison of Embedding Methods for Retrieval Under Noisy Institutional Labels

Investor logo

Warning

This publication doesn't include Faculty of Arts. It includes Faculty of Law. Official publication website can be found on muni.cz.
Authors

NOVOTNÁ Tereza HARAŠTA Jakub

Year of publication 2025
Type Article in Proceedings
Conference JURIX 2025 Proceedings (Frontiers in Artificial Intelligence and Applications, volume 416: Legal Knowledge and Information Systems)
MU Faculty or unit

Faculty of Law

Citation
web Plný text výsledku
Doi https://doi.org/10.3233/FAIA251605
Keywords legal information retrieval; case law; embeddings; evaluation; noisy labels; Czech Constitutional Court
Description Retrieving relevant case law remains a time-consuming task. We compare two embedding models for Czech Constitutional Court decisions: (i) a large general-purpose OpenAI embedder and (ii) a domain-specific BERT trained from scratch on ~34,000 decisions. We introduce a noise-aware evaluation using IDF-weighted keyword overlap as graded relevance, dual thresholds (0.20, 0.28), paired-bootstrap significance, and nDCG diagnostics. Despite conservative absolute nDCG due to noisy institutional labels, the OpenAI embedder consistently and significantly outperforms the domain BERT across all ranks and thresholds. Our framework enables robust evaluation under imperfect gold standards typical of legacy judicial databases.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.