Skip to content

evaluation

Evaluation function

Parameters

  • search

    Search function.

  • query_answers (List[Tuple[str, List[Dict[str, str]]]])

    List of tuples (query, answers).

  • hits_k (range) – defaults to range(1, 6)

    List of k to compute precision, recall and F1.

  • batch_size (Optional[int]) – defaults to None

    Batch size.

  • k (Optional[int]) – defaults to None

    Number of documents to retrieve.

Examples

>>> from pprint import pprint as print
>>> from cherche import data, evaluate, retrieve
>>> from lenlp import sparse

>>> documents, query_answers = data.arxiv_tags(
...    arxiv_title=True, arxiv_summary=False, comment=False
... )

>>> search = retrieve.TfIdf(
...     key="uri",
...     on=["prefLabel_text", "altLabel_text"],
...     documents=documents,
...     tfidf=sparse.TfidfVectorizer(normalize=True, ngram_range=(3, 7), analyzer="char"),
... ) + documents

>>> scores = evaluate.evaluation(search=search, query_answers=query_answers, k=10)

>>> print(scores)
{'F1@1': '26.52%',
 'F1@2': '29.41%',
 'F1@3': '28.65%',
 'F1@4': '26.85%',
 'F1@5': '25.19%',
 'Precision@1': '63.06%',
 'Precision@2': '43.47%',
 'Precision@3': '33.12%',
 'Precision@4': '26.67%',
 'Precision@5': '22.55%',
 'R-Precision': '26.95%',
 'Recall@1': '16.79%',
 'Recall@2': '22.22%',
 'Recall@3': '25.25%',
 'Recall@4': '27.03%',
 'Recall@5': '28.54%'}