Vote¶
Voting operator. Computes the score for each document based on it's number of occurences and based on documents ranks: \(nb_occurences * sum_{rank \in ranks} 1 / rank\). The higher the score, the higher the document is ranked in output of the vote.
Parameters¶
-
models (list)
List of models of the vote.
Examples¶
>>> from pprint import pprint as print
>>> from cherche import retrieve, rank
>>> from sentence_transformers import SentenceTransformer
>>> documents = [
... {"id": 0, "town": "Paris", "country": "France", "continent": "Europe"},
... {"id": 1, "town": "Montreal", "country": "Canada", "continent": "North America"},
... {"id": 2, "town": "Madrid", "country": "Spain", "continent": "Europe"},
... ]
>>> search = (
... retrieve.TfIdf(key="id", on="town", documents=documents) *
... retrieve.TfIdf(key="id", on="country", documents=documents) *
... retrieve.Flash(key="id", on="continent")
... )
>>> search = search.add(documents)
>>> retriever = retrieve.TfIdf(key="id", on=["town", "country", "continent"], documents=documents)
>>> ranker = rank.Encoder(
... key="id",
... on=["town", "country", "continent"],
... encoder=SentenceTransformer("sentence-transformers/all-mpnet-base-v2").encode,
... ) * rank.Encoder(
... key="id",
... on=["town", "country", "continent"],
... encoder=SentenceTransformer(
... "sentence-transformers/multi-qa-mpnet-base-cos-v1"
... ).encode,
... )
>>> search = retriever + ranker
>>> search = search.add(documents)
>>> print(search("What is the capital of Canada ? Is it paris, montreal or madrid ?"))
[{'id': 1, 'similarity': 2.5},
{'id': 0, 'similarity': 1.4},
{'id': 2, 'similarity': 1.0}]
Methods¶
call
Call self as a function.
Parameters
- q (Union[List[List[Dict[str, str]]], List[Dict[str, str]]])
- batch_size (Optional[int]) – defaults to
None
- k (Optional[int]) – defaults to
None
- documents (Optional[List[Dict[str, str]]]) – defaults to
None
- kwargs
add
Add new documents.
Parameters
- documents (list)
- kwargs