Providing feedback about search results#

Search engine quality can be directly improved with ground-truth relevance data for search results. Our feedback API allows you to provide these data through tonita.datatypes.feedback.FeedbackItems.

Each FeedbackItem holds the following information for a given search request and listing pair:

A search request in the form of a tonita.datatypes.search.SearchRequest. At minimum, a value for the query field must be provided.
The corpus ID for the search.
The ID of a listing from the search results for the search request provided. This listing must belong to the corpus specified.
The ground-truth relevance between the search request and the listing.

This relevance value can be anywhere from 0 to 1, and how it’s determined is entirely up to you. For example, you might choose to simply classify listings as either relevant or irrelevant (e.g., based on whether an end user engaged with a particular listing at all), or you might consider gradations of relevance (e.g., based on the amount of time an end user engaged with a particular listing).

Note

Note that the data for a given FeedbackItem could come from many sources. For example, they may come from search results produced by Tonita, hand-curated data sets, or historical logs from a legacy search engine.

For search results that were not produced by Tonita, please populate each SearchRequest as you would if you were to perform the same search using Tonita.

Upcoming feature

In a forthcoming version of the API, you will be able to provide a search event ID for searches performed by Tonita in lieu of providing the search request and corpus ID.

As an example, let’s construct a few FeedbackItems:

from tonita.datatypes.search import SearchRequest
from tonita.datatypes.feedback import FeedbackItem

search_request1 = SearchRequest(
    query="short story anthology with both sci-fi and fantasy",
    categories=["hardcover"],
)

feedback_item1 = FeedbackItem(
    search_request=search_request1,
    listing_id="sf826503qk",
    relevance=1,
    corpus_id="new_books"
)

feedback_item2 = FeedbackItem(
    search_request=search_request1,
    listing_id="aa359316on",
    relevance=0.56,
    corpus_id="new_books"
)

search_request2 = SearchRequest(
    query="alice in wonderland",
    categories=["first-edition", "signed"],
    facet_restrictions=[
        {
            "name": "price",
            "type": "NUMERIC",
            "operation": "LESS_THAN_EQUAL",
            "value": 500
        }
    ],
    corpus_id="rare_books"
)

feedback_item3 = FeedbackItem(
    search_request=search_request2,
    listing_id="wn225968dr",
    relevance=0.93
)

Here, we’ve created three FeedbackItems: all have categories data, two pertain to the same search request, and one has a facet restriction. Note also that the corpus IDs differ among the FeedbackItems: two of them share a corpus ID, but the third one has a different corpus ID.

We can submit them all at the same time through the feedback API as follows:

tonita.feedback.submit(
    feedback_items=[feedback_item1, feedback_item2, feedback_item3],
    api_key="my_api_key"
)

# Example return value:
# SubmitFeedbackResponse(
#     feedback_submission_id="a9042b8c0396"
# )

If a submission was successful, we return an ID for the feedback submission for bookkeeping.

If you have many FeedbackItems to submit, you can also provide a path to a JSONL file. Each line of this file should be a FeedbackItem in JSON format. For example, consider the following dataclass:

FeedbackItem(
    search_request=SearchRequest(
        query="famous 90s romcom"
    ),
    corpus_id="movies",
    listing_id="sa395823fn",
    relevance=0.98
)

As a JSON, this becomes:

{
    "search_request": {
        "query": "famous 90s romcom"
    },
    "corpus_id": "movies",
    "listing_id": "sa395823fn",
    "relevance": 0.98
}

For details on how to make this conversion more generally, see Dataclasses and JSONs.

A JSONL file can be submitted as follows:

tonita.feedback.submit(
    jsonl_path="path/to/my_feedback_items.jsonl",
    api_key="my_api_key"
)

# Example return value:
# SubmitFeedbackResponse(
#     feedback_submission_id="2aa64f3967b3"
# )