July 19: Support for OpenAI GPT-4o Mini, BYO-key for Azure AI, similarity by summary, bug fixes

New Features

💡 Graphlit now supports the OpenAI GPT-4o Mini model, with 16k output tokens.
💡 Graphlit now supports 'bring-your-own-key' for Azure AI Document Intelligence models. We have added a custom endpoint and key property, which can be assigned to use your own Azure AI resource.
Updated to use Jina reranker v2 (jina-reranker-v2-base-multilingual) by default.
Updated to assign the summary, bullets, etc properties when calling summarizeContents mutation. Now when summarizing contents, we will store the resulting summary in the content itself, in addition to returning the summarized results.
Added relevance property to all entity types, which will be assigned when searching for these entities. Entity results will be sorted (descending) by this search relevance score.
Added the ability to manually update summary, bullets, etc. properties when calling the updateContent mutation.
Added offset property to AtlassianJiraFeedProperties, so the timezone offset can be properly assigned for paging of the Jira feed. (Defaults to zero offset, i.e. UTC.) Jira does not store dates in UTC format, and the timestamps are based on the server timezone of the hosted Jira instance. By assigning the timezone offset with the Jira feed, we can reliably page the updated date timestamps from the Jira API.
⚡ We have changed the content similarity search behavior to find similar content by summary, rather than text of the document, when a summary has been previously generated. For long documents, this will provide a more accurate similarity, rather than comparing on the first few pages of text in a document.
⚡ We have changed the behavior of assigning offset in the entity filter objects for paging through entities. If using vector or hybrid search, this offset will be ignored (i.e. zero offset). Paging will not be supported through vector or hybrid search results. For keyword search, the offset will continue to be used, along with the limit property, to provide paging through the search results. We have made this change because we have found that index-based paging is not reliable with our vector/hybrid search approach. We are investigating ways to support this reliably with vector/hybrid search in the future.

Bugs Fixed

GPLA-2915: Add retry on OpenAI API HTTP 524 error (gateway timeout).
GPLA-2908: Not paging through Jira feed correctly.
GPLA-2917: Search by similar content is not giving expected results on long documents.
GPLA-2244: Keyword search not finding text in latter part of long PDF.

PreviousJuly 25: Support for Mistral Large 2 & Nemo, Groq Llama 3.1 models, bug fixes NextJuly 4: Support for webhook Alerts, keywords summarization, Deepseek 128k context window, bug fixes

Last updated 11 months ago