August 8: Support for LLM-based document extraction, .NET SDK, bug fixes

New Features

  • 💡 Graphlit now supports LLM-based document preparation, using vision-capable models such as OpenAI GPT-4o and Anthropic Sonnet 3.5. This is available via the MODEL_DOCUMENT preparation service type, and you can assign a customspecification object and bring your own LLM keys.

  • 💡 Graphlit now provides an open source .NET SDK, supporting .NET 6 and .NET 8 (and above). SDK package can be found on Nuget.org. Code samples can be found on GitHub.

  • Added identifier property to Content object for mapping content to external database identifiers. This is supported for content filtering as well.

  • Added support for Claude 3 vision models for image-based entity extraction, using the MODEL_IMAGE entity extraction service.

  • Added context augmentation to conversations, via the augmentedFilter property on the Conversation object. Any content which matches this augmented filter will be injected into the LLM prompt content, without needing to be related by vector similarity to the user prompt. This is useful for specifying domain knowledge which should always be referenced by the RAG pipeline.

  • Added support for the latest snapshot of OpenAI GPT-4o, with the model enum GPT4O_128K_20240806.

  • Added reranking of related entities, when preparing the LLM prompt context for GraphRAG. If reranking is enabled, the metadata from the related entities will be reranked with the same reranker assigned to the conversation specification.

  • We have changed the type of the duration field in the AudioMetadata and VideoMetadata types to be TimeSpan rather than string, as to be more consistent with the rest of the API data model.

Bugs Fixed

  • GPLA-2884: Support retry on HTTP 529 (Overloaded) error from Anthropic API.

Last updated