June 9: Support for Deepseek models, JSON-LD webpage parsing, performance improvements and bug fixes

New Features

  • 💡 Graphlit now supports Deepseek LLMs for prompt completion. We offer the deepseek-chat and deepseek-coder models.

  • 💡 Graphlit now supports parsing embedded JSON-LD from web pages. If a web page contains 'script' tags with JSON-LD, we will automatically parse and inject into the knowledge graph.

  • We have changed the default model for entity extraction and image completions to be OpenAI GPT-4o. This provides faster performance and better quality output.

  • We have changed the behavior of knowledge graph generation, from a prompted conversation, to be opt-in. In order to receive the graph's nodes and edges with the response, you will now need to set generateGraph to True in the specification's graphStrategy object. This provides improved performance when the graph is not needed for visualization.

  • Added thing property for observable entities, which returns the JSON-LD metadata associated with the entity.

  • Added regex-based filtering for URI paths during feed ingestion, link crawling, and workflow filtering. You can assign regex patterns in allowedPaths and excludedPaths.

  • Added observableLimit to configure the limit of how many observed entities will be added to the GraphRAG context, defaults to 1000.

  • Added prompt to suggestConversation mutation, which allows customization of the followup question generation.

  • Updated suggestConversation to incorporate the past conversation message history, in addition to the filtered set of content sources.

  • 🔥 We have improved performance in knowledge graph retrieval and generation, via better parallelization and batching.

Bugs Fixed

  • GPLA-2748: Optimize the retrieval performance of observed entities during GraphRAG

  • GPLA-2732: Invalid user-provided URI causing parsing exception

  • GPLA-2666: Shouldn't require tenant ID for Microsoft email or Teams

  • GPLA-2772: Not returning labels or categories from graph in API

  • GPLA-2762: Failed to extract spreadsheet images

  • GPLA-2687: Email to/from not getting added as observations on emails

  • GPLA-2738: API is returning 'audio' metadata from podcast HTML document

Last updated