December 27: Support for LLM fallbacks, native Google Docs formats, website unblocking, bug fixes

New Features

  • 💡 Graphlit now supports LLM fallbacks which can help protect your application from model provider downtime. By assigning the fallbacksproperty when creating your conversation, you can provide an optional list of LLM specifications to be used (in order). These fallback specifications will only be used when we failed to prompt the conversation via the main specification. Caveat, the RAG pipeline will only use the strategies provided in the main specification for prompt rewriting, content retrieval, etc. Content is not re-retrieved upon fallback - the formatted LLM prompt will be tried against each fallback specification in succession until one succeeds. (Colab Notebook Example)

  • 💡 Graphlit now supports querying of all available models, through the new modelsquery in the API. This returns the model enum, model service type enum, description, and several other useful details about the models. (Colab Notebook Example)

  • Graphlit now supports the ingestion of native Google Docs, Google Sheets and Google Slides documents from Google Drive feeds. These formats will be auto-exported to the corresponding Microsoft Office format (DOCX, XLSX, PPTX) prior to ingesting as content.

  • Graphlit now supports unblocking of websites, such as those using Cloudflare. You can set enableUnblockedCaptureto true on the PreparationWorkflowStageto enable unblocking - through our integration with Browserless.io headless browser service. This does incur an additional cost per page, compared to normal web page ingestion.

  • We have added support for assigning observations to contents ingested via feeds. By assigning observationsto the IngestionWorkflowStagein workflow object, you can assign Labels, Organizations, etc. without needing to use entity extraction.

  • We have added support for assigning observations when ingesting content via ingestUri, ingestText, etc. mutations. By passing observationsas a parameter, similar to `collections`, you can assign Labels, Organizations, etc. without needing to use entity extraction.

  • We have changed the response type of the publishContentsmutation to return PublishContentstype. This new PublishContentstype wraps the published Contentobject, and includes the new Detailsproperty of PublishingDetailstype. We have added an includeDetailsparameter to publishContentsmutation, which will fill in the Details property with a list of intermediate content summaries and the published text, among other publishing metrics.

  • We have changed the behavior of publishContentssuch that, if no content was retrieved for publishing, the mutation returns a null content object rather than returning an error.

Bugs Fixed

  • GPLA-3645: Table headers merged together on web scrape

  • GPLA-3634: Failed to extract pages from PDF with empty hyperlink text

  • GPLA-3633: Not handling empty observables properly for reranking

Last updated