December 27: Support for LLM fallbacks, native Google Docs formats, website unblocking, bug fixes
Last updated
Last updated
Graphlit now supports LLM fallbacks which can help protect your application from model provider downtime. By assigning the fallbacks
property when creating your conversation, you can provide an optional list of LLM specifications to be used (in order). These fallback specifications will only be used when we failed to prompt the conversation via the main specification. Caveat, the RAG pipeline will only use the strategies provided in the main specification for prompt rewriting, content retrieval, etc. Content is not re-retrieved upon fallback - the formatted LLM prompt will be tried against each fallback specification in succession until one succeeds. (Colab Notebook Example)
Graphlit now supports querying of all available models, through the new models
query in the API. This returns the model enum, model service type enum, description, and several other useful details about the models. (Colab Notebook Example)
Graphlit now supports the ingestion of native Google Docs, Google Sheets and Google Slides documents from Google Drive feeds. These formats will be auto-exported to the corresponding Microsoft Office format (DOCX, XLSX, PPTX) prior to ingesting as content.
Graphlit now supports unblocking of websites, such as those using Cloudflare. You can set enableUnblockedCapture
to true on the PreparationWorkflowStage
to enable unblocking - through our integration with Browserless.io headless browser service. This does incur an additional cost per page, compared to normal web page ingestion.
We have added support for assigning observations to contents ingested via feeds. By assigning observations
to the IngestionWorkflowStage
in workflow object, you can assign Labels, Organizations, etc. without needing to use entity extraction.
We have added support for assigning observations when ingesting content via ingestUri
, ingestText
, etc. mutations. By passing observations
as a parameter, similar to `collections`, you can assign Labels, Organizations, etc. without needing to use entity extraction.
We have changed the response type of the publishContents
mutation to return PublishContents
type. This new PublishContents
type wraps the published Content
object, and includes the new Details
property of PublishingDetails
type. We have added an includeDetails
parameter to publishContents
mutation, which will fill in the Details property with a list of intermediate content summaries and the published text, among other publishing metrics.
We have changed the behavior of publishContents
such that, if no content was retrieved for publishing, the mutation returns a null content object rather than returning an error.
GPLA-3645: Table headers merged together on web scrape
GPLA-3634: Failed to extract pages from PDF with empty hyperlink text
GPLA-3633: Not handling empty observables properly for reranking