July 15: Support for SharePoint feeds, new Conversation features
New Features
💡 Added support for SharePoint feeds: now can create feed to ingest files from SharePoint document library (and optionally, folder within document library)
💡 Added support for PII detection during entity extraction from text documents and audio transcripts: now we will create labels such as
PII: Social Security Numberautomatically when PII is detected💡 Added support for developer's own OpenAI API keys and Azure OpenAI deployments in
Specificationsℹ️ Changed semantics of
deleteFeedto delete the contents ingested by the feed; since contents are linked to feeds, now feeds can be disabled, while keeping the lineage to the feed, and if feeds are deleted, they will delete the linked contents, so we never lose the feed-to-content lineageAdded GraphQL query for SharePoint consent URI, for registered Graphlit Platform Azure AD application
Better handling of web sitemap indexes: now if a sitemap.xml contains a
sitemapindexelement, we will load all linked sitemaps for evaluating web pages to ingest from Web feedAdded new GraphQL mutations for
openConversation,closeConversationandundoConversationAdded timestamps to Conversation messages
Added new GraphQL mutations for
openCollectionandcloseCollectionAdded more configuration for content search: now can specify
searchType(KEYWORD, VECTOR, HYBRID) andqueryType(SIMPLE, FULL - aka Lucene syntax)Better parsing of iTunes podcast metadata
⚡ Renamed
listingLimitfield on feeds toreadLimit⚡ Renamed
topKtonumberSimilarfor content vector search type⚡ Changed GraphQL feed properties: split out
azureintoazureBlobandazureFileproperties⚡ Changed GraphQL specification properties: split out
openAIintoopenAIandazureOpenAIproperties⚡ Removed
countfields on query results, and replaced with explicitcount{Entity}queries, which support search and filtering.
Bugs Fixed
GPLA-1043: Reddit
readLimitnot taking effect: now the specified limit of Reddit posts will be leveraged for Reddit feedsGPLA-1064: Performance on entity extraction and observation creation for large PDFs was under expectations: now able to build knowledge graph from large PDFs much faster (4x speed improvement)
GPLA-1053: If rendition generation errored during content workflow, the content was not properly marked as errored
GPLA-1102: Large Web sitemaps were slow to load; rewrote sitemap index handling, and now can process sitemaps with 150K+ entries in seconds.
Last updated