July 15: Support for SharePoint feeds, new Conversation features
New Features
💡 Added support for SharePoint feeds: now can create feed to ingest files from SharePoint document library (and optionally, folder within document library)
💡 Added support for PII detection during entity extraction from text documents and audio transcripts: now we will create labels such as
PII: Social Security Number
automatically when PII is detected💡 Added support for developer's own OpenAI API keys and Azure OpenAI deployments in
Specifications
ℹ️ Changed semantics of
deleteFeed
to delete the contents ingested by the feed; since contents are linked to feeds, now feeds can be disabled, while keeping the lineage to the feed, and if feeds are deleted, they will delete the linked contents, so we never lose the feed-to-content lineageAdded GraphQL query for SharePoint consent URI, for registered Graphlit Platform Azure AD application
Better handling of web sitemap indexes: now if a sitemap.xml contains a
sitemapindex
element, we will load all linked sitemaps for evaluating web pages to ingest from Web feedAdded new GraphQL mutations for
openConversation
,closeConversation
andundoConversation
Added timestamps to Conversation messages
Added new GraphQL mutations for
openCollection
andcloseCollection
Added more configuration for content search: now can specify
searchType
(KEYWORD, VECTOR, HYBRID) andqueryType
(SIMPLE, FULL - aka Lucene syntax)Better parsing of iTunes podcast metadata
⚡ Renamed
listingLimit
field on feeds toreadLimit
⚡ Renamed
topK
tonumberSimilar
for content vector search type⚡ Changed GraphQL feed properties: split out
azure
intoazureBlob
andazureFile
properties⚡ Changed GraphQL specification properties: split out
openAI
intoopenAI
andazureOpenAI
properties⚡ Removed
count
fields on query results, and replaced with explicitcount{Entity}
queries, which support search and filtering.
Bugs Fixed
GPLA-1043: Reddit
readLimit
not taking effect: now the specified limit of Reddit posts will be leveraged for Reddit feedsGPLA-1064: Performance on entity extraction and observation creation for large PDFs was under expectations: now able to build knowledge graph from large PDFs much faster (4x speed improvement)
GPLA-1053: If rendition generation errored during content workflow, the content was not properly marked as errored
GPLA-1102: Large Web sitemaps were slow to load; rewrote sitemap index handling, and now can process sitemaps with 150K+ entries in seconds.
Last updated