July 15: Support for SharePoint feeds, new Conversation features
Last updated
Last updated
Added support for SharePoint feeds: now can create feed to ingest files from SharePoint document library (and optionally, folder within document library)
Added support for PII detection during entity extraction from text documents and audio transcripts: now we will create labels such as PII: Social Security Number
automatically when PII is detected
Added support for developer's own OpenAI API keys and Azure OpenAI deployments in Specifications
Changed semantics of deleteFeed
to delete the contents ingested by the feed; since contents are linked to feeds, now feeds can be disabled, while keeping the lineage to the feed, and if feeds are deleted, they will delete the linked contents, so we never lose the feed-to-content lineage
Added GraphQL query for SharePoint consent URI, for registered Graphlit Platform Azure AD application
Better handling of web sitemap indexes: now if a sitemap.xml contains a sitemapindex
element, we will load all linked sitemaps for evaluating web pages to ingest from Web feed
Added new GraphQL mutations for openConversation
, closeConversation
and undoConversation
Added timestamps to Conversation messages
Added new GraphQL mutations for openCollection
and closeCollection
Added more configuration for content search: now can specify searchType
(KEYWORD, VECTOR, HYBRID) and queryType
(SIMPLE, FULL - aka Lucene syntax)
Better parsing of iTunes podcast metadata
Renamed listingLimit
field on feeds to readLimit
Renamed topK
to numberSimilar
for content vector search type
Changed GraphQL feed properties: split out azure
into azureBlob
and azureFile
properties
Changed GraphQL specification properties: split out openAI
into openAI
and azureOpenAI
properties
Removed count
fields on query results, and replaced with explicit count{Entity}
queries, which support search and filtering.
GPLA-1043: Reddit readLimit
not taking effect: now the specified limit of Reddit posts will be leveraged for Reddit feeds
GPLA-1064: Performance on entity extraction and observation creation for large PDFs was under expectations: now able to build knowledge graph from large PDFs much faster (4x speed improvement)
GPLA-1053: If rendition generation errored during content workflow, the content was not properly marked as errored
GPLA-1102: Large Web sitemaps were slow to load; rewrote sitemap index handling, and now can process sitemaps with 150K+ entries in seconds.