September 4: Workflow configuration; support for Notion feeds; document OCR

New Features

  • 🔥
    Added Workflow entity to data model for configuring stages of content workflow; can be assigned to Feed or with ingestPage, ingestFile, or ingestText mutations to control how content is ingested, prepared, extracted and enriched into the knowledge graph.
  • 💡
    Added support for Notion feeds: now can create feed to ingest files from Notion pages or databases (i.e. wikis).
  • 💡
    Added support for API-created Observation entities, which allow for custom observations of observable entities (i.e. Person, Label) on Content.
  • 💡
    Added support for Azure AI Document Intelligence as an optional method for preparing PDF files, using OCR and advanced layout analysis.
  • 💡
    Added summarization strategies, where content can be summarized into paragraphs, bullet points or headline.
  • Added more well-known link types, during link crawling, such as Discord, Airtable and TypeForm.
  • ℹ️
    Free/Hobby plan now has 5GB storage quota; any content ingested past that limit will be auto-deleted.
  • Actions have been moved into Workflow entity.
  • Link enrichment for Feeds has been moved into the Workflow enrichment stage, now called link crawling. ExcludeContentDomain property has been reversed and is now called IncludeContentDomain.

Bugs Fixed

  • GPLA-1204: Failed to ingest content with backslash in name.
  • GPLA-1276: Failed to ingest RSS posts which contained enclosure URI, but no post URI.