October 21: Support OpenAI, Cohere, Jina, Mistral, Voyage and Google AI embedding models
Last updated
Last updated
Graphlit now supports the configuration of image and text embedding models, at the Project level. You can create an embedding specification for a text or image embedding model, and then assign that to the Project, and all further embedding requests will use that embedding model. See this Colab notebook for an example of how to configure the project.
Graphlit now supports the OpenAI Embedding-3-Small and Embedding-3-Large, Cohere Embed 3.0, Jina Embed 3.0, Mistral Embed, and Voyage 2.0 and 3.0 text embedding models. Graphlit also now supports Jina CLIP image embeddings, which are used by default for image search.
Graphlit now supports the chunkTokenLimit
property in Specifications, which specifies the number of tokens for each embedded text chunk. If this is not configured, Graphlit uses 600 tokens for each embedded text chunk.
Graphlit now supports the Voyage reranking model.
Graphlit now supports the ingestTextBatch
mutation, which accepts an array of text and name pairs, and will asynchronously ingest these into content objects.
We have moved the chunkTokenLimit
property from the Workflow storage embeddings strategy to the Specification object. The Workflow storage
property has now been deprecated.
We have deprecated the openAIImage
property from Workflow entity extraction properties. Use the modelImage
property instead.
Once a text embedding model has been updated at the project level, any content, conversations or observed entities will no longer be semantically searchable.
Text embeddings are not compatible across models, so you will need to delete and reingest any content, or recreate conversations or knowledge graph entities, with the new embedding model to become searchable.