JUHE API Marketplace

Gmail to Vector Embeddings with PGVector and Ollama

Active

For Gmail, this automated workflow extracts emails, processes them into structured data, and generates vector embeddings for advanced search capabilities. It efficiently handles bulk email imports, enabling users to analyze and retrieve relevant information quickly. With a manual trigger and integration with LangChain, it ensures timely updates and seamless data management.

Workflow Overview

For Gmail, this automated workflow extracts emails, processes them into structured data, and generates vector embeddings for advanced search capabilities. It efficiently handles bulk email imports, enabling users to analyze and retrieve relevant information quickly. With a manual trigger and integration with LangChain, it ensures timely updates and seamless data management.

  • Data Scientists: Those looking to analyze and extract insights from email data.
  • Marketing Professionals: Individuals who want to leverage email data for targeted campaigns.
  • Developers: Coders who seek to automate email processing and storage.
  • Business Analysts: Analysts interested in understanding communication trends and patterns within email data.
  • Researchers: Academics who require structured data from emails for studies.

This workflow automates the process of extracting email data from Gmail, transforming it into structured records and vector embeddings for advanced analysis. It solves the problem of manual data entry and organization, enabling users to efficiently manage large volumes of emails and perform similarity searches on the content.

  • Manual Trigger: The workflow begins when manually triggered by the user.
  • Create the Table: A PostgreSQL table named emails_metadata is created if it doesn't already exist, ensuring a structured storage for email data.
  • Explode Interval into Weeks: The workflow calculates weekly intervals from a specified Gmail account creation date, generating a list of weeks.
  • Set Before and After Dates: It assigns the after and before dates based on the generated weeks for filtering emails.
  • Get a Batch of Messages: Emails received between the specified dates are fetched from Gmail.
  • Extract Email Fields: Relevant fields such as email_text, email_from, date, and email_id are extracted from the fetched emails.
  • Store Structured: The extracted email data is inserted or updated in the emails_metadata table in PostgreSQL.
  • Loop Over Items: Each email is processed in batches, allowing for scalable handling of large datasets.
  • Recursive Character Text Splitter: Email text is split into manageable chunks for further processing.
  • Embeddings Ollama: The split email text is transformed into vector embeddings using the nomic-embed-text model.
  • Store Vectorized: The generated vector embeddings are stored in a separate emails_embeddings table for similarity searches.

Statistics

20
Nodes
0
Downloads
16
Views
11752
File Size

Quick Info

Categories
Communication & Messaging
Complex Workflow
+2
Complexity
complex

Tags

manual
advanced
noop
logic
conditional
complex
sticky note
langchain
+7 more