JuheAPI Marketplace - Connect Smarter, Beyond APIs

This workflow is designed for:

Data Scientists: Who need to automate the process of loading, processing, and storing document embeddings.
Developers: Looking for an efficient way to handle files from Google Drive and integrate them into a database.
Researchers: Who require a systematic approach to manage and analyze large sets of documents, especially in PDF, text, and JSON formats.
Business Analysts: Interested in leveraging document data for insights and reporting.
Automation Enthusiasts: Wanting to streamline their workflows and minimize manual tasks.

This workflow addresses the challenge of:

Manual File Handling: Reducing the time spent on downloading, processing, and storing files from Google Drive.
Data Integration: Seamlessly integrating various document formats into a PostgreSQL database for further analysis.
File Organization: Automatically moving processed files to designated folders, ensuring better organization and accessibility.
Complex Workflow Management: Simplifying the management of various document types and their embeddings through a structured automation process.

Schedule Trigger: The workflow is initiated automatically every day at 3 AM.
Search Folder: It searches a specific Google Drive folder for files to process.
Loop Over Items: Each file found is processed one by one.
Download File: Each file is downloaded from Google Drive.
Switch Node: The workflow determines the file type (PDF, text, or JSON) based on its MIME type.
Extract from File: Depending on the file type, the appropriate extraction method is applied:
- For PDFs, it uses the Extract from PDF node.
- For text files, it uses the Extract from Text node.
- For JSON files, it uses the Extract from JSON node.
Embeddings OpenAI: The extracted text is processed to generate embeddings using OpenAI's model.
Postgres PGVector Store: The embeddings are stored in a PostgreSQL database.
Move File: Finally, the processed file is moved to a designated folder in Google Drive, ensuring organization.

Vector DB Loader from Google Drive