JUHE API Marketplace

AI-Ready LLMs.txt Generator from Screaming Frog Crawls

Active

For n8n, generate an AI-ready `llms.txt` file from Screaming Frog website crawls. This automated workflow extracts key data from your Screaming Frog CSV export, filters URLs based on status and indexability, and formats the output for easy use with language models. Quickly create a downloadable file that enhances content discovery for AI applications, streamlining the process of preparing valuable web content for analysis.

Workflow Overview

For n8n, generate an AI-ready `llms.txt` file from Screaming Frog website crawls. This automated workflow extracts key data from your Screaming Frog CSV export, filters URLs based on status and indexability, and formats the output for easy use with language models. Quickly create a downloadable file that enhances content discovery for AI applications, streamlining the process of preparing valuable web content for analysis.

This workflow is ideal for:

  • SEO Professionals: Those who need to generate structured content files from website crawls to improve search engine optimization strategies.
  • Content Marketers: Individuals looking to curate high-quality content for AI models or content discovery.
  • Web Developers: Developers who want to automate the extraction and organization of website data for further analysis or integration.
  • Data Analysts: Analysts needing to process and filter large amounts of data from web crawls efficiently.
  • Small Business Owners: Owners of small websites who want to leverage AI for content generation without extensive technical knowledge.

This workflow addresses the challenge of generating an llms.txt file from Screaming Frog exports, which can be cumbersome and time-consuming. It automates the process of filtering URLs based on specific criteria, ensuring that only valuable content is included. This helps users save time and focus on higher-level tasks while ensuring that the generated file is optimized for AI models.

  1. Trigger the Workflow: The user fills out a form providing the website name, description, and uploads the internal_html.csv file from Screaming Frog.
  2. Extract Data: The workflow extracts data from the uploaded CSV file, ensuring it is in a usable format for subsequent steps.
  3. Set Useful Fields: Key fields such as URL, title, description, status, indexability, content type, and word count are defined for processing.
  4. Filter URLs: The workflow filters URLs to retain only those with a 200 status code, marked as indexable, and with a content type of text/html.
  5. Classify Content (Optional): A text classifier can be activated to intelligently filter and classify URLs based on their content quality.
  6. Format Rows for llms.txt: Each URL is formatted into a specific row structure for the llms.txt file.
  7. Concatenate Rows: All formatted rows are concatenated to create a single output string.
  8. Set Content for llms.txt: The content of the llms.txt file is prepared, including the website title, description, and concatenated rows.
  9. Generate the File: Finally, the workflow creates the llms.txt file, which can be downloaded directly from n8n or uploaded to a preferred storage solution.

Statistics

23
Nodes
0
Downloads
72
Views
17918
File Size

Quick Info

Categories
Complex Workflow
Manual Triggered
Complexity
complex

Tags

manual
advanced
noop
logic
conditional
complex
sticky note
files
+7 more

Boost your workflows with Wisdom Gate LLM API

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.