Back to datasets
Dataset assetOpen Source CommunityLegal DataData Indexing
orbiter/bundestag_gesetze_index_bulk_20240507
The dataset named Deutsche Bundesgesetze und -verordnungen contains Elasticsearch bulk‑format index files of German federal laws and regulations. These files were generated using a tool called bundestag_gesetze_parser, with source data from https://www.gesetze-im-internet.de/. The primary use of the dataset is as a foundation for Retrieval‑Augmented Generation (RAG) combined with large language models (LLMs). The README also provides detailed steps for importing the data into Elasticsearch and YaCy.
Source
hugging_face
Created
Nov 28, 2025
Updated
May 14, 2024
Signals
52 views
Availability
Linked source ready
Overview
Dataset description and usage context
Dataset Overview
Basic Information
- License: cc0-1.0
- Task Category: Text Generation
- Language: German
- Tags: Legal
- Size Category: 100K<n<1M
Dataset Content
- Name: Deutsche Bundesgesetze und -verordnungen
- Format: Elasticsearch Index Bulk Format
- Source: German laws from https://www.gesetze-im-internet.de/
Use Cases
- Can be used for RAG (Retrieval Augmented Generation) combined with large language models.
- Recommended to use Elasticsearch for full‑text indexing or embedding‑based semantic indexing.
Dataset Import
- Elasticsearch import: Start an Elasticsearch container with Docker and import the dataset via a series of curl commands.
- YaCy import: Data can also be imported via YaCy; detailed steps are described on the YaCy forum.
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.