AI-powered automation: streamlined document processing for a German retail enterprise

About the client

Company Name
Location
Soltau, Germany
Company size
1,400 employees
Industry

Hagebau is one of Germany’s leading retailers in the building materials industry. Its network consists of 300 medium-sized shareholder companies, uniting independent businesses across different European countries. From specialty trade outlets to retail stores and online platforms, Hagebau works across almost 1,500 locations.

Hagebau - top image - Lemberg Solutions

The challenge

As a retail company, Hagebau works with various suppliers who manufacture building materials, some of which may be hazardous. To comply with regulatory requirements, such items come with safety data sheets. These documents outline important information about the product’s chemical composition and storage recommendations. And any updates to safety data sheets must be reflected directly on product labels, both online and offline.

Previously, Hagebau had to pay for costly third-party solutions to process supplier documents and identify labeling data. However, the results came in incompatible formats, so Hagebau employees had to manually prepare the data for further processing in the company’s systems. The company wanted to improve this process and approached Lemberg Solutions with the following objectives:

Automate the process for extracting safety data sheet updates
Suppliers use different formats for their safety data sheets. Some documents can even contain scanned images instead of text, making copying harder. Hagebau needed a tool to extract data from a large number of data sheets and structure it into a standardized file.
Ensure seamless integration into the internal enterprise AI platform
Our client had their own AI platform that was already automating some tasks. Therefore, the new solution had to stay within the system’s boundaries while maintaining high accuracy.

Delivered value

AI-enabled automation for extracting updates from safety data sheets
The new system can seamlessly process up to 10,000 complex, unstructured documents and accurately detect the required data fields as quickly as possible. This allows the Hagebau team to access up-to-date and accurate safety information with minimal manual effort.
Reduced data extraction time and third-party vendor expenses
The automation of safety data sheet updates made work more efficient for Hagebau employees. Now, they don't need to invest significant time and rely on third-party providers to extract and structure critical information.
High accuracy
The solution is fine-tuned to provide safety data with high accuracy, considerably outperforming the manual process. Built-in validation and error logging mechanisms make outcomes transparent and easy to track.
Decreased AI implementation costs
High operational costs are a key barrier to AI adoption. Our engineers optimized the solution and the way the AI model is used to keep the cost of processing and extracting data controllable and low.
Hagebau - bottom image - left - Lemberg Solutions
Hagebau - bottom image - right - Lemberg Solutions

Solution

During the discovery phase, our team studied the architecture of Hagebau’s internal platform to ensure our solution would be a proper fit. The main requirement was to work with a specific LLM model — OpenAI GPT-5.1, which was the core of our tool. 

As the most suitable approach for the client’s needs, we decided to build a stateless application. The system allows single-user mode, meaning only one user can run the app at a time. This way, it can perform efficiently while also keeping operational costs low.  

Once the files are uploaded into the system, our tool applies semantic search to find relevant sections in safety data sheets. Specifically, it targets “Section 2: Potential Hazards” with subsections “2.1 Classification of the substance or mixture” and “2.2 Labeling elements”.

Then the system performs structured data extraction to capture compliance-critical information. It includes H-phrases, P-phrases, EUH phrases, signal words, and hazard pictograms

Despite variations in the structure and formatting of safety data sheets from different suppliers, the system automatically arranges the data in the relevant predefined columns of the Excel file. Alongside the final output, the system generates an error report highlighting any missing or inconsistent data. 

This smart automation application is also built to be resilient to any operational issues. Without interrupting the overall workflow, it can handle situations such as broken source files or temporary service disruptions.

Technologies
Python
React
OpenAI SDK
Docker

By developing a customized solution to extract safety data sheets on our Hagebau AI platform in collaboration with Lemberg Solutions, we can now provide our stores with the product data required by regulations faster and at a lower cost. The collaboration went smoothly: on time, on budget, and within the scope.

Hauke Kay Pless
AI Innovation Manager at Hagebau
Hagebau - Testimonial - Lemberg Solutions

How it works

Hagebau - How it works - Lemberg Solutions
User uploads documents
Users can either upload multiple PDF files directly or provide a single Excel file containing a list of document URLs.
System validates data input
The frontend performs initial input data validation; after which, the backend checks whether all files have a specific structure. Then, each file is preprocessed to meet the constraints of the LLM model.
AI model analyzes and extracts data
The system quickly identifies relevant sections and extracts the required data into a standardized Excel format.
The system sends results back to the user
Via email, the user receives the final Excel file, ready for product labeling. A detailed error log is sent alongside in a separate Excel file.

Contact us

Kick-start your software development project with expert engineers

Share your business challenge with our experts so we can discuss it in detail and come up with the most feasible solution shortly.

Olga Lysak, Business Representative in Germany, Head of AI Business Development at Lemberg Solutions
Olga Lysak
CEO at Lemberg Solutions GmbH

Olga guides our customers on powering their software with AI capabilities, helping them to innovate and grow their business efficiency. Tap into her experience to get consulted on how AI can benefit your company and what it takes to implement it.