
Hamburg Regional Court ruling: AI training as permissible “text and data mining” under copyright law?
- Case number: 310 O 227/23
- Decision date: 27th September 2024
One legally standardized exception created a few years ago in light of digital progress is that of “text and data mining”. A distinction must be made here between the regulation that applies to everyone for personal use on the one hand and the more generous provision for research and cultural purposes on the other. In both cases, third parties are free to reproduce a work in order to automatically analyze texts and data in digital form and obtain information about patterns, trends and correlations, among other things. The prerequisite is that third parties have lawful access to the work and that they only store the reproduction of the work to the extent necessary. The difference lies in the fact that “text and data mining” for personal use can also be carried out for commercial purposes, but is not permitted if the reproduction of the work is expressly prohibited and this is indicated by a machine-readable reservation of use.
The extent to which the “text and data mining” exception can be applied to the training of AI systems is still unclear – also due to a lack of relevant case law.
A decision has now recently been issued by the Hamburg Regional Court (310 O 227/23). A defendant association made a table document with hyperlinks to publicly accessible images or image files publicly available free of charge on the internet. This data set, comprising of 5.85 billion image-text pairs, can also be used to train generative AI and is based on existing data sets on the internet. The latter also includes the photo at issue on the website of an image agency, which the defendant automatically captured, downloaded, analyzed and included in its data set with its metadata.
The plaintiff photographer, as the producer of the photo, granted the photo agency a license to it. According to him, however, this was limited to showing the photo in dispute on the agency’s website and offering the photo itself or a license to it. However, the granting of rights for the purpose of AI training was not covered by this. The reproduction made by the defendant as part of the analysis process constituted a copyright infringement. The “text and data mining” exception would not apply. The plaintiff further argues that the aggregation of data for the purpose of AI training constitutes a case of “AI web scraping” and that the work serves to create identical or similar competing products.
Result:
The Hamburg Regional Court ruled that the reproduction of the photo in dispute by the defendant was in principle permissible “text and data mining” and dismissed the claim. The court specifically affirmed the exception for research purposes, but left the exception for private use open after some of the facts of the case spoke in favor of a validly declared reservation of use on the photo agency’s website. Furthermore, the court only subsumed the creation of the data set by the defendant under “text and data mining”, but left open whether the downstream training of AI systems also falls under this. In light of the grounds of the decision, this is likely to be the case.
Even if no clear legal conclusions can yet be drawn from the German court decision in question – especially for Austria – the decision can be seen as an initial indication that the reproduction of copyrighted works for the training of AI systems will constitute permissible “text and data mining”.
Blog