Text & Data Mining TDM

Would you like to use our licensed resources for a TDM project? We can offer support!

"Text and data mining TDM" refers to algorithm- or statistics-based analytical methods which are used to discover structured information patterns or trends within digitial text or other data sources. The fundament of TDM projects is the availability of extensive data sets (Big Data).

We offer access to two platforms that give access to news content that can be used for TDM projects:

  • With Swissdox@LiRI, you get access to "Schweizerische Mediendatenbank SMD" with 30 million news articles from 300 sources that can be used for TDM.
  • TDM Studio is a cloud based tool that has access to licensed content from ProQuest, the highlights being "Global Newsstream" for International Press and "Dissertations & Theses". On our information page, you can find further information on available sources and coverage.

Unfortunately, most of our other licensed electronic resources are not available for TDM. Specifically, aggregated resources – e.g. EBSCO, Factiva, WISO, LexisNexis etc. – can not be used in such a way. Furthermore, the web interfaces of the majority of our licensed products are not suited for the download of big amounts of data.

Systematic downloading, e.g. by using crawlers, scraping or scripts, is not allowed! Providers deem such activities and the use of such tools as infringement of license agreements – which can lead to restricting or even blocking of the access to these resources for the entire University!

The Library actively supports making licensed data available for academic re-use in the TDM framework. We offer assistance so you do not get entangled in restrictions imposed by license agreements, copyright law or other regulations:

  • We can act as an intermediary between you and specific providers.
  • We check for you the general availability of required data through licensed eletronic resources for TDM projects.
  • We clarify with providers if and how required data can either be accessed by you directly or be made available through the provider.