ProQuest TDM Studio is a cloud-based tool giving researchers at HSG the ability to text data mine (TDM) large sets of content published in news, scholarly, and other publications that the Library of the University of St.Gallen licenses from ProQuest within the tool. TDM Studio offers two levels of working with the data: Visualizations and Workbench.
TDM Studio Visualizations | TDM Studio Workbench | |
Corpus | Limited number of Newspapers covering primarily the 1990s to the present based on current HSG subscriptions: Chicago Tribune, Globe and Mail, The Guardian, Los Angeles Times, New York Times, Sydney Morning Herald, Times of India, Wall Street Journal, Washington Post, ProQuest Dissertations and Theses. | Includes most scholarly journals, newspapers, industry and popular magazines (ProQuest One Business and Global Newsstream), dissertations, theses, and other primary source texts for all time periods and publication dates available through HSG subscriptions to ProQuest databases: List of available titles. |
Coding Skills | No advanced coding skills are required to mine and generate data visualisations. Employs a graphical user interface and provides pre-built visualisations one can apply to the content being analyzed, including support of geographical analysis, topic modeling, and sentiment analysis. | Requires one or more team members with knowledge of either R or Python programming languages for use within TDM Studio's Jupyter Notebook coding environment. |
Access | All HSG students, faculty, and staff with current HSG account. | All HSG students, faculty, and staff with current HSG account. Research teams (2-5 people) who need to collaborate on a project using the same workbench should email tdmstudio@clarivate.com to request the research team workbench be created. At least one team member must be a current HSG faculty, staff, or student with a current HSG account. |
Period of Access for Research or Teaching | 24/7, for as long as you are a current HSG student, faculty, or staff member with a valid HSG account. | 24/7 as long as you are a current HSG student, faculty, or staff member with a valid HSG account. |
Storage Limits | Each researcher can work on up to a maximum of 10 projects simultaneously, each consisting of 10,000 documents or less. | Research team members can work on as many as ten dataset projects simultaneously, each consisting of as many as 2 million documents. |
Data Export Limits | Citation and geographic locations from each specific dataset may be exported. Screen captures of visualisations may be taken, saved, and published. The corpus of full text remains in the TDM Studio. The full text data sets cannot be exported. | Rolling seven-day maximum limit of 30 MB is available for export outside of the TDM Studio environment. The corpus of full text remains in the TDM Studio. The full text data sets cannot be exported, only programmes and secondary analysis. |
HSG students, faculty, and staff must register to use TDM Studio:
Available to all HSG students, faculty, and staff, the Visualizations component of TDM Studio does not require advanced coding skills. It supports a point and click creation of data visualisations enabling users to:
The range of content available in Visualizations is limited. Visualizations currently includes dissertations, theses, and a very small selection of newspapers: Chicago Tribune, The Guardian, Globe and Mail, Los Angeles Times, New York Times, Sydney Morning Herald, Wall Street Journal and Washington Post. You can see the full list as well as date coverage in our current title list. For some newspaper dates, TDM is only available for the citation or abstract.
Current HSG faculty, staff, and students can register for access to the ProQuest TDM Studio Workbench.
Individuals or research teams of up to 5 persons wishing to use the TDM Studio Workbench should:
You can download up to 1 million metadata records per week.
There are two types of metadata exports available:
Both metadata exports can be analyzed using various tools and methods outside of TDM Studio. The metadata export files have been tailored to the two most-commonly-used data types in TDM Studio: Newspapers and Dissertations.
ProQuest offers a guide with documentation on getting started with TDM Studio. More documentation is contained in the Jupyter Notebook in the TDM Studio Workbench.
Content related questions can be sent to elibrary@unisg.ch.