close

TDM Studio

(ProQuest)

ProQuest TDM Studio is a cloud-based tool giving researchers at HSG the ability to text data mine (TDM) large sets of content published in news, scholarly, and other publications that the Library of the University of St.Gallen licenses from ProQuest within the tool. TDM Studio offers two levels of working with the data: Visualizations and Workbench.

 

  TDM Studio Visualizations TDM Studio Workbench 
Corpus Limited number of Newspapers covering primarily the 1990s to the present based on current HSG subscriptions: Chicago Tribune, Globe and Mail, The Guardian, Los Angeles Times, New York Times, Sydney Morning Herald, Times of India, Wall Street Journal, Washington Post, ProQuest Dissertations and Theses.     Includes most scholarly journals, newspapers, industry and popular magazines (ProQuest One Business and Global Newsstream), dissertations, theses, and other primary source texts for all time periods and publication dates available through HSG subscriptions to ProQuest databases:  List of available titles.
Coding Skills  No advanced coding skills are required to mine and generate data visualisations. Employs a graphical user interface and provides pre-built visualisations one can apply to the content being analyzed, including support of geographical analysis, topic modeling, and sentiment analysis. Requires one or more team members with knowledge of either R or Python programming languages for use within TDM Studio's Jupyter Notebook coding environment.
Access All HSG students, faculty, and staff with current HSG account. All HSG students, faculty, and staff with current HSG account.

Research teams (2-5 people) who need to collaborate on a project using the same workbench should email tdmstudioclarivate.com  to request the research team workbench be created.

At least one team member must be a current HSG faculty, staff, or student with a current HSG account.
Period of Access for Research or Teaching 24/7, for as long as you are a current HSG student, faculty, or staff member with a valid HSG account. 24/7 as long as you are a current HSG student, faculty, or staff member with a valid HSG account.
Storage Limits Each researcher can work on up to a maximum of 10 projects simultaneously, each consisting of 10,000 documents or less.  Research team members can work on as many as ten dataset projects simultaneously, each consisting of as many as 2 million documents.
Data Export Limits Citation and geographic locations from each specific dataset may be exported.
Screen captures of visualisations may be taken, saved, and published.  

The corpus of full text remains in the TDM Studio.  The full text data sets cannot be exported.
Rolling seven-day maximum limit of 30 MB is available for export outside of the TDM Studio environment.

The corpus of full text remains in the TDM Studio.  The full text data sets cannot be exported, only programmes and secondary analysis.  

 

HSG students, faculty, and staff must register to use TDM Studio:

  1. Go to: https://tdmstudio.proquest.com/createaccount
  2. Submit your account information, using your unisg email address
  3. You will receive an email message asking you to verify your email address. 
  4. Follow the steps to verify your email address
  5. Log in and start using TDM Studio

 

Quick help

Available to all HSG students, faculty, and staff, the Visualizations component of TDM Studio does not require advanced coding skills.  It supports a point and click creation of data visualisations enabling users to:

  • Mine and analyse thousands of articles in up to 10 newspapers
  • Visualise global trends by comparing impact of a news topic across multiple geographic locations
  • View interactive chronological displays to reveal changes in the subject of news coverage over time
  • Use an embedded topic modeling component to analyze major themes in those news reports
  • Manage as many as 10 simultaneous research projects of 10,000 documents each
NOTE: Depending on your query, it may take hours to produce the visualisation. 

The range of content available in Visualizations is limited. Visualizations currently includes dissertations, theses, and a very small selection of newspapers: Chicago Tribune, The Guardian, Globe and Mail, Los Angeles Times, New York Times, Sydney Morning Herald, Wall Street Journal and Washington Post. You can see the full list as well as date coverage in our current title list.  For some newspaper dates, TDM is only available for the citation or abstract.    

Current HSG faculty, staff, and students can register for access to the ProQuest TDM Studio Workbench.

Individuals or research teams of up to 5 persons wishing to use the TDM Studio Workbench should:

  • Create a TDM Studio Account 
  • If working with a team, the Team Lead should email tdmstudioclarivate.com to request that additional users are added to the workbench. Team Leads may request up to 4 additional researchers be added to the workbench. Include the researchers first and last names and email addresses. Note: the Team Lead should be someone with a HSG account. Although, additional researchers can be from other institutions.
  • The Team Lead should  schedule an Onboarding and Question & Answer session  using the TDM Studio (office 365.com) self-scheduling tool. Attention: the times offered are in US Eastern Time, normally a time difference of six hours.
  • To access your Workbench, login into the TDM Studio page: https://tdmstudio.proquest.com/home.

 

The Fine Print
Access Requirements
  • All Users are responsible for following and abiding by the ProQuest TDM Studio terms of use.
  • For Research Teams
    • Each Team member must have an individual TDM Studio Account.
    • At least one Team member must be a current HSG faculty, staff, or student with a valid HSG account.
    • One HSG-affiliated Team member will be designated as the Team Lead and will handle all communication between the Team, the HSG Library, and ProQuest TDM Studio support. 
    • A Team may include researchers who are not at HSG if they are working on a team with one or more HSG-affiliated researchers.
    • For Research Teams: ProQuest will create a Workbench and provide passwords to individuals comprising a Research Team
Data Limits
  • A limit of ten total simultaneous datasets can be created on the Workbench with up to 2 million documents per dataset at any one time.
  • The Team mines content from sources aggregated within ProQuest databases to which the HSG Library has current licensing agreements/subscriptions in place with ProQuest.
    • Title List
    • Contact elibraryunisg.ch to verify which publication titles of interest are contained within TDM Studio. 
  • A maximum of 30 MB per week is available for export outside of TDM Studio.   NOTE: the corpus of text remains in TDM Studio.  Datasets cannot be exported.  Scripts and analysis results created by the Team may be exported from TDM Studio. Screen shots of the visualisations may be taken. If you have problems with export, please contact tdmstudioclarivate.com.
  • The Team's files will be deleted from the Workbench at the end of the Team's scheduled time.
Metadata Export

You can download up to 1 million metadata records per week.
There are two types of metadata exports available: 

  • Citation Metadata: This export file provides metadata fields typically used to cite individual documents such as “Title”, “Date”, and “Author(s)”.
  • Extended Metadata: This export file includes all of the fields in Citation Metadata but also includes additional, valuable metadata for text and data mining purposes such as subject fields and extra publication information. 

Both metadata exports can be analyzed using various tools and methods outside of TDM Studio. The metadata export files have been tailored to the two most-commonly-used data types in TDM Studio: Newspapers and Dissertations.

Downloading
  • All members of a Team have read/write access to all data, programmes, and results. TDM Studio users will be able to analyze, but not download full data sets.  The full corpus of text remains in TDM Studio.
  • TDM Studio users will be able to analyze, but not download full data sets.  The full corpus of text remains in TDM Studio.
  • Team members can download all of their programmes and related analytical results up to 30MB per week. If you have problems with export, please contact tdmstudioclarivate.com
  • Downloads are queued and team members are sent an email when the download is ready.
  • The link in the email will work only once.
  • Team member coordination for the use of the emailed links is advised. 
  • The Team should retain search strings used to retrieve content so that the searches can be re-run at a later time if necessary

ProQuest offers a guide with documentation on getting started with TDM Studio.  More documentation is contained in the Jupyter Notebook in the TDM Studio Workbench.

Content related questions can be sent to elibraryunisg.ch.

north