Text and data mining (TDM) uses automated tools in order to identify and extract data relevant to one's research from large or numerous sources. By processing the available data in this way, researchers hope to show trends or patterns in the available data. TDM is used in both the humanities and sciences and can apply to a wide variety of types of datasets. Researchers can use some AI and machine-learning tools for TDM, but these tools are not required for TDM-based research.
We are currently updating our policies and guidelines regarding use of acquiring, using, and accessing e-resources for TDM projects and/or AI large language models. The AI environment is rapidly changing! Current guidelines are now available. Where legally and financially possible, Emory Libraries will liaise with vendors to accommodate researchers wishing to use corpora derived from licensed e-resources for computational analysis and AI learning. Before using AI tools (or accessing large amounts of content) with any licensed library resource, researchers should contact the library for more information about governing terms. Here are some example terms and conditions.
As a general rule, check with the relevant subject librarian before beginning any project that involves use of Emory e-resources for TDM or AI purposes! Here are some questions to consider as you develop a proposal that uses TDM or AI techniques.
Databases often have their own rules and restrictions on what is and is not permissible when it comes to applying TDM methods to their data. In addition, access to these databases comes in a variety of forms, mediated by Emory Libraries.
Broadly, databases fall into four categories
As a general rule, check with the relevant subject librarian before beginning any project that involves use of Emory e-resources for TDM or AI purposes!