A great place to start with publicly available APIs is the Public API GitHub site. Th potentially most useful categories are Books, Geocoding, Government, and Health.
The best current guide for APIs is MIT's APIs for Scholarly Resources. Each entry provides information on how to access, the file format of results, any limitations on access, and contact information. Since the guide was created by MIT staff, there may be resources on the list not available at Emory. Check Databases@Emory for availability. Also, please note that Emory has recently renegotiated its Elsevier (which owns the Science Direct platform) contract, which has some more restrictive terms regarding text and data mining. Please contact Chris Palazzolo (below) prior to pursuing an API.
Note that in some cases you will need to be linked into Emory's instance of the resource (i.e., accessing through the proxy server).
For any general questions or inquiries, please contact Chris Palazzolo.
Webscraping typically refers to the systematic extraction (either automated or manual) from a website into another a spreadsheet or database for later analysis and retrieval. Because of the licensing agreements Emory has with various publishers, using text scrapers or crawlers is typically prohibited, and users must employ the publisher's API in order to access this information, or use other vendor/publisher tools (such as Gale's Digital Scholar Lab). In addition, most websites frown upon such web scraping, citing copyright and legal considerations.