{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Web Scraping with the Art of Literary Text Analysis\n", "\n", "Our objective here is to create a local copy of all the articles in journal _Digital Humanities Quarterly_. As always, there are multiple ways to accomplish this, but this will be our plan:\n", "\n", "* find a page that lists all the articles (since one of those conveniently exists)\n", "* fetch its contents\n", "* create a list of URLs for the titles\n", "* loop through each URL and save the contents locally\n", "\n", "So we'll begin by fetching the URL, something we already saw in the [Getting Started](GettingStarted.ipynb). We begin by importing the built-in [urllib.request](https://docs.python.org/3/library/urllib.request.html#module-urllib.request) library, define the URL to fetch, and fetch it (as HTML source code, so plain text)." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "b'\\n\\nDHQ: Digital Humanities Quarterly: Title Index
\"DHQ
\"Digital
\"sidenavbarimg\"/
ISSN 1938-4122
\"button1-addthis.gif\"/
DHQ: Digital Humanities Quarterly

Title Index

ABCDEFGHIJKLMNOPQRSTUVWXYZ

A

Acknowledgements and Dedications, 2009: v3 n1
Gregory Crane, Tufts University; Brent Seales, University of Kentucky; Melissa Terras, University College London
An Agent-based Model for the Humanities, 2013: v7 n1
Belinda Roman, St. Mary\\'s University
Agent-Based Modeling and Historical\\n Simulation, 2014: v8 n4
Michael Gavin, University of South Carolina
All and Each: A Socio-Technical Review of the Europeana\\n Project, 2017: v11 n3
Rhiannon Stephanie Bettivia, University of Illinois, Urbana-Champaign; Elizabeth Stainforth, University of Leeds
All Hope Abandon: Biblical Text and Interactive Fiction, 2007: v1 n2
Eric Eve, Harris Manchester College, University of Oxford
All Relate to Art:\\n The William Blake Archive and Its Web of Relations, 2018: v12 n1
Michael Fox, University of North Carolina-Chapel Hill; Joseph Fletcher, University of North Carolina-Chapel Hill
The Almanac Archive: Theorizing\\n Marginalia and Duplicate Copies in the Digital Realm, 2016: v10 n1
Lindsey Eckert, Georgia State University; Julia Grandison, University of Toronto
[en]Analyzing the interlocking spatio-temporal scales of\\n a forest: from geographic information systems to the SyMoGIH method (Avesnois,\\n France), 2018: v12 n1
Marie Delcourte-Debarre, Universit\\xc3\\xa9 de Valenciennes et du Hainaut-Cambr\\xc3\\xa9sis
Aporias of the Digital Avant-Garde, 2007: v1 n2
Steve F. Anderson, University of Southern California
The App-Maker Model: An Embodied Expansion of Mobile\\n Cyberinfrastructure, 2016: v10 n3
Brett Oppegaard, University of Hawaii; Michael Rabby, Washington State University Vancouver
Archival Liveness: Designing with Collections Before and\\n During Cataloguing and Digitization, 2015: v9 n3
Tom Schofield, Culture Lab, Newcastle University; David Kirk, Digital Interactions Group, Newcastle University; Telmo Amaral, Digital Interactions Group, Newcastle University; Marian D\\xc3\\xb6rk, Potsdam University of Applied Sciences, Institute for Urban Futures; Mitchell Whitelaw, Faculty of Arts and Design, University of Canberra; Guy Schofield, Digital Interactions Group, Newcastle University; Thomas Ploetz, Digital Interactions Group, Newcastle University
The Archive as Repertoire: Transience and Sustainability in\\n Digital Archives , 2016: v10 n4
Miguel Escobar Varela, National University of Singapore
As You Can See: Applying Visual Collaborative Filtering to Works of Art, 2008: v2 n1
Gerhard Jan Nauta, Leiden University
Attention Ecology: Trend Circulation and the Virality\\n Threshold, 2016: v10 n4
Nicholas M Van Horn, Capital University; Aaron Beveridge, University of Florida; Sean Morey, University of Tennessee, Knoxville
Automated Pattern Analysis in Gesture Research: Similarity\\n Measuring in 3D Motion Capture Models of Communicative Action, 2017: v11 n2
Daniel Sch\\xc3\\xbcller, Natural Media Lab, Human Technology Centre, RWTH Aachen University; Christian Beecks, University of M\\xc3\\xbcnster; Marwan Hassani, Data Management and Exploration Group, RWTH Aachen University; Jennifer Hinnell, Department of Linguistics, University of Alberta; Bela Brenger, Natural Media Lab, Human Technology Centre, RWTH Aachen University; Thomas Seidl, Ludwig Maximilian University of Munich; Irene Mittelberg, Natural Media Lab, Human Technology Centre, RWTH Aachen University
Avatar Emergency, 2011: v5 n3
Gregory L. Ulmer, Professor of English and Media Studies University of Florida, Gainesville
Avatari: Disruption and Imago in Video Games, 2009: v3 n3
Philip Sandifer, University of Florida

B

Because It\\'s Not There: Ekphrasis and the Threat of Graphics in\\n Interactive Fiction, 2011: v5 n1
Aaron Kashtan, Department of English, University of Florida
Behind the Scenes of a Dissertation in Comics Form, 2015: v9 n4
Nick Sousanis, University of Calgary
[en]Between words and music: towards a digital\\n methodology for analysing song settings of the poems of Charles Baudelaire, 2018: v12 n1
Caroline Ardrey, The University of Birmingham; Myl\\xc3\\xa8ne Dubiau, Universit\\xc3\\xa9 de Toulouse \\xe2\\x80\\x94 Jean Jaur\\xc3\\xa8s; Helen Abbott, The University of Birmingham
Beyond Gutenberg: Transcending the Document Paradigm in\\n Digital Humanities, 2014: v8 n4
David Schloen, University of Chicago; Sandra Schloen, University of Chicago
Beyond Representation: Embodied Expression and Social Me-dia, 2012: v6 n2
Lissa Holloway-Attaway, Blekinge Tekniska H\\xc3\\xb6gskola
Beyond the Margins: Intersectionality and the Digital\\n Humanities, 2015: v9 n2
Roopika Risam, Salem State University
BigDIVA and Networked Browsing: A Case for Generous\\n Interfacing and Joyous Searching, 2018: v12 n2
Joel Schneier, North Carolina State University; Timothy Stinson, North Carolina State University; Matthew Davis, McMaster University
The Boundless Book: A Conversation between the Pre-modern and\\n Posthuman, 2013: v7 n1
Alison Tara Walker, Saint Louis University
Building a Student-Centered (Digital) Learning Community With\\n Undergraduates, 2017: v11 n3
Danica Savonick, The Graduate Center, CUNY; Lisa Tagliaferri, The Graduate Center, CUNY
Building a Toolkit for Digital Pedagogy, 2017: v11 n3
Alex Christie, Brock University
Building A Volunteer Community: Results and Findings from\\n Transcribe Bentham\\n , 2012: v6 n2
Tim Causer, Bentham Project, University College London; Valerie Wallace, Bentham Project, University College London,and Center for History and Economics, Harvard University
Building Better Digital Humanities Tools: Toward broader\\n audiences and user-centered designs, 2012: v6 n2
Fred Gibbs, George Mason University; Trevor Owens, Library of Congress
Burying Dead Projects: Depositing the\\n Globalization Compendium, 2014: v8 n2
Geoffrey Rockwell, University of Alberta; Shawn Day, University College Cork; Joyce Yu, University of Alberta; Maureen Engel, University of Alberta
By the People, For the\\n People: Assessing the Value of Crowdsourced, User-Generated\\n Metadata, 2015: v9 n1
Christina Manzo, Simmons College, USA; Geoff Kaufman, Tiltfactor Laboratory, Dartmouth College, USA; Sukdith Punjasthitkul, Tiltfactor Laboratory, Dartmouth College, USA; Mary Flanagan, Tiltfactor Laboratory, Dartmouth College, USA

C

[es]Cartograf\\xc3\\xadas de la sociedad\\n red, 2018: v12 n1
Paulo Antonio Gatica Cote, Universidad de Salamanca
[en]Cartographies of the Network\\n Society, 2018: v12 n1
Paulo Antonio Gatica Cote, Universidad de Salamanca
Cervantes Project: The Digital Quixote Iconography Collection, 2009: v3 n3
Eduardo Urbina, Texas A&M University; Richard Furuta, Texas A&M University; Steven E. Smith, Texas A&M University
Circling around texts and language: towards\\n pragmatic modelling in Digital Humanities, 2016: v10 n3
Arianna Ciula, University of Roehampton; Cristina Marras, Istituto per il Lessico Intellettuale Europeo e Storia delle Idee, Consiglio Nazionale delle Ricerche
Citation in Classical Studies, 2009: v3 n1
Neel Smith, College of the Holy Cross
[en]Citizen Laboratories and Digital\\n Humanities, 2018: v12 n1
Paola Ricaurte Quijano, Tecnol\\xc3\\xb3gico de Monterrey
Classics in the Million Book Library, 2009: v3 n1
Gregory Crane, Tufts University; Alison Babeu, Tufts University; David Bamman, Tufts University; Thomas Breuel, Technical University of Kaiserslautern; Lisa Cerrato, Tufts University; Daniel Deckers, Hamburg University; Anke L\\xc3\\xbcdeling, Humboldt-University, Berlin; David Mimno, University of Massachusetts, Amherst; Rashmi Singhal, Tufts University; David A. Smith, University of Massachusetts, Amherst; Amir Zeldes, Humboldt-University, Berlin
Coca-Cola: An Icon of the American Way of Life.\\n An Iterative Text Mining Workflow for Analyzing Advertisements in Dutch\\n Twentieth-Century Newspapers, 2017: v11 n4
Melvin Wevers, DH Group, KNAW HUC, Amsterdam, The Netherlands; Jesper Verhoef, TU Delft, Delft, The Netherlands
Code as Ritualized Poetry: The Tactics of the Transborder Immigrant Tool, 2013: v7 n1
Mark C. Marino, University of Southern California
Collaboration in Digital Humanities Research \\xe2\\x80\\x93\\n Persisting Silences, 2018: v12 n1
Gabriele Griffin, Uppsala University; Matt Steven Hayler, Birmingham University, UK
\\n Collaboration Must Be Fundamental or It\\'s Not Going to Work: an Oral\\n History Conversation between Harold Short and Julianne Nyhan , 2012: v6 n3
Harold Short, King\\'s College London and University of Western Sydney; Julianne Nyhan, University College London; Anne Welsh, University College London; Jessica Salmon, University of Trier
Comic Book Markup Language: An Introduction and Rationale, 2012: v6 n1
John A. Walsh, Indiana University
Communitizing Electronic Literature, 2009: v3 n2
Scott Rettberg, The University of Bergen Dept. of Literary, Linguistic, and Aesthetic Studies
Comparative rates of text reuse in classical Latin hexameter\\n poetry, 2015: v9 n3
Neil Bernstein, Ohio University; Kyle Gervais, University of Western Ontario; Wei Lin, Ohio University
Comparing Disciplinary Patterns: Exploring the Humanities\\n through the Lens of Scholarly Communication, 2017: v11 n2
Daniel Burckhardt, Humboldt-Universit\\xc3\\xa4t zu Berlin
Computational Linguistics and Classical Lexicography, 2009: v3 n1
David Bamman, Tufts University; Gregory Crane, Tufts University
Computational Stylistic Analysis of\\n Popular Songs of Japanese Female Singer-songwriters, 2014: v8 n1
Takafumi Suzuki, Toyo University; Mai Hosoya, Toyo University
Computers, Comics and Cult Status: A Forensics of Digital\\n Graphic Novels, 2014: v8 n3
Jaime Lee Kirtz, University of Colorado
Conclusion: Cyberinfrastructure, the Scaife Digital Library and Classics in a\\n Digital age, 2009: v3 n1
Christopher Blackwell, Furman University; Gregory Crane, Tufts University
Conjectural Criticism: Computing Past and Future Texts, 2009: v3 n4
Kari Kraus, College of Information Studies and Department of English, University of Maryland
[en]The Conquest of Jerusalem: by Cervantes?\\n Styometric analysis on authorship in the Golden Age Spanish theater, 2018: v12 n1
Jos\\xc3\\xa9 Calvo Tello, Universidad de W\\xc3\\xbcrzburg; Juan Cerezo Soler, Universidad Aut\\xc3\\xb3noma de Madrid
Continuous Integration and Unit Testing of\\n Digital Editions, 2017: v11 n4
Bridget Almas, The Alpheios Project, Ltd.; Thibault Cl\\xc3\\xa9rice, Centre Jean-Mabillon (\\xc3\\x89cole des chartes) - PSL
Covers and Corpus wanted! Some Digital Humanities\\n Fragments , 2016: v10 n3
Claire Clivaz, Swiss Institute of Bioinformatics
Crafting in\\n Games, 2017: v11 n4
April Grow, University of California Santa Cruz; Melanie Dickinson, University of California Santa Cruz; Johnathan Pagnutti, University of California Santa Cruz; Noah Wardrip-Fruin, University of California Santa Cruz; Michael Mateas, University of California Santa Cruz
Crafting the User-Centered Document Interface: The Hypertext\\n Editing System (HES) and the File Retrieval and Editing System (FRESS), 2010: v4 n1
Belinda Barnet, Lecturer in Media at Swinburne University Melbourne, in association with Smart Services CRC.
Creating a regional DH community \\xe2\\x80\\x93 A Case Study of the\\n RedHD, 2015: v9 n3
Isabel Galina Russell, Instituto de Investigaciones Bibliotecol\\xc3\\xb3gicas, Universidad Nacional Aut\\xc3\\xb3noma de M\\xc3\\xa9xico (UNAM)
Creative Data Literacy: A Constructionist\\n Approach to Teaching Information Visualization , 2018: v12 n4
Catherine D\\'Ignazio, Emerson College; Rahul Bhargava, Massachusetts Institute of Technology
Criminal Code: Procedural Logic and Rhetorical Excess in\\n Videogames, 2013: v7 n1
Mark L. Sample, George Mason University
[en]Criminocorpus. A digital project for History of\\n Justice, 2018: v12 n1
Marc Renneville, CLAMOR. Center for Digital Humanities and History of Justice (CNRS); Jean-Lucien Sanchez, CLAMOR. Center for Digital Humanities and History of Justice (CNRS); Sophie Victorien, CLAMOR. Center for Digital Humanities and History of Justice (CNRS)
[fr]Criminocorpus. Un projet num\\xc3\\xa9rique pour l\\'histoire\\n de la justice, 2018: v12 n1
Marc Renneville, CLAMOR. Center for Digital Humanities and History of Justice (CNRS); Jean-Lucien Sanchez, CLAMOR. Center for Digital Humanities and History of Justice (CNRS); Sophie Victorien, CLAMOR. Center for Digital Humanities and History of Justice (CNRS)
A Culture of non-citation: Assessing the digital impact of\\n British History Online and the Early English Books Online Text Creation\\n Partnership, 2017: v11 n1
Jonathan Blaney, Institute of Historical Research, University of London; Judith Siefring, Bodleian Libraries, University of Oxford
Curating Digital Spaces, Making Visual Arguments: A Case Study in New Media Presentations of Ancient Objects, 2013: v7 n2
Daniel Price, University of Houston; Rex Koontz, University of Houston; Lauren Lovings, Independent scholar
Curating Electronic Literature as Critical and Scholarly\\n Practice, 2014: v8 n4
Dene Grigar, Washington State University Vancouver
cut to fit the tool-spun course, 2013: v7 n1
Nick Montfort, MIT; Stephanie Strickland, Independent scholar
Cyberinfrastructure for Classical Philology, 2009: v3 n1
Gregory Crane, Tufts University; Brent Seales, University of Kentucky; Melissa Terras, University College London

D

Data Assemblages: A Call to Conceptualize Materiality in the\\n Academic Ecosystem, 2015: v9 n2
Nabeel Siddiqui, College of William and Mary
The Data Sprint Approach: Exploring the field of Digital\\n Humanities through Amazon\\xe2\\x80\\x99s Application Programming Interface, 2015: v9 n3
David M. Berry, University of Sussex; Erik Borra, University of Amsterdam; Anne Helmond, University of Amsterdam; Jean-Christophe Plantin, London School of Economics and Political Science; Jill Walker Rettberg, University of Bergen
Deconstructing Bricolage: Interactive Online Analysis of Compiled\\n Texts with Factotum, 2015: v9 n1
Tomas Zahora, Monash University; Dmitri Nikulin, Google; Constant J. Mews, Monash University; David Squire, Monash University
Defining scholarly practices, methods and tools\\n in the Lithuanian digital humanities research community, 2018: v12 n4
Ingrida Kelp\\xc5\\xa1ien\\xc4\\x97, Vilnius University, Lithuania
A Design Methodology for Web-based Sound Archives, 2014: v8 n2
Annie Murray, University of Calgary; Jared Wiercinski, Concordia University
The Design of an International Social Media Event: A Day in the Life of the Digital Humanities, 2012: v6 n2
Geoffrey Rockwell, University of Alberta; Peter Organisciak, University of Illinois, Urbana-Champaign; Megan Meredith-Lobay, University of Alberta; Kamal Ranaweera, University of Alberta; Stan Ruecker, Illinois Institute of Technology; Julianne Nyhan, University College London
Designing Choreographies for the New Economy of Attention\\n , 2009: v3 n2
Eric Gordon, Emerson College; David Bogen, Rhode Island School of Design
Designing Data Mining Droplets: New Interface Objects for the Humanities\\n Scholar, 2009: v3 n3
Stan Ruecker, University of Alberta, Canada; Milena Radzikowska, Mount Royal College, Canada; St\\xc3\\xa9fan Sinclair, McMaster University, Canada
Determining Value for Digital Humanities Tools: Report on a Survey\\n of Tool Developers , 2010: v4 n2
Susan Schreibman, Digital Humanities Observatory; Ann M. Hanlon, Marquette University
DH for History Students: A Case Study at the Facultad de\\n Filosof\\xc3\\xada y Letras (National Autonomous University of Mexico) , 2017: v11 n3
Adriana \\xc3\\x81lvarez S\\xc3\\xa1nchez, National Autonomous University of Mexico; Miriam Pe\\xc3\\xb1a Pimentel, National Autonomous University of Mexico
DHBeNeLux: Incubator for Digital Humanities in Belgium, the\\n Netherlands and Luxembourg, 2017: v11 n4
Joris van Zundert, Huygens Institute for the History of the Netherlands, Royal Netherlands Academy of Arts and Sciences; Sally Chambers, Ghent Centre for Digital Humanities, Ghent University; Mike Kestemont, Antwerp University; Marijn Koolen, Netherlands Institute for Sound and Vision; Catherine Jones, University of Luxembourg
DHQ in the Public Eye, 2007: v1 n2
Melissa Terras, University College London
Diachronic trends in Homeric translations, 2017: v11 n2
Yuri Bizzoni, University of Gothenburg; Marianne Reboul, Universit\\xc3\\xa9 Paris-Sorbonne; Angelo Del Grosso, Institute for Computational Liguistics A. Zampolli
Digital Caricature, 2014: v8 n3
Sean Sturm, The University of Auckland; Stephen Francis Turner, The University of Auckland
The Digital Classicist: building a Digital\\n Humanities Community, 2017: v11 n3
Simon Mahony, University College London
Digital Criticism: Editorial Standards for the Homer Multitext , 2009: v3 n1
Casey Du\\xc3\\xa9, University of Houston, Texas; Mary Ebbott, College of the Holy Cross
Digital Encoding as a Hermeneutic and Semiotic Act: The Case of\\n Valerio Magrelli, 2010: v4 n1
Domenico Fiormonte, Universit\\xc3\\xa0 Roma Tre, Dipartimento di Italianistica; Valentina Martiradonna, Universit\\xc3\\xa0 di Roma, La Sapienza; Desmond Schmidt, Queensland University of Technology, Information Security Institute
The Digital Future is Now: A Call to Action for the Humanities, 2009: v3 n4
Christine L. Borgman, Professor & Presidential Chair in Information Studies, UCLA
The Digital Future of Humanities through the Lens of DIY\\n Culture, 2016: v10 n4
Henriette Roued-Cunlife, University of Copenhagen
Digital Geography and Classics, 2009: v3 n1
Tom Elliott, New York University; Sean Gillies, New York University
A Digital Humanities Approach to Narrative Voice in The Secret Scripture: Proposing a New Research\\n Method, 2014: v8 n2
Sonia Howell, University of Notre Dame, USA; Margaret Kelleher, University College Dublin (UCD); Aja Teehan, National University of Ireland, Maynooth, Ireland; John Keating, National University of Ireland, Maynooth, Ireland
Digital Humanities, Copyright Law, and the Literary, 2013: v7 n1
Robin Wharton, Independent Scholar
Digital Humanities in the 21st Century: Digital Material as a\\n Driving Force , 2016: v10 n2
Niels Br\\xc3\\xbcgger, The Centre for Internet Studies, and NetLab Aarhus University
Digital Humanities, Postfoundationalism, Postindustrial\\n Culture, 2014: v8 n1
James Smithies, University of Canterbury
Digital Humanities Quarterly Special Cluster on Arts and\\n Humanities e-Science, 2009: v3 n4
Stuart Dunn, Centre for e-Research, King\\'s College London; Tobias Blanke, Centre for e-Research, King\\'s College London
Digital Literature and the Modernist Problem, 2011: v5 n3
Maria Engberg, Blekinge Institute of Technology; Jay David Bolter, Georgia Institute of Technology
Digital Methods and Classical Studies, 2016: v10 n2
Neil Coffee, University at Buffalo; Neil W. Bernstein, Ohio University
Digital Oulipo: Programming Potential Literature, 2017: v11 n3
Natalie Berkman, Princeton University
Digital Pedagogy Unplugged, 2011: v5 n3
Paul Fyfe, Florida State University
Digital Surrealism: Visualizing Walt Disney Animation\\n Studios, 2017: v11 n1
Kevin L. Ferguson, Queens College / CUNY
[en]Digital text: hermeneutic issues , 2018: v12 n1
Jean Guy Meunier, Universit\\xc3\\xa9 du Qu\\xc3\\xa9bec \\xc3\\xa0 Montr\\xc3\\xa9al
Digitizing Latin Incunabula: Challenges, Methods, and Possibilities, 2009: v3 n1
Jeffrey A. Rydberg-Cox, University of Missouri-Kansas City
Distracted Reading: Acts of Attention in the Age\\n of the Internet, 2018: v12 n2
Marion Thain, King\\'s College London
Distributed reading: Literary reading in diverse\\n environments, 2018: v12 n2
Tully Barnett, Flinders University
\\n Do You Want to Save Your Progress?: The Role of Professional and Player\\n Communities in Preserving Virtual Worlds, 2012: v6 n2
Kari Kraus, University of Maryland; Rachel Donahue, University of Maryland
Does your historical collection need a database-driven\\n website?, 2015: v9 n1
Adam Crymble, University of Hertfordshire, United Kingdom
Done: Finishing Projects in the Digital Humanities, 2009: v3 n2
Matthew G. Kirschenbaum, University of Maryland

E

The e Prefix: e-Science, e-Art & the New Creativity, 2009: v3 n4
Gregory Sporton, Director, Visualisation Research Unit, School of Art, Birmingham City University
Edition, Project, Database, Archive, Thematic Research Collection: What\\'s in a\\n Name?, 2009: v3 n3
Kenneth M. Price, University of Nebraska-Lincoln
[en]Editors\\'\\n Introduction, 2018: v12 n1
Ernesto Priani Saiso, Universidad Nacional Aut\\xc3\\xb3noma de M\\xc3\\xa9xico (UNAM); Elena Gonzalez-Blanco Garc\\xc3\\xada, Universidad Nacional de Educaci\\xc3\\xb3n a Distancia (UNED)
Encoding for Endangered Tibetan Texts, 2007: v1 n1
Linda E. Patrik, Department of Philosophy, Union College
The Ends of Editing, 2009: v3 n3
Peter M. W. Robinson, University of Birmingham
An Enlightenment Utopia: The Network of Sociability in Corinne, 2017: v11 n2
Chloe Edmondson, Stanford University
Enlisting Vertues Noble &\\n Excelent: Behavior, Credit, and Knowledge Organization in the Social\\n Edition, 2015: v9 n2
Constance Crompton, University of British Columbia, Okanagan; Raymond Siemens, University of Victoria; Alyssa Arbuckle, University of Victoria; Implementing New Knowledge Environment (INKE)
[fr]Entre musique et lettres\\xc2\\xa0: vers une m\\xc3\\xa9thodologie\\n num\\xc3\\xa9rique pour l\\xe2\\x80\\x99analyse de la mise en musique des po\\xc3\\xa9sies de Charles\\n Baudelaire, 2018: v12 n1
Caroline Ardrey, The University of Birmingham; Myl\\xc3\\xa8ne Dubiau, Universit\\xc3\\xa9 de Toulouse \\xe2\\x80\\x94 Jean Jaur\\xc3\\xa8s; Helen Abbott, The University of Birmingham
Envisioning the Digital Humanities, 2012: v6 n1
Patrik Svensson, University of Ume\\xc3\\xa5
Epigraphy in 2017, 2009: v3 n1
Hugh Cayless, University of North Carolina; Charlotte Rouech\\xc3\\xa9, King\\'s College London; Tom Elliott, New York University; Gabriel Bodard, King\\'s College London
e-Science for Medievalists: Options, Challenges, Solutions and\\n Opportunities, 2009: v3 n4
Peter Ainsworth, Dept of French and Humanities Research Institute, University of Sheffield; Michael Meredith, Humanities Research Institute, University of Sheffield
Experiential Analogies: A Sonic Digital Ekphrasis as a Digital\\n Humanities Project, 2016: v10 n2
Anna Foka, Ume\\xc3\\xa5 University; Viktor Arvidsson, Swedish Center for Digital Innovation. Department of Informatics, University of Oslo
Explaining Events to Computers: Critical Quantification,\\n Multiplicity and Narratives in Cultural Heritage, 2016: v10 n3
Stuart Dunn, King\\xe2\\x80\\x99s College London; Mareike Schumacher, Hamburg University
Exploratory Search Through Visual Analysis of Topic\\n Models, 2017: v11 n2
Patrick J\\xc3\\xa4hnichen, Machine Learning Group, Humboldt-Universit\\xc3\\xa4t zu Berlin; Patrick Oesterling, Image and Signal Processing Group, Leipzig University, Germany; Gerhard Heyer, Natural Language Processing Group, Leipzig University, Germany; Tom Liebmann, Image and Signal Processing Group, Leipzig University, Germany; Gerik Scheuermann, Image and Signal Processing Group, Leipzig University, Germany; Christoph Kuras, Natural Language Processing Group, Leipzig University, Germany
Exploring Citation Networks to Study Intertextuality in\\n Classics, 2016: v10 n2
Matteo Romanello, Deutsches Arch\\xc3\\xa4ologisches Institut, Berlin / \\xc3\\x89cole Polytechnique F\\xc3\\xa9d\\xc3\\xa9rale de Lausanne
Exploring Historical RDF with Heml, 2009: v3 n1
Bruce Robertson, Mount Allison University

F

FairCite, 2013: v7 n2
Adam Crymble, Kings College London; Julia Flanders, Northeastern University
Fitting Personal Interpretation with the Semantic Web: lessons\\n learned from Pliny, 2017: v11 n1
John Bradley, King\\'s College London; Michele Pasin, Springer Nature
Foreword, 2009: v3 n1
Gregory Nagy, Harvard University; James O\\'Donnell, Georgetown University
Forward to the Past: Nostalgia for Handwriting in Scribblenauts and The World Ends with\\n You\\n , 2011: v5 n3
Aaron Kashtan, Department of English University of Florida
Friedrich Kittler\\'s Digital Legacy \\xe2\\x80\\x93 PART I - Challenges,\\n Insights and Problem-Solving Approaches in the Editing of Complex Digital Data\\n Collections, 2017: v11 n2
J\\xc3\\xbcrgen Enge, FHNW Academy of Art and Design, Basel; Heinz WernerKramski, German Literature Archive Marbach
From Disclaimer to Critique: Race and the Digital Image\\n Archivist, 2017: v11 n3
Kate Holterhoff, Georgia Institute of Technology
From Distracted to Recursive Reading:\\n Facilitating Knowledge Transfer through Annotation Software, 2018: v12 n2
Sarah E. Kersh, Dickinson College; Chelsea Skalak, Dickinson College
From Optical Fiber To Conceptual Cyberinfrastructure, 2011: v5 n1
Patrik Svensson, HUMlab, Ume\\xc3\\xa5 University
From Stone to Screen: Digital\\n Revitalization of Ancient Epigraphy, 2016: v10 n1
Lisa Tweten, University of British Columbia; Gwynaeth McIntyre, University of British Columbia; Chelsea Gardner, University of British Columbia

G

Gender, Race, and Nationality in Black Drama, 1950-2006: Mining Differences in Language Use\\n in Authors and their Characters, 2009: v3 n2
Shlomo Argamon, Linguistic Cognition Lab, Dept. of Computer Science, Illinois Institute of Technology, Chicago; Charles Cooney, ARTFL Project, University of Chicago; Russell Horton, Digital Library Development Center, University of Chicago; Mark Olsen, ARTFL Project, University of Chicago; Sterling Stein, Linguistic Cognition Lab, Dept. of Computer Science, Illinois Institute of Technology, Chicago; Robert Voyer, Powerset
A Genealogy of Distant Reading, 2017: v11 n2
Ted Underwood, University of Illinois, Urbana-Champaign
Generous Interfaces for Digital Cultural Collections, 2015: v9 n1
Mitchell Whitelaw, University of Canberra, Australia
Ghosts in the Machine: a motion-capture\\n experiment in distributed reception, 2018: v12 n3
Helen Slaney, University of Roehampton; Anna Foka, DH Uppsala and Humlab Ume\\xc3\\xa5; Sophie Bocksberger, University of Oxford
GIS and Literary History: Advancing Digital Humanities\\n research through the Spatial Analysis of historical travel writing and\\n topographical literature, 2017: v11 n1
Patricia Murrieta-Flores, Digital Humanities Research Centre, University of Chester; Christopher Donaldson, History Department, Lancaster University; Ian Gregory, History Department, Lancaster University
Graphic Sublime: On the Art and Designwriting of Kate Armstrong\\n and Michael Tippett, 2012: v6 n2
Joseph Tabbi, University of Illinois at Chicago (UIC)
Grid-enabling Humanities Datasets, 2009: v3 n4
Mark Hedges, Centre for e-Research, King\\xe2\\x80\\x99s College London

H

A Historical Geographic Information System (HGIS) of Nubia\\n Based on the William J. Bankes Archive (1815-1822), 2017: v11 n2
Daniele Salvoldi, Dahlem Research School, Freie Universit\\xc3\\xa4t Berlin
History, People, and Informatics: A Conversation between Sharon\\n Irish and Wendy Plotkin, 2010: v4 n2
Sharon Irish, Interim Director, Community Informatics Initiative, Graduate School of Library and Information Science, University of Illinois, Urbana-Champaign; Wendy Plotkin, Editor-in-Chief, H-Urban
How do we get to the Humanitarium from here?, 2016: v10 n3
Stan Ruecker, IIT Institute of Design
How Literary Works Exist: Convenient Scholarly Editions, 2009: v3 n3
Peter Shillingsburg, Loyola University Chicago
How Scholars Read Now: When the Signal\\n Is the Noise, 2018: v12 n1
Jennifer Edmond, Trinity College Dublin
Humanities Approaches to Graphical Display, 2011: v5 n1
Johanna Drucker, Breslauer Professor of Bibliographical Studies Department of Information Studies, UCLA
Humanities Computing as Digital Humanities, 2009: v3 n3
Patrik Svensson, Ume\\xc3\\xa5 University
The Humanities HyperMedia Centre @ Acadia University: An Invitation to Think About Higher\\n Education, 2008: v2 n1
Richard Cunningham, Acadia University; David Duke, Acadia University; John Eustace, Acadia University; Anna Galway; Erin Patterson, Acadia University
Humanities Unbound: Supporting Careers and Scholarship Beyond the\\n Tenure Track, 2015: v9 n1
Katina Rogers, The Graduate Center, City University of New York
A Hybrid Model for Managing DH Projects, 2017: v11 n1
Edin Tabak, University of Zenica

I

Impractical Applications, 2011: v5 n1
Wendell Piez, Senior Consultant, Mulberry Technologies, Inc.
In One\\'s Own Hand: Seeing Manuscripts in a Digital Age, 2012: v6 n2
Anna Chen, The University of Texas at Austin
Information access in the art history domain: Evaluating a\\n federated search engine for Rembrandt research , 2016: v10 n4
Suzan Verberne, Radboud University; Lou Boves, Radboud University; Antal van den Bosch, Radboud University
An Information Science Question in DH Feminism, 2015: v9 n2
Tanya Clement, School of Information, University of Texas at Austin
Interdisciplinary Collaboration and Brokerage in\\n the Digital
 Humanities, 2017: v11 n3
Anela Chan, Independent Scholar; Richard Chenhall, University of Melbourne; Tamara Kohn, University of Melbourne; Carolyn Stevens, Monash University
Interpretative Quests in Theory and Pedagogy, 2007: v1 n1
Jeff Howard, University of Texas, Austin
[es]Introducci\\xc3\\xb3n de los\\n editores, 2018: v12 n1
Ernesto Priani Saiso, Universidad Nacional Aut\\xc3\\xb3noma de M\\xc3\\xa9xico (UNAM); Elena Gonzalez-Blanco Garc\\xc3\\xada, Universidad Nacional de Educaci\\xc3\\xb3n a Distancia (UNED)
Introducing DREaM (Distant Reading Early Modernity), 2017: v11 n4
Matthew Milner; Stephen Wittek, Carnegie Mellon; St\\xc3\\xa9fan Sinclair, McGill University
Introducing Issues in Humanities Computing, 2007: v1 n1
Joseph Raben, Queens College, City University of New York
Introduction, 2009: v3 n3
Amy Earhart, Texas A&M University; Maura Ives, Texas A&M University
Introduction to Feminisms and DH special issue, : v9 n02
Jacqueline Wernimont, Arizona State University
Introduction to the DHQ Special Issue: Digital Technology in the Study of the\\n Past, 2018: v12 n2
Anna Foka, Ume\\xc3\\xa5 University; Jonathan Westin, University of Gothenburg; Adam Chapman, University of Gothenburg
Introduction to the Digital Humanities\\n Summer Institute Colloquium Special Issue, 2016: v10 n1
James O\\xe2\\x80\\x99Sullivan, University of Sheffield; Mary Galvin, University College Cork; Diane Jakacki, Bucknell University
Introduction: Comics and the Digital Humanities, 2015: v9 n4
Roger Todd Whitson, Washington State University; Anastasia Salter, University of Central Florida
Is this Article a Comic?, 2015: v9 n4
Jason Muir Helms, Texas Christian University
\\n It May Change My Understanding of the Field: Understanding Reading Tools for Scholars and Professional Readers, 2009: v3 n4
Ray Siemens, University of Victoria; Cara Leitch, University of Victoria; Analisa Blake, University of Victoria; Karin Armstrong, University of Victoria; John Willinsky, University of British Columbia/Stanford

J

J. M. Coetzee\\'s Work in Stylostatistics, 2014: v8 n3
Peter Johnston, Royal Holloway, University of London
Jane, John \\xe2\\x80\\xa6 Leslie? A Historical Method\\n for Algorithmic Gender Prediction, 2015: v9 n3
Cameron Blevins, Rutgers University; Lincoln Mullen, George Mason University

K

Kindling, Disappearing, Reading, 2013: v7 n1
Yung-Hsing Wu, University of Louisiana, Lafayette
Knowledge Organization and Cultural Heritage in\\n the Semantic Web \\xe2\\x80\\x93 A Review of a Conference and a Special Journal Issue of\\n JLIS, 2018: v12 n1
Marcia Lei Zeng, Kent State University, Kent, Ohio, USA; Sophy Shu-Jiun Chen, Academia Sinica, Taiwan
The Kuzushiji Project: Developing a Mobile Learning Application for\\n Reading Early Modern Japanese Texts, 2017: v11 n1
Yuta Hashimoto, Kyoto University; Yoichi Iikura, Osaka University; Yukio Hisada, Osaka University; SungKook Kang, Osaka University; Tomoyo Arisawa, Osaka University; Daniel Kobayashi-Better, Osaka University

L

[es]La conquista de Jerusal\\xc3\\xa9n \\xc2\\xbfde Cervantes?\\n An\\xc3\\xa1lisis estilom\\xc3\\xa9trico sobre autor\\xc3\\xada en el teatro del Siglo de Oro\\n espa\\xc3\\xb1ol, 2018: v12 n1
Jos\\xc3\\xa9 Calvo Tello, Universidad de W\\xc3\\xbcrzburg; Juan Cerezo Soler, Universidad Aut\\xc3\\xb3noma de Madrid
[es]Laboratorios ciudadanos y humanidades\\n digitales, 2018: v12 n1
Paola Ricaurte Quijano, Tecnol\\xc3\\xb3gico de Monterrey
The Landscape of Digital Humanities, 2010: v4 n1
Patrik Svensson, HUMlab, Ume\\xc3\\xa5 University
Language DNA: Visualizing a Language Decomposition , 2016: v10 n4
Adam James Bradley, University of Waterloo; Travis Kirton, University of Calgary; Mark Hancock, University of Waterloo; Sheelagh Carpendale, University of Calgary
Laptop Policy:\\xc2\\xa0Notes on Boredom, 2018: v12 n2
Grant Wythoff, Pennsylvania State University
[fr]Le texte num\\xc3\\xa9rique : enjeux herm\\xc3\\xa9neutiques, 2018: v12 n1
Jean Guy Meunier, Universit\\xc3\\xa9 du Qu\\xc3\\xa9bec \\xc3\\xa0 Montr\\xc3\\xa9al
The Leipzig Open Fragmentary Texts Series (LOFTS), 2016: v10 n2
Monica Berti, University of Leipzig; Bridget Almas, Tufts University; Gregory R. Crane, Tufts University and University of Leipzig
[fr]Les Sganarelle de Moli\\xc3\\xa8re\\xc2\\xa0: un nom, des\\n syntaxes\\xc2\\xa0?, 2018: v12 n1
\\xc3\\x89lodie B\\xc3\\xa9nard, Universit\\xc3\\xa9 Paris-Sorbonne; Francesca Frontini, Universit\\xc3\\xa9 Paul Val\\xc3\\xa9ry Montpellier
[en]Let a thousand readings flourish . . . A\\n crowdreading experience, 2018: v12 n1
Ioana Galleron, University of Grenoble; Fatiha Idmhand, Universit\\xc3\\xa9 de Poitiers; C\\xc3\\xa9cile Meynard, University of Angers
A Life Lived in Media, 2012: v6 n1
Mark Deuze, Department of Telecommunications, Indiana University; Peter Blank, Department of Telecommunications, Indiana University; Laura Speers, King\\'s College, London
The Literary And/As the Digital Humanities, 2013: v7 n1
Jessica Pressman, ACLS Fellow; Lisa Swanstrom, Florida Atlantic University

M

Machine Enhanced (Re)minding: the Development of\\n Storyspace, 2012: v6 n2
Belinda Barnet, Swinburne University of Technology
The Machine in the Text, and the Text in the Machine , 2010: v4 n1
Manuel Portela, University of Coimbra
Machine Reading the Primeros\\n Libros, 2016: v10 n4
Hannah Alpert-Abrams, University of Texas at Austin
Machine-aided\\n close listening: Prosthetic synaesthesia and the 3D phonotext, 2018: v12 n3
Chris Mustazza, University of Pennsylvania
A Macroscope for Global History: Seshat Global History\\n Databank, a methodological overview, 2016: v10 n4
Pieter Fran\\xc3\\xa7ois, University of Hertfordshire, University of Oxford; J.G. Manning, Yale University; Harvey Whitehouse, University of Oxford; Rob Brennan, Trinity College Dublin; Thomas Currie, University of Exeter, Penryn Campus; Kevin Feeney, Trinity College Dublin; Peter Turchin, University of Connecticut
Making and Breaking: Teaching Information Ethics\\n through Curatorial Practice, 2018: v12 n4
Christina Boyles, Michigan State University
The Making of Our Cultural Commonwealth\\n , 2009: v3 n4
John Unsworth, Graduate School of Library and Information Science, University of Illinois, Urbana-Champaign
Manuscript Study in Digital Spaces: The State of\\n the Field and New Ways Forward, 2018: v12 n2
Bridget Almas, The Alpheios Project, Ltd.; Emad Khazraee, School of Information, Kent State University; Matthew Thomas Miller, Roshan Institute for Persian Studies, University of Maryland College Park; Joshua Westgard, University Libraries, University of Maryland College Park
The Materialities of Close Reading: 1942, 1959, 2009, 2012: v6 n1
David Ciccoricco, University of Otago
Materiality Comics, 2015: v9 n4
Aaron Jacob Kashtan, Miami University
\\n May the Text Rise up to Meet You: New Ways of Reading Old Manuscripts, 2009: v3 n3
Eugene Lyman, University of Rhode Island
Metaphors in Digital Hermeneutics: Zooming\\n through Literary, Didactic and Historical Representations of Imaginary and\\n Existing Cities, 2017: v11 n3
Florentina Armaselu, University of Luxembourg; Charles van den Heuvel, Huygens ING, The Hague, Netherlands; University of Amsterdam
Methodological Nearness and the Question of\\n Computational Literature, 2018: v12 n2
Michael Marcinkowski, Bath Spa University
Mining Eighteenth Century Ontologies: Machine Learning and Knowledge\\n Classification in the Encyclop\\xc3\\xa9die\\n , 2009: v3 n2
Russell Horton, Digital Library Development Center, University of Chicago; Robert Morrissey, University of Chicago; Mark Olsen, ARTFL Project, University of Chicago; Glenn Roe, ARTFL Project, University of Chicago; Robert Voyer, Powerset
Mining Embodied Emotions: A Comparative Analysis\\n of Sentiment and Emotion in Dutch Texts,\\n 1600-1800., 2017: v11 n4
Inger Leemans, Faculty of Humanities, Vrije Universiteit Amsterdam, The Netherlands; Janneke M. van der Zwaan, Netherlands eScience Center, Amsterdam, The Netherlands; Isa Maks, Faculty of Humanities, Vrije Universiteit Amsterdam, The Netherlands; Erika Kuijpers, Faculty of Humanities, Vrije Universiteit, Amsterdam, The Netherlands; Kristine Steenbergh, Faculty of Humanities, Vrije Universiteit, Amsterdam, The Netherlands
Mining for characterising patterns in literature using\\n correspondence analysis: an experiment on French novels, 2017: v11 n2
Francesca Frontini, Universit\\xc3\\xa9 Paul-Val\\xc3\\xa9ry Montpellier 3 - Praxiling UMR 5267 CNRS - UPVM3; Mohamed Amine Boukhaled, Laboratoire d\\'Informatique de Paris 6 (LIP6 UPMC) / Labex OBVIL; Jean-Gabriel Ganascia, Laboratoire d\\'Informatique de Paris 6 (LIP6 UPMC) / Labex OBVIL
Mining for the Meanings of a Murder: The Impact of OCR Quality\\n on the Use of Digitized Historical Newspapers, 2014: v8 n1
Carolyn Strange, Australian National University; Daniel McNamara; Josh Wodak, Australian National University; Ian Wood, Research School of Computer Science, Australian National University
Mining Public Discourse for Emerging Dutch Nationalism, 2016: v10 n3
Maarten van den Bos, Utrecht University; Hermione Giffard, Utrecht University
Missed Connections: The Collective Novel and the Metropolis, 2011: v5 n2
J.J. Butts, Assistant Professor of English, Wartburg College
Modeling Afro-Latin American Artistic Representations in Topic Maps: Cuba\\xe2\\x80\\x99s Prominence in Latin American Discourse, 2013: v7 n1
Eduard A. Arriaga, University of Western Ontario; Fernando Sancho Caparrini, University of Sevilla; Juan Luis Su\\xc3\\xa1rez, University of Western Ontario
The MoEML Pedagogical Partnership\\n Program, 2017: v11 n3
Janelle Jenstad, University of Victoria; Kim McLean-Fiander, University of Victoria; Kathryn R. McPherson, Utah Valley University
[en]Moli\\xc3\\xa8re\\'s Sganarelles: one name, several\\n syntaxes?, 2018: v12 n1
\\xc3\\x89lodie B\\xc3\\xa9nard, Universit\\xc3\\xa9 Paris-Sorbonne; Francesca Frontini, Universit\\xc3\\xa9 Paul Val\\xc3\\xa9ry Montpellier

N

The New Edition of the Letters of Vincent van Gogh on the\\n Web, 2010: v4 n2
Arianna Ciula, Independent Scholar
New Media in the Academy: Labor and the Production of Knowledge in\\n Scholarly Multimedia, 2011: v5 n3
Helen J. Burgess, University of Maryland Baltimore County; Jeanne Hamming, Centenary College of Louisiana
The New Place of Reading: Locative Media and the Future of\\n Narrative, 2011: v5 n3
Brian Greenspan, Carleton University
Nodalism, 2011: v5 n3
Phillip H. Gochenour, Towson University
Now is the Future Now? The Urgency of Digital Curation in the\\n Digital Humanities, 2013: v7 n2
Alex H. Poole, University of North Carolina at Chapel Hill

O

Obama\\xe2\\x80\\x99s Sixth Annual Address: Image, Affordance, Flow, 2016: v10 n4
Dan Faltesek, Oregon State University
OCR of historical printings with an application to building\\n diachronic corpora: A case study using the RIDGES herbal corpus, 2017: v11 n2
Uwe Springmann, LMU Munich & Humboldt-Universit\\xc3\\xa4t zu Berlin; Anke L\\xc3\\xbcdeling, Humboldt-Universit\\xc3\\xa4t zu Berlin
Old Content and Modern Tools \\xe2\\x80\\x93 Searching Named Entities in a\\n Finnish OCRed Historical Newspaper Collection 1771\\xe2\\x80\\x931910, 2017: v11 n3
Kimmo Kettunen, National Library of Finland, Mikkeli, Finland; Eetu M\\xc3\\xa4kel\\xc3\\xa4, University of Helsinki, Helsinki Centre for Digital Humanities; Teemu Ruokolainen, National Library of Finland, Mikkeli, Finland; Juha Kuokkala, University of Helsinki, Department of Modern Languages, Helsinki, Finland; Laura L\\xc3\\xb6fberg, Department of Linguistics and English Language, Lancaster University, UK
Old Ways for Linking Texts in the Digital Reading Environment: The\\n Case of the Thompson Chain Reference Bible, 2012: v6 n2
Brent Nelson, University of Saskatchewan; Jon Bath, University of Saskatchewan
Ontologies and Logic Reasoning as Tools in Humanities?, 2009: v3 n4
Am\\xc3\\xa9lie Z\\xc3\\xb6llner-Weber, Uni Digital, Bergen, Norway
An Ontology for Gendered Content Representation of Cultural\\n Heritage Artefacts, 2017: v11 n3
Ioanna Kyvernitou, National University of Ireland, Galway (NUIG); Antonis Bikakis, University College London (UCL)
Open Access and the Theological\\n Imagination, 2017: v11 n4
Talea Anderson, Washington State University; David Squires, University of Louisiana at Lafayette
Orientation: Man and His Tool,\\n Again?, 2015: v9 n2
Nicole Starosielski, New York University

P

Past Visions and Reconciling Views: Visualizing Time, Texture\\n and Themes in Cultural Collections, 2017: v11 n2
Katrin Glinka, University of Applied Sciences Potsdam; Christopher Pietsch, University of Applied Sciences Potsdam; Marian D\\xc3\\xb6rk, University of Applied Sciences Potsdam
A Pedagogy for Computer-Assisted Literary Analysis:\\n Introducing GALGO (Golden Age Literature Glossary\\n Online) , 2017: v11 n3
Nuria Alonso Garc\\xc3\\xada, Providence College; Alison Caplan, Providence College; Brad Mering, Mervideo
Pertinent Discussions Toward Modeling the Social\\n Edition: Annotated Bibliographies, 2012: v6 n1
Ray Siemens, University of Victoria; Meagan Timney, University of Victoria; Cara Leitch, University of Victoria; Corina Koolen, University of Victoria; Alex Garnett, University of Victoria
Picture Problems: X-Editing Images 1992-2010, 2009: v3 n3
Morris Eaves, University of Rochester
Playing with Chance: On Random Generation in Playable Media\\n and Electronic Literature, 2013: v7 n3
Robert Schoenbeck, University of California, Irvine
The Poetess Archive Database, 2009: v3 n3
Laura Mandell, Miami University of Ohio
\\n Postmodern Culture and More: an Oral History\\n Conversation between John Unsworth and Anne Welsh , 2012: v6 n3
John Unsworth, Brandeis University; Anne Welsh, University College London; Julianne Nyhan, University College London; Jessica Salmon, University of Trier
[en]Potentialities and difficulties of a digital\\n humanities (DH) project: confrontation with tools and reorientations of\\n research, 2018: v12 n1
Christelle Cocco, Universit\\xc3\\xa9 de Lausanne; Gr\\xc3\\xa9gory Dessart, Universit\\xc3\\xa9 de Lausanne; Olga Serbaeva, Universit\\xc3\\xa9s de Lausanne et de Z\\xc3\\xbcrich; Pierre-Yves Brandt, Universit\\xc3\\xa9 de Lausanne; Dominique Vinck, Universit\\xc3\\xa9 de Lausanne; Fr\\xc3\\xa9d\\xc3\\xa9ric Darbellay, Universit\\xc3\\xa9 de Gen\\xc3\\xa8ve
[fr]Potentialit\\xc3\\xa9s et difficult\\xc3\\xa9s d\\xe2\\x80\\x99un projet en\\n humanit\\xc3\\xa9s num\\xc3\\xa9riques (DH)\\xc2\\xa0: confrontation aux outils et r\\xc3\\xa9orientations de\\n recherche, 2018: v12 n1
Christelle Cocco, Universit\\xc3\\xa9 de Lausanne; Gr\\xc3\\xa9gory Dessart, Universit\\xc3\\xa9 de Lausanne; Olga Serbaeva, Universit\\xc3\\xa9s de Lausanne et de Z\\xc3\\xbcrich; Pierre-Yves Brandt, Universit\\xc3\\xa9 de Lausanne; Dominique Vinck, Universit\\xc3\\xa9 de Lausanne; Fr\\xc3\\xa9d\\xc3\\xa9ric Darbellay, Universit\\xc3\\xa9 de Gen\\xc3\\xa8ve
[fr]Pour une analyse automatique du jugement critique\\xc2\\xa0:\\n les citations modalis\\xc3\\xa9es dans le discours litt\\xc3\\xa9raire du XIXe si\\xc3\\xa8cle, 2018: v12 n1
Marine Riguet, Labex OBVIL (Paris-Sorbonne); Motasem Alrahabi, Paris-Sorbonne Abou Dhabi
Predicting the\\n Past, 2018: v12 n2
Tobias Blanke, King\\'s College London, Department of Digital Humanities
The Printing Press as Metaphor, 2016: v10 n3
Elyse Graham, SUNY Stony Brook
The Productive Unease of 21st-century Digital Scholarship, 2009: v3 n3
Julia Flanders, Brown University
Published Yet Never Done: The Tension Between Projection and Completion in\\n Digital Humanities Research, 2009: v3 n2
Susan Brown, University of Guelph; Patricia Clements, University of Alberta; Isobel Grundy, University of Alberta; Stan Ruecker, University of Alberta; Jeffery Antoniuk, University of Alberta; Sharon Balazs, University of Alberta

Q

[fr]Que mille lectures s\\xe2\\x80\\x99\\xc3\\xa9panouissent\\xe2\\x80\\xa6 Mod\\xc3\\xa9lisation du\\n personnage et exp\\xc3\\xa9rience de \\xc2\\xa0crowdreading\\xc2\\xa0, 2018: v12 n1
Ioana Galleron, University of Grenoble; Fatiha Idmhand, Universit\\xc3\\xa9 de Poitiers; C\\xc3\\xa9cile Meynard, University of Angers
\\n Questioning, Asking and Enduring Curiosity: an Oral History Conversation between Julianne Nyhan and Willard McCarty \\n , 2012: v6 n3
Willard McCarty, King\\'s College London and University of Western Sydney; Julianne Nyhan, University College London; Anne Welsh, University College London; Jessica Salmon, University of Trier

R

Racial Proxies in Daily News: A Case Study of the Use of\\n Directional Euphemisms, 2016: v10 n4
Timothy Messer-Kruse, Bowling Green State University
The Radical Historicity of Everything: Exploring Shakespearean Identity with Web\\n 2.0, 2009: v3 n3
Katheryn Giglio, University of Central Florida (English Department); John Venecek, University of Central Florida Libraries
Raiders of the Lost Corpus, 2016: v10 n2
Caroline T Schroeder, University of the Pacific; Amir Zeldes, Georgetown University
Readies Online, 2011: v5 n3
Craig Saper, University of Maryland, Baltimore County
Reading, Making, and Metacognition: Teaching\\n Digital Humanities for Transfer, 2018: v12 n2
Paul Fyfe, North Carolina State University
Reading Potential: The Oulipo and the Meaning of Algorithms, 2007: v1 n1
Mark Wolff, Hartwick College
Reading Today, 2014: v8 n4
Fr\\xc3\\xa9d\\xc3\\xa9ric Clavert, Universit\\xc3\\xa9 Paris-Sorbonne
Reconstructing\\n Brandon (1998-1999): A Cross-disciplinary\\n Digital Humanities Study of Shu Lea Cheang\\xe2\\x80\\x99s Early Web Artwork, 2018: v12 n2
Deena Engel, New York University; Lauren Hinkson, Solomon R. Guggenheim Museum; Joanna Phillips, Solomon R. Guggenheim Museum; Marion Thain, New York University
Reinventing the Classroom Edition: Paradise Lost\\n Book IX Flash Audiotext, 2009: v3 n3
Olin Bjork, Georgia Institute of Technology
Renaissance Remix. Isabella\\n d\\xe2\\x80\\x99Este: Virtual Studiolo, 2018: v12 n4
Deanna Shemek, University of California, Irvine, USA; Antonella Guidazzoli, VisitLab - Cineca Interuniversity Consortium, Italy; Maria Chiara Liguori, VisitLab - Cineca Interuniversity Consortium, Italy; Giovanni Bellavia, VisitLab - Cineca Interuniversity Consortium, Italy; Daniele De Luca, VisitLab - Cineca Interuniversity Consortium, Italy; Luigi Verri, VisitLab - Cineca Interuniversity Consortium, Italy; Silvano Imboden, VisitLab - Cineca Interuniversity Consortium, Italy
Researcher as Bricoleur: Contextualizing\\n humanists\\xe2\\x80\\x99 digital workflows, 2018: v12 n3
Smiljana Antonijevic, The Pennsylvania State University; Ellysa Stern Cahoy, The Pennsylvania State University
[es]Retorno a trazos de mil\\n historias, 2018: v12 n1
Suzana Sukovic, HETI (Health Education and Training Institute); Peter Read, Australian National University
[en]A Return to the Traces of a Thousand\\n Stories, 2018: v12 n1
Suzana Sukovic, HETI (Health Education and Training Institute); Peter Read, Australian National University
Reverse Engineering the First Humanities\\n Computing Center, 2018: v12 n2
Steven Jones, University of South Florida
A Review of Memes in Digital\\n Culture, 2016: v10 n2
Kevin Lewis, Virginia Tech
Review: The Electronic Literature Collection Volume I: A New Media\\n Primer\\n , 2008: v2 n1
Mark C. Marino, University of Southern California
Revista Digital Universitaria: A Workshop of Digital Editing at the Universidad\\n\\t\\t\\tNacional Aut\\xc3\\xb3noma de M\\xc3\\xa9xico, 2007: v1 n2
Ernesto Priani Sais\\xc3\\xb3, Universidad Nacional Aut\\xc3\\xb3noma de M\\xc3\\xa9xico
Ross Scaife (1960-2008), 2009: v3 n1
Dot Porter, Digital Humanities Observatory

S

Scaffolding and Play Approaches to Digital\\n Humanities Pedagogy: Assessment and Iteration in Topically-Driven\\n Courses, 2017: v11 n4
Daniel G. Tracy, University of Illinois at Urbana-Champaign; Elizabeth Massa Hoiem, University of Illinois at Urbana-Champaign
Semantic Enrichment of a Multilingual Archive with Linked Open\\n Data, 2017: v11 n4
Max De Wilde, Universit\\xc3\\xa9 libre de Bruxelles(ULB), Information Science Department; Simon Hengchen, University of Helsinki (UH)
Sequential Rhetoric: Using Freire and Quintilian to Teach\\n Students to Read and Create Comics , 2015: v9 n4
Robert Dennis Watkins, Idaho State University; Tom Lindsley, Interaction Designer, Workiva
Service-Oriented Software in the Humanities: A Software Engineering Perspective, 2009: v3 n4
Nicolas Gold, King\\'s College London, Department of Computer Science
Shakespeare\\xe2\\x80\\x99s Tragic Social Network; or Why All the World\\xe2\\x80\\x99s a\\n Stage, 2017: v11 n2
James Lee, University of Cincinnati; Jason Lee, Independent Scholar
\\n A short Introduction to the Hidden Histories project and interviews\\n , 2012: v6 n3
Julianne Nyhan, Lecturer, UCL; Andrew Flinn, Lecturer, UCL; Anne Welsh, Lecturer, UCL
Simulated Visuals: Some Rhetorical and Ethical Implications, 2009: v3 n3
Aimee Roundtree, University of Houston-Downtown
Six Degrees of Francis Bacon: A Statistical Method for\\n Reconstructing Large Historical Social Networks , 2016: v10 n3
Christopher N. Warren, Carnegie Mellon University; Daniel Shore, Georgetown University; Jessica Otis, Carnegie Mellon University; Lawrence Wang, Carnegie Mellon University; Mike Finegold, Carnegie Mellon University; Cosma Shalizi, Carnegie Mellon University
Some principles for making collaborative scholarly editions in\\n digital form, 2017: v11 n2
Peter Robinson, University of Saskatchewan
Something Called Digital Humanities\\n , 2008: v2 n1
Wendell Piez, Mulberry Technologies, Inc.
Sound and Digital Humanities: reflecting on a DHSI\\n course, 2016: v10 n1
John F. Barber, The Creative Media & Digital Culture Program, Washington State University Vancouver
Sounding for Meaning: Using Theories of Knowledge Representation\\n to Analyze Aural Patterns in Texts, 2013: v7 n1
Tanya Clement, University of Texas, Austin; David Tcheng, University of Illinois, Urbana-Champaign; Loretta Auvil, University of Illinois, Urbana-Champaign; Boris Capitanu, University of Illinois, Urbana-Champaign; Megan Monroe, University of Maryland, College Park
SpotiBot \\xe2\\x80\\x94 Turing Testing\\n Spotify, 2018: v12 n1
Pelle Snickars, Ume\\xc3\\xa5 University; Roger M\\xc3\\xa4hler, Ume\\xc3\\xa5 University
Starting From Scratch? Workshopping New Directions in\\n Undergraduate Digital Humanities, 2017: v11 n3
Caitlin Christian-Lamb, Davidson College; Anelise Hanson Shrout, California State University Fullerton
Stealing a Corpus: Appropriating Aesop\\xe2\\x80\\x99s Body in the\\n Early Age of Print , 2018: v12 n2
Alex Mueller, University of Massachusetts Boston
Stretched Skulls: Anamorphic Games and the memento mortem\\n mortis\\n , 2012: v6 n2
Stephanie Boluk, Vassar College; Patrick LeMieux, Duke University
Structure over Style: Collaborative Authorship and the Revival\\n of Literary Capitalism, 2017: v11 n1
Simon Fuller, National University of Ireland, Maynooth; James O\\'Sullivan, University of Sheffield
Student Labour and Training in Digital Humanities, 2016: v10 n1
Katrina Anderson, Simon Fraser University; Lindsey Bannister, Simon Fraser University; Janey Dodd, University of British Columbia; Deanna Fong, Simon Fraser University; Michelle Levy, Simon Fraser University; Lindsey Seatter, University of Victoria
Studying Up: A Review of Alice Marwick\\xe2\\x80\\x99s Status Update, 2015: v9 n2
Luke Fernandez, Weber State University
The Stuff of Science Fiction: An Experiment in Literary\\n History, 2016: v10 n1
Stefania Forlini, University of Calgary; Uta Hinrichs, University of St. Andrews; Bridget Moynihan, University of Calgary
Supporting the Exploration of Online Cultural Heritage\\n Collections: The Case of the Dutch Folktale Database, 2017: v11 n4
Iwe Everhardus Christiaan Muiser, University of Twente, Enschede / Meertens Institute, Amsterdam; Mari\\xc3\\xabt Theune, University of Twente, Enschede; Ruud de Jong, University of Twente, Enschede; Nigel Smink, University of Twente, Enschede; Dolf Trieschnigg, MyDatafactory, Meppel; Djoerd Hiemstra, University of Twente, Enschede; Theo Meder, Meertens Institute, Amsterdam / University of Groningen, Groningen

T

TaDiRAH: a Case Study in Pragmatic Classification, 2016: v10 n1
Luise Borek, Technical University of Darmstadt; Quinn Dombrowski, Miami University; Jody Perkins, Miami University; Christof Sch\\xc3\\xb6ch, University of W\\xc3\\xbcrzburg
A Tale of Two Internships: Developing Digital Skills through\\n Engaged Scholarship, 2017: v11 n3
Patricia Hswe, The Andrew W. Mellon Foundation; Tara LaLonde, The Pennsylvania State University; Kate Miffitt, The Pennsylvania State University; James O\\'Sullivan, University College Cork; Sarah Pickle, The Claremont Colleges Library; Nathan Piekielek, The Pennsylvania State University; Heather Ross, The Pennsylvania State University; Albert Rozo, The Pennsylvania State University
Teaching and Learning from the U.S. South in Global Contexts: A Case Study of Southern Spaces and Southcomb, 2009: v3 n2
Sarah Toton, Emory University; Stacey Martin, Emory University
Teaching Electronic Literature as Digital Humanities: A\\n Proposal, 2017: v11 n3
Alex Saum-Pascual, University of California, Berkeley
Teaching Spatial Literacy in the Classical Studies\\n Curriculum, 2016: v10 n2
Rebecca K. Schindler, DePauw University
The Technical Evolution of Vannevar Bush\\xe2\\x80\\x99s Memex , 2008: v2 n1
Belinda Barnet, Swinburne University of Technology, Melbourne
Technology, Collaboration, and Undergraduate Research, 2009: v3 n1
Christopher Blackwell, Furman University; Thomas R. Martin, College of the Holy Cross
Tenure, Promotion and Digital Publication, 2007: v1 n1
Joseph Raben, Queens College, City University of New York
Textual Artifacts and their Digital\\n Representations: Teaching Graduate Students to Build Online\\n Archives, 2015: v9 n1
Deena Engel, Courant Institute of Mathematical Sciences, New York University; Marion Thain, New York University
Textual Reuse in the Eighteenth Century: Mining Eliza\\n Haywood\\xe2\\x80\\x99s Quotations, 2016: v10 n1
Douglas Ernest Duhaime, University of Notre Dame
Theorizing Connectivity: Modernism and the Network Narrative, 2011: v5 n2
Wesley Beal, Lyon College; Stacy Lavin, Independent Scholar
To tree, or not to tree? On the Empirical Basis\\n for Having Past Landscapes to Experience., 2018: v12 n2
Philip I. Buckland, Ume\\xc3\\xa5 University, Sweden; Nicol\\xc3\\xb2 Dell\\'Unto, Lund University, Sweden; G\\xc3\\xadsli P\\xc3\\xa1lsson, Ume\\xc3\\xa5 University, Sweden
To Visualize Past Communities: A Solution from Contemporary\\n Practices in the Industry for the Digital Humanities , 2017: v11 n2
G\\xc3\\xa9rald P\\xc3\\xa9oux, Universit\\xc3\\xa9 Paris Ouest Nanterre La D\\xc3\\xa9fense & Institut d\\'Histoire Moderne et Contemporaine (CNRS); Jean-Roch Houllier, Thales University
Topic Modeling Genre: An Exploration of French Classical and\\n Enlightenment Drama, 2017: v11 n2
Christof Sch\\xc3\\xb6ch, University of W\\xc3\\xbcrzburg, Germany
[en]Toward an automatic analysis of critical judgment:\\n modified quotes in the literary discourse of the nineteenth century, 2018: v12 n1
Marine Riguet, Labex OBVIL (Paris-Sorbonne); Motasem Alrahabi, Paris-Sorbonne Abou Dhabi
Toward an Open Digital Tutorial for Ancient Greek v.\\n 2.0, 2016: v10 n2
Jeffrey Rydberg-Cox, The University of Missouri-Kansas City
Towards a Conceptual Framework for the Digital Humanities, 2012: v6 n2
Paul S. Rosenbloom, Department of Computer Science and Institute for Creative Technologies, University of Southern California
Towards a Rationale of Audio-Text, 2016: v10 n2
Tanya E. Clement, University of Texas at Austin
Towards a Seamful Design of Networked Knowledge: Practical\\n Pedagogies in Collaborative Teams, 2017: v11 n3
Aaron Mauro, Penn State Behrend; Daniel Powell, University of Victoria; Sarah Potvin, Texas A&M University; Jacob Heil, The Five Colleges of Ohio; Eric Dye, Penn State Behrend; Bridget Jenkins, Penn State Behrend; Dene Grigar, Washington State University
Tracking the telepathic sublime as a phenomenon\\n in a digital humanities archive, 2017: v11 n4
Isabel Pedersen, University of Ontario Institute of Technology; Quinn DuPont, University of Washington
Trading Stories: an Oral History Conversation between\\n Geoffrey Rockwell and Julianne Nyhan , 2012: v6 n3
Geoffrey Rockwell, University of Alberta; Julianne Nyhan, University College London; Anne Welsh, University College London; Jessica Salmon, University of Trier
Traveling the Silk Road on a Virtual Globe: Pedagogy,\\n Technology and Evaluation for Spatial History, 2013: v7 n2
Ruth Mostern, University of California Merced; Elana Gainor, University of California Merced
Treebanking in the world of Thucydides. Linguistic annotation\\n for the Hellespont Project, 2016: v10 n2
Francesco Mambrini, Deutsches Arch\\xc3\\xa4ologisches Institut, Berlin
Twisty Little Passages Almost All Alike: Applying the FRBR\\n Model to a Classic Computer Game, 2010: v4 n2
Jerome McDonough, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign; Matthew Kirschenbaum, Maryland Institute for Technology in the Humanities, University of Maryland; Doug Reside, Maryland Institute for Technology in the Humanities, University of Maryland; Neil Fraistat, Maryland Institute for Technology in the Humanities, University of Maryland; Dennis Jerz, English \\xe2\\x80\\x94 New Media Journalism, Seton Hill University
TypeWright: An Experiment in Participatory Curation, 2015: v9 n4
Alan Bilansky, University of Illinois, Urbana-Champaign

U

Uncovering Latent Metadata in the FSA-OWI Photographic Archive, 2017: v11 n2
Taylor Arnold, University of Richmond; Stacey Maples, Stanford University; Lauren Tilton, University of Richmond; Laura Wexler, Yale University
Undergraduate Students and Digital Humanities Belonging:\\n Metaphors and Methods for Including Undergraduate Research in DH\\n Communities, 2017: v11 n3
Emily Christina Murphy, Queen\\'s University; Shannon R. Smith, Bader International Study Centre, Herstmonceux Castle, Queen\\'s University
The Underside of the Digital Field, 2012: v6 n2
Terry Harpold, University of Florida
Unraveling reported dreams with text\\n analytics, 2017: v11 n4
Iris Hendrickx, Centre for Language Studies, Radboud University, Nijmegen, The Netherlands; Centre for Language and Speech Technology, Radboud University, Nijmegen, The Netherlands; Louis Onrust, Centre for Language Studies, Radboud University, Nijmegen, The Netherlands; Florian Kunneman, Centre for Language Studies, Radboud University, Nijmegen, The Netherlands; Ali H\\xc3\\xbcrriyeto\\xc4\\x9flu, Centre for Language Studies, Radboud University, Nijmegen, The Netherlands; Wessel Stoop, Centre for Language and Speech Technology, Radboud University, Nijmegen, The Netherlands; Antal van den Bosch, Meertens Institute, Amsterdam, The Netherlands

V

\\n Video-gaming, Paradise Lost and TCP/IP: an Oral History Conversation between Ray Siemens and Anne Welsh\\n , 2012: v6 n3
Ray Siemens, University of Victoria; Anne Welsh, University College London; Julianne Nyhan, University College London; Jessica Salmon, University of Trier
A View from IT, 2011: v5 n3
James Smithies, University of Canterbury, New Zealand
\\n A Visual Sense is Born in the Fingertips: Towards a Digital Ekphrasis, 2013: v7 n1
Cecilia Lindh\\xc3\\xa9, Ume\\xc3\\xa5 University
Visualizing and Analyzing the Hollywood Screenplay with\\n ScripThreads, 2014: v8 n4
Eric Hoyt, University of Wisconsin-Madison; Kevin Ponto, University of Wisconsin-Madison; Carrie Roy, University of Wisconsin-Madison
Visualizing Theatrical Text: From Watching the Script to the\\n Simulated Environment for Theatre (SET), 2013: v7 n3
Jennifer Roberts-Smith, University of Waterloo; Shawn DeSouza-Coelho, University of Waterloo; Teresa M. Dobson, University of British Columbia; Sandra Gabriele, York University; Omar Rodriguez-Arenas, University of Alberta; Stan Ruecker, Illinois Institute of Technology; St\\xc3\\xa9fan Sinclair, McGill University; Annmarie Akong, York University; Matt Bouchard, University of Alberta; Marcelo Hong, York University; Diane Jakacki, Bucknell University; David Lam, University of Waterloo; Alexandra Kovacs, University of Toronto; Lesley Northam, University of Waterloo; Daniel So, York University
Vive la Diff\\xc3\\xa9rence! Text Mining Gender Difference in French Literature\\n , 2009: v3 n2
Shlomo Argamon, Linguistic Cognition Lab, Dept. of Computer Science, Illinois Institute of Technology; Jean-Baptiste Goulain, Linguistic Cognition Lab, Dept. of Computer Science, Illinois Institute of Technology; Russell Horton, Digital Library Development Center, University of Chicago; Mark Olsen, ARTFL Project, University of Chicago

W

War in Parliament: What a Digital Approach Can Add to the\\n Study of Parliamentary History , 2014: v8 n1
Hinke Piersma, NIOD Institute for War, Holocaust and Genocide Studies; Ismee Tames, NIOD Institute for War, Holocaust and Genocide Studies; Lars Buitinck, Informatics Institute, University of Amsterdam; Johan van Doornik, Informatics Institute, University of Amsterdam; Maarten Marx, Informatics Institute, University of Amsterdam
Web 2.0 and the Ontology of the Digital, 2012: v6 n2
Aden Evens, Dartmouth College
Webbots and Machinic Agency, 2012: v6 n2
John Johnston, Emory University
Welcome to Digital Humanities Quarterly, 2007: v1 n1
Julia Flanders, Brown University; Wendell Piez, Mulberry Technologies, Inc.; Melissa Terras, University College London
What can the digital humanities learn from feminist game\\n studies?, 2015: v9 n2
Elizabeth Losh, University of California, San Diego
What Your Teacher Told You is True: Latin Verbs Have Four Principal Parts , 2009: v3 n1
Raphael Finkel, University of Kentucky; Gregory Stump, University of Kentucky
The Why and How of Middleware, 2016: v10 n2
Johanna Drucker, UC Los Angeles; Patrik BO Svensson, Ume\\xc3\\xa5 University
Winesburg, Ohio: A Modernist Kluge, 2011: v5 n2
Molly Gage, University of Minnesota
Word Processor Art: How User-friendly Inhibits\\n Creativity, 2016: v10 n1
Ren\\xc3\\xa9e Farrar, United States Military Academy at West Point
Words, Patterns and Documents: Experiments in Machine Learning and Text\\n Analysis, 2009: v3 n2
Shlomo Argamon, Linguistic Cognition Lab, Dept. of Computer Science, Illinois Institute of Technology; Mark Olsen, ARTFL Project, University of Chicago
The Writeprints of Man: a Stylometric Study of\\n Lafayette\\'s Hand in Paine\\'s \\'Rights of\\n Man\\', 2018: v12 n1
Richard Forsyth, Independent Researcher; David Holmes, Department of Statistics, George Mason University
Writing to be Found and Writing Readers, 2011: v5 n3
John Cayley, Brown University

X

XML, Interoperability and the Social Construction of Markup Languages: The\\n Library Example, 2009: v3 n3
Jerome McDonough, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign
Xpos\\xe2\\x80\\x99re: A Tool for Rich Internet Publications, 2014: v8 n2
Leen Breure, Utrecht University & DANS (Dutch Data Archive); Maarten Hoogerwerf, DANS (Dutch Data Archive); Ren\\xc3\\xa9 van Horik, DANS (Dutch Data Archive)

Y

Z

\\n URL: http://www.digitalhumanities.org/dhq/index/title.html
Last updated:\\n
Comments: dhqinfo@digitalhumanities.org
Published by:\\n The Alliance of Digital Humanities Organizations
Affiliated with: Literary and Linguistic Computing
Copyright 2005 -
\"Creative
This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.\\n
'" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import urllib.request\n", "\n", "urlRoot = \"http://digitalhumanities.org\"\n", "titlesUrl = \"http://digitalhumanities.org/dhq/index/title.html\" # define the URL\n", "titlesSource = urllib.request.urlopen(titlesUrl).read() # fetch the source from the URL\n", "titlesSource # preview the contents from the URL" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We have our source code. The next step is to find the list of titles in the document." ] }, { "cell_type": "code", "execution_count": 105, "metadata": {}, "outputs": [], "source": [ "from bs4 import BeautifulSoup\n", "\n", "titlesSoup = BeautifulSoup(titlesSource) # parse the source document\n", "titleIndex = titlesSoup.find(id=\"titleIndex\") # find the tag with ID=titleIndex " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The next step will be to compile a set of URLs but before we do that we'll detour to explain the concept of lists loops in Python.\n", "\n", "A list in Python is an ordered set of elements, sometimes called an array in other languages. A list can be a list of just about any data type, including strings, numbers, objects, and even other lists.\n", "\n", "One very easy way to create a list in Python is using the square brackets." ] }, { "cell_type": "code", "execution_count": 106, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[0, 1, 2, 3, 4, 5]" ] }, "execution_count": 106, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers = [0, 1, 2, 3, 4, 5] # new list of numbers\n", "numbers # preview the list" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's say we want to do something with each element in the list, that's called iterating or looping. In Python, as with many languages, there's the concept of a [for loop](https://docs.python.org/3/tutorial/controlflow.html?highlight=loop#for-statements). In this example we'll create tens from our single digits." ] }, { "cell_type": "code", "execution_count": 107, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[0, 10, 20, 30, 40, 50]" ] }, "execution_count": 107, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tens = [] # create an empty list for our tens\n", "for number in numbers: # for each number in our numbers list\n", " tens.append(number*10) # add to our tens list the number times ten\n", "tens # preview the output" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another useful concept is that of the [conditional structure](http://en.wikibooks.org/wiki/Python_Programming/Conditional_Statements) in Python where we test for a boolean value (true or false).\n", "\n", "Python uses a colon and indentation to indicate the parts of the conditional block. If we want to execute a block when a condition evaluates to true (like ```1 < 5```, one _is_ smaller than five):\n", "\n", " if _condition_:\n", " _block_\n", "\n", "Or if a condition is not true (like ```1 > 5```, one _is not_ smaller than five):\n", "\n", " if *not* _condition_:\n", " _block_\n", "\n", "\n", "Using this syntax we can determine if one string is in another." ] }, { "cell_type": "code", "execution_count": 108, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "dhq is in http://digitalhumanities.org/dhq/index/title.html\n", "qhd is not in http://digitalhumanities.org/dhq/index/title.html\n" ] } ], "source": [ "if \"dhq\" in titlesUrl:\n", " print(\"dhq is in \", titlesUrl)\n", "if not \"qhd\" in titlesUrl:\n", " print(\"qhd is not in \", titlesUrl)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we've introduced lists, conditionals and the \"in\" operator we can understand the code below more easily. This code uses the BeautifulSoup [find_all](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#find-all) function to create a list of all the <a> (anchor or link) in our document." ] }, { "cell_type": "code", "execution_count": 109, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['/dhq/vol/12/1/000354/000354.html',\n", " '/dhq/vol/4/1/000084/000084.html',\n", " '/dhq/vol/3/1/000021/000021.html',\n", " '/dhq/vol/8/4/000187/000187.html',\n", " '/dhq/vol/7/1/000142/000142.html',\n", " '/dhq/vol/8/4/000195/000195.html',\n", " '/dhq/vol/11/3/000318/000318.html',\n", " '/dhq/vol/1/2/000010/000010.html',\n", " '/dhq/vol/12/1/000372/000372.html',\n", " '/dhq/vol/10/1/000240/000240.html',\n", " '/dhq/vol/12/1/000367/000367.html',\n", " '/dhq/vol/12/1/000367/000367.html',\n", " '/dhq/vol/10/2/000250/000250.html',\n", " '/dhq/vol/1/2/000011/000011.html',\n", " '/dhq/vol/10/3/000267/000267.html',\n", " '/dhq/vol/9/3/ 000227 / 000227 .html',\n", " '/dhq/vol/10/4/000269/000269.html',\n", " '/dhq/vol/9/2/000213/000213.html',\n", " '/dhq/vol/2/1/000019/000019.html',\n", " '/dhq/vol/10/4/000271/000271.html',\n", " '/dhq/vol/11/2/000309/000309.html',\n", " '/dhq/vol/5/3/000100/000100.html',\n", " '/dhq/vol/3/3/000066/000066.html',\n", " '/dhq/vol/7/2/000158/000158.html',\n", " '/dhq/vol/5/1/000101/000101.html',\n", " '/dhq/vol/8/3/000189/000189.html',\n", " '/dhq/vol/9/4/000234/000234.html',\n", " '/dhq/vol/12/1/000364/000364.html',\n", " '/dhq/vol/8/4/000196/000196.html',\n", " '/dhq/vol/6/2/000118/000118.html',\n", " '/dhq/vol/9/2/000208/000208.html',\n", " '/dhq/vol/12/2/000376/000376.html',\n", " '/dhq/vol/7/1/000140/000140.html',\n", " '/dhq/vol/11/3/000306/000306.html',\n", " '/dhq/vol/11/3/000310/000310.html',\n", " '/dhq/vol/6/2/000125/000125.html',\n", " '/dhq/vol/6/2/000136/000136.html',\n", " '/dhq/vol/8/2/000179/000179.html',\n", " '/dhq/vol/9/1/000204/000204.html',\n", " '/dhq/vol/8/2/000181/000181.html',\n", " '/dhq/vol/12/1/000347/000347.html',\n", " '/dhq/vol/12/1/000347/000347.html',\n", " '/dhq/vol/3/3/000061/000061.html',\n", " '/dhq/vol/10/3/000258/000258.html',\n", " '/dhq/vol/3/1/000028/000028.html',\n", " '/dhq/vol/12/1/000352/000352.html',\n", " '/dhq/vol/3/1/000034/000034.html',\n", " '/dhq/vol/9/1/000207/000207.html',\n", " '/dhq/vol/11/4/000338/000338.html',\n", " '/dhq/vol/7/1/000157/000157.html',\n", " '/dhq/vol/12/1/000351/000351.html',\n", " '/dhq/vol/6/3/000133/000133.html',\n", " '/dhq/vol/11/3/000320/000320.html',\n", " '/dhq/vol/6/1/000117/000117.html',\n", " '/dhq/vol/3/2/000046/000046.html',\n", " '/dhq/vol/9/3/000237/000237.html',\n", " '/dhq/vol/11/2/000298/000298.html',\n", " '/dhq/vol/10/4/000278/000278.html',\n", " '/dhq/vol/3/1/000033/000033.html',\n", " '/dhq/vol/12/1/000368/000368.html',\n", " '/dhq/vol/8/1/000170/000170.html',\n", " '/dhq/vol/8/3/000185/000185.html',\n", " '/dhq/vol/3/1/000035/000035.html',\n", " '/dhq/vol/2/1/000018/000018.html',\n", " '/dhq/vol/3/4/000069/000069.html',\n", " '/dhq/vol/12/1/000346/000346.html',\n", " '/dhq/vol/11/4/000350/000350.html',\n", " '/dhq/vol/10/3/000257/000257.html',\n", " '/dhq/vol/11/4/000339/000339.html',\n", " '/dhq/vol/4/1/000081/000081.html',\n", " '/dhq/vol/9/3/ 000221 / 000221 .html',\n", " '/dhq/vol/12/4/000403/000403.html',\n", " '/dhq/vol/7/1/000153/000153.html',\n", " '/dhq/vol/12/1/000365/000365.html',\n", " '/dhq/vol/12/1/000365/000365.html',\n", " '/dhq/vol/11/1/000282/000282.html',\n", " '/dhq/vol/7/2/000159/000159.html',\n", " '/dhq/vol/8/4/000194/000194.html',\n", " '/dhq/vol/7/1/000149/000149.html',\n", " '/dhq/vol/3/1/000023/000023.html',\n", " '/dhq/vol/9/2/000215/000215.html',\n", " '/dhq/vol/9/3/000222/000222.html',\n", " '/dhq/vol/9/1/000203/000203.html',\n", " '/dhq/vol/7/1/000150/000150.html',\n", " '/dhq/vol/12/4/000401/000401.html',\n", " '/dhq/vol/8/2/000173/000173.html',\n", " '/dhq/vol/6/2/000123/000123.html',\n", " '/dhq/vol/3/2/000049/000049.html',\n", " '/dhq/vol/3/3/000067/000067.html',\n", " '/dhq/vol/4/2/000083/000083.html',\n", " '/dhq/vol/10/4/000275/000275.html',\n", " '/dhq/vol/7/1/000114/000114.html',\n", " '/dhq/vol/12/1/000354/000354.html',\n", " '/dhq/vol/11/3/000312/000312.html',\n", " '/dhq/vol/11/4/000326/000326.html',\n", " '/dhq/vol/1/2/000013/000013.html',\n", " '/dhq/vol/11/2/000297/000297.html',\n", " '/dhq/vol/8/3/000182/000182.html',\n", " '/dhq/vol/11/3/000335/000335.html',\n", " '/dhq/vol/3/1/000029/000029.html',\n", " '/dhq/vol/4/1/000082/000082.html',\n", " '/dhq/vol/3/4/000077/000077.html',\n", " '/dhq/vol/10/4/000277/000277.html',\n", " '/dhq/vol/12/1/000353/000353.html',\n", " '/dhq/vol/3/1/000031/000031.html',\n", " '/dhq/vol/9/4/ 000232 / 000232 .html',\n", " '/dhq/vol/8/2/000178/000178.html',\n", " '/dhq/vol/7/1/000147/000147.html',\n", " '/dhq/vol/10/2/000256/000256.html',\n", " '/dhq/vol/11/3/000303/000303.html',\n", " '/dhq/vol/8/1/000172/000172.html',\n", " '/dhq/vol/3/4/000079/000079.html',\n", " '/dhq/vol/7/3/000167/000167.html',\n", " '/dhq/vol/10/4/000270/000270.html',\n", " '/dhq/vol/5/3/000099/000099.html',\n", " '/dhq/vol/10/2/000253/000253.html',\n", " '/dhq/vol/11/3/000325/000325.html',\n", " '/dhq/vol/5/3/000106/000106.html',\n", " '/dhq/vol/11/1/000276/000276.html',\n", " '/dhq/vol/12/1/000362/000362.html',\n", " '/dhq/vol/3/1/000027/000027.html',\n", " '/dhq/vol/12/2/000393/000393.html',\n", " '/dhq/vol/12/2/000389/000389.html',\n", " '/dhq/vol/6/2/000129/000129.html',\n", " '/dhq/vol/9/1/000206/000206.html',\n", " '/dhq/vol/3/2/000037/000037.html',\n", " '/dhq/vol/3/4/000074/000074.html',\n", " '/dhq/vol/3/3/000053/000053.html',\n", " '/dhq/vol/12/1/000345/000345.html',\n", " '/dhq/vol/1/1/000004/000004.html',\n", " '/dhq/vol/7/1/000152/000152.html',\n", " '/dhq/vol/1/2/000012/000012.html',\n", " '/dhq/vol/3/3/000051/000051.html',\n", " '/dhq/vol/11/2/000300/000300.html',\n", " '/dhq/vol/9/2/000202/000202.html',\n", " '/dhq/vol/12/1/000364/000364.html',\n", " '/dhq/vol/11/3/000361/000361.html',\n", " '/dhq/vol/6/1/000112/000112.html',\n", " '/dhq/vol/3/1/000030/000030.html',\n", " '/dhq/vol/8/2/000180/000180.html',\n", " '/dhq/vol/3/4/000071/000071.html',\n", " '/dhq/vol/10/2/000246/000246.html',\n", " '/dhq/vol/10/3/000262/000262.html',\n", " '/dhq/vol/11/2/000296/000296.html',\n", " '/dhq/vol/10/2/000255/000255.html',\n", " '/dhq/vol/3/1/000026/000026.html',\n", " '/dhq/vol/7/2/000164/000164.html',\n", " '/dhq/vol/11/1/000279/000279.html',\n", " '/dhq/vol/3/1/000036/000036.html',\n", " '/dhq/vol/5/3/000098/000098.html',\n", " '/dhq/vol/11/2/000307/000307.html',\n", " '/dhq/vol/11/2/000308/000308.html',\n", " '/dhq/vol/11/3/000324/000324.html',\n", " '/dhq/vol/12/2/000386/000386.html',\n", " '/dhq/vol/12/2/000387/000387.html',\n", " '/dhq/vol/10/2/000242/000242.html',\n", " '/dhq/vol/5/1/000090/000090.html',\n", " '/dhq/vol/10/1/000236/000236.html',\n", " '/dhq/vol/6/2/000124/000124.html',\n", " '/dhq/vol/3/2/000043/000043.html',\n", " '/dhq/vol/11/2/000317/000317.html',\n", " '/dhq/vol/9/1/000205/000205.html',\n", " '/dhq/vol/12/1/000353/000353.html',\n", " '/dhq/vol/11/3/000330/000330.html',\n", " '/dhq/vol/12/3/000395/000395.html',\n", " '/dhq/vol/11/1/000283/000283.html',\n", " '/dhq/vol/5/2/000096/000096.html',\n", " '/dhq/vol/9/4/000218/000218.html',\n", " '/dhq/vol/6/2/000139/000139.html',\n", " '/dhq/vol/3/4/000078/000078.html',\n", " '/dhq/vol/11/2/000294/000294.html',\n", " '/dhq/vol/4/2/000086/000086.html',\n", " '/dhq/vol/10/3/000260/000260.html',\n", " '/dhq/vol/3/3/000054/000054.html',\n", " '/dhq/vol/12/1/000388/000388.html',\n", " '/dhq/vol/12/3/000398/000398.html',\n", " '/dhq/vol/11/2/000360/000360.html',\n", " '/dhq/vol/5/1/000091/000091.html',\n", " '/dhq/vol/3/3/000065/000065.html',\n", " '/dhq/vol/2/1/000016/000016.html',\n", " '/dhq/vol/9/1/000198/000198.html',\n", " '/dhq/vol/11/1/000284/000284.html',\n", " '/dhq/vol/7/1/000155/000155.html',\n", " '/dhq/vol/12/2/000390/000390.html',\n", " '/dhq/vol/5/1/000095/000095.html',\n", " '/dhq/vol/11/4/000341/000341.html',\n", " '/dhq/vol/6/2/000138/000138.html',\n", " '/dhq/vol/8/1/000171/000171.html',\n", " '/dhq/vol/10/4/000265/000265.html',\n", " '/dhq/vol/9/2/000186/000186.html',\n", " '/dhq/vol/11/3/000336/000336.html',\n", " '/dhq/vol/9/3/000193/000193.html',\n", " '/dhq/vol/1/1/000002/000002.html',\n", " '/dhq/vol/12/1/000345/000345.html',\n", " '/dhq/vol/11/4/000313/000313.html',\n", " '/dhq/vol/1/1/000008/000008.html',\n", " '/dhq/vol/3/3/000050/000050.html',\n", " '/dhq/vol/9/02/000217/000217.html',\n", " '/dhq/vol/12/2/000396/000396.html',\n", " '/dhq/vol/10/1/\\n 000241\\n /\\n 000241\\n .html',\n", " '/dhq/vol/9/4/000210/000210.html',\n", " '/dhq/vol/9/4/000230/000230.html',\n", " '/dhq/vol/10/3/000261/000261.html',\n", " '/dhq/vol/3/4/000075/000075.html',\n", " '/dhq/vol/3/2/000039/000039.html',\n", " '/dhq/vol/8/3/000188/000188.html',\n", " '/dhq/vol/9/3/000223/000223.html',\n", " '/dhq/vol/7/1/000115/000115.html',\n", " '/dhq/vol/12/1/000370/000370.html',\n", " '/dhq/vol/11/1/000281/000281.html',\n", " '/dhq/vol/12/1/000346/000346.html',\n", " '/dhq/vol/12/1/000352/000352.html',\n", " '/dhq/vol/4/1/000080/000080.html',\n", " '/dhq/vol/10/4/000259/000259.html',\n", " '/dhq/vol/12/2/000391/000391.html',\n", " '/dhq/vol/3/2/000038/000038.html',\n", " '/dhq/vol/12/1/000362/000362.html',\n", " '/dhq/vol/10/2/000245/000245.html',\n", " '/dhq/vol/12/1/000357/000357.html',\n", " '/dhq/vol/12/1/000363/000363.html',\n", " '/dhq/vol/6/1/000110/000110.html',\n", " '/dhq/vol/7/1/000154/000154.html',\n", " '/dhq/vol/10/3/000266/000266.html',\n", " '/dhq/vol/3/4/000076/000076.html',\n", " '/dhq/vol/11/3/000315/000315.html',\n", " '/dhq/vol/6/2/000128/000128.html',\n", " '/dhq/vol/4/1/000087/000087.html',\n", " '/dhq/vol/10/4/000268/000268.html',\n", " '/dhq/vol/12/3/000397/000397.html',\n", " '/dhq/vol/10/4/000272/000272.html',\n", " '/dhq/vol/12/4/000404/000404.html',\n", " '/dhq/vol/3/4/000073/000073.html',\n", " '/dhq/vol/9/2/000216/000216.html',\n", " '/dhq/vol/8/1/000174/000174.html',\n", " '/dhq/vol/12/2/000374/000374.html',\n", " '/dhq/vol/3/3/000057/000057.html',\n", " '/dhq/vol/6/1/000113/000113.html',\n", " '/dhq/vol/9/4/000212/000212.html',\n", " '/dhq/vol/3/3/000058/000058.html',\n", " '/dhq/vol/11/3/000332/000332.html',\n", " '/dhq/vol/11/3/000337/000337.html',\n", " '/dhq/vol/12/2/000378/000378.html',\n", " '/dhq/vol/11/3/000329/000329.html',\n", " '/dhq/vol/3/2/000044/000044.html',\n", " '/dhq/vol/11/4/000343/000343.html',\n", " '/dhq/vol/11/2/000295/000295.html',\n", " '/dhq/vol/8/1/000168/000168.html',\n", " '/dhq/vol/10/3/000263/000263.html',\n", " '/dhq/vol/5/2/000092/000092.html',\n", " '/dhq/vol/7/1/000145/000145.html',\n", " '/dhq/vol/11/3/000302/000302.html',\n", " '/dhq/vol/12/1/000357/000357.html',\n", " '/dhq/vol/9/3/000214/000214.html',\n", " '/dhq/vol/8/1/000175/000175.html',\n", " '/dhq/vol/8/3/000191/000191.html',\n", " '/dhq/vol/5/2/000094/000094.html',\n", " '/dhq/vol/4/2/000088/000088.html',\n", " '/dhq/vol/11/3/000304/000304.html',\n", " '/dhq/vol/5/3/000102/000102.html',\n", " '/dhq/vol/5/3/000103/000103.html',\n", " '/dhq/vol/5/3/000105/000105.html',\n", " '/dhq/vol/7/2/000163/000163.html',\n", " '/dhq/vol/10/4/000280/000280.html',\n", " '/dhq/vol/11/2/000288/000288.html',\n", " '/dhq/vol/11/3/000333/000333.html',\n", " '/dhq/vol/6/2/000137/000137.html',\n", " '/dhq/vol/3/4/000068/000068.html',\n", " '/dhq/vol/11/3/000316/000316.html',\n", " '/dhq/vol/11/4/000340/000340.html',\n", " '/dhq/vol/9/2/000211/000211.html',\n", " '/dhq/vol/3/3/000062/000062.html',\n", " '/dhq/vol/11/2/000290/000290.html',\n", " '/dhq/vol/11/1/000274/000274.html',\n", " '/dhq/vol/11/3/000323/000323.html',\n", " '/dhq/vol/7/1/000143/000143.html',\n", " '/dhq/vol/6/1/000111/000111.html',\n", " '/dhq/vol/1/1/000001/000001.html',\n", " '/dhq/vol/3/3/000052/000052.html',\n", " '/dhq/vol/12/4/000406/000406.html',\n", " '/dhq/vol/7/3/000165/000165.html',\n", " '/dhq/vol/11/3/000331/000331.html',\n", " '/dhq/vol/3/3/000059/000059.html',\n", " '/dhq/vol/6/3/000132/000132.html',\n", " '/dhq/vol/3/4/000070/000070.html',\n", " '/dhq/vol/12/1/000359/000359.html',\n", " '/dhq/vol/12/1/000359/000359.html',\n", " '/dhq/vol/12/1/000349/000349.html',\n", " '/dhq/vol/12/2/000377/000377.html',\n", " '/dhq/vol/10/3/000264/000264.html',\n", " '/dhq/vol/3/3/000055/000055.html',\n", " '/dhq/vol/3/2/000040/000040.html',\n", " '/dhq/vol/12/1/000363/000363.html',\n", " '/dhq/vol/6/3/000134/000134.html',\n", " '/dhq/vol/10/4/000273/000273.html',\n", " '/dhq/vol/3/3/000063/000063.html',\n", " '/dhq/vol/10/2/ 000247 / 000247 .html',\n", " '/dhq/vol/5/3/000108/000108.html',\n", " '/dhq/vol/12/2/000394/000394.html',\n", " '/dhq/vol/1/1/000005/000005.html',\n", " '/dhq/vol/8/4/000197/000197.html',\n", " '/dhq/vol/11/4/000356/000356.html',\n", " '/dhq/vol/12/1/000348/000348.html',\n", " '/dhq/vol/11/2/000292/000292.html',\n", " '/dhq/vol/12/2/000379/000379.html',\n", " '/dhq/vol/12/1/000348/000348.html',\n", " '/dhq/vol/12/3/000384/000384.html',\n", " '/dhq/vol/11/3/000321/000321.html',\n", " '/dhq/vol/3/3/000056/000056.html',\n", " '/dhq/vol/12/4/000400/000400.html',\n", " '/dhq/vol/12/3/000399/000399.html',\n", " '/dhq/vol/12/1/000355/000355.html',\n", " '/dhq/vol/12/1/000355/000355.html',\n", " '/dhq/vol/7/1/000148/000148.html',\n", " '/dhq/vol/12/2/000380/000380.html',\n", " '/dhq/vol/8/2/000177/000177.html',\n", " '/dhq/vol/3/2/000048/000048.html',\n", " '/dhq/vol/10/2/000243/000243.html',\n", " '/dhq/vol/7/2/000160/000160.html',\n", " '/dhq/vol/4/2/000085/000085.html',\n", " '/dhq/vol/2/1/000017/000017.html',\n", " '/dhq/vol/1/2/000014/000014.html',\n", " '/dhq/vol/12/3/000385/000385.html',\n", " '/dhq/vol/3/1/000022/000022.html',\n", " '/dhq/vol/11/4/000358/000358.html',\n", " '/dhq/vol/11/4/000328/000328.html',\n", " '/dhq/vol/10/1/000231/000231.html',\n", " '/dhq/vol/9/4/000225/000225.html',\n", " '/dhq/vol/3/4/000072/000072.html',\n", " '/dhq/vol/8/3/000183/000183.html',\n", " '/dhq/vol/11/2/000289/000289.html',\n", " '/dhq/vol/9/2/000201/000201.html',\n", " '/dhq/vol/6/3/000130/000130.html',\n", " '/dhq/vol/3/3/000060/000060.html',\n", " '/dhq/vol/10/3/000244/000244.html',\n", " '/dhq/vol/8/3/000184/000184.html',\n", " '/dhq/vol/11/2/000293/000293.html',\n", " '/dhq/vol/2/1/000020/000020.html',\n", " '/dhq/vol/1/2/000009/000009.html',\n", " '/dhq/vol/10/1/000239/000239.html',\n", " '/dhq/vol/6/2/000119/000119.html',\n", " '/dhq/vol/7/1/000146/000146.html',\n", " '/dhq/vol/12/1/000373/000373.html',\n", " '/dhq/vol/11/3/000311/000311.html',\n", " '/dhq/vol/12/2/000382/000382.html',\n", " '/dhq/vol/6/2/000122/000122.html',\n", " '/dhq/vol/11/1/000286/000286.html',\n", " '/dhq/vol/10/1/000233/000233.html',\n", " '/dhq/vol/9/2/000219/000219.html',\n", " '/dhq/vol/10/1/000228/000228.html',\n", " '/dhq/vol/11/4/000327/000327.html',\n", " '/dhq/vol/3/1/000025/000025.html',\n", " '/dhq/vol/10/1/000235/000235.html',\n", " '/dhq/vol/7/1/000144/000144.html',\n", " '/dhq/vol/11/3/000319/000319.html',\n", " '/dhq/vol/3/2/000047/000047.html',\n", " '/dhq/vol/11/3/000314/000314.html',\n", " '/dhq/vol/10/2/000252/000252.html',\n", " '/dhq/vol/2/1/000015/000015.html',\n", " '/dhq/vol/3/1/000024/000024.html',\n", " '/dhq/vol/1/1/000006/000006.html',\n", " '/dhq/vol/3/2/000045/000045.html',\n", " '/dhq/vol/9/1/000199/000199.html',\n", " '/dhq/vol/9/3/000224/000224.html',\n", " '/dhq/vol/10/1/ 000229 / 000229 .html',\n", " '/dhq/vol/7/3/000162/000162.html',\n", " '/dhq/vol/5/2/000097/000097.html',\n", " '/dhq/vol/12/01/000369/000369.html',\n", " '/dhq/vol/12/2/000383/000383.html',\n", " '/dhq/vol/11/2/000285/000285.html',\n", " '/dhq/vol/11/2/000291/000291.html',\n", " '/dhq/vol/12/1/000349/000349.html',\n", " '/dhq/vol/10/2/000249/000249.html',\n", " '/dhq/vol/6/2/000127/000127.html',\n", " '/dhq/vol/10/2/000254/000254.html',\n", " '/dhq/vol/6/2/000121/000121.html',\n", " '/dhq/vol/11/3/000322/000322.html',\n", " '/dhq/vol/11/4/000344/000344.html',\n", " '/dhq/vol/6/3/000135/000135.html',\n", " '/dhq/vol/9/2/000209/000209.html',\n", " '/dhq/vol/7/2/000116/000116.html',\n", " '/dhq/vol/10/2/000251/000251.html',\n", " '/dhq/vol/4/2/000089/000089.html',\n", " '/dhq/vol/7/1/000151/000151.html',\n", " '/dhq/vol/9/4/000220/000220.html',\n", " '/dhq/vol/11/2/000299/000299.html',\n", " '/dhq/vol/11/3/000305/000305.html',\n", " '/dhq/vol/6/2/000141/000141.html',\n", " '/dhq/vol/11/4/000342/000342.html',\n", " '/dhq/vol/12/2/000392/000392.html',\n", " '/dhq/vol/8/4/000192/000192.html',\n", " '/dhq/vol/6/3/000131/000131.html',\n", " '/dhq/vol/5/3/000107/000107.html',\n", " '/dhq/vol/7/1/000161/000161.html',\n", " '/dhq/vol/8/4/000190/000190.html',\n", " '/dhq/vol/7/3/000166/000166.html',\n", " '/dhq/vol/3/2/000042/000042.html',\n", " '/dhq/vol/8/1/000176/000176.html',\n", " '/dhq/vol/6/2/000120/000120.html',\n", " '/dhq/vol/6/2/000126/000126.html',\n", " '/dhq/vol/1/1/000003/000003.html',\n", " '/dhq/vol/1/1/000007/000007.html',\n", " '/dhq/vol/9/2/000200/000200.html',\n", " '/dhq/vol/3/1/000032/000032.html',\n", " '/dhq/vol/7/1/000156/000156.html',\n", " '/dhq/vol/11/1/000287/000287.html',\n", " '/dhq/vol/10/2/000248/000248.html',\n", " '/dhq/vol/5/2/000093/000093.html',\n", " '/dhq/vol/10/1/000238/000238.html',\n", " '/dhq/vol/3/2/000041/000041.html',\n", " '/dhq/vol/12/1/000371/000371.html',\n", " '/dhq/vol/9/4/000226/000226.html',\n", " '/dhq/vol/5/3/000104/000104.html',\n", " '/dhq/vol/3/3/000064/000064.html',\n", " '/dhq/vol/8/2/000169/000169.html']" ] }, "execution_count": 109, "metadata": {}, "output_type": "execute_result" } ], "source": [ "urls = [] # create a list of URLs\n", "anchors = titleIndex.find_all(\"a\") # create a list from finding all a (link) tags\n", "for a in anchors: # loop through each element of the anchors list\n", " href = a['href'] # determine what the href (url) is for the a tag using list subscripting (explained later)\n", " if \"/dhq/vol/\" in href: # ensure that we have a proper look URL (not an #anchor)\n", " urls.append(href) # add the href to the urls list\n", "urls # preview" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As a digression, there's a more \"Pythonesque\" way of doing things, which is to use [list comprehension](https://docs.python.org/3/tutorial/datastructures.html?highlight=list%20comprehension#list-comprehensions). This is a syntax that's very compact and convenient (though arguable not as legible). The entire code block for looping above can be accomplished in one line. The two approaches are identical (in this case) and you should be familiar with both syntaxes." ] }, { "cell_type": "code", "execution_count": 110, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['/dhq/vol/12/1/000354/000354.html',\n", " '/dhq/vol/4/1/000084/000084.html',\n", " '/dhq/vol/3/1/000021/000021.html',\n", " '/dhq/vol/8/4/000187/000187.html',\n", " '/dhq/vol/7/1/000142/000142.html',\n", " '/dhq/vol/8/4/000195/000195.html',\n", " '/dhq/vol/11/3/000318/000318.html',\n", " '/dhq/vol/1/2/000010/000010.html',\n", " '/dhq/vol/12/1/000372/000372.html',\n", " '/dhq/vol/10/1/000240/000240.html',\n", " '/dhq/vol/12/1/000367/000367.html',\n", " '/dhq/vol/12/1/000367/000367.html',\n", " '/dhq/vol/10/2/000250/000250.html',\n", " '/dhq/vol/1/2/000011/000011.html',\n", " '/dhq/vol/10/3/000267/000267.html',\n", " '/dhq/vol/9/3/ 000227 / 000227 .html',\n", " '/dhq/vol/10/4/000269/000269.html',\n", " '/dhq/vol/9/2/000213/000213.html',\n", " '/dhq/vol/2/1/000019/000019.html',\n", " '/dhq/vol/10/4/000271/000271.html',\n", " '/dhq/vol/11/2/000309/000309.html',\n", " '/dhq/vol/5/3/000100/000100.html',\n", " '/dhq/vol/3/3/000066/000066.html',\n", " '/dhq/vol/7/2/000158/000158.html',\n", " '/dhq/vol/5/1/000101/000101.html',\n", " '/dhq/vol/8/3/000189/000189.html',\n", " '/dhq/vol/9/4/000234/000234.html',\n", " '/dhq/vol/12/1/000364/000364.html',\n", " '/dhq/vol/8/4/000196/000196.html',\n", " '/dhq/vol/6/2/000118/000118.html',\n", " '/dhq/vol/9/2/000208/000208.html',\n", " '/dhq/vol/12/2/000376/000376.html',\n", " '/dhq/vol/7/1/000140/000140.html',\n", " '/dhq/vol/11/3/000306/000306.html',\n", " '/dhq/vol/11/3/000310/000310.html',\n", " '/dhq/vol/6/2/000125/000125.html',\n", " '/dhq/vol/6/2/000136/000136.html',\n", " '/dhq/vol/8/2/000179/000179.html',\n", " '/dhq/vol/9/1/000204/000204.html',\n", " '/dhq/vol/8/2/000181/000181.html',\n", " '/dhq/vol/12/1/000347/000347.html',\n", " '/dhq/vol/12/1/000347/000347.html',\n", " '/dhq/vol/3/3/000061/000061.html',\n", " '/dhq/vol/10/3/000258/000258.html',\n", " '/dhq/vol/3/1/000028/000028.html',\n", " '/dhq/vol/12/1/000352/000352.html',\n", " '/dhq/vol/3/1/000034/000034.html',\n", " '/dhq/vol/9/1/000207/000207.html',\n", " '/dhq/vol/11/4/000338/000338.html',\n", " '/dhq/vol/7/1/000157/000157.html',\n", " '/dhq/vol/12/1/000351/000351.html',\n", " '/dhq/vol/6/3/000133/000133.html',\n", " '/dhq/vol/11/3/000320/000320.html',\n", " '/dhq/vol/6/1/000117/000117.html',\n", " '/dhq/vol/3/2/000046/000046.html',\n", " '/dhq/vol/9/3/000237/000237.html',\n", " '/dhq/vol/11/2/000298/000298.html',\n", " '/dhq/vol/10/4/000278/000278.html',\n", " '/dhq/vol/3/1/000033/000033.html',\n", " '/dhq/vol/12/1/000368/000368.html',\n", " '/dhq/vol/8/1/000170/000170.html',\n", " '/dhq/vol/8/3/000185/000185.html',\n", " '/dhq/vol/3/1/000035/000035.html',\n", " '/dhq/vol/2/1/000018/000018.html',\n", " '/dhq/vol/3/4/000069/000069.html',\n", " '/dhq/vol/12/1/000346/000346.html',\n", " '/dhq/vol/11/4/000350/000350.html',\n", " '/dhq/vol/10/3/000257/000257.html',\n", " '/dhq/vol/11/4/000339/000339.html',\n", " '/dhq/vol/4/1/000081/000081.html',\n", " '/dhq/vol/9/3/ 000221 / 000221 .html',\n", " '/dhq/vol/12/4/000403/000403.html',\n", " '/dhq/vol/7/1/000153/000153.html',\n", " '/dhq/vol/12/1/000365/000365.html',\n", " '/dhq/vol/12/1/000365/000365.html',\n", " '/dhq/vol/11/1/000282/000282.html',\n", " '/dhq/vol/7/2/000159/000159.html',\n", " '/dhq/vol/8/4/000194/000194.html',\n", " '/dhq/vol/7/1/000149/000149.html',\n", " '/dhq/vol/3/1/000023/000023.html',\n", " '/dhq/vol/9/2/000215/000215.html',\n", " '/dhq/vol/9/3/000222/000222.html',\n", " '/dhq/vol/9/1/000203/000203.html',\n", " '/dhq/vol/7/1/000150/000150.html',\n", " '/dhq/vol/12/4/000401/000401.html',\n", " '/dhq/vol/8/2/000173/000173.html',\n", " '/dhq/vol/6/2/000123/000123.html',\n", " '/dhq/vol/3/2/000049/000049.html',\n", " '/dhq/vol/3/3/000067/000067.html',\n", " '/dhq/vol/4/2/000083/000083.html',\n", " '/dhq/vol/10/4/000275/000275.html',\n", " '/dhq/vol/7/1/000114/000114.html',\n", " '/dhq/vol/12/1/000354/000354.html',\n", " '/dhq/vol/11/3/000312/000312.html',\n", " '/dhq/vol/11/4/000326/000326.html',\n", " '/dhq/vol/1/2/000013/000013.html',\n", " '/dhq/vol/11/2/000297/000297.html',\n", " '/dhq/vol/8/3/000182/000182.html',\n", " '/dhq/vol/11/3/000335/000335.html',\n", " '/dhq/vol/3/1/000029/000029.html',\n", " '/dhq/vol/4/1/000082/000082.html',\n", " '/dhq/vol/3/4/000077/000077.html',\n", " '/dhq/vol/10/4/000277/000277.html',\n", " '/dhq/vol/12/1/000353/000353.html',\n", " '/dhq/vol/3/1/000031/000031.html',\n", " '/dhq/vol/9/4/ 000232 / 000232 .html',\n", " '/dhq/vol/8/2/000178/000178.html',\n", " '/dhq/vol/7/1/000147/000147.html',\n", " '/dhq/vol/10/2/000256/000256.html',\n", " '/dhq/vol/11/3/000303/000303.html',\n", " '/dhq/vol/8/1/000172/000172.html',\n", " '/dhq/vol/3/4/000079/000079.html',\n", " '/dhq/vol/7/3/000167/000167.html',\n", " '/dhq/vol/10/4/000270/000270.html',\n", " '/dhq/vol/5/3/000099/000099.html',\n", " '/dhq/vol/10/2/000253/000253.html',\n", " '/dhq/vol/11/3/000325/000325.html',\n", " '/dhq/vol/5/3/000106/000106.html',\n", " '/dhq/vol/11/1/000276/000276.html',\n", " '/dhq/vol/12/1/000362/000362.html',\n", " '/dhq/vol/3/1/000027/000027.html',\n", " '/dhq/vol/12/2/000393/000393.html',\n", " '/dhq/vol/12/2/000389/000389.html',\n", " '/dhq/vol/6/2/000129/000129.html',\n", " '/dhq/vol/9/1/000206/000206.html',\n", " '/dhq/vol/3/2/000037/000037.html',\n", " '/dhq/vol/3/4/000074/000074.html',\n", " '/dhq/vol/3/3/000053/000053.html',\n", " '/dhq/vol/12/1/000345/000345.html',\n", " '/dhq/vol/1/1/000004/000004.html',\n", " '/dhq/vol/7/1/000152/000152.html',\n", " '/dhq/vol/1/2/000012/000012.html',\n", " '/dhq/vol/3/3/000051/000051.html',\n", " '/dhq/vol/11/2/000300/000300.html',\n", " '/dhq/vol/9/2/000202/000202.html',\n", " '/dhq/vol/12/1/000364/000364.html',\n", " '/dhq/vol/11/3/000361/000361.html',\n", " '/dhq/vol/6/1/000112/000112.html',\n", " '/dhq/vol/3/1/000030/000030.html',\n", " '/dhq/vol/8/2/000180/000180.html',\n", " '/dhq/vol/3/4/000071/000071.html',\n", " '/dhq/vol/10/2/000246/000246.html',\n", " '/dhq/vol/10/3/000262/000262.html',\n", " '/dhq/vol/11/2/000296/000296.html',\n", " '/dhq/vol/10/2/000255/000255.html',\n", " '/dhq/vol/3/1/000026/000026.html',\n", " '/dhq/vol/7/2/000164/000164.html',\n", " '/dhq/vol/11/1/000279/000279.html',\n", " '/dhq/vol/3/1/000036/000036.html',\n", " '/dhq/vol/5/3/000098/000098.html',\n", " '/dhq/vol/11/2/000307/000307.html',\n", " '/dhq/vol/11/2/000308/000308.html',\n", " '/dhq/vol/11/3/000324/000324.html',\n", " '/dhq/vol/12/2/000386/000386.html',\n", " '/dhq/vol/12/2/000387/000387.html',\n", " '/dhq/vol/10/2/000242/000242.html',\n", " '/dhq/vol/5/1/000090/000090.html',\n", " '/dhq/vol/10/1/000236/000236.html',\n", " '/dhq/vol/6/2/000124/000124.html',\n", " '/dhq/vol/3/2/000043/000043.html',\n", " '/dhq/vol/11/2/000317/000317.html',\n", " '/dhq/vol/9/1/000205/000205.html',\n", " '/dhq/vol/12/1/000353/000353.html',\n", " '/dhq/vol/11/3/000330/000330.html',\n", " '/dhq/vol/12/3/000395/000395.html',\n", " '/dhq/vol/11/1/000283/000283.html',\n", " '/dhq/vol/5/2/000096/000096.html',\n", " '/dhq/vol/9/4/000218/000218.html',\n", " '/dhq/vol/6/2/000139/000139.html',\n", " '/dhq/vol/3/4/000078/000078.html',\n", " '/dhq/vol/11/2/000294/000294.html',\n", " '/dhq/vol/4/2/000086/000086.html',\n", " '/dhq/vol/10/3/000260/000260.html',\n", " '/dhq/vol/3/3/000054/000054.html',\n", " '/dhq/vol/12/1/000388/000388.html',\n", " '/dhq/vol/12/3/000398/000398.html',\n", " '/dhq/vol/11/2/000360/000360.html',\n", " '/dhq/vol/5/1/000091/000091.html',\n", " '/dhq/vol/3/3/000065/000065.html',\n", " '/dhq/vol/2/1/000016/000016.html',\n", " '/dhq/vol/9/1/000198/000198.html',\n", " '/dhq/vol/11/1/000284/000284.html',\n", " '/dhq/vol/7/1/000155/000155.html',\n", " '/dhq/vol/12/2/000390/000390.html',\n", " '/dhq/vol/5/1/000095/000095.html',\n", " '/dhq/vol/11/4/000341/000341.html',\n", " '/dhq/vol/6/2/000138/000138.html',\n", " '/dhq/vol/8/1/000171/000171.html',\n", " '/dhq/vol/10/4/000265/000265.html',\n", " '/dhq/vol/9/2/000186/000186.html',\n", " '/dhq/vol/11/3/000336/000336.html',\n", " '/dhq/vol/9/3/000193/000193.html',\n", " '/dhq/vol/1/1/000002/000002.html',\n", " '/dhq/vol/12/1/000345/000345.html',\n", " '/dhq/vol/11/4/000313/000313.html',\n", " '/dhq/vol/1/1/000008/000008.html',\n", " '/dhq/vol/3/3/000050/000050.html',\n", " '/dhq/vol/9/02/000217/000217.html',\n", " '/dhq/vol/12/2/000396/000396.html',\n", " '/dhq/vol/10/1/\\n 000241\\n /\\n 000241\\n .html',\n", " '/dhq/vol/9/4/000210/000210.html',\n", " '/dhq/vol/9/4/000230/000230.html',\n", " '/dhq/vol/10/3/000261/000261.html',\n", " '/dhq/vol/3/4/000075/000075.html',\n", " '/dhq/vol/3/2/000039/000039.html',\n", " '/dhq/vol/8/3/000188/000188.html',\n", " '/dhq/vol/9/3/000223/000223.html',\n", " '/dhq/vol/7/1/000115/000115.html',\n", " '/dhq/vol/12/1/000370/000370.html',\n", " '/dhq/vol/11/1/000281/000281.html',\n", " '/dhq/vol/12/1/000346/000346.html',\n", " '/dhq/vol/12/1/000352/000352.html',\n", " '/dhq/vol/4/1/000080/000080.html',\n", " '/dhq/vol/10/4/000259/000259.html',\n", " '/dhq/vol/12/2/000391/000391.html',\n", " '/dhq/vol/3/2/000038/000038.html',\n", " '/dhq/vol/12/1/000362/000362.html',\n", " '/dhq/vol/10/2/000245/000245.html',\n", " '/dhq/vol/12/1/000357/000357.html',\n", " '/dhq/vol/12/1/000363/000363.html',\n", " '/dhq/vol/6/1/000110/000110.html',\n", " '/dhq/vol/7/1/000154/000154.html',\n", " '/dhq/vol/10/3/000266/000266.html',\n", " '/dhq/vol/3/4/000076/000076.html',\n", " '/dhq/vol/11/3/000315/000315.html',\n", " '/dhq/vol/6/2/000128/000128.html',\n", " '/dhq/vol/4/1/000087/000087.html',\n", " '/dhq/vol/10/4/000268/000268.html',\n", " '/dhq/vol/12/3/000397/000397.html',\n", " '/dhq/vol/10/4/000272/000272.html',\n", " '/dhq/vol/12/4/000404/000404.html',\n", " '/dhq/vol/3/4/000073/000073.html',\n", " '/dhq/vol/9/2/000216/000216.html',\n", " '/dhq/vol/8/1/000174/000174.html',\n", " '/dhq/vol/12/2/000374/000374.html',\n", " '/dhq/vol/3/3/000057/000057.html',\n", " '/dhq/vol/6/1/000113/000113.html',\n", " '/dhq/vol/9/4/000212/000212.html',\n", " '/dhq/vol/3/3/000058/000058.html',\n", " '/dhq/vol/11/3/000332/000332.html',\n", " '/dhq/vol/11/3/000337/000337.html',\n", " '/dhq/vol/12/2/000378/000378.html',\n", " '/dhq/vol/11/3/000329/000329.html',\n", " '/dhq/vol/3/2/000044/000044.html',\n", " '/dhq/vol/11/4/000343/000343.html',\n", " '/dhq/vol/11/2/000295/000295.html',\n", " '/dhq/vol/8/1/000168/000168.html',\n", " '/dhq/vol/10/3/000263/000263.html',\n", " '/dhq/vol/5/2/000092/000092.html',\n", " '/dhq/vol/7/1/000145/000145.html',\n", " '/dhq/vol/11/3/000302/000302.html',\n", " '/dhq/vol/12/1/000357/000357.html',\n", " '/dhq/vol/9/3/000214/000214.html',\n", " '/dhq/vol/8/1/000175/000175.html',\n", " '/dhq/vol/8/3/000191/000191.html',\n", " '/dhq/vol/5/2/000094/000094.html',\n", " '/dhq/vol/4/2/000088/000088.html',\n", " '/dhq/vol/11/3/000304/000304.html',\n", " '/dhq/vol/5/3/000102/000102.html',\n", " '/dhq/vol/5/3/000103/000103.html',\n", " '/dhq/vol/5/3/000105/000105.html',\n", " '/dhq/vol/7/2/000163/000163.html',\n", " '/dhq/vol/10/4/000280/000280.html',\n", " '/dhq/vol/11/2/000288/000288.html',\n", " '/dhq/vol/11/3/000333/000333.html',\n", " '/dhq/vol/6/2/000137/000137.html',\n", " '/dhq/vol/3/4/000068/000068.html',\n", " '/dhq/vol/11/3/000316/000316.html',\n", " '/dhq/vol/11/4/000340/000340.html',\n", " '/dhq/vol/9/2/000211/000211.html',\n", " '/dhq/vol/3/3/000062/000062.html',\n", " '/dhq/vol/11/2/000290/000290.html',\n", " '/dhq/vol/11/1/000274/000274.html',\n", " '/dhq/vol/11/3/000323/000323.html',\n", " '/dhq/vol/7/1/000143/000143.html',\n", " '/dhq/vol/6/1/000111/000111.html',\n", " '/dhq/vol/1/1/000001/000001.html',\n", " '/dhq/vol/3/3/000052/000052.html',\n", " '/dhq/vol/12/4/000406/000406.html',\n", " '/dhq/vol/7/3/000165/000165.html',\n", " '/dhq/vol/11/3/000331/000331.html',\n", " '/dhq/vol/3/3/000059/000059.html',\n", " '/dhq/vol/6/3/000132/000132.html',\n", " '/dhq/vol/3/4/000070/000070.html',\n", " '/dhq/vol/12/1/000359/000359.html',\n", " '/dhq/vol/12/1/000359/000359.html',\n", " '/dhq/vol/12/1/000349/000349.html',\n", " '/dhq/vol/12/2/000377/000377.html',\n", " '/dhq/vol/10/3/000264/000264.html',\n", " '/dhq/vol/3/3/000055/000055.html',\n", " '/dhq/vol/3/2/000040/000040.html',\n", " '/dhq/vol/12/1/000363/000363.html',\n", " '/dhq/vol/6/3/000134/000134.html',\n", " '/dhq/vol/10/4/000273/000273.html',\n", " '/dhq/vol/3/3/000063/000063.html',\n", " '/dhq/vol/10/2/ 000247 / 000247 .html',\n", " '/dhq/vol/5/3/000108/000108.html',\n", " '/dhq/vol/12/2/000394/000394.html',\n", " '/dhq/vol/1/1/000005/000005.html',\n", " '/dhq/vol/8/4/000197/000197.html',\n", " '/dhq/vol/11/4/000356/000356.html',\n", " '/dhq/vol/12/1/000348/000348.html',\n", " '/dhq/vol/11/2/000292/000292.html',\n", " '/dhq/vol/12/2/000379/000379.html',\n", " '/dhq/vol/12/1/000348/000348.html',\n", " '/dhq/vol/12/3/000384/000384.html',\n", " '/dhq/vol/11/3/000321/000321.html',\n", " '/dhq/vol/3/3/000056/000056.html',\n", " '/dhq/vol/12/4/000400/000400.html',\n", " '/dhq/vol/12/3/000399/000399.html',\n", " '/dhq/vol/12/1/000355/000355.html',\n", " '/dhq/vol/12/1/000355/000355.html',\n", " '/dhq/vol/7/1/000148/000148.html',\n", " '/dhq/vol/12/2/000380/000380.html',\n", " '/dhq/vol/8/2/000177/000177.html',\n", " '/dhq/vol/3/2/000048/000048.html',\n", " '/dhq/vol/10/2/000243/000243.html',\n", " '/dhq/vol/7/2/000160/000160.html',\n", " '/dhq/vol/4/2/000085/000085.html',\n", " '/dhq/vol/2/1/000017/000017.html',\n", " '/dhq/vol/1/2/000014/000014.html',\n", " '/dhq/vol/12/3/000385/000385.html',\n", " '/dhq/vol/3/1/000022/000022.html',\n", " '/dhq/vol/11/4/000358/000358.html',\n", " '/dhq/vol/11/4/000328/000328.html',\n", " '/dhq/vol/10/1/000231/000231.html',\n", " '/dhq/vol/9/4/000225/000225.html',\n", " '/dhq/vol/3/4/000072/000072.html',\n", " '/dhq/vol/8/3/000183/000183.html',\n", " '/dhq/vol/11/2/000289/000289.html',\n", " '/dhq/vol/9/2/000201/000201.html',\n", " '/dhq/vol/6/3/000130/000130.html',\n", " '/dhq/vol/3/3/000060/000060.html',\n", " '/dhq/vol/10/3/000244/000244.html',\n", " '/dhq/vol/8/3/000184/000184.html',\n", " '/dhq/vol/11/2/000293/000293.html',\n", " '/dhq/vol/2/1/000020/000020.html',\n", " '/dhq/vol/1/2/000009/000009.html',\n", " '/dhq/vol/10/1/000239/000239.html',\n", " '/dhq/vol/6/2/000119/000119.html',\n", " '/dhq/vol/7/1/000146/000146.html',\n", " '/dhq/vol/12/1/000373/000373.html',\n", " '/dhq/vol/11/3/000311/000311.html',\n", " '/dhq/vol/12/2/000382/000382.html',\n", " '/dhq/vol/6/2/000122/000122.html',\n", " '/dhq/vol/11/1/000286/000286.html',\n", " '/dhq/vol/10/1/000233/000233.html',\n", " '/dhq/vol/9/2/000219/000219.html',\n", " '/dhq/vol/10/1/000228/000228.html',\n", " '/dhq/vol/11/4/000327/000327.html',\n", " '/dhq/vol/3/1/000025/000025.html',\n", " '/dhq/vol/10/1/000235/000235.html',\n", " '/dhq/vol/7/1/000144/000144.html',\n", " '/dhq/vol/11/3/000319/000319.html',\n", " '/dhq/vol/3/2/000047/000047.html',\n", " '/dhq/vol/11/3/000314/000314.html',\n", " '/dhq/vol/10/2/000252/000252.html',\n", " '/dhq/vol/2/1/000015/000015.html',\n", " '/dhq/vol/3/1/000024/000024.html',\n", " '/dhq/vol/1/1/000006/000006.html',\n", " '/dhq/vol/3/2/000045/000045.html',\n", " '/dhq/vol/9/1/000199/000199.html',\n", " '/dhq/vol/9/3/000224/000224.html',\n", " '/dhq/vol/10/1/ 000229 / 000229 .html',\n", " '/dhq/vol/7/3/000162/000162.html',\n", " '/dhq/vol/5/2/000097/000097.html',\n", " '/dhq/vol/12/01/000369/000369.html',\n", " '/dhq/vol/12/2/000383/000383.html',\n", " '/dhq/vol/11/2/000285/000285.html',\n", " '/dhq/vol/11/2/000291/000291.html',\n", " '/dhq/vol/12/1/000349/000349.html',\n", " '/dhq/vol/10/2/000249/000249.html',\n", " '/dhq/vol/6/2/000127/000127.html',\n", " '/dhq/vol/10/2/000254/000254.html',\n", " '/dhq/vol/6/2/000121/000121.html',\n", " '/dhq/vol/11/3/000322/000322.html',\n", " '/dhq/vol/11/4/000344/000344.html',\n", " '/dhq/vol/6/3/000135/000135.html',\n", " '/dhq/vol/9/2/000209/000209.html',\n", " '/dhq/vol/7/2/000116/000116.html',\n", " '/dhq/vol/10/2/000251/000251.html',\n", " '/dhq/vol/4/2/000089/000089.html',\n", " '/dhq/vol/7/1/000151/000151.html',\n", " '/dhq/vol/9/4/000220/000220.html',\n", " '/dhq/vol/11/2/000299/000299.html',\n", " '/dhq/vol/11/3/000305/000305.html',\n", " '/dhq/vol/6/2/000141/000141.html',\n", " '/dhq/vol/11/4/000342/000342.html',\n", " '/dhq/vol/12/2/000392/000392.html',\n", " '/dhq/vol/8/4/000192/000192.html',\n", " '/dhq/vol/6/3/000131/000131.html',\n", " '/dhq/vol/5/3/000107/000107.html',\n", " '/dhq/vol/7/1/000161/000161.html',\n", " '/dhq/vol/8/4/000190/000190.html',\n", " '/dhq/vol/7/3/000166/000166.html',\n", " '/dhq/vol/3/2/000042/000042.html',\n", " '/dhq/vol/8/1/000176/000176.html',\n", " '/dhq/vol/6/2/000120/000120.html',\n", " '/dhq/vol/6/2/000126/000126.html',\n", " '/dhq/vol/1/1/000003/000003.html',\n", " '/dhq/vol/1/1/000007/000007.html',\n", " '/dhq/vol/9/2/000200/000200.html',\n", " '/dhq/vol/3/1/000032/000032.html',\n", " '/dhq/vol/7/1/000156/000156.html',\n", " '/dhq/vol/11/1/000287/000287.html',\n", " '/dhq/vol/10/2/000248/000248.html',\n", " '/dhq/vol/5/2/000093/000093.html',\n", " '/dhq/vol/10/1/000238/000238.html',\n", " '/dhq/vol/3/2/000041/000041.html',\n", " '/dhq/vol/12/1/000371/000371.html',\n", " '/dhq/vol/9/4/000226/000226.html',\n", " '/dhq/vol/5/3/000104/000104.html',\n", " '/dhq/vol/3/3/000064/000064.html',\n", " '/dhq/vol/8/2/000169/000169.html']" ] }, "execution_count": 110, "metadata": {}, "output_type": "execute_result" } ], "source": [ "urls = [a['href'] for a in titleIndex.find_all(\"a\") if \"/dhq/vol/\" in a['href']]\n", "urls" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Before saving a local copy of our articles we'll create a data directory to which we can save our files" ] }, { "cell_type": "code", "execution_count": 111, "metadata": {}, "outputs": [], "source": [ "import os\n", "\n", "directory = \"data\" # data directory\n", "if not os.path.exists(directory): # if the directory doesn't exist\n", " os.makedirs(directory) # create it" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can go ahead and fetch the URLs, writing out the results as we go along. Some concepts won't be explained immediately, such as the [regular expressions](https://docs.python.org/3.7/library/re.html) (`re`) and the `with open` syntax." ] }, { "cell_type": "code", "execution_count": 113, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "fetching: /dhq/vol/7/1/000153/000153.html\n", "fetching: /dhq/vol/9/2/000208/000208.html\n", "fetching: /dhq/vol/6/2/000121/000121.html\n", "fetching: /dhq/vol/11/3/000321/000321.html\n", "fetching: /dhq/vol/3/4/000078/000078.html\n", "fetching: /dhq/vol/11/3/000331/000331.html\n", "fetching: /dhq/vol/5/3/000098/000098.html\n", "fetching: /dhq/vol/9/4/000220/000220.html\n", "fetching: /dhq/vol/10/2/000254/000254.html\n", "fetching: /dhq/vol/2/1/000019/000019.html\n", "fetching: /dhq/vol/3/3/000053/000053.html\n", "fetching: /dhq/vol/8/2/000180/000180.html\n", "fetching: /dhq/vol/12/4/000403/000403.html\n", "fetching: /dhq/vol/5/1/000101/000101.html\n", "fetching: /dhq/vol/6/2/000139/000139.html\n", "fetching: /dhq/vol/1/1/000005/000005.html\n", "fetching: /dhq/vol/5/3/000107/000107.html\n", "fetching: /dhq/vol/11/3/000310/000310.html\n", "fetching: /dhq/vol/11/4/000328/000328.html\n", "fetching: /dhq/vol/3/1/000022/000022.html\n", "fetching: /dhq/vol/12/1/000346/000346.html\n", "fetching: /dhq/vol/11/3/000333/000333.html\n", "fetching: /dhq/vol/9/4/000226/000226.html\n", "fetching: /dhq/vol/12/1/000371/000371.html\n", "fetching: /dhq/vol/11/1/000282/000282.html\n", "fetching: /dhq/vol/10/3/000260/000260.html\n", "fetching: /dhq/vol/10/2/000252/000252.html\n", "fetching: /dhq/vol/10/1/000231/000231.html\n", "fetching: /dhq/vol/7/2/000159/000159.html\n", "fetching: /dhq/vol/5/2/000092/000092.html\n", "fetching: /dhq/vol/10/4/000277/000277.html\n", "fetching: /dhq/vol/11/4/000327/000327.html\n", "fetching: /dhq/vol/5/3/000106/000106.html\n", "fetching: /dhq/vol/5/1/000095/000095.html\n", "fetching: /dhq/vol/3/1/000034/000034.html\n", "fetching: /dhq/vol/1/2/000014/000014.html\n", "fetching: /dhq/vol/5/2/000097/000097.html\n", "fetching: /dhq/vol/12/1/000359/000359.html\n", "fetching: /dhq/vol/11/2/000297/000297.html\n", "fetching: /dhq/vol/12/4/000404/000404.html\n", "fetching: /dhq/vol/6/2/000122/000122.html\n", "fetching: /dhq/vol/12/1/000351/000351.html\n", "fetching: /dhq/vol/3/1/000036/000036.html\n", "fetching: /dhq/vol/9/2/000186/000186.html\n", "fetching: /dhq/vol/11/3/000304/000304.html\n", "fetching: /dhq/vol/3/2/000049/000049.html\n", "fetching: /dhq/vol/3/4/000073/000073.html\n", "fetching: /dhq/vol/3/1/000026/000026.html\n", "fetching: /dhq/vol/11/1/000284/000284.html\n", "fetching: /dhq/vol/7/1/000150/000150.html\n", "fetching: /dhq/vol/12/2/000389/000389.html\n", "fetching: /dhq/vol/3/3/000063/000063.html\n", "fetching: /dhq/vol/8/3/000182/000182.html\n", "fetching: /dhq/vol/10/4/000265/000265.html\n", "fetching: /dhq/vol/7/1/000144/000144.html\n", "fetching: /dhq/vol/11/3/000319/000319.html\n", "fetching: /dhq/vol/11/3/000315/000315.html\n", "fetching: /dhq/vol/11/3/000361/000361.html\n", "fetching: /dhq/vol/7/1/000151/000151.html\n", "fetching: /dhq/vol/8/3/000188/000188.html\n", "fetching: /dhq/vol/5/2/000096/000096.html\n", "fetching: /dhq/vol/10/2/000250/000250.html\n", "fetching: /dhq/vol/5/1/000090/000090.html\n", "fetching: /dhq/vol/5/3/000102/000102.html\n", "fetching: /dhq/vol/3/3/000067/000067.html\n", "fetching: /dhq/vol/1/1/000007/000007.html\n", "fetching: /dhq/vol/11/2/000285/000285.html\n", "fetching: /dhq/vol/4/1/000084/000084.html\n", "fetching: /dhq/vol/4/2/000086/000086.html\n", "fetching: /dhq/vol/1/1/000003/000003.html\n", "fetching: /dhq/vol/10/2/000249/000249.html\n", "fetching: /dhq/vol/11/3/000336/000336.html\n", "fetching: /dhq/vol/12/1/000373/000373.html\n", "fetching: /dhq/vol/7/1/000149/000149.html\n", "fetching: /dhq/vol/7/1/000154/000154.html\n", "fetching: /dhq/vol/7/1/000148/000148.html\n", "fetching: /dhq/vol/8/4/000187/000187.html\n", "fetching: /dhq/vol/12/2/000396/000396.html\n", "fetching: /dhq/vol/11/2/000289/000289.html\n", "fetching: /dhq/vol/12/1/000367/000367.html\n", "fetching: /dhq/vol/8/4/000195/000195.html\n", "fetching: /dhq/vol/10/1/000236/000236.html\n", "fetching: /dhq/vol/10/1/000235/000235.html\n", "fetching: /dhq/vol/11/4/000340/000340.html\n", "fetching: /dhq/vol/5/3/000103/000103.html\n", "fetching: /dhq/vol/6/2/000126/000126.html\n", "fetching: /dhq/vol/3/4/000068/000068.html\n", "fetching: /dhq/vol/12/1/000345/000345.html\n", "fetching: /dhq/vol/10/2/000246/000246.html\n", "fetching: /dhq/vol/10/3/000264/000264.html\n", "fetching: /dhq/vol/11/4/000338/000338.html\n", "fetching: /dhq/vol/11/3/000325/000325.html\n", "fetching: /dhq/vol/2/1/000017/000017.html\n", "fetching: /dhq/vol/6/2/000120/000120.html\n", "fetching: /dhq/vol/12/2/000387/000387.html\n", "fetching: /dhq/vol/9/1/000198/000198.html\n", "fetching: /dhq/vol/8/2/000177/000177.html\n", "fetching: /dhq/vol/3/4/000075/000075.html\n", "fetching: /dhq/vol/12/1/000357/000357.html\n", "fetching: /dhq/vol/6/2/000118/000118.html\n", "fetching: /dhq/vol/12/2/000380/000380.html\n", "fetching: /dhq/vol/4/1/000081/000081.html\n", "fetching: /dhq/vol/11/3/000322/000322.html\n", "fetching: /dhq/vol/10/3/000244/000244.html\n", "fetching: /dhq/vol/12/1/000388/000388.html\n", "fetching: /dhq/vol/5/3/000105/000105.html\n", "fetching: /dhq/vol/10/4/000259/000259.html\n", "fetching: /dhq/vol/7/1/000152/000152.html\n", "fetching: /dhq/vol/6/2/000125/000125.html\n", "fetching: /dhq/vol/12/2/000392/000392.html\n", "fetching: /dhq/vol/1/1/000006/000006.html\n", "fetching: /dhq/vol/3/2/000045/000045.html\n", "fetching: /dhq/vol/10/3/000263/000263.html\n", "fetching: /dhq/vol/12/2/000377/000377.html\n", "fetching: /dhq/vol/3/3/000057/000057.html\n", "fetching: /dhq/vol/7/1/000155/000155.html\n", "fetching: /dhq/vol/9/1/000199/000199.html\n", "fetching: /dhq/vol/3/3/000052/000052.html\n", "fetching: /dhq/vol/3/3/000061/000061.html\n", "fetching: /dhq/vol/1/2/000009/000009.html\n", "fetching: /dhq/vol/3/1/000035/000035.html\n", "fetching: /dhq/vol/6/3/000134/000134.html\n", "fetching: /dhq/vol/6/3/000133/000133.html\n", "fetching: /dhq/vol/3/3/000055/000055.html\n", "fetching: /dhq/vol/3/1/000021/000021.html\n", "fetching: /dhq/vol/8/3/000189/000189.html\n", "fetching: /dhq/vol/11/4/000343/000343.html\n", "fetching: /dhq/vol/11/3/000312/000312.html\n", "fetching: /dhq/vol/3/1/000031/000031.html\n", "fetching: /dhq/vol/3/4/000069/000069.html\n", "fetching: /dhq/vol/8/4/000196/000196.html\n", "fetching: /dhq/vol/6/2/000137/000137.html\n", "fetching: /dhq/vol/12/2/000382/000382.html\n", "fetching: /dhq/vol/8/2/000181/000181.html\n", "fetching: /dhq/vol/6/2/000128/000128.html\n", "fetching: /dhq/vol/9/4/000232/000232.html\n", "fetching: /dhq/vol/11/2/000299/000299.html\n", "fetching: /dhq/vol/8/4/000194/000194.html\n", "fetching: /dhq/vol/6/3/000132/000132.html\n", "fetching: /dhq/vol/11/2/000293/000293.html\n", "fetching: /dhq/vol/10/1/000233/000233.html\n", "fetching: /dhq/vol/11/1/000279/000279.html\n", "fetching: /dhq/vol/9/2/000215/000215.html\n", "fetching: /dhq/vol/8/2/000178/000178.html\n", "fetching: /dhq/vol/12/3/000395/000395.html\n", "fetching: /dhq/vol/7/1/000115/000115.html\n", "fetching: /dhq/vol/9/02/000217/000217.html\n", "fetching: /dhq/vol/12/1/000368/000368.html\n", "fetching: /dhq/vol/9/4/000234/000234.html\n", "fetching: /dhq/vol/8/1/000170/000170.html\n", "fetching: /dhq/vol/3/2/000048/000048.html\n", "fetching: /dhq/vol/10/4/000268/000268.html\n", "fetching: /dhq/vol/3/4/000079/000079.html\n", "fetching: /dhq/vol/3/1/000027/000027.html\n", "fetching: /dhq/vol/11/3/000320/000320.html\n", "fetching: /dhq/vol/11/3/000302/000302.html\n", "fetching: /dhq/vol/11/4/000313/000313.html\n", "fetching: /dhq/vol/11/1/000283/000283.html\n", "fetching: /dhq/vol/7/3/000162/000162.html\n", "fetching: /dhq/vol/10/4/000278/000278.html\n", "fetching: /dhq/vol/10/4/000273/000273.html\n", "fetching: /dhq/vol/7/3/000167/000167.html\n", "fetching: /dhq/vol/12/1/000370/000370.html\n", "fetching: /dhq/vol/9/4/000225/000225.html\n", "fetching: /dhq/vol/10/2/000256/000256.html\n", "fetching: /dhq/vol/12/1/000347/000347.html\n", "fetching: /dhq/vol/8/2/000169/000169.html\n", "fetching: /dhq/vol/2/1/000016/000016.html\n", "fetching: /dhq/vol/7/1/000147/000147.html\n", "fetching: /dhq/vol/6/3/000131/000131.html\n", "fetching: /dhq/vol/3/4/000072/000072.html\n", "fetching: /dhq/vol/9/4/000218/000218.html\n", "fetching: /dhq/vol/9/2/000216/000216.html\n", "fetching: /dhq/vol/10/3/000261/000261.html\n", "fetching: /dhq/vol/12/1/000363/000363.html\n", "fetching: /dhq/vol/11/4/000344/000344.html\n", "fetching: /dhq/vol/12/3/000385/000385.html\n", "fetching: /dhq/vol/7/1/000114/000114.html\n", "fetching: /dhq/vol/3/4/000076/000076.html\n", "fetching: /dhq/vol/3/2/000043/000043.html\n", "fetching: /dhq/vol/9/2/000200/000200.html\n", "fetching: /dhq/vol/8/4/000192/000192.html\n", "fetching: /dhq/vol/11/2/000292/000292.html\n", "fetching: /dhq/vol/9/3/000221/000221.html\n", "fetching: /dhq/vol/3/2/000040/000040.html\n", "fetching: /dhq/vol/3/3/000054/000054.html\n", "fetching: /dhq/vol/12/2/000393/000393.html\n", "fetching: /dhq/vol/12/1/000355/000355.html\n", "fetching: /dhq/vol/11/4/000341/000341.html\n", "fetching: /dhq/vol/1/2/000012/000012.html\n", "fetching: /dhq/vol/12/1/000353/000353.html\n", "fetching: /dhq/vol/6/1/000112/000112.html\n", "fetching: /dhq/vol/5/3/000104/000104.html\n", "fetching: /dhq/vol/12/2/000386/000386.html\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "fetching: /dhq/vol/8/1/000172/000172.html\n", "fetching: /dhq/vol/12/2/000376/000376.html\n", "fetching: /dhq/vol/8/1/000168/000168.html\n", "fetching: /dhq/vol/9/3/000214/000214.html\n", "fetching: /dhq/vol/2/1/000015/000015.html\n", "fetching: /dhq/vol/8/4/000197/000197.html\n", "fetching: /dhq/vol/11/2/000298/000298.html\n", "fetching: /dhq/vol/11/3/000323/000323.html\n", "fetching: /dhq/vol/3/1/000032/000032.html\n", "fetching: /dhq/vol/9/2/000211/000211.html\n", "fetching: /dhq/vol/7/1/000143/000143.html\n", "fetching: /dhq/vol/10/3/000267/000267.html\n", "fetching: /dhq/vol/11/4/000342/000342.html\n", "fetching: /dhq/vol/1/1/000001/000001.html\n", "fetching: /dhq/vol/3/3/000060/000060.html\n", "fetching: /dhq/vol/6/3/000130/000130.html\n", "fetching: /dhq/vol/10/2/000242/000242.html\n", "fetching: /dhq/vol/12/2/000379/000379.html\n", "fetching: /dhq/vol/11/4/000358/000358.html\n", "fetching: /dhq/vol/11/2/000317/000317.html\n", "fetching: /dhq/vol/11/3/000311/000311.html\n", "fetching: /dhq/vol/10/4/000271/000271.html\n", "fetching: /dhq/vol/8/3/000183/000183.html\n", "fetching: /dhq/vol/6/2/000124/000124.html\n", "fetching: /dhq/vol/11/3/000305/000305.html\n", "fetching: /dhq/vol/11/2/000295/000295.html\n", "fetching: /dhq/vol/6/2/000141/000141.html\n", "fetching: /dhq/vol/3/1/000030/000030.html\n", "fetching: /dhq/vol/9/1/000206/000206.html\n", "fetching: /dhq/vol/6/1/000111/000111.html\n", "fetching: /dhq/vol/11/2/000308/000308.html\n", "fetching: /dhq/vol/9/3/000223/000223.html\n", "fetching: /dhq/vol/7/1/000156/000156.html\n", "fetching: /dhq/vol/12/2/000383/000383.html\n", "fetching: /dhq/vol/3/1/000024/000024.html\n", "fetching: /dhq/vol/7/1/000142/000142.html\n", "fetching: /dhq/vol/10/4/000275/000275.html\n", "fetching: /dhq/vol/4/2/000085/000085.html\n", "fetching: /dhq/vol/9/3/000222/000222.html\n", "fetching: /dhq/vol/10/3/000266/000266.html\n", "fetching: /dhq/vol/3/2/000039/000039.html\n", "fetching: /dhq/vol/3/3/000056/000056.html\n", "fetching: /dhq/vol/11/3/000316/000316.html\n", "fetching: /dhq/vol/11/4/000339/000339.html\n", "fetching: /dhq/vol/12/01/000369/000369.html\n", "fetching: /dhq/vol/12/1/000372/000372.html\n", "fetching: /dhq/vol/8/1/000175/000175.html\n", "fetching: /dhq/vol/11/2/000291/000291.html\n", "fetching: /dhq/vol/7/1/000140/000140.html\n", "fetching: /dhq/vol/11/3/000324/000324.html\n", "fetching: /dhq/vol/11/4/000350/000350.html\n", "fetching: /dhq/vol/10/3/000257/000257.html\n", "fetching: /dhq/vol/10/2/000247/000247.html\n", "fetching: /dhq/vol/6/2/000127/000127.html\n", "fetching: /dhq/vol/11/1/000287/000287.html\n", "fetching: /dhq/vol/9/3/000227/000227.html\n", "fetching: /dhq/vol/12/3/000398/000398.html\n", "fetching: /dhq/vol/3/4/000077/000077.html\n", "fetching: /dhq/vol/9/1/000203/000203.html\n", "fetching: /dhq/vol/11/2/000307/000307.html\n", "fetching: /dhq/vol/6/1/000113/000113.html\n", "fetching: /dhq/vol/3/3/000065/000065.html\n", "fetching: /dhq/vol/7/1/000145/000145.html\n", "fetching: /dhq/vol/11/1/000281/000281.html\n", "fetching: /dhq/vol/7/2/000160/000160.html\n", "fetching: /dhq/vol/7/2/000163/000163.html\n", "fetching: /dhq/vol/11/3/000330/000330.html\n", "fetching: /dhq/vol/10/1/000240/000240.html\n", "fetching: /dhq/vol/12/2/000378/000378.html\n", "fetching: /dhq/vol/2/1/000018/000018.html\n", "fetching: /dhq/vol/3/3/000051/000051.html\n", "fetching: /dhq/vol/10/2/000255/000255.html\n", "fetching: /dhq/vol/10/4/000272/000272.html\n", "fetching: /dhq/vol/9/4/000212/000212.html\n", "fetching: /dhq/vol/12/1/000349/000349.html\n", "fetching: /dhq/vol/6/2/000119/000119.html\n", "fetching: /dhq/vol/11/3/000337/000337.html\n", "fetching: /dhq/vol/3/1/000023/000023.html\n", "fetching: /dhq/vol/12/1/000365/000365.html\n", "fetching: /dhq/vol/5/3/000100/000100.html\n", "fetching: /dhq/vol/12/4/000400/000400.html\n", "fetching: /dhq/vol/1/2/000010/000010.html\n", "fetching: /dhq/vol/10/2/000243/000243.html\n", "fetching: /dhq/vol/7/1/000146/000146.html\n", "fetching: /dhq/vol/4/1/000087/000087.html\n", "fetching: /dhq/vol/11/2/000290/000290.html\n", "fetching: /dhq/vol/7/1/000157/000157.html\n", "fetching: /dhq/vol/3/2/000042/000042.html\n", "fetching: /dhq/vol/6/2/000123/000123.html\n", "fetching: /dhq/vol/9/2/000209/000209.html\n", "fetching: /dhq/vol/4/1/000082/000082.html\n", "fetching: /dhq/vol/10/3/000258/000258.html\n", "fetching: /dhq/vol/2/1/000020/000020.html\n", "fetching: /dhq/vol/9/2/000202/000202.html\n", "fetching: /dhq/vol/11/4/000356/000356.html\n", "fetching: /dhq/vol/3/4/000070/000070.html\n", "fetching: /dhq/vol/7/2/000116/000116.html\n", "fetching: /dhq/vol/8/1/000171/000171.html\n", "fetching: /dhq/vol/3/3/000058/000058.html\n", "fetching: /dhq/vol/1/1/000002/000002.html\n", "fetching: /dhq/vol/10/1/000229/000229.html\n", "fetching: /dhq/vol/10/2/000251/000251.html\n", "fetching: /dhq/vol/7/2/000164/000164.html\n", "fetching: /dhq/vol/3/1/000033/000033.html\n", "fetching: /dhq/vol/3/3/000066/000066.html\n", "fetching: /dhq/vol/10/2/000248/000248.html\n", "fetching: /dhq/vol/11/2/000360/000360.html\n", "fetching: /dhq/vol/10/2/000245/000245.html\n", "fetching: /dhq/vol/6/2/000136/000136.html\n", "fetching: /dhq/vol/3/1/000025/000025.html\n", "fetching: /dhq/vol/8/3/000191/000191.html\n", "fetching: /dhq/vol/12/1/000354/000354.html\n", "fetching: /dhq/vol/12/1/000364/000364.html\n", "fetching: /dhq/vol/9/4/000230/000230.html\n", "fetching: /dhq/vol/5/3/000108/000108.html\n", "fetching: /dhq/vol/10/1/000239/000239.html\n", "fetching: /dhq/vol/12/2/000374/000374.html\n", "fetching: /dhq/vol/12/4/000406/000406.html\n", "fetching: /dhq/vol/3/2/000041/000041.html\n", "fetching: /dhq/vol/8/1/000174/000174.html\n", "fetching: /dhq/vol/4/2/000089/000089.html\n", "fetching: /dhq/vol/5/2/000094/000094.html\n", "fetching: /dhq/vol/7/2/000158/000158.html\n", "fetching: /dhq/vol/10/4/000270/000270.html\n", "fetching: /dhq/vol/12/3/000399/000399.html\n", "fetching: /dhq/vol/11/2/000300/000300.html\n", "fetching: /dhq/vol/3/3/000062/000062.html\n", "fetching: /dhq/vol/12/1/000352/000352.html\n", "fetching: /dhq/vol/7/3/000166/000166.html\n", "fetching: /dhq/vol/4/2/000088/000088.html\n", "fetching: /dhq/vol/12/1/000348/000348.html\n", "fetching: /dhq/vol/3/2/000046/000046.html\n", "fetching: /dhq/vol/6/1/000117/000117.html\n", "fetching: /dhq/vol/7/1/000161/000161.html\n", "fetching: /dhq/vol/9/2/000213/000213.html\n", "fetching: /dhq/vol/3/2/000044/000044.html\n", "fetching: /dhq/vol/10/1/000228/000228.html\n", "fetching: /dhq/vol/12/2/000391/000391.html\n", "fetching: /dhq/vol/7/3/000165/000165.html\n", "fetching: /dhq/vol/11/2/000309/000309.html\n", "fetching: /dhq/vol/11/1/000276/000276.html\n", "fetching: /dhq/vol/1/1/000004/000004.html\n", "fetching: /dhq/vol/8/4/000190/000190.html\n", "fetching: /dhq/vol/4/1/000080/000080.html\n", "fetching: /dhq/vol/3/4/000074/000074.html\n", "fetching: /dhq/vol/11/2/000294/000294.html\n", "fetching: /dhq/vol/3/2/000037/000037.html\n", "fetching: /dhq/vol/8/3/000184/000184.html\n", "fetching: /dhq/vol/10/2/000253/000253.html\n", "fetching: /dhq/vol/3/1/000028/000028.html\n", "fetching: /dhq/vol/11/2/000296/000296.html\n", "fetching: /dhq/vol/3/4/000071/000071.html\n", "fetching: /dhq/vol/8/2/000179/000179.html\n", "fetching: /dhq/vol/4/2/000083/000083.html\n", "fetching: /dhq/vol/11/3/000306/000306.html\n", "fetching: /dhq/vol/9/1/000205/000205.html\n", "fetching: /dhq/vol/9/3/000193/000193.html\n", "fetching: /dhq/vol/12/3/000384/000384.html\n", "fetching: /dhq/vol/11/3/000329/000329.html\n", "fetching: /dhq/vol/5/2/000093/000093.html\n", "fetching: /dhq/vol/12/4/000401/000401.html\n", "fetching: /dhq/vol/6/2/000138/000138.html\n", "fetching: /dhq/vol/3/1/000029/000029.html\n", "fetching: /dhq/vol/5/1/000091/000091.html\n", "fetching: /dhq/vol/3/2/000038/000038.html\n", "fetching: /dhq/vol/11/4/000326/000326.html\n", "fetching: /dhq/vol/3/3/000059/000059.html\n", "fetching: /dhq/vol/11/3/000314/000314.html\n", "fetching: /dhq/vol/10/3/000262/000262.html\n", "fetching: /dhq/vol/1/2/000013/000013.html\n", "fetching: /dhq/vol/8/1/000176/000176.html\n", "fetching: /dhq/vol/9/4/000210/000210.html\n", "fetching: /dhq/vol/9/2/000201/000201.html\n", "fetching: /dhq/vol/10/4/000280/000280.html\n", "fetching: /dhq/vol/8/2/000173/000173.html\n", "fetching: /dhq/vol/11/3/000303/000303.html\n", "fetching: /dhq/vol/3/3/000050/000050.html\n", "fetching: /dhq/vol/12/1/000362/000362.html\n", "fetching: /dhq/vol/12/3/000397/000397.html\n", "fetching: /dhq/vol/11/3/000335/000335.html\n", "fetching: /dhq/vol/9/2/000219/000219.html\n", "fetching: /dhq/vol/10/1/000241/000241.html\n", "fetching: /dhq/vol/9/3/000224/000224.html\n", "fetching: /dhq/vol/11/1/000286/000286.html\n", "fetching: /dhq/vol/9/1/000204/000204.html\n", "fetching: /dhq/vol/11/3/000318/000318.html\n", "fetching: /dhq/vol/6/2/000129/000129.html\n", "fetching: /dhq/vol/3/3/000064/000064.html\n", "fetching: /dhq/vol/1/1/000008/000008.html\n", "fetching: /dhq/vol/10/4/000269/000269.html\n", "fetching: /dhq/vol/5/3/000099/000099.html\n", "fetching: /dhq/vol/9/1/000207/000207.html\n", "fetching: /dhq/vol/9/3/000237/000237.html\n", "fetching: /dhq/vol/8/3/000185/000185.html\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "fetching: /dhq/vol/10/1/000238/000238.html\n", "fetching: /dhq/vol/6/3/000135/000135.html\n", "fetching: /dhq/vol/11/3/000332/000332.html\n", "fetching: /dhq/vol/11/1/000274/000274.html\n", "fetching: /dhq/vol/12/2/000394/000394.html\n", "fetching: /dhq/vol/3/2/000047/000047.html\n", "fetching: /dhq/vol/6/1/000110/000110.html\n", "fetching: /dhq/vol/12/2/000390/000390.html\n", "fetching: /dhq/vol/1/2/000011/000011.html\n", "fetching: /dhq/vol/11/2/000288/000288.html\n" ] } ], "source": [ "import re\n", "import time\n", "\n", "for url in set(urls): # for each url in our list of URLs\n", " clean_url = re.sub(r'\\s+', \"\", url) # clean our URLs of superfluous spaces\n", " filename = os.path.basename(clean_url) # determine the filename\n", " path = os.path.join(directory, filename) # construct a full path\n", " if os.path.exists(path): # if we alredy have the path locally\n", " print(\"already fetched:\", clean_url) # let the user know we've \n", " else: # otherwise we need to fetch it\n", " print(\"fetching:\", clean_url) # let the user know we've \n", " contents = urllib.request.urlopen(urlRoot+clean_url).read() # grab the contents of the URL\n", " with open(path, \"wb\") as f: # open the file for writing\n", " f.write(contents) # write the contents\n", " time.sleep(1) # be nice to the server by waiting" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And there we are, we now have a local copy of all the articles from _Digital Humanities Quarterly_!\n", "\n", "---\n", "[CC BY-SA](https://creativecommons.org/licenses/by-sa/4.0/) From [The Art of Literary Text Analysis](ArtOfLiteraryTextAnalysis.ipynb) by [Stéfan Sinclair](http://stefansinclair.name) & [Geoffrey Rockwell](http://geoffreyrockwell.com).
Created January 24, 2019 (Jupyter 5), last updated January 31, 2018." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.1" } }, "nbformat": 4, "nbformat_minor": 2 }