{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction\n", "\n", "This IPython notebook illustrates how to perform matching using the rule-based matcher.\n", "\n", "First, we need to import py_entitymatching package and other libraries as follows:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# Import py_entitymatching package\n", "import py_entitymatching as em\n", "import os\n", "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then, read the (sample) input tables for matching purposes." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Get the datasets directory\n", "datasets_dir = em.get_install_path() + os.sep + 'datasets'\n", "\n", "path_A = datasets_dir + os.sep + 'dblp_demo.csv'\n", "path_B = datasets_dir + os.sep + 'acm_demo.csv'\n", "path_labeled_data = datasets_dir + os.sep + 'labeled_data_demo.csv'" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | _id | \n", "ltable_id | \n", "rtable_id | \n", "ltable_title | \n", "ltable_authors | \n", "ltable_year | \n", "rtable_title | \n", "rtable_authors | \n", "rtable_year | \n", "label | \n", "
---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "0 | \n", "l1223 | \n", "r498 | \n", "Dynamic Information Visualization | \n", "Yannis E. Ioannidis | \n", "1996 | \n", "Dynamic information visualization | \n", "Yannis E. Ioannidis | \n", "1996 | \n", "1 | \n", "
1 | \n", "1 | \n", "l1563 | \n", "r1285 | \n", "Dynamic Load Balancing in Hierarchical Parallel Database Systems | \n", "Luc Bouganim, Daniela Florescu, Patrick Valduriez | \n", "1996 | \n", "Dynamic Load Balancing in Hierarchical Parallel Database Systems | \n", "Luc Bouganim, Daniela Florescu, Patrick Valduriez | \n", "1996 | \n", "1 | \n", "
2 | \n", "2 | \n", "l1514 | \n", "r1348 | \n", "Query Processing and Optimization in Oracle Rdb | \n", "Gennady Antoshenkov, Mohamed Ziauddin | \n", "1996 | \n", "prospector: a content-based multimedia server for massively parallel architectures | \n", "S. Choo, W. O'Connell, G. Linerman, H. Chen, K. Ganapathy, A. Biliris, E. Panagos, D. Schrader | \n", "1996 | \n", "0 | \n", "
3 | \n", "3 | \n", "l206 | \n", "r1641 | \n", "An Asymptotically Optimal Multiversion B-Tree | \n", "Thomas Ohler, Peter Widmayer, Bruno Becker, Stephan Gschwind, Bernhard Seeger | \n", "1996 | \n", "A complete temporal relational algebra | \n", "Debabrata Dey, Terence M. Barron, Veda C. Storey | \n", "1996 | \n", "0 | \n", "
4 | \n", "4 | \n", "l1589 | \n", "r495 | \n", "Evaluating Probabilistic Queries over Imprecise Data | \n", "Reynold Cheng, Dmitri V. Kalashnikov, Sunil Prabhakar | \n", "2003 | \n", "Evaluating probabilistic queries over imprecise data | \n", "Reynold Cheng, Dmitri V. Kalashnikov, Sunil Prabhakar | \n", "2003 | \n", "1 | \n", "
\n", " | _id | \n", "ltable_id | \n", "rtable_id | \n", "ltable_title | \n", "ltable_authors | \n", "ltable_year | \n", "rtable_title | \n", "rtable_authors | \n", "rtable_year | \n", "label | \n", "pred_label | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "0 | \n", "l1223 | \n", "r498 | \n", "Dynamic Information Visualization | \n", "Yannis E. Ioannidis | \n", "1996 | \n", "Dynamic information visualization | \n", "Yannis E. Ioannidis | \n", "1996 | \n", "1 | \n", "1 | \n", "
1 | \n", "1 | \n", "l1563 | \n", "r1285 | \n", "Dynamic Load Balancing in Hierarchical Parallel Database Systems | \n", "Luc Bouganim, Daniela Florescu, Patrick Valduriez | \n", "1996 | \n", "Dynamic Load Balancing in Hierarchical Parallel Database Systems | \n", "Luc Bouganim, Daniela Florescu, Patrick Valduriez | \n", "1996 | \n", "1 | \n", "1 | \n", "
2 | \n", "2 | \n", "l1514 | \n", "r1348 | \n", "Query Processing and Optimization in Oracle Rdb | \n", "Gennady Antoshenkov, Mohamed Ziauddin | \n", "1996 | \n", "prospector: a content-based multimedia server for massively parallel architectures | \n", "S. Choo, W. O'Connell, G. Linerman, H. Chen, K. Ganapathy, A. Biliris, E. Panagos, D. Schrader | \n", "1996 | \n", "0 | \n", "0 | \n", "
3 | \n", "3 | \n", "l206 | \n", "r1641 | \n", "An Asymptotically Optimal Multiversion B-Tree | \n", "Thomas Ohler, Peter Widmayer, Bruno Becker, Stephan Gschwind, Bernhard Seeger | \n", "1996 | \n", "A complete temporal relational algebra | \n", "Debabrata Dey, Terence M. Barron, Veda C. Storey | \n", "1996 | \n", "0 | \n", "0 | \n", "
4 | \n", "4 | \n", "l1589 | \n", "r495 | \n", "Evaluating Probabilistic Queries over Imprecise Data | \n", "Reynold Cheng, Dmitri V. Kalashnikov, Sunil Prabhakar | \n", "2003 | \n", "Evaluating probabilistic queries over imprecise data | \n", "Reynold Cheng, Dmitri V. Kalashnikov, Sunil Prabhakar | \n", "2003 | \n", "1 | \n", "1 | \n", "
5 | \n", "5 | \n", "l43 | \n", "r1415 | \n", "Optimization of Run-time Management of Data Intensive Web-sites | \n", "Khaled Yagoub, Dan Suciu, Alon Y. Levy, Daniela Florescu | \n", "1999 | \n", "On random sampling over joins | \n", "Surajit Chaudhuri, Rajeev Motwani, Vivek Narasayya | \n", "1999 | \n", "0 | \n", "0 | \n", "
6 | \n", "6 | \n", "l1466 | \n", "r1348 | \n", "Access Path Support for Referential Integrity in SQL2 | \n", "Joachim Reinert, Theo Hrder | \n", "1996 | \n", "prospector: a content-based multimedia server for massively parallel architectures | \n", "S. Choo, W. O'Connell, G. Linerman, H. Chen, K. Ganapathy, A. Biliris, E. Panagos, D. Schrader | \n", "1996 | \n", "0 | \n", "0 | \n", "
7 | \n", "7 | \n", "l1535 | \n", "r1800 | \n", "Mariposa: A Wide-Area Distributed Database System | \n", "Carl Staelin, Paul M. Aoki, Witold Litwin, Michael Stonebraker, Adam Sah, Jeff Sidell, Andrew Yu... | \n", "1996 | \n", "Further Improvements on Integrity Constraint Checking for Stratifiable Deductive Databases | \n", "Sin Yeung Lee, Tok Wang Ling | \n", "1996 | \n", "0 | \n", "0 | \n", "
8 | \n", "8 | \n", "l1317 | \n", "r1676 | \n", "QuickStore: A High Performance Mapped Object Store | \n", "David J. DeWitt, Seth J. White | \n", "1994 | \n", "An Overview of Repository Technology | \n", "Philip A. Bernstein, Umeshwar Dayal | \n", "1994 | \n", "0 | \n", "0 | \n", "
9 | \n", "9 | \n", "l621 | \n", "r175 | \n", "Communication Efficient Distributed Mining of Association Rules | \n", "Ran Wolff, Assaf Schuster | \n", "2001 | \n", "Editorial | \n", "Richard Snodgrass | \n", "2001 | \n", "0 | \n", "0 | \n", "
10 | \n", "10 | \n", "l668 | \n", "r1694 | \n", "Indexing Multimedia Databases (Tutorial) | \n", "Christos Faloutsos | \n", "1995 | \n", "Information finding in a digital library: the Stanford perspective | \n", "Tak W. Yan, Héctor García-Molina | \n", "1995 | \n", "0 | \n", "0 | \n", "
11 | \n", "11 | \n", "l1189 | \n", "r1674 | \n", "Weimin Du, Xiangning Liu, Abdelsalam Helal | \n", "Multiview Access Protocols for Large-Scale Replication | \n", "1998 | \n", "Multiview access protocols for large-scale replication | \n", "Xiangning Liu, Abdelsalam Helal, Weimin Du | \n", "1998 | \n", "1 | \n", "0 | \n", "
12 | \n", "12 | \n", "l1657 | \n", "r110 | \n", "Semantic B2B Integration | \n", "Christoph Bussler | \n", "2001 | \n", "Monitoring business processes through event correlation based on dependency model | \n", "Asaf Adii, David Botzer, Opher Etzion, Tali Yatzkar-Haham | \n", "2001 | \n", "0 | \n", "0 | \n", "
13 | \n", "13 | \n", "l1490 | \n", "r599 | \n", "Extracting Large Data Sets using DB2 Parallel Edition | \n", "Sriram Padmanabhan | \n", "1996 | \n", "Extracting Large Data Sets using DB2 Parallel Edition | \n", "Sriram Padmanabhan | \n", "1996 | \n", "1 | \n", "1 | \n", "
14 | \n", "14 | \n", "l595 | \n", "r87 | \n", "Of Crawlers, Portals, Mice and Men: Is there more to Mining the Web? (Panel) | \n", "Kyuseok Shim, Rajeev Rastogi, Minos N. Garofalakis, Sridhar Ramaswamy | \n", "1999 | \n", "Of crawlers, portals, mice, and men: is there more to mining the Web? | \n", "Minos N. Garofalakis, Sridhar Ramaswamy, Rajeev Rastogi, Kyuseok Shim | \n", "1999 | \n", "1 | \n", "1 | \n", "
15 | \n", "15 | \n", "l380 | \n", "r1337 | \n", "Outerjoin Simplification and Reordering for Query Optimization | \n", "Csar A. Galindo-Legaria, Arnon Rosenthal | \n", "1997 | \n", "Outerjoin simplification and reordering for query optimization | \n", "César Galindo-Legaria, Arnon Rosenthal | \n", "1997 | \n", "1 | \n", "1 | \n", "
16 | \n", "16 | \n", "l165 | \n", "r1118 | \n", "Cache-and-Query for Wide Area Sensor Databases | \n", "Phillip B. Gibbons, Srinivasan Seshan, Suman Kumar Nath, Amol Deshpande | \n", "2003 | \n", "Cache-and-query for wide area sensor databases | \n", "Amol Deshpande, Suman Nath, Phillip B. Gibbons, Srinivasan Seshan | \n", "2003 | \n", "1 | \n", "1 | \n", "
17 | \n", "17 | \n", "l796 | \n", "r588 | \n", "Generating Dynamic Content at Database-Backed Web Servers: cgi-bin vs. mod_perl | \n", "Alexandros Labrinidis, Nick Roussopoulos | \n", "2000 | \n", "Novel Approaches in Query Processing for Moving Object Trajectories | \n", "Dieter Pfoser, Christian S. Jensen, Yannis Theodoridis | \n", "2000 | \n", "0 | \n", "0 | \n", "
18 | \n", "18 | \n", "l1160 | \n", "r1733 | \n", "Khaled Alsabti, Vineet Singh, Sanjay Ranka | \n", "A One-Pass Algorithm for Accurately Estimating Quantiles for Disk-Resident Data | \n", "1997 | \n", "A One-Pass Algorithm for Accurately Estimating Quantiles for Disk-Resident Data | \n", "Khaled Alsabti, Sanjay Ranka, Vineet Singh | \n", "1997 | \n", "1 | \n", "0 | \n", "
19 | \n", "19 | \n", "l1752 | \n", "r3 | \n", "SHORE: Combining the Best Features of OODBMS and File Systems | \n", "Shore Team | \n", "1995 | \n", "The LyriC language: querying constraint objects | \n", "Alexander Brodsky, Yoram Kornatzky | \n", "1995 | \n", "0 | \n", "0 | \n", "
20 | \n", "20 | \n", "l1647 | \n", "r945 | \n", "Cost Based Query Scrambling for Initial Delays | \n", "Tolga Urhan, Michael J. Franklin, Laurent Amsaleg | \n", "1998 | \n", "The Cubetree Storage Organization | \n", "Nick Roussopoulos, Yannis Kotidis | \n", "1998 | \n", "0 | \n", "0 | \n", "
21 | \n", "21 | \n", "l1135 | \n", "r1127 | \n", "Sampling-Based Estimation of the Number of Distinct Values of an Attribute | \n", "Peter J. Haas, Lynne Stokes, S. Seshadri, Jeffrey F. Naughton | \n", "1995 | \n", "View maintenance in a warehousing environment | \n", "Yue Zhuge, Héctor García-Molina, Joachim Hammer, Jennifer Widom | \n", "1995 | \n", "0 | \n", "0 | \n", "
22 | \n", "22 | \n", "l1776 | \n", "r987 | \n", "Walking Through a Very Large Virtual Environment in Real-time | \n", "Yixin Ruan, Kian-Lee Tan, Jason Chionh, Lidan Shou, Zhiyong Huang | \n", "2001 | \n", "Walking Through a Very Large Virtual Environment in Real-time | \n", "Lidan Shou, Jason Chionh, Zhiyong Huang, Yixin Ruan, Kian-Lee Tan | \n", "2001 | \n", "1 | \n", "1 | \n", "
23 | \n", "23 | \n", "l676 | \n", "r1395 | \n", "Datawarehousing Has More Colours Than Just Black & White | \n", "Thomas Zurek, Markus Sinnwell | \n", "1999 | \n", "Datawarehousing Has More Colours Than Just Black &; White | \n", "Thomas Zurek, Markus Sinnwell | \n", "1999 | \n", "1 | \n", "1 | \n", "
24 | \n", "24 | \n", "l1087 | \n", "r648 | \n", "The Grid: An Application of the Semantic Web | \n", "Carole A. Goble, David De Roure | \n", "2002 | \n", "An XML query engine for network-bound data | \n", "Zachary G. Ives, A. Y. Halevy, D. S. Weld | \n", "2002 | \n", "0 | \n", "0 | \n", "
25 | \n", "25 | \n", "l629 | \n", "r1478 | \n", "Engineering Federated Information Systems: Report of EFIS '99 Workshop | \n", "Flix Saltor, Uwe Hohenstein, Ralf-Detlef Kutsche, Wilhelm Hasselbring, Gunter Saake, Stefan Conr... | \n", "1999 | \n", "Engineering federated information systems: report of EEFIS '99 workshop | \n", "S. Conrad, W. Hasselbring, U. Hohenstein, R.-D. Kutsche, M. Roantree, G. Saake, F. Saltor | \n", "1999 | \n", "1 | \n", "1 | \n", "
26 | \n", "26 | \n", "l649 | \n", "r1366 | \n", "Random Sampling for Histogram Construction: How much is enough? | \n", "Vivek R. Narasayya, Rajeev Motwani, Surajit Chaudhuri | \n", "1998 | \n", "Random sampling for histogram construction: how much is enough? | \n", "Surajit Chaudhuri, Rajeev Motwani, Vivek Narasayya | \n", "1998 | \n", "1 | \n", "1 | \n", "
27 | \n", "27 | \n", "l211 | \n", "r1490 | \n", "BeSS: Storage Support for Interactive Visualization Systems | \n", "William O'Connell, Thomas A. Funkhouser, Alexandros Biliris, Euthimios Panagos | \n", "1996 | \n", "BeSS: storage support for interactive visualization systems | \n", "A. Biliris, T. A. Funkhouser, W. O'Connell, E. Panagos | \n", "1996 | \n", "1 | \n", "1 | \n", "
28 | \n", "28 | \n", "l734 | \n", "r384 | \n", "Min-Max Compression Methods for Medical Image Databases | \n", "John M. Tyler, Kosmas Karadimitriou | \n", "1997 | \n", "Min-max compression methods for medical image databases | \n", "Kosmas Karadimitriou, John M. Tyler | \n", "1997 | \n", "1 | \n", "1 | \n", "
29 | \n", "29 | \n", "l611 | \n", "r141 | \n", "Mining Generalized Association Rules | \n", "Ramakrishnan Srikant, Rakesh Agrawal | \n", "1995 | \n", "Multi-table joins through bitmapped join indices | \n", "Patrick O'Neil, Goetz Graefe | \n", "1995 | \n", "0 | \n", "0 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
420 | \n", "420 | \n", "l834 | \n", "r883 | \n", "Estimating the Selectivity of XML Path Expressions for Internet Scale Applications | \n", "Ashraf Aboulnaga, Jeffrey F. Naughton, Alaa R. Alameldeen | \n", "2001 | \n", "Estimating the Selectivity of XML Path Expressions for Internet Scale Applications | \n", "Ashraf Aboulnaga, Alaa R. Alameldeen, Jeffrey F. Naughton | \n", "2001 | \n", "1 | \n", "1 | \n", "
421 | \n", "421 | \n", "l746 | \n", "r301 | \n", "Providing Database Migration Tools - A Practicioner's Approach | \n", "Andreas Meier | \n", "1995 | \n", "Providing Database Migration Tools - A Practicioner's Approach | \n", "Andreas Meier | \n", "1995 | \n", "1 | \n", "1 | \n", "
422 | \n", "422 | \n", "l1332 | \n", "r619 | \n", "Workshop on Workflow Management in Scientific and Engineering Applications - Report | \n", "Gottfried Vossen, Richard McClatchey | \n", "1997 | \n", "Workshop on workflow management in scientific and engineering applications-report | \n", "R. McClatchey, G. Vossen | \n", "1997 | \n", "1 | \n", "1 | \n", "
423 | \n", "423 | \n", "l942 | \n", "r1473 | \n", "Research in Databases and Data-Intensive Applications - Computer Science Department and FZI, Uni... | \n", "Birgitta Knig-Ries, Peter C. Lockemann | \n", "1997 | \n", "Research in databases and data-intensive applications: Computer Science Dept. and FIZ, Universit... | \n", "Brigitta König-Ries, Peter C. Lockermann | \n", "1997 | \n", "1 | \n", "1 | \n", "
424 | \n", "424 | \n", "l806 | \n", "r356 | \n", "Tribeca: A Stream Database Manager for Network Traffic Analysis | \n", "Mark Sullivan | \n", "1996 | \n", "Type-safe relaxing of schema consistency rules for flexible modelling in OODBMS | \n", "Eric Amiel, Marie-Jo Bellosta, Eric Dujardin, Eric Simon | \n", "1996 | \n", "0 | \n", "0 | \n", "
425 | \n", "425 | \n", "l794 | \n", "r784 | \n", "Spatial Data Management for Computer Aided Design | \n", "Andreas Mller, Marco Ptke, Thomas Seidl, Hans-Peter Kriegel | \n", "2001 | \n", "Dynamic content acceleration: a caching solution to enable scalable dynamic Web page generation | \n", "Anindya Datta, Kaushik Dutta, Krithi Ramamritham, Helen Thomas, Debra VanderMeer | \n", "2001 | \n", "0 | \n", "0 | \n", "
426 | \n", "426 | \n", "l28 | \n", "r1618 | \n", "Storage Technology: RAID and Beyond | \n", "Garth A. Gibson | \n", "1995 | \n", "Tutorial on storage technology: RAID and beyond | \n", "Garth A. Gibson | \n", "1995 | \n", "1 | \n", "1 | \n", "
427 | \n", "427 | \n", "l1183 | \n", "r1409 | \n", "Stephen Blott, Roger Weber, Hans-Jrg Schek | \n", "A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional ... | \n", "1998 | \n", "A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional ... | \n", "Roger Weber, Hans-Jörg Schek, Stephen Blott | \n", "1998 | \n", "1 | \n", "0 | \n", "
428 | \n", "428 | \n", "l1122 | \n", "r232 | \n", "Interview with Jim Gray | \n", "Marianne Winslett | \n", "2003 | \n", "In-context peer-to-peer information filtering on the Web | \n", "Aris M. Ouksel | \n", "2003 | \n", "0 | \n", "0 | \n", "
429 | \n", "429 | \n", "l1430 | \n", "r1444 | \n", "Condition Handling in SQL Persistent Stored Modules | \n", "Jeff Richey | \n", "1995 | \n", "Condition handling in SQL persistent stored modules | \n", "Jeff Richey | \n", "1995 | \n", "1 | \n", "1 | \n", "
430 | \n", "430 | \n", "l1494 | \n", "r1257 | \n", "The Mariposa Distributed Database Management System | \n", "Jeff Sidell | \n", "1996 | \n", "Open issues in parallel query optimization | \n", "Waqar Hasan, Daniela Florescu, Patrick Valduriez | \n", "1996 | \n", "0 | \n", "0 | \n", "
431 | \n", "431 | \n", "l1592 | \n", "r439 | \n", "Report on the 18th British National Conference on Databases (BNCOD) | \n", "Carole A. Goble, Brian J. Read | \n", "2002 | \n", "Contracting in the days of eBusiness | \n", "W. Hümmer, W. Lehner, H. Wedekind | \n", "2002 | \n", "0 | \n", "0 | \n", "
432 | \n", "432 | \n", "l1015 | \n", "r45 | \n", "Database Systems - Breaking Out of the Box | \n", "Abraham Silberschatz, Stanley B. Zdonik | \n", "1997 | \n", "Dynamic Memory Adjustment for External Mergesort | \n", "Weiye Zhang, Per-Åke Larson | \n", "1997 | \n", "0 | \n", "0 | \n", "
433 | \n", "433 | \n", "l1147 | \n", "r1016 | \n", "Xiaolei Qian | \n", "Scientist's Called Upon to Take Actions | \n", "1996 | \n", "Scientists called upon to take actions | \n", "Xiaolei Qian | \n", "1996 | \n", "1 | \n", "0 | \n", "
434 | \n", "434 | \n", "l1756 | \n", "r310 | \n", "ARIES/CSA: A Method for Database Recovery in Client-Server Architectures | \n", "C. Mohan, Inderpal Narang | \n", "1994 | \n", "Enterprise information architectures-they're finally changing | \n", "Wesley P. Melling | \n", "1994 | \n", "0 | \n", "0 | \n", "
435 | \n", "435 | \n", "l1044 | \n", "r67 | \n", "Digital Library Services in Mobile Computing | \n", "Evaggelia Pitoura, Melliyal Annamalai, Bharat K. Bhargava | \n", "1995 | \n", "Ordered shared locks for real-time databases | \n", "Divyakant Agrawal, Amr El Abbadi, Richard Jeffers, Lijing Lin | \n", "1995 | \n", "0 | \n", "0 | \n", "
436 | \n", "436 | \n", "l412 | \n", "r651 | \n", "Phoenix: Making Applications Robust | \n", "David B. Lomet, Roger S. Barga | \n", "1999 | \n", "DataBlitz storage manager: main-memory database performance for critical applications | \n", "J. Baulier, P. Bohannon, S. Gogate, C. Gupta, S. Haldar | \n", "1999 | \n", "0 | \n", "0 | \n", "
437 | \n", "437 | \n", "l796 | \n", "r1808 | \n", "Generating Dynamic Content at Database-Backed Web Servers: cgi-bin vs. mod_perl | \n", "Alexandros Labrinidis, Nick Roussopoulos | \n", "2000 | \n", "On wrapping query languages and efficient XML integration | \n", "Vassilis Christophides, Sophie Cluet, Jérǒme Simèon | \n", "2000 | \n", "0 | \n", "0 | \n", "
438 | \n", "438 | \n", "l1570 | \n", "r1468 | \n", "Instance-based attribute identification in database integration | \n", "Roger H. L. Chiang, Ee-Peng Lim, Chua Eng Huang Cecil | \n", "2003 | \n", "Index-driven similarity search in metric spaces | \n", "Gisli R. Hjaltason, Hanan Samet | \n", "2003 | \n", "0 | \n", "0 | \n", "
439 | \n", "439 | \n", "l1577 | \n", "r688 | \n", "Data Mining Using Two-Dimensional Optimized Accociation Rules: Scheme, Algorithms, and Visualiza... | \n", "Shinichi Morishita, Yasuhiko Morimoto, Takeshi Tokuyama, Takeshi Fukuda | \n", "1996 | \n", "Static detection of security flaws in object-oriented databases | \n", "Keishi Tajima | \n", "1996 | \n", "0 | \n", "0 | \n", "
440 | \n", "440 | \n", "l617 | \n", "r310 | \n", "Fine-Grained Sharing in a Page Server OODBMS | \n", "Michael J. Carey, Markos Zaharioudakis, Michael J. Franklin | \n", "1994 | \n", "Enterprise information architectures-they're finally changing | \n", "Wesley P. Melling | \n", "1994 | \n", "0 | \n", "0 | \n", "
441 | \n", "441 | \n", "l1304 | \n", "r1178 | \n", "Query Rewriting for Semistructured Data | \n", "Vasilis Vassalos, Yannis Papakonstantinou | \n", "1999 | \n", "The Aqua approximate query answering system | \n", "Swarup Acharya, Phillip B. Gibbons, Viswanath Poosala, Sridhar Ramaswamy | \n", "1999 | \n", "0 | \n", "0 | \n", "
442 | \n", "442 | \n", "l727 | \n", "r597 | \n", "Design and Analysis of Parametric Query Optimization Algorithms | \n", "Sumit Ganguly | \n", "1998 | \n", "Incremental distance join algorithms for spatial databases | \n", "Gísli R. Hjaltason, Hanan Samet | \n", "1998 | \n", "0 | \n", "0 | \n", "
443 | \n", "443 | \n", "l1205 | \n", "r395 | \n", "Proxy-Server Architectures for OLAP | \n", "Panos Kalnis, Dimitris Papadias | \n", "2001 | \n", "Proxy-server architectures for OLAP | \n", "Panos Kalnis, Dimitris Papadias | \n", "2001 | \n", "1 | \n", "1 | \n", "
444 | \n", "444 | \n", "l915 | \n", "r1532 | \n", "Efficient k-NN search on vertically decomposed data | \n", "Niels Nes, Martin L. Kersten, Nikos Mamoulis, Arjen P. de Vries | \n", "2002 | \n", "Efficient k-NN search on vertically decomposed data | \n", "Arjen P. de Vries, Nikos Mamoulis, Niels Nes, Martin Kersten | \n", "2002 | \n", "1 | \n", "1 | \n", "
445 | \n", "445 | \n", "l365 | \n", "r53 | \n", "50,000 Users on an Oracle8 Universal Server Database | \n", "Ashok Josji, Tirthankar Lahiri, Amit Jasuja, Sumanta Chatterjee | \n", "1998 | \n", "A workflow-based electronic marketplace on the Web | \n", "Asuman Dogac, Ilker Durusoy, Sena Arpinar, Nesime Tatbul, Pinar Koksal, Ibrahim Cingil, Nazife D... | \n", "1998 | \n", "0 | \n", "0 | \n", "
446 | \n", "446 | \n", "l458 | \n", "r767 | \n", "Comparing Hierarchical Data in External Memory | \n", "Sudarshan S. Chawathe | \n", "1999 | \n", "Context-Based Prefetch for Implementing Objects on Relations | \n", "Philip A. Bernstein, Shankar Pal, David Shutt | \n", "1999 | \n", "0 | \n", "0 | \n", "
447 | \n", "447 | \n", "l655 | \n", "r412 | \n", "The SDSS skyserver: public access to the sloan digital sky server data | \n", "Tanu Malik, Jordan Raddick, Alexander S. Szalay, Peter Z. Kunszt, Jim Gray, Christopher Stoughto... | \n", "2002 | \n", "Report on the ACM fourth international workshop on data warehousing and OLAP (DOLAP 2001) | \n", "Joachim Hammer | \n", "2002 | \n", "0 | \n", "0 | \n", "
448 | \n", "448 | \n", "l123 | \n", "r1493 | \n", "Change-Centric Management of Versions in an XML Warehouse | \n", "Laurent Mignet, Amlie Marian, Gregory Cobena, Serge Abiteboul | \n", "2001 | \n", "A Sequential Pattern Query Language for Supporting Instant Data Mining for e-Services | \n", "Reza Sadri, Carlo Zaniolo, Amir M. Zarkesh, Jafar Adibi | \n", "2001 | \n", "0 | \n", "0 | \n", "
449 | \n", "449 | \n", "l590 | \n", "r295 | \n", "Skew handling techniques in sort-merge join | \n", "Richard T. Snodgrass, Wei Li, Dengfeng Gao | \n", "2002 | \n", "QURSED: querying and reporting semistructured data | \n", "Yannis Papakonstantinou, Michalis Petropoulos, Vasilis Vassalos | \n", "2002 | \n", "0 | \n", "0 | \n", "
450 rows × 11 columns
\n", "